Вы можете сделать это с помощью популярных библиотек stringr и dplyr.
library(dplyr)
library(stringr)
df <- tibble(
sentence = c(
"Time & tide waits for none",
" Tit for tat",
"Eyes are mirror of person's thoughts",
"Some Other Sentence",
"Odd sentences failure"
)
)
df <- df %>%
# Split the sentence and store it in a new column
mutate(split_sentence = str_split(sentence," ")) %>%
# Do the next step row wise because we will be dealing with a vector of vectors
rowwise() %>%
# Keep only words that have a remainder of 0 when divided by 2 (str_length modulo 2)
mutate(split_sentence = list(split_sentence[str_length(split_sentence) %% 2 == 0])) %>%
# Only keep non-null strings !""
mutate(split_sentence = list(split_sentence[str_length(split_sentence) > 0])) %>%
# Find the first word with the longest length
mutate(split_sentence = list(split_sentence[which.max(str_length(split_sentence))])) %>%
# Keep only the first word left in the vector or return NA if none left
mutate(first_even = first(split_sentence)) %>%
# Ungroup because we don't need to work rowwise anymore
ungroup() %>%
# Convert any NA values to "00" per question
mutate(first_even = ifelse(is.na(first_even),"00",first_even)) %>%
select(-split_sentence)
# A tibble: 5 x 2
# sentence first_even
# <chr> <chr>
# 1 Time & tide waits for none Time
# 2 " Tit for tat" 00
# 3 Eyes are mirror of person's thoughts person's
# 4 Some Other Sentence Sentence
# 5 Odd sentences failure 00
В своем описании вы сказали, что thoughts
будет самым длинным словом, но мой аллогритм обнаружил, что person's
было таким же длинным.Если вы хотите удалить апостроф, вы можете выяснить, как это сделать, используя функцию str_remove_all()
.Я оставлю это для вас.