У меня есть фрейм данных с n строками, содержащими некоторый текст. Некоторые из этих строк содержат лишний текст, который я хотел бы удалить, и этот лишний текст появляется после некоторых определенных c предложений.
Позвольте мне взять пример:
df = structure(list(Text = c("The text you see here is fine, no problem with this.",
"The text you see here is fine, no problem with this.", "The text you see here is fine, no problem with this. We are now ready to take your questions. Life is great even if it is too hot to work at the moment.",
"The text you see here is fine, no problem with this.", "The text you see here is fine, no problem with this.",
"The text you see here is fine, no problem with this. We are now at your disposal for questions. I really need to remove this bit that comes after since I don't need it. Hopefully SE will sort this out.",
"The text you see here is fine, no problem with this.", "The text you see here is fine, no problem with this.",
"The text you see here is fine, no problem with this.", "The text you see here is fine, no problem with this. Transcript of the questions asked and the answers. Summertime is nice.",
"The text you see here is fine, no problem with this.", "The text you see here is fine, no problem with this."
)), class = "data.frame", row.names = c(NA, -12L))
Я бы хотел получить:
# Text
# 1 The text you see here is fine, no problem with this.
# 2 The text you see here is fine, no problem with this.
# 3 The text you see here is fine, no problem with this. We are now ready to take your questions.
# 4 The text you see here is fine, no problem with this.
# 5 The text you see here is fine, no problem with this.
# 6 The text you see here is fine, no problem with this. We are now at your disposal for questions.
# 7 The text you see here is fine, no problem with this.
# 8 The text you see here is fine, no problem with this.
# 9 The text you see here is fine, no problem with this.
# 10 The text you see here is fine, no problem with this. Transcript of the questions asked and the answers.
# 11 The text you see here is fine, no problem with this.
# 12 The text you see here is fine, no problem with this.
Dataframe - это упрощенное представление реального. Дополнительный текст (который всегда один и тот же в примере, но меняется в реальном) идет всегда после трех предложений: Теперь мы в вашем распоряжении для вопросов. , Расшифровка заданных вопросов и ответов. и Теперь мы готовы ответить на ваши вопросы.
Может ли кто-нибудь помочь мне разобраться в этом?
Вы действительно сделаете мой день лучше.
Спасибо!