ДАННЫЕ
mystring1 <- "Other work has shown that, in addition to language-general features such as a decreased speaking rate and an expanded pitch range, clear speech production involves the enhancement of the acoustic-phonetic distance between phonologically contrastive categories ?e.g., Ferguson and Kewley-Port, 2002; Krause and Braida, 2004, Picheny et al, 1986; Smiljanic and Bradlow, 2005, 2007?."
mystring2 <- "Other work has shown that, in addition to language-general features such as a decreased speaking rate and an expanded pitch range, clear speech production involves the enhancement of the acoustic-phonetic distance between phonologically contrastive categories ?e.g., Ferguson and Kewley-Port, 2002; Krause and Braida, 2004, Picheny et al, 1986; Smiljanic and Bradlow, 2005, 2007?. Therefore, reduced sensitivity to any or all of the language-specific acoustic-phonetic dimensions of contrast and clear speech enhancement would yield a diminished clear speech benefit for non-native listeners. This may appear somewhat surprising given that clear speech production was elicited in our studies by instructing the talkers to speak clearly for the sake of listeners with either a hearing impairment or from a different native language background. However, as discussed further in Bradlow and Bent ?2002?, the limits of clear speech as a means of enhancing non-native speech perception likely reflect the “mistuning” that characterizes spoken language communication between native and non-native speakers."
Я хотел бы получить помощь по регулярному выражению.Я получил некоторые текстовые данные.В основном я хочу удалить части цитирования, которые появляются между последним словом в предложении и точкой.Однако скобки как-то отсутствуют.mystring1
является примером для этого.В этом примере я хочу удалить e.g., Ferguson and Kewley-Port, 2002; Krause and Braida, 2004, Picheny et al, 1986; Smiljanic and Bradlow, 2005, 2007?
.Но это предложение является лишь одним из предложений в параграфе.mystring2
содержит еще три предложения после mystring1
.Моя цель - убрать часть цитаты из mystring2
.Но я не был успешным;шаблон удаляет больше текстов, чем я хочу.Как я могу пересмотреть шаблон регулярных выражений?Заранее благодарю за помощь.
# This works for mystring1.
gsub(x = mystring1, pattern = "e\\.g\\.,.*[0-9]{4}(?=.)", replacement = "", perl = T)
[1] "Other work has shown that, in addition to language-general features such as a
decreased speaking rate and an expanded pitch range, clear speech production involves
the enhancement of the acoustic-phonetic distance between phonologically contrastive
categories ??."
# But this pattern does not work for mystring2; gsub() removes texts more than I want.
gsub(x = mystring2, pattern = "e\\.g\\.,.*[0-9]{4}(?=.)", replacement = "", perl = T)
[1] "Other work has shown that, in addition to language-general features such as a decreased
speaking rate and an expanded pitch range, clear speech production involves the
enhancement of the acoustic-phonetic distance between phonologically contrastive
categories ??, the limits of clear speech ... (I trimmed texts here) speakers."