Ключевой концепцией в этом случае я считаю использование оператора |
, который работает как или или или для шаблона. Это очень полезно для этих случаев.
Вот как я бы решил проблему:
import pandas as pd
df = pd.DataFrame({'index':[0,1,2,3,4,5,6,7],
'summary':['Fwd: Please look at the attached documents and take action ',
'NSN for the ones who care',
'News for all team members ',
'Fwd:RE:Re: Please take action on the action needed items',
'Fix all the mistakes please ',
'Fwd:Re: Take action on the attachments in this email',
'Fwd:RE: Action is required',
'Redemption!']})
df['clean'] = df['summary'].str.replace(r'^Fwd:|R[eE]:\s*','')
print(df)
Вывод:
index ... clean
0 0 ... Please look at the attached documents and tak...
1 1 ... NSN for the ones who care
2 2 ... News for all team members
3 3 ... Please take action on the action needed items
4 4 ... Fix all the mistakes please
5 5 ... Take action on the attachments in this email
6 6 ... Action is required
7 7 ... Redemption!