Я могу что-то упустить, но вы можете просто использовать .replace()
:
sample_string = 'Patient Jane Candy was seen by Dr. Smith on 12/1/2018 and her number is 5041112222'
dic_list = [
{'id': 'T1', 'type': 'PATIENT', 'start': 0, 'end': 6, 'text': 'Jane'},
{'id': 'T2', 'type': 'PATIENT', 'start': 8, 'end': 11, 'text': 'Candy'},
{'id': 'T3', 'type': 'DOCTOR', 'start': 35, 'end': 39, 'text': 'Smith'},
{'id': 'T4', 'type': 'DATE', 'start': 44, 'end': 52, 'text': '12/1/2018'},
{'id': 'T5', 'type': 'PHONE', 'start': 72, 'end': 81, 'text': '5041112222'}]
for dic in dic_list:
sample_string = sample_string.replace(dic['text'], '**PHI**')
print(sample_string)
Хотя regex
, вероятно, будет быстрее:
import re
sample_string = 'Patient Jane Candy was seen by Dr. Smith on 12/1/2018 and her number is 5041112222'
dic_list = [
{'id': 'T1', 'type': 'PATIENT', 'start': 0, 'end': 6, 'text': 'Jane'},
{'id': 'T2', 'type': 'PATIENT', 'start': 8, 'end': 11, 'text': 'Candy'},
{'id': 'T3', 'type': 'DOCTOR', 'start': 35, 'end': 39, 'text': 'Smith'},
{'id': 'T4', 'type': 'DATE', 'start': 44, 'end': 52, 'text': '12/1/2018'},
{'id': 'T5', 'type': 'PHONE', 'start': 72, 'end': 81, 'text': '5041112222'}]
pattern = re.compile('|'.join(dic['text'] for dic in dic_list))
result = pattern.sub('**PHI**', sample_string)
print(result)
Оба вывода:
Patient **PHI** **PHI** was seen by Dr. **PHI** on **PHI** and her number is **PHI**