Вариант 1
re.findall
с collections.Counter
import re
from collections import Counter
with open('test.txt') as f:
data = re.findall(r'(?m)^(\w+).*@.*$', f.read())
print(Counter(data))
# Counter({'tom': 5, 'peter': 4, 'edwin': 3, 'amy': 3, 'john': 1})
regex
объяснение:
(?m) # asserts multiline matching
^ # asserts position at the start of the line
(\w+) # captures any word character in group 1 (this is the name you want)
.* # Greedily matches any character besides line breaks
@ # Matches an @ symbol
.* # Greedily matches any character besides line breaks
$ # Asserts position at end of line
Если вам действительно нужно количество раз, когда они упоминают людей , а не только количество строк, в которыхони упоминают людей :
Вариант 2
Использование collections.defaultdict
:
with open('test.txt') as f:
dct = defaultdict(int)
for line in f:
dct[line.split()[0]] += line.count('@')
print(dct)
# defaultdict(<class 'int'>, {'peter': 5, 'amy': 3, 'tom': 5, 'edwin': 3, 'john': 2})
Опция 3
Живи на грани с pandas
:
import pandas as pd
with open('test.txt') as f:
data = [i.split(' ', 1) for i in f.read().splitlines()]
df = pd.DataFrame(data)
print(df.groupby(0).sum()[1].str.count('@'))
# Result
0
amy 3
edwin 3
john 2
peter 5
tom 5