Как лучше (или быстрее всего) сделать этот код короче? - PullRequest
0 голосов
/ 26 мая 2020

Для ясности, файл, из которого я получаю данные, выглядит следующим образом на несколько тысяч строк:

[12:29, 8.2.2020] Fabian Obst: Wir sind stammtisch heute raus
[12:30, 8.2.2020] Benedikt Stumpf: Dito
[12:40, 8.2.2020] Louis Rückel: Ich wär da
[12:41, 8.2.2020] Jan Hofmann: Ich geb nochmal bescheid

Профессиональные программисты, вероятно, вырвут себе глаза, если увидят этот код - но я не еще знаю эффективный способ сократить его. Не могли бы вы мне помочь?

class Months():
    December17 = []
    January18 = []
    February18 = []
    March18 = []
    April18 = []
    May18 =[]
    June18 = []
    July18 = []
    August18 = []
    September18 = []
    October18 = []
    November18 = []
    December18 = []
    January19 = []
    February19 = []
    March19 = []
    April19 = []
    May19 =[]
    June19 = []
    July19 = []
    August19 = []
    September19 = []
    October19 = []
    November19 = []
    December19 = []
    January20 = []
    February20 = []
    March20 = []
    April20 = []
    May20 =[]

    with open('whatsapp.txt','r', encoding="UTF-8") as file:
        for line in file:
            if '12.2017' in line:
                December17.append(line)
            elif '.1.2018' in line:
                January18.append(line)
            elif '.2.2018' in line:
                February18.append(line)
            elif '3.2018' in line:
                March18.append(line)
            elif '4.2018' in line:
                April18.append(line)
            elif '5.2018' in line:
                May18.append(line)
            elif '6.2018' in line:
                June18.append(line)
            elif '7.2018' in line:
                July18.append(line)
            elif '8.2018' in line:
                August18.append(line)
            elif '9.2018' in line:
                September18.append(line)
            elif '10.2018' in line:
                October18.append(line)
            elif '11.2018' in line:
                November18.append(line)
            elif '12.2018' in line:
                December18.append(line)
            elif '.1.2019' in line:
                January19.append(line)
            elif '.2.2019' in line:
                February19.append(line)
            elif '3.2019' in line:
                March19.append(line)
            elif '4.2019' in line:
                April19.append(line)
            elif '5.2019' in line:
                May19.append(line)
            elif '6.2019' in line:
                June19.append(line)
            elif '7.2019' in line:
                July19.append(line)
            elif '8.2019' in line:
                August19.append(line)
            elif '9.2019' in line:
                September19.append(line)
            elif '10.2019' in line:
                October19.append(line)
            elif '11.2019' in line:
                November19.append(line)
            elif '12.2019' in line:
                December19.append(line)
            elif '.1.2020' in line:
                January20.append(line)
            elif '.2.2020' in line:
                February20.append(line)
            elif '3.2020' in line:
                March20.append(line)
            elif '4.2020' in line:
                April20.append(line)
            elif '5.2020' in line:
                May20.append(line)

    print (" December17:", len(December17),"\n",
        "January18:", len(January18),"\n",
        "February18:", len(February18),"\n",
        "March18:", len(March18),"\n",
        "April18:", len(April18),"\n",
        "May18:", len(May18),"\n",
        "June18:", len(June18),"\n",
        "July18:", len(July18),"\n",
        "August18:", len(August18),"\n",
        "September18:", len(September18),"\n",
        "October18:", len(October18),"\n",
        "November18:", len(November18),"\n",
        "December18:", len(December18),"\n",
        "January19:", len(January19),"\n",
        "February19:", len(February19),"\n",
        "March19:", len(March19),"\n",
        "April19:", len(April19),"\n",
        "May19:", len(May19),"\n",
        "June19:", len(June19),"\n",
        "July19:", len(July19),"\n",
        "August19:", len(August19),"\n",
        "September19:", len(September19),"\n",
        "October19:", len(October19),"\n",
        "November19:", len(November19),"\n",
        "December19:", len(December19),"\n",
        "January20:", len(January20),"\n",
        "February20:", len(February20),"\n",
        "March20:", len(March20),"\n",
        "April20:", len(April20),"\n",
        "May20:", len(May20),"\n",
        )

    Summary = len(December17+January18+February18+March18+April18
        +May18+June18+July18+August18+September18+October18
        +November18+December18+January19+February19+March19
        +April19+May19+June19+July19+August19+September19
        +October19+November19+December19+January20+February20
        +March20+April20+May20)
    print ("There are", Summary, "messages in total.")

Что возвращает то, что должно:

December17: 19 
 January18: 13 
 February18: 41 
 March18: 43 
 April18: 80 
 May18: 241 
 June18: 67 
 July18: 183 
 August18: 280 
 September18: 83 
 October18: 61 
 November18: 116 
 December18: 228 
 January19: 145 
 February19: 111 
 March19: 131 
 April19: 188 
 May19: 151 
 June19: 120 
 July19: 222 
 August19: 289 
 September19: 141 
 October19: 127 
 November19: 107 
 December19: 190 
 January20: 92 
 February20: 73 
 March20: 90 
 April20: 45 
 May20: 136 

There are 3813 messages in total.

Я бы хотел иметь только несколько строк для 30 списков вверху, или, может быть, также для операторов if и print ближе к концу.

1 Ответ

1 голос
/ 26 мая 2020

Вам нужно что-то вроде этого:

from collections import OrderedDict
from datetime import datetime

months = OrderedDict()

with open('whatsapp.txt', 'r', encoding='utf-8') as file:
    for line in file:
        ts = datetime.strptime(line.split(']')[0], '[%H:%M, %d.%m.%Y')
        months.setdefault(ts.strftime('%b %Y'), []).append(line)

for month, messages in months.items():
    print(f'{month}:', len(messages))

print('There are {} messages in total.'.format(sum(map(len, months.values()))))

line.split(']')[0] получает начало каждой строки, например «[12:29, 8.2.2020», чтобы затем преобразовать его в объект datetime . Этот datetime затем используется для формирования ключа типа «Январь 2020» в упорядоченном словаре и добавления к нему строки. Остальное - это вычисления на основе агрегированных данных.

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...