Я нашел решение своей проблемы. Вместо того чтобы создавать столбец 'polarity'
вручную, присваивая trump['polarity']
результату вложенных списков, я объединил tidy_format
и sent
фреймы данных (sent
содержит столбец polarity
, содержащий показатель полярности каждого слова в Лексика VADER, индексируемая каждым отдельным словом) и выполненные операции над полученной таблицей:
>>> tidy_sent = tidy_format.merge(sent, left_on = 'word', right_index = True)
>>> tidy_sent.fillna(0, inplace = True)
>>> tidy_sent.index = tidy_sent.index.set_names('id')
>>> tidy_sent.head()
num word polarity
id
786204978629185536 0 pay -0.4
783477966906925056 5 pay -0.4
771294347501461504 2 pay -0.4
771210555822477313 2 pay -0.4
764552764177481728 20 pay -0.4
>>> ts_grouped = tidy_sent.groupby('id').sum()
>>> ts_grouped.head()
num polarity
id
690171403388104704 10 -2.6
690173226341691392 27 -6.0
690176882055114758 39 4.3
690180284189310976 38 -2.6
690271688127213568 18 -5.2
>>> trump['polarity'] = ts_grouped['polarity']
>>> trump.fillna(0, inplace = True)
>>> trump['polarity'].head()
786204978629185536 1.0
786201435486781440 -6.9
786189446274248704 1.8
786054986534969344 1.5
786007502639038464 1.2
Name: polarity, dtype: float64
Поскольку изначально моя ошибка заключалась в вычислении trump['polarity']
, при объединении таблиц я могу получить правильное значение для этого Series
, что позволяет мне правильно вызывать sort_values()
.
>>> print('Most negative tweets:')
>>> for t in trump.sort_values(by = 'polarity').head()['text']:
print('\n ', t)
Most negative tweets:
the trump portrait of an unsustainable border crisis is dead on. “in the last two years, ice officers made 266,000 arrests of aliens with criminal records, including those charged or convicted of 100,000 assaults, 30,000 sex crimes & 4000 violent killings.” america’s southern....
it is outrageous that poisonous synthetic heroin fentanyl comes pouring into the u.s. postal system from china. we can, and must, end this now! the senate should pass the stop act – and firmly stop this poison from killing our children and destroying our country. no more delay!
the rigged russian witch hunt goes on and on as the “originators and founders” of this scam continue to be fired and demoted for their corrupt and illegal activity. all credibility is gone from this terrible hoax, and much more will be lost as it proceeds. no collusion!
...this evil anti-semitic attack is an assault on humanity. it will take all of us working together to extract the poison of anti-semitism from our world. we must unite to conquer hate.
james comey is a proven leaker & liar. virtually everyone in washington thought he should be fired for the terrible job he did-until he was, in fact, fired. he leaked classified information, for which he should be prosecuted. he lied to congress under oath. he is a weak and.....