Как разрешить TypeError: строковые индексы должны быть целыми числами, с пониманием списка над списком dicts? - PullRequest
0 голосов
/ 09 мая 2020

Может кто-нибудь объяснить мне, почему это работает: (автономно)

numpy_data = np.array([[1, [{'id': 1495, 'name': 'fishing'}, {'id': 12392, 'name': 'best friend'}]], 
                   [3, [{‘id’: 818, ‘name’: ‘based on novel’}, {‘id’: 10131, ‘name’: ‘interracial relationship’}]]])
    df = pd.DataFrame(data=numpy_data, index=[“row1”, “row2"], columns=[“id”, “keywords_text”])
    df[‘keywords_list’] = df[‘keywords_text’].apply(lambda column_value : ” “.join([sub[‘name’] for sub in column_value]))
    df.head(20)

Вот результат команды head:

df is a <class 'pandas.core.frame.DataFrame'> datatype
       id   keywords_text                                       keywords_list
==== =====  =================================================== ========================
row1    1   [{'id': 1495, 'name': 'fishing'}, {'id': 12392...   fishing best friend
row2    3   [{'id': 818, 'name': 'based on novel'}, {'id':...   based on novel interracial relationship

А этого нет: (это взят из набора данных Kaggle Movies, файла ключевых слов)

df_movie_keywords[‘keywords_list’] = df_movie_keywords[‘keywords’].apply(lambda column_value : ” “.join([sub[‘name’] for sub in column_value]))

Я получаю эту ошибку:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1473-18a756783d63> in <module>
     15 
     16 # df_movie_keywords['keywords_list'] = df_movie_keywords.apply(lambda row: string_all_keywords(row), axis=1)
---> 17 df_movie_keywords['keywords_list'] = df_movie_keywords['keywords'].apply(lambda column_value : " ".join([sub['name'] for sub in column_value]))
     18 
     19 # df['keywords_list'] = df['keywords_text'].apply(lambda column_value : " ".join([sub['name'] for sub in column_value]))
~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
   3846             else:
   3847                 values = self.astype(object).values
-> 3848                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   3849 
   3850         if len(mapped) and isinstance(mapped[0], Series):
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-1473-18a756783d63> in <lambda>(column_value)
     15 
     16 # df_movie_keywords['keywords_list'] = df_movie_keywords.apply(lambda row: string_all_keywords(row), axis=1)
---> 17 df_movie_keywords['keywords_list'] = df_movie_keywords['keywords'].apply(lambda column_value : " ".join([sub['name'] for sub in column_value]))
     18 
     19 # df['keywords_list'] = df['keywords_text'].apply(lambda column_value : " ".join([sub['name'] for sub in column_value]))
<ipython-input-1473-18a756783d63> in <listcomp>(.0)
     15 
     16 # df_movie_keywords['keywords_list'] = df_movie_keywords.apply(lambda row: string_all_keywords(row), axis=1)
---> 17 df_movie_keywords['keywords_list'] = df_movie_keywords['keywords'].apply(lambda column_value : " ".join([sub['name'] for sub in column_value]))
     18 
     19 # df['keywords_list'] = df['keywords_text'].apply(lambda column_value : " ".join([sub['name'] for sub in column_value]))
TypeError: string indices must be integers

1 Ответ

0 голосов
/ 09 мая 2020
from ast import literal_eval
import pandas as pd

df = pd.read_csv('keywords.csv')

print(type(df.keywords[0]))

>>> <class 'str'>

df.keywords = df.keywords.apply(literal_eval)

print(type(df.keywords[0]))

>>> <class 'list'>

df['keywords_list'] = df['keywords'].apply(lambda column_value : " ".join([sub['name'] for sub in column_value]))

print(df.head)

0    jealousy toy boy friendship friends rivalry bo...
1    board game disappearance based on children's b...
2     fishing best friend duringcreditsstinger old men
3    based on novel interracial relationship single...
4    baby midlife crisis confidence aging daughter ...
Name: keywords_list, dtype: object
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...