Когда я запускаю этот код в Jupyter Notebook:
columns = ['nkill', 'nkillus', 'nkillter','nwound', 'nwoundus', 'nwoundte', 'propvalue', 'nperps', 'nperpcap', 'iyear', 'imonth', 'iday']
for col in columns:
# needed for any missing values set to '-99'
df[col] = [np.nan if (x < 0) else x for x in
df[col].tolist()]
# calculate the mean of the column
column_temp = [0 if math.isnan(x) else x for x in df[col].tolist()]
mean = round(np.mean(column_temp))
# then apply the mean to all NaNs
df[col].fillna(mean, inplace=True)
я получаю следующую ошибку:
AttributeError Traceback
(most recent call last)
<ipython-input-56-f8a0a0f314e6> in <module>()
3 for col in columns:
4 # needed for any missing values set to '-99'
----> 5 df[col] = [np.nan if (x < 0) else x for x in df[col].tolist()]
6 # calculate the mean of the column
7 column_temp = [0 if math.isnan(x) else x for x in df[col].tolist()]
/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in __getattr__(self, name)
4374 if self._info_axis._can_hold_identifiers_and_holds_name(name):
4375 return self[name]
-> 4376 return object.__getattribute__(self, name)
4377
4378 def __setattr__(self, name, value):
AttributeError: 'DataFrame' object has no attribute 'tolist'
Код работает нормально, когда я запускаю его в Pycharm, и всемое исследование привело меня к выводу, что все должно быть хорошо.Я что-то пропустил?
Я создал минимальный, полный и проверяемый пример ниже:
import numpy as np
import pandas as pd
import os
import math
# get the path to the current working directory
cwd = os.getcwd()
# then add the name of the Excel file, including its extension to get its relative path
# Note: make sure the Excel file is stored inside the cwd
file_path = cwd + "/data.xlsx"
# Copy the database to file
df = pd.read_excel(file_path)
columns = ['nkill', 'nkillus', 'nkillter', 'nwound', 'nwoundus', 'nwoundte', 'propvalue', 'nperps', 'nperpcap', 'iyear', 'imonth', 'iday']
for col in columns:
# needed for any missing values set to '-99'
df[col] = [np.nan if (x < 0) else x for x in df[col].tolist()]
# calculate the mean of the column
column_temp = [0 if math.isnan(x) else x for x in df[col].tolist()]
mean = round(np.mean(column_temp))
# then apply the mean to all NaNs
df[col].fillna(mean, inplace=True)