Pandas Dataframe: непреднамеренное удаление столбцов - * Странное поведение - PullRequest
0 голосов
/ 21 сентября 2018

Я испытываю странное поведение при попытке создать кадр данных.У меня есть список диктов, которые я конвертирую в фрейм данных.Однако в процессе создания два столбца непреднамеренно удаляются.Я не уверен, почему это происходит.

Вот мой список:

    data_income_stmt = [{'ticker': 'ADBE', 'FY': 2017, 'statement': 'income_statement', 'operatingrevenue': 7301505000.0, 'totalrevenue': 7301505000.0, 'operatingcostofrevenue': 1010491000.0, 'totalcostofrevenue': 1010491000.0, 'totalgrossprofit': 6291014000.0, 'sgaexpense': 624706000.0, 'marketingexpense': 2197592000.0, 'rdexpense': 1224059000.0, 'amortizationexpense': 76562000.0, 'totaloperatingexpenses': 4122919000.0, 'totaloperatingincome': 2168095000.0, 'totalinterestexpense': 74402000.0, 'totalinterestincome': 7553000.0, 'otherincome': 36395000.0, 'totalotherincome': -30454000.0, 'totalpretaxincome': 2137641000.0, 'incometaxexpense': 443687000.0, 'netincomecontinuing': 1693954000.0, 'netincome': 1693954000.0, 'netincometocommon': 1693954000.0, 'weightedavebasicsharesos': 493632000.0, 'basiceps': 3.43, 'weightedavedilutedsharesos': 501123000.0, 'dilutedeps': 3.38, 'weightedavebasicdilutedsharesos': 493900000.0, 'basicdilutedeps': 3.43}, {'ticker': 'ADBE', 'FY': 2016, 'statement': 'income_statement', 'operatingrevenue': 5854430000.0, 'totalrevenue': 5854430000.0, 'operatingcostofrevenue': 819908000.0, 'totalcostofrevenue': 819908000.0, 'totalgrossprofit': 5034522000.0, 'sgaexpense': 576202000.0, 'marketingexpense': 1910197000.0, 'rdexpense': 975987000.0, 'amortizationexpense': 78534000.0, 'totaloperatingexpenses': 3540920000.0, 'totaloperatingincome': 1493602000.0, 'totalinterestexpense': 70442000.0, 'totalinterestincome': -1570000.0, 'otherincome': 13548000.0, 'totalotherincome': -58464000.0, 'totalpretaxincome': 1435138000.0, 'incometaxexpense': 266356000.0, 'netincomecontinuing': 1168782000.0, 'netincome': 1168782000.0, 'netincometocommon': 1168782000.0, 'weightedavebasicsharesos': 498345000.0, 'basiceps': 2.35, 'weightedavedilutedsharesos': 504299000.0, 'dilutedeps': 2.32, 'weightedavebasicdilutedsharesos': 497400000.0, 'basicdilutedeps': 2.35}, {'ticker': 'ADBE', 'FY': 2015, 'statement': 'income_statement', 'operatingrevenue': 4795511000.0, 'totalrevenue': 4795511000.0, 'operatingcostofrevenue': 744317000.0, 'totalcostofrevenue': 744317000.0, 'totalgrossprofit': 4051194000.0, 'sgaexpense': 533478000.0, 'marketingexpense': 1683242000.0, 'rdexpense': 862730000.0, 'amortizationexpense': 68649000.0, 'totaloperatingexpenses': 3148099000.0, 'totaloperatingincome': 903095000.0, 'totalinterestexpense': 64184000.0, 'totalinterestincome': 961000.0, 'otherincome': 33909000.0, 'totalotherincome': -29314000.0, 'totalpretaxincome': 873781000.0, 'incometaxexpense': 244230000.0, 'netincomecontinuing': 629551000.0, 'netincome': 629551000.0, 'netincometocommon': 629551000.0, 'weightedavebasicsharesos': 498764000.0, 'basiceps': 1.26, 'weightedavedilutedsharesos': 507164000.0, 'dilutedeps': 1.24, 'weightedavebasicdilutedsharesos': 499600000.0, 'basicdilutedeps': 1.26}, {'ticker': 'AMZN', 'FY': 2017, 'statement': 'income_statement', 'operatingrevenue': 177866000000.0, 'totalrevenue': 177866000000.0, 'operatingcostofrevenue': 137183000000.0, 'totalcostofrevenue': 137183000000.0, 'totalgrossprofit': 40683000000.0, 'sgaexpense': 3888000000.0, 'marketingexpense': 10069000000.0, 'rdexpense': 22620000000.0, 'totaloperatingexpenses': 36577000000.0, 'totaloperatingincome': 4106000000.0, 'totalinterestexpense': 848000000.0, 'totalinterestincome': 202000000.0, 'otherincome': 346000000.0, 'totalotherincome': -300000000.0, 'totalpretaxincome': 3806000000.0, 'incometaxexpense': 769000000.0, 'othergains': -4000000.0, 'netincomecontinuing': 3033000000.0, 'netincome': 3033000000.0, 'netincometocommon': 3033000000.0, 'weightedavebasicsharesos': 480000000.0, 'basiceps': 6.32, 'weightedavedilutedsharesos': 493000000.0, 'dilutedeps': 6.15, 'weightedavebasicdilutedsharesos': 479900000.0, 'basicdilutedeps': 6.32}, {'ticker': 'AMZN', 'FY': 2016, 'statement': 'income_statement', 'operatingrevenue': 135987000000.0, 'totalrevenue': 135987000000.0, 'operatingcostofrevenue': 105884000000.0, 'totalcostofrevenue': 105884000000.0, 'totalgrossprofit': 30103000000.0, 'sgaexpense': 2599000000.0, 'marketingexpense': 7233000000.0, 'rdexpense': 16085000000.0, 'totaloperatingexpenses': 25917000000.0, 'totaloperatingincome': 4186000000.0, 'totalinterestexpense': 484000000.0, 'totalinterestincome': 100000000.0, 'otherincome': 90000000.0, 'totalotherincome': -294000000.0, 'totalpretaxincome': 3892000000.0, 'incometaxexpense': 1425000000.0, 'othergains': -96000000.0, 'netincomecontinuing': 2371000000.0, 'netincome': 2371000000.0, 'netincometocommon': 2371000000.0, 'weightedavebasicsharesos': 474000000.0, 'basiceps': 5.01, 'weightedavedilutedsharesos': 484000000.0, 'dilutedeps': 4.9, 'weightedavebasicdilutedsharesos': 473300000.0, 'basicdilutedeps': 5.01}, {'ticker': 'AMZN', 'FY': 2015, 'statement': 'income_statement', 'operatingrevenue': 107006000000.0, 'totalrevenue': 107006000000.0, 'operatingcostofrevenue': 85061000000.0, 'totalcostofrevenue': 85061000000.0, 'totalgrossprofit': 21945000000.0, 'sgaexpense': 1918000000.0, 'marketingexpense': 5254000000.0, 'rdexpense': 12540000000.0, 'totaloperatingexpenses': 19712000000.0, 'totaloperatingincome': 2233000000.0, 'totalinterestexpense': 459000000.0, 'totalinterestincome': 50000000.0, 'otherincome': -256000000.0, 'totalotherincome': -665000000.0, 'totalpretaxincome': 1568000000.0, 'incometaxexpense': 950000000.0, 'othergains': -22000000.0, 'netincomecontinuing': 596000000.0, 'netincome': 596000000.0, 'netincometocommon': 596000000.0, 'weightedavebasicsharesos': 467000000.0, 'basiceps': 1.28, 'weightedavedilutedsharesos': 477000000.0, 'dilutedeps': 1.25, 'weightedavebasicdilutedsharesos': 465600000.0, 'basicdilutedeps': 1.28}, {'ticker': 'BA', 'FY': 2017, 'statement': 'income_statement', 'operatingrevenue': 93392000000.0, 'totalrevenue': 93392000000.0, 'operatingcostofrevenue': 76066000000.0, 'totalcostofrevenue': 76066000000.0, 'totalgrossprofit': 17326000000.0, 'sgaexpense': 4094000000.0, 'rdexpense': 3179000000.0, 'otherspecialcharges': -21000000.0, 'totaloperatingexpenses': 7252000000.0, 'totaloperatingincome': 10074000000.0, 'totalinterestexpense': 360000000.0, 'totalinterestincome': 204000000.0, 'otherincome': 129000000.0, 'totalotherincome': -27000000.0, 'totalpretaxincome': 10047000000.0, 'incometaxexpense': 1850000000.0, 'netincomecontinuing': 8197000000.0, 'netincome': 8197000000.0, 'netincometocommon': 8197000000.0, 'weightedavebasicsharesos': 602500000.0, 'basiceps': 13.6, 'weightedavedilutedsharesos': 602500000.0, 'dilutedeps': 13.43, 'weightedavebasicdilutedsharesos': 602500000.0, 'basicdilutedeps': 13.6, 'cashdividendspershare': 5.97}, {'ticker': 'BA', 'FY': 2016, 'statement': 'income_statement', 'operatingrevenue': 94571000000.0, 'totalrevenue': 94571000000.0, 'operatingcostofrevenue': 80790000000.0, 'totalcostofrevenue': 80790000000.0, 'totalgrossprofit': 13781000000.0, 'sgaexpense': 3616000000.0, 'rdexpense': 4627000000.0, 'otherspecialcharges': 7000000.0, 'totaloperatingexpenses': 8250000000.0, 'totaloperatingincome': 5531000000.0, 'totalinterestexpense': 306000000.0, 'totalinterestincome': 303000000.0, 'otherincome': 40000000.0, 'totalotherincome': 37000000.0, 'totalpretaxincome': 5568000000.0, 'incometaxexpense': 673000000.0, 'netincomecontinuing': 4895000000.0, 'netincome': 4895000000.0, 'netincometocommon': 4895000000.0, 'weightedavebasicsharesos': 635500000.0, 'basiceps': 7.7, 'weightedavedilutedsharesos': 635500000.0, 'dilutedeps': 7.61, 'weightedavebasicdilutedsharesos': 635500000.0, 'basicdilutedeps': 7.7, 'cashdividendspershare': 4.69}, {'ticker': 'BA', 'FY': 2015, 'statement': 'income_statement', 'operatingrevenue': 96114000000.0, 'totalrevenue': 96114000000.0, 'operatingcostofrevenue': 82088000000.0, 'totalcostofrevenue': 82088000000.0, 'totalgrossprofit': 14026000000.0, 'sgaexpense': 3525000000.0, 'rdexpense': 3331000000.0, 'otherspecialcharges': 1000000.0, 'totaloperatingexpenses': 6857000000.0, 'totaloperatingincome': 7169000000.0, 'totalinterestexpense': 275000000.0, 'totalinterestincome': 274000000.0, 'otherincome': -13000000.0, 'totalotherincome': -14000000.0, 'totalpretaxincome': 7155000000.0, 'incometaxexpense': 1979000000.0, 'netincomecontinuing': 5176000000.0, 'netincome': 5176000000.0, 'netincometocommon': 5176000000.0, 'weightedavebasicsharesos': 686900000.0, 'basiceps': 7.52, 'weightedavedilutedsharesos': 686900000.0, 'dilutedeps': 7.44, 'weightedavebasicdilutedsharesos': 686900000.0, 'basicdilutedeps': 7.52, 'cashdividendspershare': 3.82}]

Вот код, который я использую для преобразования кадра данных:

df = pd.DataFrame(data_income_stmt)

Результат - два пропущенных столбца: тикер, оператор

Вот результат при выполнении print(df.columns.values.tolist())

['FY', 'amortizationexpense', 'basicdilutedeps', 'basiceps', 'cashdividendspershare', 'dilutedeps', 'incometaxexpense', 'marketingexpense', 'netincome', 'netincomecontinuing', 'netincometocommon', 'operatingcostofrevenue', 'operatingrevenue', 'othergains', 'otherincome', 'otherspecialcharges', 'rdexpense', 'sgaexpense', 'statement', 'ticker', 'totalcostofrevenue', 'totalgrossprofit', 'totalinterestexpense', 'totalinterestincome', 'totaloperatingexpenses', 'totaloperatingincome', 'totalotherincome', 'totalpretaxincome', 'totalrevenue', 'weightedavebasicdilutedsharesos', 'weightedavebasicsharesos', 'weightedavedilutedsharesos']

Я не уверен, почему столбцы удаляются /отброшен.

1 Ответ

0 голосов
/ 22 сентября 2018

Не потеряно ни одного столбца.

Ваш информационный фрейм содержит 32 столбца

len((df.columns.values.tolist()))

Если вы просматриваете список по списку, соберите все ключи и сравните их с информационным фреймом, это то же самое.

keys = [] 
for e, k in enumerate(data_income_stmt):
    keys.extend(k.keys()) 
    print ('row',e,' keys so far', len(set(keys)),
           'statement found in keys', 'statement' in k.keys(),
           'ticker found in keys', 'ticker' in k.keys())

print('compare columns to keys', set(df.columns.values.tolist()) == set(keys))

print('ticker found in keys', 'ticker' in keys)
print('ticker found in df', 'ticker' in df.columns)
print('statement found in keys', 'statement' in keys)
print('statement found in df', 'statement' in df.columns)

Это печатает

row 0  keys so far 29 statement found in keys True ticker found in keys True
row 1  keys so far 29 statement found in keys True ticker found in keys True
row 2  keys so far 29 statement found in keys True ticker found in keys True
row 3  keys so far 30 statement found in keys True ticker found in keys True
row 4  keys so far 30 statement found in keys True ticker found in keys True
row 5  keys so far 30 statement found in keys True ticker found in keys True
row 6  keys so far 32 statement found in keys True ticker found in keys True
row 7  keys so far 32 statement found in keys True ticker found in keys True
row 8  keys so far 32 statement found in keys True ticker found in keys True
compare columns to keys True
ticker found in keys True
ticker found in df True
statement found in keys True
statement found in df True

Возможно, вас смущает тот факт, что каждый элемент словаря имеет 29 ключей, а не 32 ключа.Но оператор и тикер находятся там.

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...