Используйте альтернативное решение с set_index
и unstack
:
df = (df.set_index(['u_id','date','social_interaction_type_id'])['Total_Count']
.unstack()
.reset_index()
.rename_axis(None, axis=1))
print (df)
u_id date 1 2 4
0 4 2018-08-19 NaN NaN 5.0
1 4 2018-08-21 4.0 NaN NaN
2 4 2018-08-24 NaN 3.0 NaN
Если необходимо дублирование в первых 2 столбцах, используйте функцию агрегирования mean
sum
нравится:
print (df)
u_id date social_interaction_type_id Total_Count
0 4 2018-08-19 4 5 <- 4 2018-08-19
1 4 2018-08-19 6 4 <- 4 2018-08-19
2 4 2018-08-24 2 3
3 4 2018-08-21 1 4
df2 = (df.groupby(['u_id','date','social_interaction_type_id'])['Total_Count']
.mean()
.unstack()
.reset_index()
.rename_axis(None, axis=1))
Или:
df2 = (df.pivot_table(index=['u_id','date'],columns='social_interaction_type_id', values='Total_Count')
.reset_index()
.rename_axis(None, axis=1))
print (df2)
u_id date 1 2 4 6
0 4 2018-08-19 NaN NaN 5.0 4.0
1 4 2018-08-21 4.0 NaN NaN NaN
2 4 2018-08-24 NaN 3.0 NaN NaN