Образец:
Combined_Relevant = pd.DataFrame({
'Date':['2019-01-01'] * 6,
'Countries':list('aaabbb'),
'A':[1,5,4,2,5,8],
'B':[7,8,9,4,2,3],
})
Использование GroupBy.transform
с sum
для серий такого же размера, как у оригинала Dataframe
:
g = Combined_Relevant.groupby(["Date",'Countries'])
Combined_Relevant["Ratio"] = g['A'].transform('sum') / g['B'].transform('sum')
print (Combined_Relevant)
Date Countries A B Ratio
0 2019-01-01 a 1 7 0.416667
1 2019-01-01 a 5 8 0.416667
2 2019-01-01 a 4 9 0.416667
3 2019-01-01 b 2 4 1.666667
4 2019-01-01 b 5 2 1.666667
5 2019-01-01 b 8 3 1.666667
Ваше решение работа с rename
и DataFrame.join
:
def divide_two_cols(df_sub):
return df_sub['A'].sum() / float(df_sub['B'].sum())
s = Combined_Relevant.groupby(["Date",'Countries']).apply(divide_two_cols).rename('Ratio')
Combined_Relevant1=Combined_Relevant.join(s, on=['Date','Countries'])
print (Combined_Relevant1)
Date Countries A B Ratio
0 2019-01-01 a 1 7 0.416667
1 2019-01-01 a 5 8 0.416667
2 2019-01-01 a 4 9 0.416667
3 2019-01-01 b 2 4 1.666667
4 2019-01-01 b 5 2 1.666667
5 2019-01-01 b 8 3 1.666667