Данные:
Profit Amount Rate Accunt Status Yr
0.3065 56999 1 Acc3 S1 1
0.3956 57000 1 Acc3 S1 1
0.3065 57001 1 Acc3 S1 1
0.3956 57002 1 Acc3 S1 1
0.3065 57003 1 Acc3 S1 2
0.3065 57004 0.89655 Acc3 S1 3
0.3956 57005 0.89655 Acc3 S1 3
0.2984 57006 0.89655 Acc3 S1 3
0.3956 57007 1 Acc3 S2 2
0.3956 57008 1 Acc3 S2 2
0.2984 57009 1 Acc3 S2 2
0.2984 57010 1 Acc1 S1 1
0.3956 57011 1 Acc1 S1 1
0.3065 57012 1 Acc1 S1 1
0.3065 57013 1 Acc1 S1 1
0.3065 57013 1 Acc1 S1 1
Код:
df = df1\
.join(df12,(df12.code == df2.code),how = 'left').drop(df2.code).filter(col('Date') == '20Jan2019')\
.join(df3,df1.id== df3.id,how = 'left').drop(df3.id)\
.join(df4,df1.id == df4.id,how = 'left').drop(df4.id)\
.join(df5,df1.id2 == df5.id2,how ='left').drop(df5.id2)\
.withColumn("Account",concat(trim(df3.name1),trim(df4name1)))\
.withColumn("Status",when(df1.FB_Ind == 1,"S1").otherwise("S2"))\
.withColumn('Year',((df1['date'].substr(6, 4))+df1['Year']))
df6 = df.distinct()
df7 = df6.groupBy('Yr','Status','Account')\
.agg(sum((Profit * amount)/Rate).alias('output'))
Вывод, который я получаю, имеет десятичные дроби, например 0,234, а не тысячи 23344.2 Преобразование Sum((Profit*amount)/Rate)
в код вывода в pyspark