С кодом, предоставленным OP
, мы получаем следующее DataFrame
-
df.show(truncate=False)
+----+---+----+------------------------------+-------------+
|Name|yr |cash|cash_list |discount_rate|
+----+---+----+------------------------------+-------------+
|b |5 |43 |[43] |0.3 |
|b |4 |55 |[43, 55] |0.3 |
|b |3 |34 |[43, 55, 34] |0.3 |
|b |2 |32 |[43, 55, 34, 32] |0.3 |
|b |1 |23 |[43, 55, 34, 32, 23] |0.3 |
|a |6 |600 |[600] |0.3 |
|a |5 |500 |[600, 500] |0.3 |
|a |4 |400 |[600, 500, 400] |0.3 |
|a |3 |300 |[600, 500, 400, 300] |0.3 |
|a |2 |200 |[600, 500, 400, 300, 200] |0.3 |
|a |1 |100 |[600, 500, 400, 300, 200, 100]|0.3 |
+----+---+----+------------------------------+-------------+
ОП хочет вычислить Net Present Value (NPV)
и для этого он хочет использовать UDF
. Для Name=a yr=1
NPV выглядит следующим образом -
600 / (1,3) ^ 5 + 500 / (1,3) ^ 4 + 400 / (1,3) ^ 3 + 300 / (1,3) ^ 2 + 200 / (1,3) ^ 1 + 100 / (1,3)
# Creating a function and it's corresponding UDF
from pyspark.sql.functions import udf
def calculate_npv(cash_list,rate):
# Reverse the List
cash_list = cash_list[::-1]
return float(np.npv(rate,cash_list))
calculate_npv = udf(calculate_npv,FloatType())
# Applying the UDF to the DataFrame below
df = df.withColumn('NPV',calculate_npv('cash_list','discount_rate'))
df.show(truncate=False)
+----+---+----+------------------------------+-------------+----------+
|Name|yr |cash|cash_list |discount_rate|NPV |
+----+---+----+------------------------------+-------------+----------+
|b |5 |43 |[43] |0.3 |43.0 |
|b |4 |55 |[43, 55] |0.3 |88.07692 |
|b |3 |34 |[43, 55, 34] |0.3 |101.75148 |
|b |2 |32 |[43, 55, 34, 32] |0.3 |110.27037 |
|b |1 |23 |[43, 55, 34, 32, 23] |0.3 |107.823364|
|a |6 |600 |[600] |0.3 |600.0 |
|a |5 |500 |[600, 500] |0.3 |961.53845 |
|a |4 |400 |[600, 500, 400] |0.3 |1139.645 |
|a |3 |300 |[600, 500, 400, 300] |0.3 |1176.65 |
|a |2 |200 |[600, 500, 400, 300, 200] |0.3 |1105.1154 |
|a |1 |100 |[600, 500, 400, 300, 200, 100]|0.3 |950.08875 |
+----+---+----+------------------------------+-------------+----------+