Как сделать матричное умножение в pyspark? - PullRequest
0 голосов
/ 01 августа 2020

У меня есть user's вектор «вкуса»:

+------+--------------------+
|userId|      scaledFeatures|
+------+--------------------+
|    18|[0.0,0.0,0.0,0.0,...|
|    65|[0.0,0.0023910733...|
|    96|[0.0,0.0,0.005268...|
|   121|[0.0,0.0021253985...|
|   129|[0.0,0.0029224229...|
+------+--------------------+

И movie's векторы содержимого:

+-------+--------------------+
|movieId|      scaledFeatures|
+-------+--------------------+
|      1|[1.0,0.0,0.0,0.0,...|
|      2|[1.0,0.0,0.0,0.0,...|
|      3|[0.0,0.0,0.0,0.0,...|
|      4|[0.0,0.0,0.0,0.0,...|
|      5|[0.0,0.0,0.0,0.0,...|
+-------+--------------------+

Как взять user's вектор вкуса по его userId и умножить на movie's таблицу содержимого в pyspark? чтобы я мог получить самые похожие фильмы, набрав movieId?

...