Может помочь "Explode", а затем "pivot", пожалуйста, проверьте "result" в выводе:
val data = List(
(1, "A", 160, "Chess"), (1, "B", 100, "Hockey"), (1, "C", 1200, "Football"), (1, "D", 900, "Cricket"),
(2, "E", 700, "Cricket"), (2, "F", 1000, "Chess"),
(3, "G", 1900, "Basketball"), (3, "I", 1000, "Cricket"), (3, "H", 9000, "Football")
)
val unstructured = data.toDF("id", "Team", "Amount", "Game")
unstructured.show(false)
val original = unstructured.groupBy("id").agg(collect_list(struct($"Team", $"Amount", $"Game")).alias("Games"))
println("--- Original ----")
original.printSchema()
original.show(false)
val exploded = original.withColumn("Games", explode($"Games")).select("id", "Games.*")
println("--- Exploded ----")
exploded.show(false)
println("--- Result ----")
exploded.groupBy("id").pivot("Game").agg(max($"Amount").alias("Amount"), max("Team").alias("Team")).orderBy("id").show(false)
Вывод:
+---+----+------+----------+
|id |Team|Amount|Game |
+---+----+------+----------+
|1 |A |160 |Chess |
|1 |B |100 |Hockey |
|1 |C |1200 |Football |
|1 |D |900 |Cricket |
|2 |E |700 |Cricket |
|2 |F |1000 |Chess |
|3 |G |1900 |Basketball|
|3 |I |1000 |Cricket |
|3 |H |9000 |Football |
+---+----+------+----------+
--- Original ----
root
|-- id: integer (nullable = false)
|-- Games: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- Team: string (nullable = true)
| | |-- Amount: integer (nullable = false)
| | |-- Game: string (nullable = true)
+---+-------------------------------------------------------------------+
|id |Games |
+---+-------------------------------------------------------------------+
|3 |[[G,1900,Basketball], [I,1000,Cricket], [H,9000,Football]] |
|1 |[[A,160,Chess], [B,100,Hockey], [C,1200,Football], [D,900,Cricket]]|
|2 |[[E,700,Cricket], [F,1000,Chess]] |
+---+-------------------------------------------------------------------+
--- Exploded ----
+---+----+------+----------+
|id |Team|Amount|Game |
+---+----+------+----------+
|3 |G |1900 |Basketball|
|3 |I |1000 |Cricket |
|3 |H |9000 |Football |
|1 |A |160 |Chess |
|1 |B |100 |Hockey |
|1 |C |1200 |Football |
|1 |D |900 |Cricket |
|2 |E |700 |Cricket |
|2 |F |1000 |Chess |
+---+----+------+----------+
--- Result ----
+---+-----------------+---------------+------------+----------+--------------+------------+---------------+-------------+-------------+-----------+
|id |Basketball_Amount|Basketball_Team|Chess_Amount|Chess_Team|Cricket_Amount|Cricket_Team|Football_Amount|Football_Team|Hockey_Amount|Hockey_Team|
+---+-----------------+---------------+------------+----------+--------------+------------+---------------+-------------+-------------+-----------+
|1 |null |null |160 |A |900 |D |1200 |C |100 |B |
|2 |null |null |1000 |F |700 |E |null |null |null |null |
|3 |1900 |G |null |null |1000 |I |9000 |H |null |null |
+---+-----------------+---------------+------------+----------+--------------+------------+---------------+-------------+-------------+-----------+