С оконной функцией «провод»:
val df = List(
(1, "a", "1/1/2018"),
(2, "b", "1/1/2018"),
(3, "c", "1/1/2018"),
(4, "d", "1/1/2018"),
(1, "e", "10/1/2018")
).toDF("id", "Value", "start_date")
val idWindow = Window.partitionBy($"id")
.orderBy($"start_date")
val result = df.withColumn("end_date", lead($"start_date", 1).over(idWindow))
result.show(false)
Выход:
+---+-----+----------+---------+
|id |Value|start_date|end_date |
+---+-----+----------+---------+
|3 |c |1/1/2018 |null |
|4 |d |1/1/2018 |null |
|1 |a |1/1/2018 |10/1/2018|
|1 |e |10/1/2018 |null |
|2 |b |1/1/2018 |null |
+---+-----+----------+---------+