Вам нужно использовать Window
val df = Seq(
(1, "12:00:00", "12:04:00"),
(1, "12:05:00", "12:08:00"),
(1, "12:20:00", "12:22:00"),
(2, "13:00:00", "13:04:00"),
(2, "13:05:00", "13:08:00"),
(2, "13:20:00", "13:22:00")
).toDF( "trackId","start_time","end_time" )
val window = Window.partitionBy("trackId").orderBy("start_time")
df.withColumn("lead",lead(col("start_time"),1).over(window))
Если вы не хотите иметь значение null, вы можете также передать значение по умолчанию как lead($"start_time",1, defaultValue)
Результат:
+-------+----------+--------+--------+
|trackId|start_time|end_time|lead |
+-------+----------+--------+--------+
|1 |12:00:00 |12:04:00|12:05:00|
|1 |12:05:00 |12:08:00|12:20:00|
|1 |12:20:00 |12:22:00|null |
|2 |13:00:00 |13:04:00|13:05:00|
|2 |13:05:00 |13:08:00|13:20:00|
|2 |13:20:00 |13:22:00|null |
+-------+----------+--------+--------+