Я импортировал данные из вашей публикации и написал следующее SQL, чтобы помочь решить ваш вопрос (основываясь на моем понимании из вашего вопроса ...); Тем не менее, я надеюсь, что это помогает.
df = spark_sql.read
.format("csv")
.option("header", "true")
.option("delimiter", ",")
.load("/path/to/minimum-difference-in-timestamps.txt")
df.registerTempTable("TABLE1")
spark_sql.sql(""" CACHE TABLE View1 AS
SELECT * FROM TABLE1 ORDER BY TemperatureF DESC LIMIT 1""")
spark_sql.sql(""" CACHE TABLE View2 AS
SELECT A.Date StartTime,
B.Date EndTime,
CAST(B.TemperatureF - A.TemperatureF AS FLOAT) difference ,
A.TemperatureF,
A.MinTemp,
A.MaxTemp,
B.timestamp END_TIME_TS
FROM (
SELECT * FROM TABLE1 WHERE
timestamp < (SELECT timestamp FROM View1)
ORDER BY TemperatureF ASC, timestamp DESC LIMIT 1
) AS A JOIN
View1 AS B ON A.MaxTemp == B.TemperatureF""")
spark_sql.sql(""" SELECT StartTime,EndTime,difference FROM View2""").show()
spark_sql.sql(""" SELECT A.Date StartTime,
A.TemperatureF Temperature,
B.EndTime ,
B.TemperatureF Temperature_END,
B.MinTemp,
B.MaxTemp
FROM TABLE1 AS A JOIN
View2 AS B
ON A.MaxTemp =B.MaxTemp
AND
A.MinTemp == B.MinTemp
AND
A.timestamp < B.END_TIME_TS
""").show()
+-----------------+-----------------+----------+
| StartTime| EndTime|difference|
+-----------------+-----------------+----------+
|01/01/2000 8:53AM|01/02/2000 4:39AM| 9.4|
+-----------------+-----------------+----------+
+-----------------+-----------+-----------------+---------------+-------+-------+
| StartTime|Temperature| EndTime|Temperature_END|MinTemp|MaxTemp|
+-----------------+-----------+-----------------+---------------+-------+-------+
|01/01/2000 6:53AM| 28.0|01/02/2000 4:39AM| 28.0| 28.0| 37.4|
|01/01/2000 7:53AM| 28.0|01/02/2000 4:39AM| 28.0| 28.0| 37.4|
|01/01/2000 8:53AM| 28.0|01/02/2000 4:39AM| 28.0| 28.0| 37.4|
+-----------------+-----------+-----------------+---------------+-------+-------+
минимальная разница во времени timestamps.txt:
TemperatureF,Date,timestamp,MinTemp,MaxTemp
28.0,01/01/2000 6:53AM,946709580,28.0,37.4
28.0,01/01/2000 7:53AM,946713180,28.0,37.4
28.0,01/01/2000 8:53AM,946716780,28.0,37.4
30.2,01/01/2000 10:24PM,946765440,30.2,37.4
30.9,01/01/2000 10:53PM,946767180,30.9,37.4
37.4,01/02/2000 4:39AM,946787940,28.0,37.4
36.0,01/02/2000 4:53AM,946788780,28.0,36.0
36.0,01/02/2000 5:53AM,946792380,28.0,36.0