Я думаю, вам нужно DataFrame.between_time
работать с DatetimeIndex
для выбора строк 2 раза и затем агрегировать mean
:
#changed data sample for match
print (df)
Id timestamp data Date
27585 27826 2020-01-02 11:55:46.297 19.0 2020-01-02
27586 27827 2020-01-02 12:55:46.397 25.0 2020-02-02
27587 27828 2020-01-02 13:55:47.283 20.0 2020-02-02
27588 27829 2020-01-02 14:55:47.383 21.5 2020-03-02
27589 27830 2020-01-02 08:55:48.287 21.5 2020-04-02
df['timestamp'] = pd.to_datetime(df['timestamp'])
print (df.set_index('timestamp')
.between_time('12:00:00','16:00:00'))
Id data Date
timestamp
2020-01-02 12:55:46.397 27827 25.0 2020-02-02
2020-01-02 13:55:47.283 27828 20.0 2020-02-02
2020-01-02 14:55:47.383 27829 21.5 2020-03-02
df1 = (df.set_index('timestamp')
.between_time('12:00:00','16:00:00')
.groupby('Date')['data']
.mean())
print (df1)
Date
2020-02-02 22.5
2020-03-02 21.5
Name: data, dtype: float64
Если нужно resample
с groupby
по timestamp
s:
df1 = (df.set_index('timestamp')
.between_time('12:00:00','16:00:00')
.groupby('Date')['data']
.resample('1S')
.ffill())
print (df1)
Date timestamp
2020-02-02 2020-01-02 12:55:46 NaN
2020-01-02 12:55:47 25.0
2020-01-02 12:55:48 25.0
2020-01-02 12:55:49 25.0
2020-01-02 12:55:50 25.0
...
2020-01-02 13:55:44 25.0
2020-01-02 13:55:45 25.0
2020-01-02 13:55:46 25.0
2020-01-02 13:55:47 25.0
2020-03-02 2020-01-02 14:55:47 NaN
Name: data, Length: 3603, dtype: float64
А затем возможен подсчет mean
на первый уровень даты:
df1 = (df.set_index('timestamp')
.between_time('12:00:00','16:00:00')
.groupby('Date')['data']
.resample('1S')
.ffill()
.mean(level=0)
.reset_index())
print (df1)
Date data
0 2020-02-02 25.0
1 2020-03-02 NaN