Мой подход:
df = df.assign(day=df['ds time'].dt.normalize(),
hour=df['ds time'].dt.hour)
ret_df = df.merge(df.drop('ds time', axis=1)
.set_index('day')
.groupby(['id','hour']).rolling('7D').mean()
.drop(['hour','id'], axis=1),
on=['id','hour','day'],
how='left',
suffixes=['','_roll']
).drop(['day','hour'], axis=1)
Пример данных :
dates = pd.date_range('2020-02-21', '2020-02-25', freq='H')
np.random.seed(1)
df = pd.DataFrame({
'id': np.repeat([6,7], len(dates)),
'ds time': np.tile(dates,2),
'X': np.arange(len(dates)*2),
'Y': np.random.randint(0,10, len(dates)*2)
})
df.head()
Вывод ret_df.head()
:
id ds time X Y X_roll Y_roll
0 6 2020-02-21 00:00:00 0 5 0.0 5.0
1 6 2020-02-21 01:00:00 1 8 1.0 8.0
2 6 2020-02-21 02:00:00 2 9 2.0 9.0
3 6 2020-02-21 03:00:00 3 5 3.0 5.0
4 6 2020-02-21 04:00:00 4 0 4.0 0.0