Заданный входной кадр данных, df как:
np.random.seed(123)
df = pd.DataFrame(np.random.randint(20,500,(2,144)),
columns = pd.MultiIndex.from_product([['Measure1','Measure2'], [f'Month{i}' for i in range(1,73)]]),
index=[1,2]).rename_axis('Cust_no').reset_index()
df.columns = df.columns.map('_'.join).str.strip('_')
df
Выход:
Cust_no Measure1_Month1 Measure1_Month2 ... Measure2_Month70 Measure2_Month71 Measure2_Month72
0 1 385 402 ... 153 380 129
1 2 106 66 ... 363 361 173
[2 rows x 145 columns]
Формат 1:
df = df.set_index('Cust_no')
df.columns = pd.MultiIndex.from_arrays(zip(*df.columns.str.split('_')), names=['Measure', 'Month'])
df_format1 = df.stack([0,1]).rename('Value').reset_index()
df_format1['Month'] = df_format1['Month'].str.extract('(\d+)')
df_format1
Выход:
Cust_no Measure Month Value
0 1 Measure1 1 385
1 1 Measure1 10 143
2 1 Measure1 11 77
3 1 Measure1 12 234
4 1 Measure1 13 245
.. ... ... ... ...
283 2 Measure2 70 363
284 2 Measure2 71 361
285 2 Measure2 72 173
286 2 Measure2 8 65
287 2 Measure2 9 461
[288 rows x 4 columns]
Формат 2:
df_format2 = (df_format1.set_index(['Cust_no','Month','Measure'])['Value']
.unstack().reset_index().rename_axis(None, axis=1))
df_format2
Выход:
Cust_no Month Measure1 Measure2
0 1 1 385 90
1 1 10 143 379
2 1 11 77 479
3 1 12 234 458
4 1 13 245 475
.. ... ... ... ...
139 2 70 108 363
140 2 71 258 361
141 2 72 235 173
142 2 8 453 65
143 2 9 276 461
[144 rows x 4 columns]