У меня есть данные, которые выглядят так:
Location+Type tract state_abbr year tract_state_year County_name hpi
0 Census Tract 201, Autauga County, Alabama: Sum... 1001020100 AL 2012 1001020100AL2012 Autauga County 134.41
1 Census Tract 201, Autauga County, Alabama: Sum... 1001020100 AL 2013 1001020100AL2013 Autauga County 129.82
2 Census Tract 201, Autauga County, Alabama: Sum... 1001020100 AL 2014 1001020100AL2014 Autauga County 135.34
3 Census Tract 201, Autauga County, Alabama: Sum... 1001020100 AL 2015 1001020100AL2015 Autauga County 134.66
4 Census Tract 201, Autauga County, Alabama: Sum... 1001020100 AL 2016 1001020100AL2016 Autauga County 140.84
Я хочу применить эту формулу:
medians = (df.groupby(['year', 'state_abbr', 'County_name'])['hpi']
.transform(lambda x: x.median() if x.notnull().any() else np.nan)
)
df['hpi'] = df['hpi'].fillna(medians)
Но я получаю эту ошибку:
ValueError: Length mismatch: Expected axis has 151291 elements, new values have 152159 elements
Как мне решить эту проблему?