Используйте атрибут .years
с apply
и axis=1
для обработки по строкам:
df = pd.DataFrame({'start':['2015-10-02','2014-11-05'],
'end':['2018-01-02','2018-10-05']})
df['start'] = pd.to_datetime(df['start'])
df['end'] = pd.to_datetime(df['end'])
from dateutil.relativedelta import relativedelta
df['y'] = df.apply(lambda x: relativedelta(x['end'], x['start']).years, axis=1)
Или используйте list comprehension
:
df['y'] = [relativedelta(i, j).years for i, j in zip(df['end'], df['start'])]
print (df)
start end y
0 2015-10-02 2018-01-02 2
1 2014-11-05 2018-10-05 3
РЕДАКТИРОВАТЬ:
df = pd.DataFrame({'start':['2015-10-02','2014-11-05'],
'end':['2018-01-02',np.nan]})
df['start'] = pd.to_datetime(df['start'])
df['end'] = pd.to_datetime(df['end'])
from dateutil.relativedelta import relativedelta
m = df[['start','end']].notnull().all(axis=1)
df.loc[m, 'y'] = df[m].apply(lambda x: relativedelta(x['end'], x['start']).years, axis=1)
print (df)
start end y
0 2015-10-02 2018-01-02 2.0
1 2014-11-05 NaT NaN