Использование numpy.where
:
df['real_lastName'] = np.where(df['LastName'].isnull(), df['Middle'], df['LastName'] )
print (df)
FirstName Middle LastName real_lastName
0 Tom Ju NaN Ju
1 Kity NaN Rob Rob
Другим возможным решением является использование fillna
или combine_first
:
df['real_lastName'] = df['LastName'].fillna(df['Middle'])
df['real_lastName'] = df['LastName'].combine_first(df['Middle'])
Производительность аналогична:
#[200000 rows x 4 columns]
df = pd.concat([df] * 100000, ignore_index=True)
In [41]: %timeit df['real_lastName'] = np.where(df['LastName'].isnull(), df['Middle'], df['LastName'] )
13.3 ms ± 51.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [42]: %timeit df['real_lastName'] = df['LastName'].fillna(df['Middle'])
16.2 ms ± 58.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [43]: %timeit df['real_lastName'] = df['LastName'].combine_first(df['Middle'])
13 ms ± 100 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)