Вам понадобится str.split
, затем stack
:
r = df.set_index('item').feature.str.split('|', expand=True).stack()
r.index = r.index.get_level_values(0)
r.reset_index(name='feature')
item feature
0 1 Adventure
1 1 Animation
2 1 Children
3 1 Comedy
4 1 Fantasy
5 2 Adventure
6 2 Children
7 2 Fantasy
8 3 Comedy
9 3 Romance
10 4 Comedy
11 4 Drama
12 4 Romance
13 5 Comedy
Другая опция использует np.repeat
:
u = df.set_index('item').feature.str.split('|')
pd.DataFrame({
'item': np.repeat(u.index, u.str.len()),
'feature': [y for x in u for y in x]
})
item feature
0 1 Adventure
1 1 Animation
2 1 Children
3 1 Comedy
4 1 Fantasy
5 2 Adventure
6 2 Children
7 2 Fantasy
8 3 Comedy
9 3 Romance
10 4 Comedy
11 4 Drama
12 4 Romance
13 5 Comedy