IIU C, мы можем использовать str.split
и str.extract
с stack
s = df['Fish Count'].str.split(',',expand=True).stack()
s.str.extract('(\d+)(\D+)')
-
0 1
0 0 38 Sand Bass
1 16 Sculpin
2 10 Blacksmith
1 0 138 Sculpin
1 28 Sand Bass
2 0 150 Sculpin Released
1 102 Sculpin
2 40 Sanddab
3 0 156 Sculpin
1 29 Sand Bass
2 5 Black Croaker
3 3 ...
4 0 161 Sculpin
, тогда все зависит от вас в отношении формат, который вам нужен / нужен.
ie
s.str.extract('(\d+)(\D+)').groupby(level=[1]).agg(list)
0 1
0 [38, 138, 150, 156, 161] [ Sand Bass, Sculpin, Sculpin Released, Scu...
1 [16, 28, 102, 29] [ Sculpin, Sand Bass, Sculpin, Sand Bass]
2 [10, 40, 5] [ Blacksmith, Sanddab, Black Croaker]
3 [3] [ ...]
или
s.str.extract('(\d+)(\D+)').unstack(1)
0 1
0 1 2 3 0 1 2 3
0 38 16 10 NaN Sand Bass Sculpin Blacksmith NaN
1 138 28 NaN NaN Sculpin Sand Bass NaN NaN
2 150 102 40 NaN Sculpin Released Sculpin Sanddab NaN
3 156 29 5 3 Sculpin Sand Bass Black Croaker ...
4 161 NaN NaN NaN Sculpin NaN NaN NaN
или
s.str.extract('(\d+)(\D+)').values
array([['38', ' Sand Bass'],
['16', ' Sculpin'],
['10', ' Blacksmith'],
['138', ' Sculpin'],
['28', ' Sand Bass'],
['150', ' Sculpin Released'],
['102', ' Sculpin'],
['40', ' Sanddab'],
['156', ' Sculpin'],
['29', ' Sand Bass'],
['5', ' Black Croaker'],
['3', ' ...'],
['161', ' Sculpin']], dtype=object)
, которые вы можете превратить в dict .
# actually i'd use fish : num -
# sorry closed my ide keys can only be unique in a dict.
{num : fish for num, fish in s.str.extract('(\d+)(\D+)').values}
{'38': ' Sand Bass',
'16': ' Sculpin',
'10': ' Blacksmith',
'138': ' Sculpin',
'28': ' Sand Bass',
'150': ' Sculpin Released',
'102': ' Sculpin',
'40': ' Sanddab',
'156': ' Sculpin',
'29': ' Sand Bass',
'5': ' Black Croaker',
'3': ' ...',
'161': ' Sculpin'}