Данные:
applications = pd.DataFrame({'application_id': [1, 2, 3, 4, 5],
'date': ['2015-01-05', '2015-01-06', '2015-01-07', '2015- 01-08', '2015-01-09'],
'client_employer': ['company A', 'company B', 'company C', 'company A', 'company B'],
'client_name': ['Bill', 'John', 'Steve', 'Bill', 'Alex']})
Таблица:
date client_employer client_name
0 2015-01-05 company A John
1 2015-01-06 company B Bill
2 2015-01-07 company B Bill
3 2015-01-08 company A Sarah
4 2015-01-09 company B Alex
5 2015-01-10 company B Brian
Сколько разных людей с одним и тем же работодателем у нас было в прошлом?НЕТ ЦИКЛОВ
Желаемый результат:
date client_employer client_name employers_count
0 2015-01-05 company A John 0
1 2015-01-06 company B Bill 0
2 2015-01-07 company B Bill 0
3 2015-01-08 company A Sarah 1
4 2015-01-09 company B Alex 1
5 2015-01-10 company B Brian 2
Предложение не работает правильно:
applications = pd.DataFrame({'application_id': [1, 2, 3, 4, 5, 6],
'date': ['2015-01-05', '2015-01-06', '2015-01-07', '2015-01-08', '2015-01-09', '2015-01-10'],
'client_employer': ['company B', 'company B', 'company B', 'company B', 'company B', 'company B'],
'client_name': ['Bill', 'John', 'Steve', 'Bill', 'Alex', 'Bill'],
'cnt_desired': [0, 1, 2, 2, 3, 3]})
emp_count = applications.groupby(['client_employer'])['client_name'].transform(lambda x: x.map(dict(zip(x.unique(),np.arange(len(x.unique()))))))
applications['cnt'] = emp_count
application_id date client_employer client_name cnt_desired cnt
0 1 2015-01-05 company B Bill 0 0
1 2 2015-01-06 company B John 1 1
2 3 2015-01-07 company B Steve 2 2
3 4 2015-01-08 company B Bill 2 0
4 5 2015-01-09 company B Alex 3 3
5 6 2015-01-10 company B Bill 3 0