Давайте сделаем self-join
и используем pd.crosstab
с reindex
:
#do a self-join and eliminate same row matches
dfm = df.merge(df, on='Project').query('University_x != University_y')
#get unique universities
lu = df['University'].unique()
#create a crosstab report and reindex to fill zeroes
pd.crosstab(dfm['University_x'], dfm['University_y'])\
.reindex(index=lu, columns=lu, fill_value=0)
Вывод:
University_y UniA UniB UniC UniD UniE
University_x
UniA 0 2 2 2 0
UniB 2 0 2 1 0
UniC 2 2 0 1 0
UniD 2 1 1 0 0
UniE 0 0 0 0 0
Не хочу видеть все нули строки / столбцы удалить reindex
:
pd.crosstab(dfm['University_x'], dfm['University_y'])
Вывод:
University_y UniA UniB UniC UniD
University_x
UniA 0 2 2 2
UniB 2 0 2 1
UniC 2 2 0 1
UniD 2 1 1 0