Да, это возможно, если использовать DataFrame.set_index
, но тогда дублируются имена других столбцов:
print (dlung)
Year Cancer Country Gender ASR SE
0 1950 Lung Australia Male 13.89 0.56
1 1951 Lung Australia Male 14.84 0.57
2 1952 Lung Australia Male 17.19 0.61
3 1953 Lung Australia Male 18.21 0.62
4 1954 Lung Australia Male 19.05 0.63
print (dcolorectal)
Year Cancer Country Gender ASR SE
6 1950 colorectal Australia Male 22.05 0.67
7 1951 colorectal Australia Male 23.93 0.69
8 1952 colorectal Australia Male 23.77 0.68
9 1953 colorectal Australia Male 26.12 0.71
10 1954 colorectal Australia Male 27.08 0.72
df_lung_colorectal = pd.concat([dlung.set_index(['Year','Country','Gender']),
dcolorectal.set_index(['Year','Country','Gender'])], axis = 1)
print (df_lung_colorectal)
Cancer ASR SE Cancer ASR SE
Year Country Gender
1950 Australia Male Lung 13.89 0.56 colorectal 22.05 0.67
1951 Australia Male Lung 14.84 0.57 colorectal 23.93 0.69
1952 Australia Male Lung 17.19 0.61 colorectal 23.77 0.68
1953 Australia Male Lung 18.21 0.62 colorectal 26.12 0.71
1954 Australia Male Lung 19.05 0.63 colorectal 27.08 0.72
Но я думаю, что лучше сначала конкатвесь DataFrame вместе с axis=0
, что является значением по умолчанию, поэтому должен быть удален и в последний раз изменен на DataFrame.set_index
и DataFrame.unstack
:
df = pd.concat([dlung, dcolorectal]).set_index(['Year','Country','Gender','Cancer']).unstack()
df.columns = df.columns.map('_'.join)
df = df.reset_index()
print (df)
Year Country Gender ASR_Lung ASR_colorectal SE_Lung SE_colorectal
0 1950 Australia Male 13.89 22.05 0.56 0.67
1 1951 Australia Male 14.84 23.93 0.57 0.69
2 1952 Australia Male 17.19 23.77 0.61 0.68
3 1953 Australia Male 18.21 26.12 0.62 0.71
4 1954 Australia Male 19.05 27.08 0.63 0.72