Я неправильно понял цель. Вот исправленный ответ. Это длинный ответ, но работа выполняется в трех строках кода.
Сначала создайте и объедините два фрейма данных:
from io import StringIO
import pandas as pd
# create the 2 data frames
data = '''id Stu_Name Class Fees
1 Jack primary 2333
2 mack primary 2363
3 may primary 2833
3 Mark primary 1333
3 John primary 9333
4 Moon Secondary 6589
5 daisy Secondary 6565
6 shawn Secondary 6545
6 roy Secondary 6596
9 hary higher 8526
10 Joy higher 9654
10 nick higher 7845
10 julie higher 9633
'''
df1 = pd.read_csv(StringIO(data), sep='\s+', engine='python')
data = '''id Stu_Name Class Fees
11 eric primary 2333
21 fick primary 2363
42 Moon Secondary 6589
56 anki Secondary 6565
18 menk higher 7845
17 rock higher 9633
'''
df2 = pd.read_csv(StringIO(data), sep='\s+', engine='python')
# combine the 2 data frames
df = pd.concat([df1, df2], ignore_index=True)
Теперь создайте два вспомогательных столбца и sort:
# create the 1st helper column (for sorting at end)
# this will group (and sort) primary, Secondary, higher
df['class_num'] = df['Class'].factorize()[0]
# create 2nd helper column (to identify repeated IDs)
df['id_count'] = df.groupby('id')['id'].transform('count')
# if the logic is correct, then drop 'class_num', 'id_count'
df = df.sort_values(['class_num', 'id_count']).set_index('id')
Результат:
print(df)
Stu_Name Class Fees class_num id_count
id
1 Jack primary 2333 0 1
2 mack primary 2363 0 1
11 eric primary 2333 0 1
21 fick primary 2363 0 1
3 may primary 2833 0 3
3 Mark primary 1333 0 3
3 John primary 9333 0 3
4 Moon Secondary 6589 1 1
5 daisy Secondary 6565 1 1
42 Moon Secondary 6589 1 1
56 anki Secondary 6565 1 1
6 shawn Secondary 6545 1 2
6 roy Secondary 6596 1 2
9 hary higher 8526 2 1
18 menk higher 7845 2 1
17 rock higher 9633 2 1
10 Joy higher 9654 2 3
10 nick higher 7845 2 3
10 julie higher 9633 2 3
ORIG POST
Вы можете использовать категориальный тип, чтобы задать собственный порядок сортировки:
class_idx = pd.CategoricalIndex(
categories = ['primary', 'Secondary', 'higher'],
ordered=True)
df = pd.concat([df1, df2]).astype(
{'id': 'int',
'Stu_Name': 'string',
'Class': class_idx,
'Fees': 'int'}).sort_values('Class')
print(df.head())
id Stu_Name Class Fees
0 1 Jack primary 2333
1 2 mack primary 2363
2 3 may primary 2833
3 3 Mark primary 1333
4 3 John primary 9333