Использование value_counts
с normalize
:
s = pd.cut(loan, np.arange((loan.min()-100), (loan.max()+100), 500))
out = s.value_counts(normalize=True)
Или:
s1 = loan.groupby(s).size()
out = s1.div(s1.sum())
Образец :
np.random.seed(123)
data = pd.DataFrame({
'LOAN':np.random.randint(17, 65, 50) * 100
})
loan = data["LOAN"]
s = pd.cut(loan, np.arange((loan.min()-100), (loan.max()+100), 500))
out = s.value_counts(normalize=True).sort_index()
print (out)
(1600, 2100] 0.155556
(2100, 2600] 0.066667
(2600, 3100] 0.066667
(3100, 3600] 0.088889
(3600, 4100] 0.088889
(4100, 4600] 0.088889
(4600, 5100] 0.244444
(5100, 5600] 0.111111
(5600, 6100] 0.088889
Name: LOAN, dtype: float64
s1 = loan.groupby(s).size()
print (s1)
LOAN
(1600, 2100] 7
(2100, 2600] 3
(2600, 3100] 3
(3100, 3600] 4
(3600, 4100] 4
(4100, 4600] 4
(4600, 5100] 11
(5100, 5600] 5
(5600, 6100] 4
Name: LOAN, dtype: int64
out = s1.div(s1.sum())
print (out)
LOAN
(1600, 2100] 0.155556
(2100, 2600] 0.066667
(2600, 3100] 0.066667
(3100, 3600] 0.088889
(3600, 4100] 0.088889
(4100, 4600] 0.088889
(4600, 5100] 0.244444
(5100, 5600] 0.111111
(5600, 6100] 0.088889
Name: LOAN, dtype: float64