Получить все строки с и без NaN в pandas dataframe - PullRequest
0 голосов
/ 06 января 2019

Наиболее эффективный способ разделения строки, содержащей NaN в кадре данных панд и без него.

input :- ID    Gender    Dependants   Income   Education  Married
         1     Male      2            500      Graduate   Yes
         2     NaN       4            2500     Graduate   No
         3     Female    3            NaN      NaN        Yes
         4     Male      NaN          7000     Graduate   Yes
         5     Female    4            500      Graduate   NaN
         6     Female    2            4500     Graduate   Yes

Ожидаемый результат без NaN:

ID    Gender    Dependants    Income    Education    Married
1     Male      2             500       Graduate     Yes
6     Female    2             4500      Graduate     Yes

Ожидаемый результат с NaN,

ID    Gender    Dependants    Income    Education    Married
2     NaN       4             2500      Graduate     No
3     Female    3             NaN       NaN          Yes
4     Male      NaN           7000      Graduate     Yes
5     Female    4             500       Graduate     NaN 

1 Ответ

0 голосов
/ 06 января 2019

Используйте boolean indexing с проверкой пропущенных значений и any для проверки как минимум одного True на строки:

mask = df.isnull().any(axis=1)

df1 = df[~mask]
df2 = df[mask]
print (df1)
   ID  Gender  Dependants  Income Education Married
0   1    Male         2.0   500.0  Graduate     Yes
5   6  Female         2.0  4500.0  Graduate     Yes

print (df2)
   ID  Gender  Dependants  Income Education Married
1   2     NaN         4.0  2500.0  Graduate      No
2   3  Female         3.0     NaN       NaN     Yes
3   4    Male         NaN  7000.0  Graduate     Yes
4   5  Female         4.0   500.0  Graduate     NaN

Подробнее

print (df.isnull())
     ID  Gender  Dependants  Income  Education  Married
0  False   False       False   False      False    False
1  False    True       False   False      False    False
2  False   False       False    True       True    False
3  False   False        True   False      False    False
4  False   False       False   False      False     True
5  False   False       False   False      False    False

print (mask)
0    False
1     True
2     True
3     True
4     True
5    False
dtype: bool
...