Я заранее прошу прощения за длину моих вопросов, однако R возвращает результат, который я не могу понять. Поэтому я хотел собрать как можно больше своих данных. У меня есть следующий фрейм данных:
str(CompleteData)
'data.frame': 7830 obs. of 65 variables:
$ StateCD : chr "ALABAMA 1" "ALABAMA 1" "ALABAMA 1" "ALABAMA 1" ...
$ Year : num 2001 2002 2003 2004 2005 ...
$ Congress : Factor w/ 9 levels "107","108","109",..: 1 1 2 2 3 3 4 4 5 5 ...
$ AGRICULTURE : Factor w/ 3 levels "0","1","2": 1 1 2 2 2 2 2 2 1 1 ...
$ APPROPRIATIONS : Factor w/ 2 levels "0","1": 2 2 1 1 1 1 2 2 2 2 ...
$ NATIONALSECURITY : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ FINANCIALSERVICES : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ BUDGET : Factor w/ 2 levels "0","1": 1 1 2 2 2 2 2 2 1 1 ...
$ EDUCATIONANDTHEWORKFORCE : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ ENERGYANDCOMMERCE : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ INTERNATIONALRELATIONS : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ GOVERNMENTREFORMANDOVERSIGHT : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ HOUSEOVERSIGHT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ JUDICIARY : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ RESOURCES : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ TRANSPORTATIONANDINFRASTRUCTURE : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ RULES : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ SCIENCE : Factor w/ 2 levels "0","1": 1 1 1 1 2 2 1 1 1 1 ...
$ SMALLBUSINESS : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ STANDARDSOFOFFICIALCONDUCT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 2 2 2 2 ...
$ VETERANSAFFAIRS : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ WAYSANDMEANS : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ INTELLIGENCE_SELECT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ SELECTCOMMITTEEONHOMELANDSECURITY : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ LIBRARY_JOINT : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ PRINTING_JOINT : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ TAXATION_JOINT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ ECONOMIC_JOINT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ MAJORITYWHIP : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ MAJORITYLEADER : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ SPEAKER : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ MINORITYLEADER : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ MINORITYWHIP : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ SCIENCEANDTECHNOLOGY : Factor w/ 3 levels "0","1","2": 1 1 2 2 1 1 2 2 1 1 ...
$ ARMEDSERVICES : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ GOVERNMENTREFORM : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ HOUSEADMINISTRATION : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ HOMELANDSECURITY : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ EDUCATIONANDLABOR : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ FOREIGNAFFAIRS : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ OVERSIGHTANDGOVERNMENTREFORM : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ NATURALRESOURCES : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ ENERGYINDEPENDENCEANDGLOBALWARMING_SELECT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ INVESTIGATETHEVOTINGIRREGULARITIESOFAUGUST2.2007_SELECT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ EDUCATIONANDTHEWORKPLACE : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ SCIENCE.SPACE.ANDTECHNOLOGY : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ ETHICS : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ DEFICITREDUCTION_JOINT.SELECT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ ASSISTANTMINORITYLEADER : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ EVENTSSURROUNDINGTHE2012TERRORISTATTACKONBENGHAZI_SELECT: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ NA : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Majority : Factor w/ 7 levels "0","1","2","3",..: 2 2 4 4 4 4 1 1 1 1 ...
$ Minority : Factor w/ 7 levels "0","1","2","3",..: 1 1 1 1 1 1 5 5 3 3 ...
$ MinorityAddition : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ MajorityReplacement : Factor w/ 4 levels "0","1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
$ MinorityReplacement : Factor w/ 4 levels "0","1","2","3": 1 1 1 1 1 1 2 2 1 1 ...
$ MajorityAddition : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ OtherParty : Factor w/ 2 levels "0","2": 1 1 1 1 1 1 1 1 1 1 ...
$ Republican : Factor w/ 8 levels "0","1","2","3",..: 2 2 4 4 4 4 6 6 3 3 ...
$ Democratic : Factor w/ 8 levels "0","1","2","3",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Independent : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ candidatevotes : num 0 108102 0 161067 0 ...
$ totalvotes : num 0 178687 0 255164 0 ...
$ VoteShare : num 0 60.5 0 63.1 0 ...
$ election : num 0 1 0 1 0 1 0 1 0 1 ...
Этот фрейм данных был создан путем объединения двух других фреймов данных, используя left_join. Код показан ниже:
CompleteData <- Full_Congress %>%
mutate(Year = as.character(Year),
Year = as.numeric(Year),
StateCD = as.character(StateCD)) %>%
left_join(HORElections2, by = c("StateCD", "Year" = "year")) %>%
mutate(election = ifelse(is.na(candidatevotes), 0, 1),
candidatevotes = ifelse(election == 1, candidatevotes, 0),
totalvotes = ifelse(election == 1, totalvotes, 0),
VoteShare = ifelse(election == 1, VoteShare, 0))
И два других фрейма данных имеют следующие структуры:
str(Full_Congress)
'data.frame': 7830 obs. of 61 variables:
$ StateCD : Factor w/ 459 levels "ALABAMA 1","ALABAMA 2",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Year : Factor w/ 18 levels "2001","2002",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Congress : Factor w/ 9 levels "107","108","109",..: 1 1 2 2 3 3 4 4 5 5 ...
$ AGRICULTURE : Factor w/ 3 levels "0","1","2": 1 1 2 2 2 2 2 2 1 1 ...
$ APPROPRIATIONS : Factor w/ 2 levels "0","1": 2 2 1 1 1 1 2 2 2 2 ...
$ NATIONALSECURITY : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ FINANCIALSERVICES : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ BUDGET : Factor w/ 2 levels "0","1": 1 1 2 2 2 2 2 2 1 1 ...
$ EDUCATIONANDTHEWORKFORCE : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ ENERGYANDCOMMERCE : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ INTERNATIONALRELATIONS : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ GOVERNMENTREFORMANDOVERSIGHT : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ HOUSEOVERSIGHT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ JUDICIARY : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ RESOURCES : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ TRANSPORTATIONANDINFRASTRUCTURE : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ RULES : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ SCIENCE : Factor w/ 2 levels "0","1": 1 1 1 1 2 2 1 1 1 1 ...
$ SMALLBUSINESS : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ STANDARDSOFOFFICIALCONDUCT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 2 2 2 2 ...
$ VETERANSAFFAIRS : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ WAYSANDMEANS : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ INTELLIGENCE_SELECT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ SELECTCOMMITTEEONHOMELANDSECURITY : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ LIBRARY_JOINT : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ PRINTING_JOINT : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ TAXATION_JOINT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ ECONOMIC_JOINT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ MAJORITYWHIP : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ MAJORITYLEADER : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ SPEAKER : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ MINORITYLEADER : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ MINORITYWHIP : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ SCIENCEANDTECHNOLOGY : Factor w/ 3 levels "0","1","2": 1 1 2 2 1 1 2 2 1 1 ...
$ ARMEDSERVICES : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ GOVERNMENTREFORM : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ HOUSEADMINISTRATION : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ HOMELANDSECURITY : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ EDUCATIONANDLABOR : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ FOREIGNAFFAIRS : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ OVERSIGHTANDGOVERNMENTREFORM : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ NATURALRESOURCES : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ ENERGYINDEPENDENCEANDGLOBALWARMING_SELECT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ INVESTIGATETHEVOTINGIRREGULARITIESOFAUGUST2.2007_SELECT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ EDUCATIONANDTHEWORKPLACE : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ SCIENCE.SPACE.ANDTECHNOLOGY : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ ETHICS : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ DEFICITREDUCTION_JOINT.SELECT : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ ASSISTANTMINORITYLEADER : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ EVENTSSURROUNDINGTHE2012TERRORISTATTACKONBENGHAZI_SELECT: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ NA : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Majority : Factor w/ 7 levels "0","1","2","3",..: 2 2 4 4 4 4 1 1 1 1 ...
$ Minority : Factor w/ 7 levels "0","1","2","3",..: 1 1 1 1 1 1 5 5 3 3 ...
$ MinorityAddition : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ MajorityReplacement : Factor w/ 4 levels "0","1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
$ MinorityReplacement : Factor w/ 4 levels "0","1","2","3": 1 1 1 1 1 1 2 2 1 1 ...
$ MajorityAddition : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ OtherParty : Factor w/ 2 levels "0","2": 1 1 1 1 1 1 1 1 1 1 ...
$ Republican : Factor w/ 8 levels "0","1","2","3",..: 2 2 4 4 4 4 6 6 3 3 ...
$ Democratic : Factor w/ 8 levels "0","1","2","3",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Independent : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
и
str(HORElections2)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 3915 obs. of 5 variables:
$ StateCD : chr "ALABAMA 1" "ALABAMA 1" "ALABAMA 1" "ALABAMA 1" ...
$ year : num 2002 2004 2006 2008 2010 ...
$ candidatevotes: num 108102 161067 112944 210660 129063 ...
$ totalvotes : num 178687 255164 165841 214367 156281 ...
$ VoteShare : num 60.5 63.1 68.1 98.3 82.6 ...
Я хотел бы проверить если новый фрейм данных (CompleteData) имеет какие-либо пропущенные (NA) значения, используйте следующий код:
which(is.na(CompleteData))
[1] 495145
Однако фрейм данных CompleteData содержит только 7 830 строк.
dim(CompleteData)
[1] 7830 65
Почему R возвращает индекс строки, который находится далеко за пределами диапазона строк во фрейме данных? Так как 495,145 больше, чем 7830 (количество строк во фрейме данных), означает ли это, что в фрейме данных нет NA?