удаление фильтра с помощью dplyr R - PullRequest
0 голосов
/ 31 января 2019

Я пытаюсь выполнить фильтрацию для определенных групп в наборе данных, затем выбрать максимальную вероятность успеха группы, где N> 10, а затем удалить фильтр, чтобы я мог применить результат ко всем группам в наборе.

results<-df1 %>%
  group_by(course, Race) %>%
  summarize(DI=sum(success), pct=sum(success/n()), n=n())%>%
  ungroup %>%
  group_by(course) %>% 
  #mutate(ref=max(pct[n>]))
  filter(Race == "White" | Race == "Asian" | Race == "African American" | Race == "Hispanic/Latino") %>%
  mutate(reference=max(pct[n>10])) %>% 
  ungroup %>%
  mutate(di_80_index=pct/reference, di_indicator=ifelse(di_80_index < 0.80, 1, 0)) %>% 
  ungroup %>%
  arrange(course, Race)

Возвращает максимум для строки, в которой выполняется условие гонки = белая, азиатская или афроамериканская, латиноамериканская или латиноамериканская и N> 10, однако исключает все другие расы для остальных.анализа.По сути, я ищу способ вернуть все отфильтрованные данные и затем применить

mutate(di_80_index=pct/reference, di_indicator=ifelse(di_80_index < 0.80, 1, 0))

ко всему набору данных.Вместо этого происходит то, что остаются только белые, азиаты, афроамериканцы и латиноамериканцы.Я в основном ищу, чтобы "нефильтр" случился.Какие-нибудь мысли?

edit: альтернативно, если бы я мог как-то вложить фильтр (Race) в

mutate(reference=max(pct[n>10]))

, то это решило бы проблему.Что-то вроде:

mutate(reference=max(pct[n>10] & Race == "White" | Race == "Asian" | Race == "African American" | Race == "Hispanic/Latino"))

Пример данных:

Race = c("African American","African American","African American","African American","African American", "Asian","Asian","Asian","Asian","Asian", "Hispanic","African American","African American","African American","African American","African American", "Asian","Asian","Asian","Asian","Asian", "Hispanic","Hispanic","Hispanic","Hispanic","Hispanic", "White","White","White","White","White","Hispanic","Hispanic","Hispanic","Hispanic", "White","White","White","White","White","African American","African American","African American","African American","African American", "Asian","Asian","Asian","Asian","Asian", "Hispanic","Hispanic","Hispanic","Hispanic","Hispanic", "White","White","White","White","White", "Filipino","Filipino","Filipino","Filipino","Filipino","Filipino","Filipino","Filipino","Filipino","Filipino","Filipino","Filipino")
course = c("ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5","ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5", "ENG 8","ENG 1", "ENG 3", "ENG 5")
success = c(0,0,1,0,1,0,1,1,0,1,0,0,1,0,1,0,1,1,0,1,0,0,1,0,1,0,1,1,0,1,0,0,1,0,1,0,1,1,0,1,0,0,1,0,1,0,1,1,0,0,1,0,1,0,1,1,0,1,0,0,1,0,1,0,1,1,0,1,0,1,0,0)

df = data.frame(course, Race, success)
...