Попытка получить самую низкую долю успеха между днями.
Я отфильтровал данные и мутировал в пропорциях успехов за каждый день:
structure(list(weekday = c("Friday", "Friday", "Monday", "Monday",
"Saturday", "Saturday", "Sunday", "Sunday", "Thursday", "Thursday",
"Tuesday", "Tuesday", "Wednesday", "Wednesday"), successful = c(FALSE,
TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE,
TRUE, FALSE, TRUE), n = c(38404L, 19923L, 39467L, 21761L, 22023L,
10694L, 13655L, 7393L, 39231L, 21365L, 48520L, 28787L, 43405L,
24033L), proportion = c(65.8425771940954, 34.1574228059046, 64.4590710132619,
35.5409289867381, 67.313629000214, 32.686370999786, 64.8755226149753,
35.1244773850247, 64.7418971549277, 35.2581028450723, 62.7627511092139,
37.2372488907861, 64.3628221477505, 35.6371778522495)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -14L), groups = structure(list(
weekday = c("Friday", "Monday", "Saturday", "Sunday", "Thursday",
"Tuesday", "Wednesday"), .rows = list(1:2, 3:4, 5:6, 7:8,
9:10, 11:12, 13:14)), row.names = c(NA, -7L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE))
Используя:
mutate(weekday = weekdays(as.Date(launched)),
successful = state == "successful") %>%
count(weekday, successful) %>%
group_by(weekday) %>%
mutate(proportion = n/sum(n) * 100)
в исходном наборе данных.
Попытка %>% filter(proportion == (min(proportion, na.rm = T)))
просто отфильтровывает данные, в которых значение «success» равно false:
structure(list(weekday = c("Friday", "Monday", "Saturday", "Sunday",
"Thursday", "Tuesday", "Wednesday"), successful = c(TRUE, TRUE,
TRUE, TRUE, TRUE, TRUE, TRUE), n = c(19923L, 21761L, 10694L,
7393L, 21365L, 28787L, 24033L), proportion = c(34.1574228059046,
35.5409289867381, 32.686370999786, 35.1244773850247, 35.2581028450723,
37.2372488907861, 35.6371778522495)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -7L), groups = structure(list(
weekday = c("Friday", "Monday", "Saturday", "Sunday", "Thursday",
"Tuesday", "Wednesday"), .rows = list(1L, 2L, 3L, 4L, 5L,
6L, 7L)), row.names = c(NA, -7L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE))
Единственная функция, которая работала для меня в столбце «пропорции», была %>% arrange(proportion)
, но использование %>% slice(proportion, 1)
сразу после дает тот же результат, что и выше.
Для пояснения, фильтрация по минимальному значению пропорции дает мне:
# A tibble: 7 x 4
# Groups: weekday [7]
weekday successful n proportion
<chr> <lgl> <int> <dbl>
1 Friday TRUE 19923 34.2
2 Monday TRUE 21761 35.5
3 Saturday TRUE 10694 32.7
4 Sunday TRUE 7393 35.1
5 Thursday TRUE 21365 35.3
6 Tuesday TRUE 28787 37.2
7 Wednesday TRUE 24033 35.6
Вместо:
weekday successful n proportion
<chr> <lgl> <int> <dbl>
1 Sunday TRUE 7393 35.1