Я хочу провести t.test на R. Между средней продолжительностью пребывания и группой сверстников (большая и Средняя больница)
alos1 <- alos %>% filter(`Peer group` == "Large hospitals")
alos2 <- alos %>% filter(`Peer group` == "Medium hospitals")
Large_Medium <- alos1 %>% full_join(alos2)
Clean <- Large_Medium %>% filter(!`Average length of stay (days)`== "NP")
t.test(
peer_group ~ Average_length_of_stay
data = Clean
var.equal = TRUE
alternative = "two-sided"
)
Приведенный выше код используется для сортировки данных, а затем t.test однако я продолжаю получать ошибку.
Использование Dput, как предложил Ронак, вот мои данные.
structure(list(`Reporting unit` = c("Albury Wodonga Health [Albury Campus]",
"Albury Wodonga Health [Albury Campus]", "Albury Wodonga Health [Albury Campus]",
"Albury Wodonga Health [Albury Campus]", "Albury Wodonga Health [Albury Campus]",
"Albury Wodonga Health [Albury Campus]", "Albury Wodonga Health [Albury Campus]",
"Albury Wodonga Health [Albury Campus]", "Albury Wodonga Health [Albury Campus]",
"Albury Wodonga Health [Albury Campus]"), `Reporting unit type` = c("Hospital",
"Hospital", "Hospital", "Hospital", "Hospital", "Hospital", "Hospital",
"Hospital", "Hospital", "Hospital"), State = c("NSW", "NSW",
"NSW", "NSW", "NSW", "NSW", "NSW", "NSW", "NSW", "NSW"), `Local Hospital Network (LHN)` = c("Albury Wodonga Health",
"Albury Wodonga Health", "Albury Wodonga Health", "Albury Wodonga Health",
"Albury Wodonga Health", "Albury Wodonga Health", "Albury Wodonga Health",
"Albury Wodonga Health", "Albury Wodonga Health", "Albury Wodonga Health"
), `Peer group` = c("Large hospitals", "Large hospitals", "Large hospitals",
"Large hospitals", "Large hospitals", "Large hospitals", "Large hospitals",
"Large hospitals", "Large hospitals", "Large hospitals"), `Time period` = c("2011–12",
"2012–13", "2013–14", "2014–15", "2015–16", "2016–17", "2011–12",
"2012–13", "2013–14", "2014–15"), Category = c("Cellulitis",
"Cellulitis", "Cellulitis", "Cellulitis", "Cellulitis", "Cellulitis",
"Chronic Obstructive Pulmonary Disease (without complications)",
"Chronic Obstructive Pulmonary Disease (without complications)",
"Chronic Obstructive Pulmonary Disease (without complications)",
"Chronic Obstructive Pulmonary Disease (without complications)"
), `Total number of stays` = c(111, 116, 141, 155, 210, 196,
109, 116, 75, 132), `Number of overnight stays` = c(92, 98, 115,
123, 166, 155, 108, 113, 71, 122), `Percentage of overnight stays` = c(0.83,
0.84, 0.82, 0.79, 0.79, 0.79, 0.99, 0.97, 0.95, 0.92), `Average length of stay (days)` = c(3.9,
3.3, 3.1, 2.5, 2.6, 2.7, 5.8, 4.6, 5.7, 4.4), `Peer group average (days)` = c(3.7,
3.5, 3.3, 3.2, 3, 3, 4.8, 4.4, 4.2, 3.9), `Total overnight patient bed days` = c(356,
326, 351, 306, 431, 418, 622, 518, 405, 538)), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
>
Я получаю новую ошибку после правильного присвоения имен моим столбцам. Это так. новая ошибка: Ошибка в t.test.formula (Группа сверстников ~ Средняя продолжительность пребывания (дней),: фактор группировки должен иметь ровно 2 уровня
Буду признателен за помощь, пожалуйста