группировка по категориям, а затем выяснить разницу между категориями [r] - PullRequest
0 голосов
/ 28 октября 2019

Я рассчитываю средний уровень занятости для разных групп с 1995 по 2015 год. А затем вычисляю разницу средних уровней занятости между группами.

Это следует заказывать ежегодно.

БольшинствоВ то время я пытался использовать функцию суммирования в dplyr, но не смог.

Код ниже - это то, что я настроил.

diff_in_diff <- Cps_total %>% 
  filter(age >= 19 & age <= 44) %>% 
  mutate(women_and_black_men = ifelse(female == 1 & marstat != 1 & nfchild == 0, "Single without children",
                                 ifelse(female == 1 & marstat != 1 & nfchild > 0, "Single with children",
                                    ifelse(female == 1 & marstat == 1 & nfchild == 0, "Married without children",
                                       ifelse(female == 1 & marstat == 1 & nfchild > 0, "Married with children",
                                          ifelse(female == 0 & wbhao == 2, "Black Men", "Otherwise Men"))))))


diff_in_diff_2 <- diff_in_diff %>% 
  filter(!is.na(empl)) %>% 
  group_by(year, women_and_black_men) %>% 
  summarize(mean_empl=mean(empl))
year |  women_and_black_men      |      mean_empl

1995 |  Black Men                |      0.8772406       
1995 |  Married with children    |      0.6810999       
1995 |  Married without children |      0.8227718       
1995 |  Otherwise Men            |      0.9048232       
1995 |  Single with children     |      0.8330486       
1995 |  Single without children  |      0.8927759       
1996 |  Black Men                |      0.8415265       
1996 |  Married with children    |      0.6800505       
1996 |  Married without children |      0.8188101       
1996 |  Otherwise Men            |      0.9035344   

Это то, что я нашел.

Однако я хочу найти значение разницы между Single with children minus Black men, Single with children minus Single without children, Single with children minus Married with children, Single with children minus Married without children и Single with children minus Otherwise Men

Поэтому мое ожидание:

year |  Single_with_children_vs      |      diff_in_diff

1995 |  vs_Married with children     |      0.031230201
1995 |  vs Married without children  |     -0.130002012
1995 |  vs Single_without_children   |     -0.190230201
1995 |  vs Black Men                 |      0.002030210
1996 |
.
.
.

и тому подобное.

1 Ответ

1 голос
/ 28 октября 2019

Возможно, не самое элегантное решение, но вот быстрое решение:

    # I created a basic dataset similar to yours
    diff_in_diff <- data.frame(year=rep(1995:1996,8)
                        , women_and_black_men = rep(c("married with children", "married 
  without children", "otherwise men", "single with children", "single without children", "black men", "married with children", "otherwise men"), 2)
                        , empl = abs(rnorm(16, 0, 0.5))

    ) %>% arrange(year)


    # create a dataframe that is just single with children
      diff_in_diff_single <- diff_in_diff %>% 
      filter(women_and_black_men == "single with children") %>% 
      dplyr::rename("single.emp" = empl)

     # join with our original dataframe and take the difference
     diff_in_diff %>% 
     full_join(diff_in_diff_single, by = c("year")) %>% 
     drop_na() %>% 
     group_by(year, women_and_black_men.x) %>% 
     mutate(diff = empl - single.emp)
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...