Нетипичный формат данных конвертируется в длинный из длинного - PullRequest
0 голосов
/ 10 октября 2018

Мои данные:

    # A tibble: 6 x 4
  X__1            X__6                                                     X__7     X__8        
  <chr>           <chr>                                                    <chr>    <chr>       
1 Emp #:          xxyy                                                    Departm~ Corporate S~
2 Reason of Resi~ I think below are areas of improvement within my team C~ NA       NA          
3 Emp #:          xyyy                                                    Departm~ Corporate S~
4 Reason of Resi~ better oppurtunity                                       NA       NA          

Я хочу изменить Данные на следующий формат

Emp #     Reason                                                 Department
10282     I think below are areas of improvement within my team  Corporate
10308     better oppurtunity                                     Corporate

Воспроизвести данные

structure(list(X__1 = c("Emp #:", "Reason of Resignation:", "Emp #:", 
"Reason of Resignation:", "Emp #:", "Reason of Resignation:", 
"Emp #:", "Reason of Resignation:", "Emp #:", "Reason of Resignation:"
), X__6 = c("10282", "I think below are areas of improvement within my team CS / SME or my be cross the organization on my level (L1-L2). Lack of career growth specially in my department i.e. CS HOD/RSM/TLs/KAMs are on same position from last 5 years. Many people are here on same position from last 10-12 years. lack in focus on low level staff (L1 / L2) in terms of capacity building and career growth i.e. not a single training for my team on it. No rotation plans (for capacity building) for CS i.e. not a single team member rotated since I joined. Better opportunity in terms of career and financials outside ", 
"10308", "better oppurtunity", "11230", "Moving on another organization for career persuade", 
"13370", "Get a new job outside the company.", "14694", "Health Issues"
), X__7 = c("Department:", NA, "Department:", NA, "Department:", 
NA, "Department:", NA, "Department:", NA), X__8 = c("Corporate Solutions", 
NA, "Corporate Solutions", NA, "Region Central A", NA, "Region North", 
NA, "Finance Operations", NA)), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame"))

Немного подробнее.

Emp# в X__1 будет идти в заголовке первого столбца, который будет иметь значение от X__6 и т. Д.

Ответы [ 2 ]

0 голосов
/ 10 октября 2018

Учитывая, что ваш формат строго соответствует тому, что вы показываете, может быть еще одна (немного переобученная) идея:

d1 <- df[c(TRUE, FALSE),]
d2 <- df[c(FALSE, TRUE),]

setNames(data.frame(d1[2], d1[4], d2[2]), c(d1[1,1], d1[1,3], d2[1,1]))

, которая дает

   Emp #:         Department:                                                       Reason of Resignation:
1  10282 Corporate Solutions I think below are areas of improvement within my team CS / SMEs outside JAZZ
2  10308 Corporate Solutions                                                           better oppurtunity
3  11230    Region Central A                           Moving on another organization for career persuade
4  13370        Region North                                           Get a new job outside the company.
5  14694  Finance Operations                                                                Health Issues
0 голосов
/ 10 октября 2018

Я добавил новый столбец с именем rid, в котором сгруппированы пары строк, затем отфильтровал необходимые столбцы и left_join() собрал их вместе по их rid.

library(dplyr)

df <- mutate(df, rid = lapply(1:(nrow(df)/2), function(x) rep(x, 2)) %>% unlist())

left_join(
  df %>%
    filter(X__1 == "Emp #:") %>%
    select(rid, X__6) %>%
    rename("Emp #" = "X__6"),
  df %>%
    filter(X__1 == "Reason of Resignation:") %>%
    select(rid, X__6) %>%
    rename("Reason" = "X__6"),
  by = "rid") %>%
  left_join(df %>%
              filter(X__7 == "Department:") %>%
              select(rid, X__8) %>%
              rename("Department" = "X__8"),
            by = "rid") %>%
  select(-rid)

#  `Emp #` Reason                                                    Department     
#   <chr>   <chr>                                                     <chr>          
# 1 10282   I think below are areas of improvement within my team CS~ Corporate Solu~
# 2 10308   better oppurtunity                                        Corporate Solu~
# 3 11230   Moving on another organization for career persuade        Region Central~
# 4 13370   Get a new job outside the company.                        Region North   
# 5 14694   Health Issues                                             Finance Operat~
...