Я пытаюсь использовать сводную шкалу, чтобы уменьшить количество строк в моих данных и добавить новые столбцы. Однако количество столбцов увеличивается, но количество строк остается неизменным. В идеале каждый «индикатор» должен представлять собой одно наблюдение, где столбцы DataYear, Company, Market, Country и c совпадают. Я думаю, что проблема может быть из-за повторяющихся наблюдений, но не понимаю, как столбец IndicatorID не решает эту проблему?
Пример моих данных:
LongTest <- structure(list(DataYear = c(2018L, 2017L, 2016L, 2018L, 2017L,
2016L, 2018L, 2017L, 2016L), Company = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L), .Label = "One", class = "factor"), Market = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Total", class = "factor"),
Country = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "ALL", class = "factor"),
ISO = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "ALL", class = "factor"),
Sector = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Insurance", class = "factor"),
Division = c(NA, NA, NA, NA, NA, NA, NA, NA, NA), Furtherdetails1 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA), Furtherdetails2 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA), Indicator = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("Tax Avoidance",
"Turnover"), class = "factor"), IndicatorID = c(20L, 20L,
20L, 20L, 20L, 20L, 26L, 26L, 26L), InputName = structure(c(3L,
3L, 3L, 2L, 2L, 2L, 1L, 1L, 1L), .Label = c("Number of employees",
"Profit before tax (Attributable to shareholder profit)",
"Tax Paid"), class = "factor"), InputCode = structure(c(2L,
2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("InputA", "InputB"
), class = "factor"), UnitRequired = structure(c(2L, 2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L), .Label = c("#", "GBP"), class = "factor"),
Value = c(4.47e+08, 6.2e+08, 6.47e+08, 2.129e+09, 2.003e+09,
1.193e+09, 37628, 42431, 39833.44), UniqueID = 1:9), class = "data.frame", row.names = c(NA,
-9L))
И код, который я в настоящее время используя:
outTest <- pivot_wider(LongTest, names_from = InputCode, values_from = c(Value, UnitRequired, InputName))
Когда я использую свой полный фрейм данных, я получаю это сообщение об ошибке:
Warning messages:
1: Values in `InputName` are not uniquely identified; output will contain list-cols.
* Use `values_fn = list(InputName = list)` to suppress this warning.
* Use `values_fn = list(InputName = length)` to identify where the duplicates arise
* Use `values_fn = list(InputName = summary_fun)` to summarise duplicates
2: Values in `UnitRequired` are not uniquely identified; output will contain list-cols.
* Use `values_fn = list(UnitRequired = list)` to suppress this warning.
* Use `values_fn = list(UnitRequired = length)` to identify where the duplicates arise
* Use `values_fn = list(UnitRequired = summary_fun)` to summarise duplicates
3: Values in `Value` are not uniquely identified; output will contain list-cols.
* Use `values_fn = list(Value = list)` to suppress this warning.
* Use `values_fn = list(Value = length)` to identify where the duplicates arise
* Use `values_fn = list(Value = summary_fun)` to summarise duplicates
Идеальный результат будет примерно таким:
structure(list(DataYear = c(2018L, 2017L, 2016L, 2018L, 2017L,
2016L), Company = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "One", class = "factor"),
Market = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Total", class = "factor"),
Country = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "ALL", class = "factor"),
ISO = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "ALL", class = "factor"),
Sector = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Insurance", class = "factor"),
Division = c(NA, NA, NA, NA, NA, NA), Furtherdetails1 = c(NA,
NA, NA, NA, NA, NA), Furtherdetails2 = c(NA, NA, NA, NA,
NA, NA), Indicator = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("Tax Avoidance",
"Turnover"), class = "factor"), IndicatorID = c(20L, 20L,
20L, 26L, 26L, 26L), Value_InputA = c(2129000000L, 2003000000L,
1193000000L, NA, NA, NA), InputName_InputA = structure(c(2L,
2L, 2L, 1L, 1L, 1L), .Label = c("", "Profit before tax (Attributable to shareholder profit)"
), class = "factor"), UnitRequired_InputA = structure(c(2L,
2L, 2L, 1L, 1L, 1L), .Label = c("", "GBP"), class = "factor"),
Value_InputB = c(4.47e+08, 6.2e+08, 6.47e+08, 37628, 42431,
39833.44), InputName_InputB = structure(c(2L, 2L, 2L, 1L,
1L, 1L), .Label = c("Number of employees", "Tax Paid"), class = "factor"),
UnitRequired_InputB = structure(c(2L, 2L, 2L, 1L, 1L, 1L), .Label = c("#",
"GBP"), class = "factor")), class = "data.frame", row.names = c(NA,
-6L))
Любой Помощь будет принята с благодарностью!
Спасибо