Я хочу сравнить каждое отдельное слово в столбце words со значениями в столбцах V1 до V576 (по каждой строке для каждой строки). Если какое-либо слово из столбца words соответствует любому из слов в столбцах V , замените слово в соответствующем
V столбец на 1 или еще на 0, если нет совпадений. Есть идеи как это сделать? Я не уверен, как это сделать по всем строкам и столбцам
Dataframe называется Data .
Столбец words представляет собой список ($ words: List of 42201). Есть 42201 строк
Есть около 576 столбцов слов для сравнения (от V1 до V576).
- это файл dput только для первых 3 строк и первых 20 столбцов.
structure(list(id = c("Te-1", "Te-2", "Te-3"), category = c("Fabric Care",
"Fabric Care", "Home Care"), brand = c("Tide", "Tide", "Cascade"
), sub_category = c("Laundry", "Laundry", "Auto Dishwashing"),
market = c("US", "US", "US"), review_title = c("the best in a very crowded market",
"first time", "i have been using another well known brand and did not expect "
), review_text = c("the best general wash detergent convenient container that keeps the product driy ",
"this helped to clean our washing machine after getting it from someone else this review was collected as part of a promotion ",
"i have been using another well known brand and did not expect much difference wow was i ever mistaken i will never go back "
), review_rating = c(5L, 5L, 5L), words = list(c("the", "best",
"general", "wash", "deterg", "conveni", "contain", "that",
"keep", "the", "product", "driy"), c("this", "help", "to",
"clean", "our", "wash", "machin", "after", "get", "it", "from",
"someon", "els", "this", "review", "was", "collect", "as",
"part", "of", "a", "promot"), c("i", "have", "been", "use",
"anoth", "well", "known", "brand", "and", "did", "not", "expect",
"much", "differ", "wow", "was", "i", "ever", "mistaken",
"i", "will", "never", "go", "back")), V1 = c("absolut", "absolut",
"absolut"), V2 = c("action", "action", "action"), V3 = c("actionpac",
"actionpac", "actionpac"), V4 = c("actual", "actual", "actual"
), V5 = c("addit", "addit", "addit"), V6 = c("adverti", "adverti",
"adverti"), V7 = c("afford", "afford", "afford"), V8 = c("agent",
"agent", "agent"), V9 = c("allerg", "allerg", "allerg"),
V10 = c("allergi", "allergi", "allergi"), V11 = c("alon",
"alon", "alon")), row.names = c(NA, -3L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x0000023d166a1ef0>)
Пожалуйста, посмотрите ниже фрагмент того, как выглядит фрейм данных, чтобы лучше понять мой вопрос
Большое спасибо за помощь!