Я хотел бы выполнить регрессию lm
внутри функции. Примерно так пост .
somefun <- function(vardep, varindep1, varindep2, DATA) {
summary(lm(paste(vardep, "~", varindep1, "+", varindep2), data = DATA))
}
Примеры данных:
panelID = c(1:50)
year= c(2001:2010)
country = c("NLD", "BEL", "GER")
urban = c("A", "B", "C")
indust = c("D", "E", "F")
sizes = c(1,2,3,4,5)
n <- 2
library(data.table)
set.seed(123)
DT <- data.table(panelID = rep(sample(panelID), each = n),
country = rep(sample(country, length(panelID), replace = T), each = n),
year = c(replicate(length(panelID), sample(year, n))),
some_NA = sample(0:5, 6),
Factor = sample(0:5, 6),
industry = rep(sample(indust, length(panelID), replace = T), each = n),
urbanisation = rep(sample(urban, length(panelID), replace = T), each = n),
size = rep(sample(sizes, length(panelID), replace = T), each = n),
income = round(runif(100)/10,2),
sales= round(rnorm(10,10,10),2),
happiness = sample(10,10),
Sex = round(rnorm(10,0.75,0.3),2),
Age = sample(100,100),
educ = round(rnorm(10,0.75,0.3),2))
DT [, uniqueID := .I] # Creates a unique ID
DT <- as.data.frame(DT)
somefun("happiness", "educ", "income", DT)
Однако я бы хотел дополнительно указать подмножество для lm
внутри функции. В результате я пробовал:
somefun<- function (vardep, varindep1, varindep2, DATA, subset=NULL) {
summary(lm(paste(vardep, "~", varindep1, "+", varindep2), data = DATA, subset=paste(subset)))
}
somefun("happiness", "educ", "income", DT, subset=(year<2005))
somefun("happiness", "educ", "income", DT, subset="(year<2005)")
Я даже пробовал:
somefun<- function (vardep, varindep1, varindep2, DATA, subset=NULL) {
summary(lm(paste(vardep, "~", varindep1, "+", varindep2), data = DATA, subset=paste(subset, "")))
}
somefun("happiness", "educ", "income", DT, subset=(year<2005))
somefun("happiness", "educ", "income", DT, subset="(year<2005)")
Но в обоих случаях получаю:
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
0 (non-NA) cases
Хотя summary(lm(paste("happiness", "~", "educ", "+", "income"), data = DT, subset=(year>2005)))
отлично работает .
Как мне это сделать?