Я сделал al oop, который тестирует три разных подмножества для трех разных значений k, но для моего последнего значения k (k = 5) я получаю ошибку, которую не могу понять или не знаю, как исправить .
boston <- data.frame(Boston)
boston1 <- data.frame(Boston_
boston$medv01 <- ifelse((boston$medv > median(boston$medv)), 1, 0)
boston$medv <- NULL
# Subset 1: zn - 2, chas-4, rm-6, dis-8, black-12,
# Subset 2: crim-1, indus-3, nox-5, age-7, tax-10, ptratio-11
# Subset 3: all
kvals <- c(1,3,5)
subset1 <- c("zn", "chas", "rm", "dis", "black")
subset2 <- c("crim", "indus", "nox", "age", "tax", "ptratio")
subset3 <- c(boston[,1:13])
x1.train <- boston[, c(subset1)]
x2.train <- boston[, c(subset2)]
x3.train <- boston[, 1:13]
y.train <- boston$medv01
xtrain.list <- list(x1.train, x2.train, x3.train)
mean(knn.cv.pred <- knn.cv(xtrain.list[[2]], y.train, k = 5) != y.train)
for (j in kvals ){
message("~~~~ K = ", j, " ~~~~")
for (s in seq_along(xtrain.list) ){
knn.cv.pred <- knn.cv(xtrain.list[[s]],
y.train,
k = kvals[j])
message("Subset ", s, " K = ", j, " Error: ", mean(knn.cv.pred != y.train)*100, "%")
}
message("\n")
}
The error I get is
~~~~ K = 1 ~~~~
Subset 1 K = 1 Error: 29.2490118577075%
Subset 2 K = 1 Error: 24.1106719367589%
Subset 3 K = 1 Error: 19.3675889328063%
~~~~ K = 3 ~~~~
Subset 1 K = 3 Error: 25.6916996047431%
Subset 2 K = 3 Error: 24.1106719367589%
Subset 3 K = 3 Error: 17.5889328063241%
~~~~ K = 5 ~~~~
Error in if (ntr - 1 < k) { : missing value where TRUE/FALSE needed