Кнн с использованием функции перекрестной проверки - PullRequest
0 голосов
/ 28 мая 2019

Мне нужно запустить код R, чтобы найти number of folder = 1 для k=(c(1:12)), но были показаны следующие предупреждения:

> warnings()
Mensagens de aviso:
1: model fit failed for Fold1.Rep1: k= 1 Error in x[1, 1] : subscript out of bounds

2: model fit failed for Fold1.Rep1: k= 2 Error in x[1, 1] : subscript out of bounds

3: model fit failed for Fold1.Rep1: k= 3 Error in x[1, 1] : subscript out of bounds

. . .

12: model fit failed for Fold1.Rep1: k=12 Error in x[1, 1] : subscript out of bounds

13: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,  ... :
  There were missing values in resampled performance measures.

Это код R, использующий пакет caret.

biopsy_final = na.omit(biopsy[,-c(1)]) # ID & NA excluded  

ctrl <- trainControl(method="repeatedcv", number=1, repeats=1)
nn_grid <- expand.grid(k=c(1:12))
nn_grid

best_knn <- train(class~., data=biopsy_,
              method="knn",
              trControl=ctrl, 
              preProcess = c("center", "scale"),  # standardize
              tuneGrid=nn_grid)
print(best_knn)

1 Ответ

0 голосов
/ 28 мая 2019

Попробуйте это.

grid <- expand.grid(k = 1:12)

{
  set.seed(1)

  index <- caret::createDataPartition(biopsy_$class, p = 0.75, list = FALSE) # partiotion test-train

  train <- biopsy_[index, ]
  test  <- biopsy_[-index, ]

  ctrl <- caret::trainControl(method  = "repeatedcv", 
                              number  = 10, # see this
                              repeats = 10   # see this
                              )  

  model <- caret::train(class~., 
                        data = train, 
                        method = "knn",
                        trControl = ctrl,
                        preProcess = c("center","scale"),
                        tuneGrid = grid)
}

# plot(model)
# model$bestTune # best k

# library(dplyr)
# predictions <- model %>% predict(test)
# RMSE(predictions, test$class)
...