Это мой первый вопрос, так что немного неудобно .. = [
Речь идет о «данных airbnb», а зависимая переменная - «цена»
Вот тип данных
> glimpse(air.df)
Rows: 56,092
Columns: 16
$ id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,...
$ name <chr> "Clean & quiet apt home by the park", "Skylit Midtown Cast...
$ host_id <chr> "2787", "2845", "4632", "4869", "7192", "7322", "7356", "8...
$ host_name <fct> John, Jennifer, Elisabeth, LisaRoxanne, Laura, Chris, Garo...
$ neighbourhood_group <fct> Brooklyn, Manhattan, Manhattan, Brooklyn, Manhattan, Manha...
$ neighbourhood <fct> Kensington, Midtown, Harlem, Clinton Hill, East Harlem, Mu...
$ latitude <dbl> 40.64749, 40.75362, 40.80902, 40.68514, 40.79851, 40.74767...
$ longitude <dbl> -73.97237, -73.98377, -73.94190, -73.95976, -73.94399, -73...
$ room_type <fct> Private room, Entire home/apt, Private room, Entire home/a...
$ price <dbl> 149, 225, 150, 89, 80, 200, 60, 79, 79, 150, 135, 85, 40, ...
$ minimum_nights <int> 1, 1, 3, 1, 10, 3, 45, 2, 2, 1, 5, 2, -73, 0, 2, 90, 2, 2,...
$ number_of_reviews <int> 9, 45, 0, 270, 9, 74, 49, 430, 118, 160, 53, 188, NA, 0, 1...
$ last_review <date> 2018-10-19, 2019-05-21, NA, 2019-07-05, 2018-11-19, 2019-...
$ reviews_per_month <dbl> 0.21, 0.38, 0.00, 4.64, 0.10, 0.59, 0.40, 3.47, 0.99, 1.33...
$ calculated_host_listings_count <int> 6, 2, 1, 1, 1, 1, 1, 1, 1, 4, 1, 1, 167, 0, 1, 1, 1, 1, 1,...
$ availability_365 <int> 365, 355, 365, 194, 0, 129, 0, 220, 0, 188, 6, 39, NA, 0, ...
Вот проблема.
fit_rf<-randomForest(price~.,
+ data=air.df_train,
+ importance=TRUE,
+ prOximity=TRUE,
+ na.action=na.roughfix)
Error in na.roughfix.data.frame(list(price = c(100L, 195L, 100L, 245L, :
na.roughfix only works for numeric or factor