Я пытаюсь запустить обобщенную линейную модель со своими данными и получаю сообщение об ошибке «Ошибка в maxkappa.train [1, 2]: неверное количество измерений». Я не уверен, почему я получаю эту ошибку и почему выходные данные для моих моделей-кандидатов оказываются пустыми.
str(ChallengeEvents_Daily_BySpp)
'data.frame': 2190 obs. of 14 variables:
$ Date : Factor w/ 1095 levels "2017-01-01","2017-01-02",..: 1 1 2 2 3 3 4 4 5 5 ...
$ Species : Factor w/ 2 levels "PDFH","SCBC": 1 2 1 2 1 2 1 2 1 2 ...
$ N.unique : int 0 0 0 0 0 0 0 0 0 0 ...
$ Challenger : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Season : Factor w/ 4 levels "FAL","SPR","SUM",..: 4 4 4 4 4 4 4 4 4 4 ...
$ Temp : num 0.127 0.127 0.281 0.281 0.322 ...
$ Hydraulic.head.m: num 2.69 2.69 2.69 2.69 2.74 ...
$ D.lock.n : int 0 0 0 0 0 0 0 0 0 0 ...
$ U.lock.n : int 0 0 0 0 0 0 0 0 0 0 ...
$ Rec.D.n : int 0 0 0 0 0 0 0 0 0 0 ...
$ Rec.U.n : int 0 0 0 0 0 0 0 0 0 0 ...
$ Barge.D.n : int 0 0 0 0 0 0 0 0 0 0 ...
$ Barge.U.n : int 0 0 0 0 0 0 0 0 0 0 ...
$ Year : Factor w/ 3 levels "2017","2018",..: 1 1 1 1 1 1 1 1 1 1 ...
Код, который я использую для запуска различных моделей.
test.models <- list(
m0.form <- formula(Challenger ~ Temp + Hydraulic.head.m + Barge.D.n + Barge.U.n + Rec.D.n + Rec.U.n + Year + Season),
m1.form <- formula(Challenger ~ Barge.D.n),
m2.form <- formula(Challenger ~ Barge.D.n + Hydraulic.head.m),
m3.form <- formula(Challenger ~ Barge.D.n + Temp),
m4.form <- formula(Challenger ~ Barge.D.n + Season),
m5.form <- formula(Challenger ~ Barge.D.n + Year),
m6.form <- formula(Challenger ~ Barge.D.n + Hydraulic.head.m + Temp),
m7.form <- formula(Challenger ~ Barge.D.n + Temp + Season),
m8.form <- formula(Challenger ~ Barge.D.n + Hydraulic.head.m + Temp + Year),
m9.form <- formula(Challenger ~ Barge.U.n),
m10.form <- formula(Challenger ~ Barge.U.n + Hydraulic.head.m),
m11.form <- formula(Challenger ~ Barge.U.n + Temp),
m12.form <- formula(Challenger ~ Barge.U.n + Season),
m13.form <- formula(Challenger ~ Barge.U.n + Year),
m14.form <- formula(Challenger ~ Barge.U.n + Hydraulic.head.m + Temp),
m15.form <- formula(Challenger ~ Barge.U.n + Temp + Season),
m16.form <- formula(Challenger ~ Barge.U.n + Hydraulic.head.m + Temp + Year),
m17.form <- formula(Challenger ~ Rec.D.n),
m18.form <- formula(Challenger ~ Rec.D.n + Hydraulic.head.m),
m19.form <- formula(Challenger ~ Rec.D.n + Temp),
m20.form <- formula(Challenger ~ Rec.D.n + Season),
m21.form <- formula(Challenger ~ Rec.D.n + Year),
m22.form <- formula(Challenger ~ Rec.D.n + Hydraulic.head.m + Temp),
m23.form <- formula(Challenger ~ Rec.D.n + Temp + Season),
m24.form <- formula(Challenger ~ Rec.D.n + Hydraulic.head.m + Temp + Year),
m25.form <- formula(Challenger ~ Rec.U.n),
m26.form <- formula(Challenger ~ Rec.U.n + Hydraulic.head.m),
m27.form <- formula(Challenger ~ Rec.U.n + Temp),
m28.form <- formula(Challenger ~ Rec.U.n + Season),
m29.form <- formula(Challenger ~ Rec.U.n + Year),
m30.form <- formula(Challenger ~ Rec.U.n + Hydraulic.head.m + Temp),
m31.form <- formula(Challenger ~ Rec.U.n + Temp + Season),
m32.form <- formula(Challenger ~ Rec.U.n + Hydraulic.head.m + Temp + Year),
m33.form <- formula(Challenger ~ Year),
m34.form <- formula(Challenger ~ Season),
m35.form <- formula(Challenger ~ Temp),
m36.form <- formula(Challenger ~ Hydraulic.head.m),
m37.form <- formula(Challenger ~ Barge.D.n + Barge.U.n),
m38.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Hydraulic.head.m),
m39.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Temp),
m40.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Season),
m41.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Year),
m42.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Hydraulic.head.m + Temp),
m43.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Temp + Season),
m44.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Hydraulic.head.m + Temp + Year),
m45.form <- formula(Challenger ~ Rec.D.n + Rec.U.n),
m46.form <- formula(Challenger ~ Rec.D.n + Rec.U.n + Hydraulic.head.m),
m47.form <- formula(Challenger ~ Rec.D.n + Rec.U.n + Temp),
m48.form <- formula(Challenger ~ Rec.D.n + Rec.U.n + Season),
m49.form <- formula(Challenger ~ Rec.D.n + Rec.U.n + Year),
m50.form <- formula(Challenger ~ Rec.D.n + Rec.U.n + Hydraulic.head.m + Temp),
m51.form <- formula(Challenger ~ Rec.D.n + Rec.U.n + Temp + Season),
m52.form <- formula(Challenger ~ Rec.D.n + Rec.U.n + Hydraulic.head.m + Temp + Year),
m53.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Rec.D.n + Rec.U.n),
m54.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Rec.D.n + Rec.U.n + Hydraulic.head.m),
m55.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Rec.D.n + Rec.U.n + Temp),
m56.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Rec.D.n + Rec.U.n + Season),
m57.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Rec.D.n + Rec.U.n + Year),
m58.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Rec.D.n + Rec.U.n + Hydraulic.head.m + Temp),
m59.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Rec.D.n + Rec.U.n + Temp + Season),
m60.form <- formula(Challenger ~ Barge.D.n + Barge.U.n + Rec.D.n + Rec.U.n + Hydraulic.head.m + Temp + Year),
m61.form <- formula(Challenger ~ Barge.D.n + Rec.D.n),
m62.form <- formula(Challenger ~ Barge.D.n + Rec.D.n + Hydraulic.head.m),
m63.form <- formula(Challenger ~ Barge.D.n + Rec.D.n + Temp),
m64.form <- formula(Challenger ~ Barge.D.n + Rec.D.n + Season),
m65.form <- formula(Challenger ~ Barge.D.n + Rec.D.n + Year),
m66.form <- formula(Challenger ~ Barge.D.n + Rec.D.n + Hydraulic.head.m + Temp),
m67.form <- formula(Challenger ~ Barge.D.n + Rec.D.n + Temp + Season),
m68.form <- formula(Challenger ~ Barge.D.n + Rec.D.n + Hydraulic.head.m + Temp + Year),
m69.form <- formula(Challenger ~ Barge.U.n + Rec.U.n + (1|TRANSMITTERID)),
m70.form <- formula(Challenger ~ Barge.U.n + Rec.U.n + Hydraulic.head.m),
m71.form <- formula(Challenger ~ Barge.U.n + Rec.U.n + Temp),
m72.form <- formula(Challenger ~ Barge.U.n + Rec.U.n + Season),
m73.form <- formula(Challenger ~ Barge.U.n + Rec.U.n + Year),
m74.form <- formula(Challenger ~ Barge.U.n + Rec.U.n + Hydraulic.head.m + Temp),
m75.form <- formula(Challenger ~ Barge.U.n + Rec.U.n + Temp + Season),
m76.form <- formula(Challenger ~ Barge.U.n + Rec.U.n + Hydraulic.head.m + Temp + Year))
CompareModels <- list()
for(i in 1:length(spp)){ # begin loop through species
spp.i <- spp[i]
df.i <- ChallengeEvents_Daily_BySpp[ChallengeEvents_Daily_BySpp$Species == spp.i, ]
# Randomly sample in 80/20 split
df.pos <- df.i[df.i$Challenger == 1, ]
pos.random <- sample(nrow(df.pos), nrow(df.pos), replace = FALSE)
pos.cut <- floor(length(pos.random)*.8)
df.pos.train <- df.pos[pos.random[c(1:pos.cut)], ]
df.pos.test <- df.pos[pos.random[c((pos.cut+1):length(pos.random))], ]
df.neg <- df.i[df.i$Challenger == 0, ]
neg.random <- sample(nrow(df.neg), nrow(df.neg), replace = FALSE)
#neg.random <- sample(nrow(df.neg), nrow(df.pos), replace = FALSE) # to balance pos/neg challenge
neg.cut <- floor(length(neg.random)*.8)
df.neg.train <- df.neg[neg.random[c(1:neg.cut)], ]
df.neg.test <- df.neg[neg.random[c((neg.cut + 1):length(neg.random))], ]
df.i.test <- rbind(df.neg.test, df.pos.test)
df.i.train <- rbind(df.neg.train, df.pos.train)
zz <- file(paste0(out.dir, 'ChallengeEventModels/', 'Challenge_Models_', spp.i, '.txt'), open = 'wt')
sink(zz, type = "message")
sink(zz, type = "output")
for(j in 1:length(test.models)){ # begin loop through models
name.i.j <- paste0(spp.i, "_m", j)
model.j <- test.models[[j]]
glm.j <-glm(model.j,
data = df.i.train, family = binomial)
print(name.i.j)
print(model.j)
print(summary(glm.j))
print(confint(glm.j))
#AUC for global model in model selection table
p.train <- predict(glm.j, df.i.train, type="response")
pr.train <- prediction(p.train, df.i.train$Challenger)
auc.train <- performance(pr.train, measure = "auc")
auc.train <- auc.train@y.values[[1]]
maxkappa.train <- ecospat.max.kappa(p.train, df.i.train$Challenger)[[2]]
#AUC for global model in model selection table
p.test <- predict(glm.j, df.i.test, type="response")
pr.test <- prediction(p.test, df.i.test$Challenger)
auc.test <- performance(pr.test, measure = "auc")
auc.test <- auc.test@y.values[[1]]
maxkappa.test <- ecospat.max.kappa(p.test, df.i.test$Challenger)[[2]]
CompareModels[[name.i.j]] <- data.frame(Species = spp.i,
Model = as.character(model.j)[3],
AIC = AIC(glm.j),
converged = glm.j$converged,
boundary = glm.j$boundary,
AUC.train = auc.train,
AUC.test = auc.test,
maxKappa.train = maxkappa.train[1,2],
Kappacutoff.train = maxkappa.train[2,2],
maxKappa.test = maxkappa.test[1,2],
Kappacutoff.test = maxkappa.test[2,2])
} #end models
sink()
sink()
} #end species
CompareModels <- do.call(rbind, CompareModels)
Это единственный вывод, который я получаю. Я должен получить 152 модели (76 для одного вида и 76 для другого вида).
[1] "SCBC_m1"
Challenger ~ Temp + Hydraulic.head.m + Barge.D.n + Barge.U.n +
Rec.D.n + Rec.U.n + Year + Season
Call:
glm(formula = model.j, family = binomial, data = df.i.train)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.22549 -0.32947 -0.08226 -0.00009 2.78997
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -5.908158 1.444232 -4.091 4.3e-05 ***
Temp 0.206311 0.060722 3.398 0.00068 ***
Hydraulic.head.m -0.830706 0.203861 -4.075 4.6e-05 ***
Barge.D.n -0.099814 0.051126 -1.952 0.05090 .
Barge.U.n -0.020138 0.044261 -0.455 0.64912
Rec.D.n 0.006503 0.152140 0.043 0.96591
Rec.U.n 0.218535 0.180073 1.214 0.22490
Year2018 -1.207097 0.433933 -2.782 0.00541 **
Year2019 -1.150210 0.450021 -2.556 0.01059 *
SeasonSPR 2.058405 1.086584 1.894 0.05817 .
SeasonSUM 2.555601 1.079296 2.368 0.01789 *
SeasonWINT -10.694876 721.821804 -0.015 0.98818
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 432.14 on 874 degrees of freedom
Residual deviance: 285.54 on 863 degrees of freedom
AIC: 309.54
Number of Fisher Scoring iterations: 18
Waiting for profiling to be done...
2.5 % 97.5 %
(Intercept) -9.32226029 -3.402828646
Temp 0.09406931 0.332836137
Hydraulic.head.m -1.24665216 -0.444214806
Barge.D.n -0.20263859 -0.001542306
Barge.U.n -0.10886309 0.065337244
Rec.D.n -0.30888543 0.294943423
Rec.U.n -0.13895160 0.571089446
Year2018 -2.08842313 -0.378075754
Year2019 -2.05408570 -0.282466826
SeasonSPR 0.30546815 5.007097435
SeasonSUM 0.83911673 5.498842642
SeasonWINT -262.95844145 17.513941454
Error in maxkappa.train[1, 2] : incorrect number of dimensions
In addition: There were 50 or more warnings (use warnings() to see the first 50)