Я пытаюсь запустить этот простой наивный байесовский классификатор по PySpark
:
nb = NaiveBayes(modelType='multinomial', smoothing=0.1)
model = nb.fit(dataset_test)
pred_nb = model.transform(dataset_test)
Когда я запускаю приведенный выше код, я получаю эту ошибку:
Py4JJavaError: An error occurred while calling o480.fit.
: java.util.NoSuchElementException: next on empty iterator
at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
at scala.collection.Iterator$$anon$2.next(Iterator.scala:37)
at scala.collection.IndexedSeqLike$Elements.next(IndexedSeqLike.scala:63)
at scala.collection.IterableLike$class.head(IterableLike.scala:107)
at scala.collection.mutable.ArrayOps$ofRef.scala$collection$IndexedSeqOptimized$$super$head(ArrayOps.scala:186)
at scala.collection.IndexedSeqOptimized$class.head(IndexedSeqOptimized.scala:126)
at scala.collection.mutable.ArrayOps$ofRef.head(ArrayOps.scala:186)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2552)
at org.apache.spark.ml.classification.NaiveBayes$$anonfun$trainWithLabelCheck$1.apply(NaiveBayes.scala:156)
at org.apache.spark.ml.classification.NaiveBayes$$anonfun$trainWithLabelCheck$1.apply(NaiveBayes.scala:129)
at org.apache.spark.ml.util.Instrumentation$$anonfun$11.apply(Instrumentation.scala:183)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:183)
at org.apache.spark.ml.classification.NaiveBayes.trainWithLabelCheck(NaiveBayes.scala:129)
at org.apache.spark.ml.classification.NaiveBayes.train(NaiveBayes.scala:118)
at org.apache.spark.ml.classification.NaiveBayes.train(NaiveBayes.scala:78)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:118)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Unknown Source)
- О чем эта ошибка?
- Возможно ли это исправить?