Сюрприз NMF выбрасывает ZeroDivisionError: деление поплавка - PullRequest
0 голосов
/ 10 февраля 2020

Я пытаюсь сделать базовую c систему рекомендаций. Для этого я использую модель NMF Surprise .

Вот мой набор данных непосредственно перед началом работы с NMF:

    store_id    item_id     quantity
0   62693933    912003029   3.000
1   62693933    912003034   4.000
2   62693933    913003004   1.000
3   62693933    913050001   2.024
4   62693933    913163001   11.838
...     ...     ...     ...
353843  101931000   4140870025  9.000
353844  101931000   19136680005     3.000
353845  101931000   50012447358     3.000
353846  101931000   51010204669     3.000
353847  101931000   51010208567     3.000

353848 rows × 3 columns

После этого я запускаю код ниже, чтобы подготовить этот набор данных для обучения модели:

min_quantity = df.quantity.min()
max_quantity = df.quantity.max()

reader = surprise.Reader(
    rating_scale=(min_quantity, max_quantity)
)

surprise_df = surprise.Dataset.load_from_df(df, reader)

surprise_trainset = surprise_df.build_full_trainset()

После этих шагов этот код ниже выдает ошибку:

model = NMF().fit(surprise_trainset)
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-73-4f2929f79206> in <module>
----> 1 model = NMF().fit(surprise_trainset)

/usr/local/lib/python3.6/dist-packages/surprise/prediction_algorithms/matrix_factorization.pyx in surprise.prediction_algorithms.matrix_factorization.NMF.fit()

/usr/local/lib/python3.6/dist-packages/surprise/prediction_algorithms/matrix_factorization.pyx in surprise.prediction_algorithms.matrix_factorization.NMF.sgd()

ZeroDivisionError: float division

Эта система работала нормально. Я предполагаю, что проблема с набором данных. Но я не мог понять, что вызывает это. Я проверил нулевые, нулевые значения и т. Д. c. Ни одно из значений не является нулевым, и в столбце количества (ранга) есть нули.

Я был бы рад, если у кого-нибудь есть идея, что может быть причиной этой ошибки. Я могу предоставить больше информации о наборе данных, если вам нужно.

Я не знаю, правильно ли это, но вот пример данных, которые вы можете использовать. Вы можете сохранить его как json и прочитать его с помощью pandas:

'{"store_id":{"5000":62693933,"5001":62693933,"5002":62693933,"5003":62693933,"5004":62693933,"5005":62693933,"5006":62693933,"5007":62693933,"5008":62693933,"5009":62693933,"5010":62693933,"5011":62693933,"5012":62693933,"5013":62693933,"5014":62693933,"5015":62693933,"5016":62693933,"5017":62693933,"5018":62693933,"5019":62693933,"5020":62693933,"5021":62693933,"5022":62693933,"5023":62693933,"5024":62693933,"5025":62693933,"5026":62693933,"5027":62693933,"5028":62693933,"5029":62693933,"5030":62693933,"5031":62693933,"5032":62693933,"5033":62693933,"5034":62693933,"5035":62693933,"5036":62693933,"5037":62693933,"5038":62693933,"5039":62693933,"5040":62693933,"5041":62693933,"5042":62693933,"5043":62693933,"5044":62693933,"5045":62693933,"5046":62693933,"5047":62693933,"5048":62693933,"5049":62693933,"5050":62693933,"5051":62693933,"5052":62693933,"5053":62693933,"5054":62693933,"5055":62693933,"5056":62693933,"5057":62693933,"5058":62693933,"5059":62693933,"5060":62693933,"5061":62693933,"5062":62693933,"5063":62693933,"5064":62693933,"5065":62693933,"5066":62693933,"5067":62693933,"5068":62693933,"5069":62693933,"5070":62693933,"5071":62693933,"5072":62693933,"5073":62693933,"5074":62693933,"5075":62693933,"5076":62693933,"5077":62693933,"5078":62693933,"5079":62693933,"5080":62693933,"5081":62693933,"5082":62693933,"5083":62693933,"5084":62693933,"5085":62693933,"5086":62693933,"5087":62693933,"5088":62693933,"5089":62693933,"5090":62693933,"5091":62693933,"5092":62693933,"5093":62693933,"5094":62693933,"5095":62693933,"5096":62693933,"5097":62693933,"5098":62693933,"5099":62693933},"item_id":{"5000":9060030036,"5001":9060710006,"5002":9060710013,"5003":9072080032,"5004":9072080034,"5005":9079990002,"5006":9081700008,"5007":9090080014,"5008":9090080018,"5009":9092110006,"5010":9092110014,"5011":9092110027,"5012":9100500001,"5013":9106660004,"5014":9122110013,"5015":9126660006,"5016":9130030022,"5017":9140130013,"5018":9141350009,"5019":9141350021,"5020":9141350038,"5021":9148340005,"5022":9148340009,"5023":9148340010,"5024":9148340011,"5025":9148340024,"5026":9151350008,"5027":9156650003,"5028":9163200003,"5029":9163200039,"5030":9163200053,"5031":9163200058,"5032":9190020003,"5033":9200020021,"5034":9212260008,"5035":9220020005,"5036":9220320027,"5037":9240020005,"5038":9240030015,"5039":9240710006,"5040":9244340002,"5041":9250030005,"5042":9252180012,"5043":9252180017,"5044":9290710002,"5045":9300320002,"5046":9310710004,"5047":9331350015,"5048":9336650002,"5049":9353790001,"5050":10022180033,"5051":10072180005,"5052":10072180011,"5053":10119830001,"5054":10119830005,"5055":10119830011,"5056":10119830013,"5057":10122360001,"5058":10911080003,"5059":11040140001,"5060":11050140003,"5061":11051690003,"5062":11051690004,"5063":11061310012,"5064":11062030023,"5065":11062030040,"5066":12010740022,"5067":12010740023,"5068":12011310001,"5069":12011310002,"5070":12011310008,"5071":12011940008,"5072":12011940087,"5073":12011940100,"5074":12020580003,"5075":12021940010,"5076":12021940032,"5077":12021940058,"5078":12021940083,"5079":12030150007,"5080":12030150008,"5081":12032170013,"5082":12051310001,"5083":13011940007,"5084":13030230057,"5085":13030230059,"5086":13030230063,"5087":13030230079,"5088":13030230080,"5089":13030230089,"5090":13030580003,"5091":13030740054,"5092":13030740055,"5093":13031230002,"5094":13031230004,"5095":13032510029,"5096":13041330011,"5097":13041330017,"5098":13041940042,"5099":13042040002},"quantity":{"5000":0.0,"5001":0.0,"5002":0.0,"5003":0.0,"5004":0.0,"5005":0.0,"5006":0.0,"5007":0.0,"5008":0.0,"5009":0.0,"5010":0.0,"5011":0.0,"5012":0.0,"5013":0.0,"5014":0.0,"5015":0.0,"5016":0.0,"5017":0.0,"5018":0.0,"5019":0.0,"5020":0.0,"5021":0.0,"5022":0.0,"5023":0.0,"5024":0.0,"5025":0.0,"5026":0.0,"5027":0.0,"5028":0.0,"5029":0.0,"5030":0.0,"5031":0.0,"5032":0.0,"5033":0.0,"5034":0.0,"5035":0.0,"5036":0.0,"5037":0.0,"5038":0.0,"5039":0.0,"5040":0.0,"5041":0.0,"5042":0.0,"5043":0.0,"5044":0.0,"5045":0.0,"5046":0.0,"5047":0.0,"5048":0.0,"5049":0.0,"5050":0.0,"5051":0.0,"5052":0.0,"5053":0.0,"5054":0.0,"5055":0.0,"5056":0.0,"5057":0.0,"5058":0.0,"5059":0.0,"5060":0.0,"5061":0.0,"5062":0.0,"5063":0.0,"5064":0.0,"5065":0.0,"5066":0.0,"5067":0.0,"5068":0.0,"5069":0.0,"5070":0.0,"5071":0.0,"5072":0.0,"5073":0.0,"5074":0.0,"5075":0.0,"5076":0.0,"5077":0.0,"5078":0.0,"5079":0.0,"5080":0.0,"5081":0.0,"5082":0.0,"5083":0.0,"5084":0.0,"5085":0.0,"5086":0.0,"5087":0.0,"5088":0.0,"5089":0.0,"5090":0.0,"5091":0.0,"5092":0.0,"5093":0.0,"5094":0.0,"5095":0.0,"5096":0.0,"5097":0.0,"5098":0.0,"5099":0.0}}'
...