Ошибка при использовании doc2vec - PullRequest
0 голосов
/ 26 июня 2018

Я пытаюсь сгенерировать векторы из списка предложений.

x1 = 'Today I’d like to start a series of some posts concerning extreme value analysis using R.'
x2 = 'Basically, there are several very useful packages in R which provide methods and functions for extreme value analysis. Information on different software (including all relevant R packages) for extreme value analysis can of course be found at the R Task View on Extreme Value Analysis as well as on Eric Gilleland’s website'
x3 = 'In addition, Gilleland, Ribatet & Stephenson have published A software review for extreme value analysis back in 2012, which provides a comprehensive overview of the most important software tools related to this topic.'
self.sentences = [x1, x2, x3]

Тогда:

        documents = []
        for uid, line in enumerate(self.sentences):
            documents.append(LabeledSentence(line.split(), 'LOG_' + str(uid)))

        self.model_d2v = Doc2Vec(alpha=0.025, min_alpha=0.025, workers = self.workers, size = self.size)
        self.model_d2v.build_vocab(documents)
        for epoch in range(20):
            self.model_d2v.train(documents)
            self.model_d2v.alpha -= 0.002
            self.model_d2v.min_alpha = self.model_d2v.alpha

Тогда у меня есть ошибка:

RuntimeError: you must first build vocabulary before training the model  

на линии train(documents).

Понятия не имею, потому что я звонил build_vocab прямо перед этим.

Не могли бы вы дать мне несколько советов?

...