Сфинкс 4, поврежден ARPA LM? - PullRequest
       84

Сфинкс 4, поврежден ARPA LM?

1 голос
/ 28 февраля 2011

У меня есть ARPA LM, сгенерированный kylm , при запуске SPHINX я получаю эту трассировку стека исключений:

Exception in thread "main" java.lang.RuntimeException: Allocation of search manager resources failed
        at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.allocate(WordPruningBreadthFirstSearchManager.java:242)
        at edu.cmu.sphinx.decoder.AbstractDecoder.allocate(AbstractDecoder.java:87)
        at edu.cmu.sphinx.recognizer.Recognizer.allocate(Recognizer.java:168)
        at transcribing.Main.main(Main.java:78)
Caused by: java.io.IOException: Corrupt Language Model file:./corpus.arpa at line 2420:Premature EOF
        at edu.cmu.sphinx.linguist.language.ngram.SimpleNGramModel.corrupt(SimpleNGramModel.java:458)
        at edu.cmu.sphinx.linguist.language.ngram.SimpleNGramModel.readLine(SimpleNGramModel.java:404)
        at edu.cmu.sphinx.linguist.language.ngram.SimpleNGramModel.load(SimpleNGramModel.java:307)
        at edu.cmu.sphinx.linguist.language.ngram.SimpleNGramModel.allocate(SimpleNGramModel.java:110)
        at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.allocate(LexTreeLinguist.java:342)
        at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.allocate(WordPruningBreadthFirstSearchManager.java:238)
        ... 3 more
Java Result: 1

Вот выдержка из ARPA LM:

[n]
3

[smoother]
kylm.model.ngram.smoother.KNSmoother

[closed]
true

[max_length]
1091

[vocab_cutoff]
0

[start_symbol]
<s>

[terminal_symbol]
</s>

[unknown_symbol]
<unk>

\data\
ngram 1=406
ngram 2=768
ngram 3=937
\1-grams: 
-99.0000    <s> -0.3630
...
...

\end\

PS : после \end\

появляется новая строка

Исключение говорит о том, что SPHINX сталкивается с неожиданным EOF в последней строке (разве он не должен там встречаться с EOF ??)

Пожалуйста, помогите!

1 Ответ

1 голос
/ 01 марта 2011

Оказывается, это ошибка SPHINX 4.

Если директива \1-grams: (или любая другая директива на самом деле) содержала хвостовое пространство [s], SimpleNGramModel не удалось его проанализировать!Я только что представил патч, но вы можете найти его здесь .

...