Failed to creating scorer model : DS_ERR_SCORER_NO_TRIE [SOLVED]

Hi i couldn’t find anything about DS_ERR_SCORER_NO_TRIE. It’s happens running with generate_package.py.
Any ideas ?

As in the source code it says

DS_ERR_SCORER_NO_TRIE, 0x2007, "Reached end of scorer file before loading vocabulary trie.

```
    ╰─ python3 generate_lm.py --input_txt ../test_final.txt --output_dir . --top_k 500000 --kenlm_bins /home/jyri/ieud/projects/deepspeech/kenlm/build/bin/ --arpa_order 5 --max_arpa_memory "85%" --arpa_prune "0|0|1" --binary_a_bits 255 --binary_q_bits 8 --binary_type trie 

    Converting to lowercase and counting word occurrences ...
    | |                                                                                                                                     #                     | 10519777 Elapsed Time: 0:01:46

    Saving top 500000 words ...

    Calculating word statistics ...
      Your text file has 64277395 words in total
      It has 1561629 unique words
      Your top-500000 words are 97.3967 percent of all words
      Your most common word "ve" occurred 2110118 times
      The least common word in your top-k is "paonun" with 3 times
      The first word with 4 occurrences is "sallanıyordur" at place 483963

    Creating ARPA file ...
    === 1/5 Counting and sorting n-grams ===
    Reading /home/jyri/Desktop/deepspeech/data/lm/lower.txt.gz
    ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
    ****************************************************************************************************
    Unigram tokens 64227371 types 1599291
    === 2/5 Calculating and sorting adjusted counts ===
    Chain sizes: 1:19191492 2:1390338944 3:2606885888 4:4171016960 5:6082733568
    Statistics:
    1 1599291 D1=0.555044 D2=1.38028 D3+=1.74191
    2 19200762 D1=0.613093 D2=1.57666 D3+=1.85088
    3 4887608/42923262 D1=0.890855 D2=1.26068 D3+=1.3821
    4 2292531/46125983 D1=0.95244 D2=1.41186 D3+=1.46762
    5 1075956/40097829 D1=0.960827 D2=1.53103 D3+=1.51773
    Memory estimate for binary LM:
    type     MB
    probing 661 assuming -p 1.5
    probing 818 assuming -r models -p 1.5
    trie    378 without quantization
    trie    227 assuming -q 8 -b 8 quantization 
    trie    323 assuming -a 22 array pointer compression
    trie    173 assuming -a 22 -q 8 -b 8 array pointer compression and quantization
    === 3/5 Calculating and sorting initial probabilities ===
    Chain sizes: 1:19191492 2:307212192 3:97752160 4:55020744 5:30126768
    ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
    **##################################################################################################
    === 4/5 Calculating and writing order-interpolated probabilities ===
    Chain sizes: 1:19191492 2:307212192 3:97752160 4:55020744 5:30126768
    ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
    ####################################################################################################
    === 5/5 Writing ARPA model ===
    ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
    ****************************************************************************************************
    Name:lmplz	VmPeak:14132308 kB	VmRSS:112920 kB	RSSMax:3961724 kB	user:73.651	sys:13.0078	CPU:86.6589	real:75.457

    Filtering ARPA file using vocabulary of top-k words ...
    Reading ./lm.arpa
    ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
    ****************************************************************************************************

    Building lm.binary ...
    Reading ./lm_filtered.arpa
    ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
    ****************************************************************************************************
    Identifying n-grams omitted by SRI
    ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
    ****************************************************************************************************
    Quantizing
    ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
    ****************************************************************************************************
    Writing trie
    ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
    ****************************************************************************************************
    SUCCESS
```
 python3.6 generate_package.py --alphabet ../alphabet.txt --lm lm.binary --vocab vocab-500000.txt --package kenlm.scorer --default_alpha 0.931289039105002 --default_beta 1.1834137581510284
500000 unique words read from vocabulary file.
Doesn't look like a character based model.
Using detected UTF-8 mode: False
Traceback (most recent call last):
  File "generate_package.py", line 153, in <module>
    main()
  File "generate_package.py", line 148, in main
    args.default_beta,
  File "generate_package.py", line 58, in create_bundle
    if err != ds_ctcdecoder.DS_ERR_SCORER_NO_TRIE:
AttributeError: module 'ds_ctcdecoder' has no attribute 'DS_ERR_SCORER_NO_TRIE'

DeepSpeech is in active development. Don’t use the current master, use v0.7.1

actually i’m using 0.7.1
╰─ cat VERSION
0.7.1

╰─ pip list
Package Version Location


absl-py 0.9.0
alembic 1.4.2
astor 0.8.1
attrdict 2.0.1
audioread 2.1.8
beautifulsoup4 4.9.1
bs4 0.0.1
certifi 2020.4.5.1
cffi 1.14.0
chardet 3.0.4
cliff 3.1.0
cmaes 0.5.0
cmd2 0.8.9
colorlog 4.1.0
decorator 4.4.2
deepspeech 0.7.1
deepspeech-gpu 0.7.1
deepspeech-training 0.7.1 /home/jyri/Desktop/deepspeech/training

Check the version of your ctc decoder, I had this happen to me in the current master but it was fine with 0.7.1

$python3 util/taskcluster.py --branch "v0.7.1" --target "."
and extracted native client.

still not getting same error :frowning:
i guess i should work with fresh project or go with v0.6.1.
but i think i should make a comprehensive guide after successful model build.
kinda sad

That line 58 is not present in the 0.7.1 version of generate.py, so you are doing something wrong:

2 Likes

ok i get it.
i did git checkout “v0.7.1.” the problem was that i’ve pulled native client before switching branch.

Please check your branch, generate_package.py is not part of the native client.

1 Like

yeap my bad thanks for response :partying_face: