TypeError: __init__() missing 1 required positional argument: 'config_path' training language model

Hello everyone !

I’m trying to add more sentences to the language model. and to do this, i downloaded a wiki dump, preprocessed and cleaned it to be one sentence per line and also removed all sentences with numbers.

I am doing this as i find myself with a larger error for WER than CER in evaluation.

Here are some details;
DeepSpeech 0.7.4
Ubuntu 20.04
Python 3.7
tensorflow 1.15

Below is what i used to create binaries which were successfully created.

python3 generate_lm.py --input_txt /media/kamla/data/Voice_dataset/language_model_stuff/corpus/librispeech.txt --output_dir /media/kamla/847636CE7636C0AA/Users/offic/Documents/Kenlm --top_k 50000 --kenlm_bins /media/kamla/data/Voice_dataset/language_model_stuff/kenlm/build/bin --arpa_order 5 --max_arpa_memory "85%" --arpa_prune "0|0|1" --binary_a_bits 255 --binary_q_bits 8 --binary_type trie

Now when i try to use generate_package.py like below;
python3 generate_package.py --alphabet ../alphabet.txt --lm lm.binary --vocab vocab-50000.txt --package kenlm.scorer --default_alpha 0.931289039105002 --default_beta 1.1834137581510284

i get the following error;

    50000 unique words read from vocabulary file.
Doesn't look like a character based model.
Using detected UTF-8 mode: False
Traceback (most recent call last):
  File "generate_package.py", line 157, in <module>
  File "generate_package.py", line 152, in main
  File "generate_package.py", line 48, in create_bundle
    alphabet = NativeAlphabet()
TypeError: __init__() missing 1 required positional argument: 'config_path'

Please use 0.9.3, 0.7.4 is old and not supported anymore.

Dear lissyx,

Thank you for the timely response.

Shall try what you suggest and update.

Hello,what is librispeech.txt, Is it the same for all languages? Thank you so much

I came across the same issue
had to downgrade ds-ctcdecoder
pip install ds-ctcdecoder==0.7.0