Generate_lm.py error when reproducing external scorer

Hi, I tryed to reproduce the scorer but I have issue with generate_lm.py .

Here it’s :

python3 generate_lm.py --input_txt librispeech-lm-norm.txt.gz --output_dir . \
>   --top_k 500000 --kenlm_bins /home/nathan/Téléchargements/kenlm/build/bin/ \
>   --arpa_order 5 --max_arpa_memory "85%" --arpa_prune "0|0|1" \
>   --binary_a_bits 255 --binary_q_bits 8 --binary_type trie
 
Converting to lowercase and counting word occurrences ...
Traceback (most recent call last):
  File "generate_lm.py", line 210, in <module>
    main()
  File "generate_lm.py", line 200, in main
    data_lower, vocab_str = convert_and_filter_topk(args)
  File "generate_lm.py", line 31, in convert_and_filter_topk
    for line in progressbar.progressbar(file_in):
TypeError: 'module' object is not callable

I have python 3.7.3

If you could help me :slight_smile: !

This looks like an environment issue. As you didn’t give any information on version or setup, this is likely the cause. Reinstall everything according to the docs and report more if you need help.

Hi Othiele,

Thank you for your answer! I decided to Reinstall everything. I have the latest version of Ubuntu (Ubuntu 20.04.1 LTS) for my project.

I have this example https://github.com/mozilla/DeepSpeech-examples/tree/r0.9/mic_vad_streaming running on Windows and a venv python 3.6 with a French model and scorer which works well.

Now I want to basically run my own scorer, so I followed this guide:

https://deepspeech.readthedocs.io/en/latest/Scorer.html#building-your-own-scorer

So, as it is said:

“first we must create a KenLM language model (using data/lm/generate_lm.py”)

Basically, I used git clone for KenLM and build it following the ReadMe. Then I also git clone the all project: https://github.com/mozilla/DeepSpeech

After I did this with the Kenlm path binary in the folder just cloned:

cd data/lm
python3 generate_lm.py --input_txt librispeech-lm-norm.txt.gz --output_dir . \

–top_k 500000 --kenlm_bins /home/nathan/Téléchargements/kenlm/build/bin/
–arpa_order 5 --max_arpa_memory “85%” --arpa_prune “0|0|1”
–binary_a_bits 255 --binary_q_bits 8 --binary_type trie

But I am not sure that it’s the right way to do it…

have you properly followed the setup steps? that does not sounds like you did
your error would be consistent with the wrong progressbar package being installed

Well, that is the deal, I think not, the ReadMe is just that :

I really don’t know what to do next, and I don’t want to start somethings without understanding what I’m doing.

I went their :
https://deepspeech.readthedocs.io/en/latest/?badge=latest

Do you think that I just should follow the first steps ?

Create and activate a virtualenv

virtualenv -p python3 $HOME/tmp/deepspeech-venv/
source $HOME/tmp/deepspeech-venv/bin/activate

Install DeepSpeech

pip3 install deepspeech

yes please, we put up a simple readme with direct link to the doc because people were getting lost …

basically, we expect people who are rebuilding a scorer using generate_lm.py to be people who are training, and thus who have already followed the steps in https://deepspeech.readthedocs.io/en/latest/TRAINING.html#prerequisites-for-training-a-model

Yes, ok! I will follow these steps like I did first. The doc was clear enough to be understandable :slight_smile:
Should have tried this way!

Yeah, but I found a good French language already trained and I wanted to test with my own scorer to see what happen. I am in student project, and I tried some stuff
Thank you for the quick responses and for the help !

this one? Modèle Français 0.6 pour DeepSpeech v0.7, v0.8, v0.9

This one :), so yes but an old version I guess.

Et du coup merci encore plus :rofl:

1 Like

I also faced this error with progressbar.
It got fixed by installing “pip install progressbar2”.

1 Like

I’m not sure this is the right method, I tryed this and I got other problems. But if it works for you it’s cool
I installed python3.6 and used Deepspeech in an virtual env and it worked.

It is the right package, but if you follow properly doc it’s installed by setup: https://github.com/mozilla/DeepSpeech/blob/fcbd92d0d75beee36473aa44669fa330ca522cdc/setup.py#L54

I got this issue too.
But pip install progressbar2 wasn’t enough. I had to uninstall progressbar first, then uninstall and reinstall progressbar2.