Language model compilation

Hi, I’m following the excellent tutorial from elpimous_robot to build a portugese DeepSpeech model and I’m stuck in the language model step.
I followed the steps from Estimating Large Language Models with KenLM to get the tools for the LM building.
wget -O - https://kheafield.com/code/kenlm.tar.gz |tar xz
mkdir kenlm/build
cd kenlm/build
cmake ..
make -j2
The next step were:
cd /Documentos/SpeechToText
(deepspeech-venv) oscar@ubuntuDS:~/Documentos/SpeechToText$ /bin/bin/./lmplz --text vocabulary.txt --arpa words.arpa --o 3
but I got an error:
bash: /bin/bin/./lmplz: No existe el archivo o el directorio
This is my first compilation ever and I had no clue why there is no lmplz . Also I searched the web trying to find the same error, or similar, with no luck
What am I doing wrong?
Thanks in advance

1 Like

Hi Oscar,
in which directory did you build kenlm? Did the compilation finish successfully or did it report any errors after make -j2?

When you build kenlm, you should be able to find the binary in

{my_kenlm_directory}/kenlm/build/bin/lmplz

So when running it, you need to use the full path to lmplz. E.g.

/home/myuser/development/kenlm/build/bin/lmplz --text vocabulary.txt --arpa words.arpa --o 3

Yv

@yv001 Thanks for your response. I made the compilation again and it ends with no error but there was a warning that I had overlooked: Eigen3 not found
Just installed https://bitbucket.org/eigen/eigen/get/3.2.8.tar.bz2 |tar xj and compiled again this time successfully generated the build folder and lmplz