Using in-tree KenLM for building LM?

dvz · March 13, 2019, 9:51am

Should the KenLM inside the DeepSpeech repo be used for building the lm files or is the official KenLM compatible as well?

lissyx · March 13, 2019, 9:53am

Like we have it documented in data/lm/README.md ?

dvz · March 13, 2019, 10:00am

I don’t think that recipe implies where the lmplz and build_binary executables are searched for. I have used ones built from the official KenLM repo, but was wondering whether DS depends on the in-tree version.

kdavis · March 13, 2019, 11:21am

In the past I’ve used the official repo and had no problems with the so created language models.

lissyx · March 13, 2019, 11:39am

Yeah we have no requirement here, maybe you could file an issue and/or a PR to augment the doc to make it clear? It was clear in our mind, but obviously it’d be better to state it

javi.rahman · August 28, 2019, 4:57pm

Hi @kdavis @lissyx ,

I am stuck with creating a lm.binary, where should I run the python file which is in data/lm/README.md
I have saved as python file and try running and I am getting a syntax error

File “lmbinary.py”, line 21
!lmplz --order 5
^
SyntaxError: invalid syntax

I have tried running inside the kenlm/build/bin folder there only I can see lmplz and build_binary.

Please clarify me.

nmstoker · August 28, 2019, 8:58pm

In the README where it says:

following this recipe (Jupyter notebook code):

… that means it’s Jupyter notebook code and as a result you would need to run it in a Jupyter notebook (it doesn’t say to save it as a python script and run that, as you appear to have done).

The reason it’s going wrong is that “!” is a special Jupyter feature to allow you to run command line commands (which lmplz is) and that won’t work in a regular python script. Alternatively you could just use os subprocess or os run to make it work from within your script (but I’d advise the Jupyter option). Running via subprocess or run is a general python thing unrelated to anything here, so I’ll leave you to Google that if you want to go that route.

Hope that helps!

swarajbadhei · May 22, 2020, 6:32am

I am using 0.7.1 version. I could not really understand what to do for the kenlm file. It would be really helpful if someone explains.

othiele · May 22, 2020, 8:55am

Check that you are using 0.7.1 and not current master as it changes stuff with trie …

Then this should get you going:

swarajbadhei · May 23, 2020, 6:41am

it worked sir. Thanks a lot.