Can I use other language modelling tools than KenLM

(Matti Meikäläinen) #1


I want to set up a training pipe for my own audio and text corpus by following the tutorial: TUTORIAL : How I trained a specific french model to control my robot

However, I got stuck in building language model with KenLM (in the tutorial the command: “/bin/bin/./lmplz --text vocabulary.txt --arpa --o 3”) as it requires Boost and I had a lot of problems with installing it to Mac.

Is Deep Speech compatible with other language modelling tools such as:

such that I am not bind to KenLM?

(Lissyx) #2

We only support KenLM, we even have a specific CTC decoder bound with it. However, if you switch the code not to rely on the specific CTC/KenLM decoder, you might be able to plug your own. Be prepared to hack, though.

What’s your problem with Boost? Maybe you should start a specific topic, others might be able to help you?

(Matti Meikäläinen) #3

The problem I met while installing Boost from sources was the error:
error: no matching constructor for initialization of ‘storage_type’ (aka ‘boost::atomics::detail::storage128_type’)

but I solved it by trying version 1.54:

(Lissyx) #4

So now it works for you? :slight_smile:

(Matti Meikäläinen) #5

Yeah, I got KenLM installed :slight_smile: thanks!

(Krishna mohan) #6

Can you explain how CTC decoder bound to KenLM.

(Lissyx) #7

@reuben implemented specific code to have CTC decoder using the beam scoring from KenLM.

(Jageshmaharjan) #8

Is the Boost library necessary. However, I installed withsudo apt-get install cmake libblkid-dev e2fslibs-dev libboost-all-dev libaudit-dev