Build the generate_trie binary

Hi,
I’m trying to create a specific french model with Deepspeech. To do that i needed to create my own language model. I wrote a vocabulary.txt file, and generated the lm.binary. Now i need to create the trie but i dont manage to build the generate_trie binary from ./DeepSpeech/native_client/generate_trie.cpp.
I followed the README.rst located in the native_client folder, and didnt succeed with older versions of DeepSpeech. I think because of wrong versions of TF, bazel, and Deepspeech. Now i’m on the 0.7.0-alpha.2 version of deepspeech and it works well, but in the line
“bazel build --workspace_status_command=“bash native_client/bazel_workspace_status_cmd.sh” --config=monolithic -c opt --copt=-O3 --copt=”-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-fvisibility=hidden //native_client:libdeepspeech.so //native_client:generate_trie"

of README.rst, “//native_client:generate_trie” has been removed, so the generate_trie binary is not created. If i write it back, some error occur :

“ERROR: Skipping ‘//native_client:generate_trie’: no such target ‘//native_client:generate_trie’: target ‘generate_trie’ not declared in package ‘native_client’”

Do i have to come back to an older version ? If you have any hint it would be great :slight_smile:

Current master does not need generate_trie anymore.

Thanks for the quick answer, so how do i generate my trie ?

1 Like

As documented in data/lm no need to generate trie anymore, use generate_scorer.py.

But for previous release, you don’t need to rebuild that, it’s bundled as part of prebuilt native_client.tar.xz

BTW don’t forget you can get some inspiration from commonvoice-fr/DeepSpeech/Dockerfile.train at master · common-voice/commonvoice-fr · GitHub and you can join us on Matrix if needed.

Thank you very much. How do i know which version do i have to use ? Is it better to stay on the 0.6.1 version, and no alpha version ?

It depends on exactly what you are working on. FYI, current french model I’m working on is still compatible with 0.6.1.

I try to understand how to do it with 0.6.1, and i dont get how i can use a prebuilt trie. The trie is not generated from a specific vocabulary ?
Sorry for all those questions, i’m a newbie in all of that aha

It is, I’m talking about the prebuilt generate_trie binary we package. You need to use that to produce the trie file.