Error: Trie file version mismatch (4 instead of expected 3). Update your trie file

I am trying to get the mic_vad_streaming example running with my language model but it throws
Error: Trie file version mismatch (4 instead of expected 3). Update your trie file.
followed by:
Successfully loaded LM and TRIE
[The program does not stop and continues to the inference stage]
The streaming goes on to work as if there was no language model post processing(equivalent to lm_alpha,lm_beta =0)

I had this issue while training but we fixed it by installing the correct decoder version and i can not seem to fix this now. any help?

[INFO]

Python 3.6.8

tensorflow.version
ā€˜1.14.0ā€™

ds-ctcdecoder==0.6.0a0

deepspeech-gpu==0.5.0

I do not have that error when I run it with the pretrain model folder i downloaded for the deepspeech repo.

Can you please retry with proper dependencies installed pip install -r examples/mic_vad_streaming/requirements.txt ?

Also, itā€™s unclear what you did when you say:

Have you built it yourself ? You need to use generate_trie matching requirements.txt.

Thatā€™s not super consistent as well.

Previously, I ran it with this requirements. I retried it just in case anyway, same results.

Iā€™ve generated a language model with kenlm then generated the trie with the generate_trie i got after Compiling libdeepspeech.so & generate_trie.

Deepspeech version is from the mic_vad_streaming requirements. ctcdecoder is from when you fixed the same issue i had while training.

I cannot seem to find the requirements.txt. where would i find it?

So, can you justify why you rebuilt everything ? If you do so, you need to rebuilt from the matching tag.

I gave you the path earlier, under examples/mic_vad_streaming. You cannot mix DeepSpeech 0.4.1 runtime and generate your trie file with generate_trie from 0.6.0 for example ā€¦

1 Like

I couldā€™nt figure out how else iā€™d use it. Iā€™ll go back and try to figure out how to vanilla.

ah ok. iā€™ll try to fix the version mismatch as well. Thank you!

Please give us feedback on what is unclear in the docs then. I think we make it pretty clear :confused:

It just might not be clear to me. Let me try and figure out whatā€™s going on.

If i was not to rebuild everything, where and how am i to get access to the generate_trie executable? I cannot seem to find that. I might just be missing something very basic.

Well, if itā€™s not clear to you, itā€™d be great to know what is unclear.

Itā€™s packaged inside native_client.tar.xz

1 Like

Now that you said this, I looked up the releases and found what I needed[I think there is no mention of this in the readmes, which is probably why a lot of people end up rebuilding the binaries(Might be common knowledge to most though)]. I regenerated the trie with the CUDA-linux native_client and the error is gone.(Testing this remotely right now, cannot tunnel microphone. will update in 8 hours or so after thorough testing). Also, I have to use deepspeech 0.5.0 instead of 0.4.1 which was in the requirements because the model doesnt load, but thats probably because of the which version i used for training.

deepspeech==0.5.0
ds-ctcdecoder==0.6.0a0

and i picked up native_clients from 0.4.1

I am not sure if this mix should be working or not, but the error seems to have vanished. Iā€™ll keep you posted on the results!

1 Like

Looks like itā€™s working.

Is this in the readMes?

I didnā€™t know this existed, which was why i rebuilt everything. This confused me while going through the docs.

Iā€™m pretty sure it is. If you want to improve doc, do not hesitate to send a PR.

youā€™re right. The problem was in certain places, like the readme in data,generate_trie was not referenced back to where it is( And it says generated from generate_trie.cpp which was why i looked up how to build it). The main readme didnt specifically have ā€œtrieā€ related mentions either.

This was where it might have been helpful.

(which includes the deepspeech binary and associated libraries.)

Iā€™ll send a PR with some updates on the docs, hope its useful.

1 Like

the issue is that the readme says:
python util/taskcluster --target
and this will download generate_trie for version of DeepSpeech tag but that wonā€™t work.

What do you mean ? What wonā€™t work ?

I tried to build own language model from own vocabulary.txt and use it for inference from acoustic model v0.5.0
Here is what I did:
lm.binary generation using kenlm(so far nothing to do with deepspeech)
to generate trie (and not to have to compile everything) I used taskcluster.py
See:
https://github.com/mozilla/DeepSpeech/blob/master/USING.rst#using-the-command-line-client which says: python util/taskcluster.py --target .
I had checked out v0.5.0 branch.
then:
./generate_trie alphabet.txt /path-to-own/lm.binary ./output-path/trie

then inference with deepspeech (pip install deepspeech) will result in error: trie mismatch, but continue with inference and so the generated lm.binary will not be used

But, if you use the generate_trie file packed in native_client.tar.gz, like you have said so in this thread, everything will work fine.
So, I think either Iā€™m missing something here, or README.md needs to be updated or taskcluster.py is to blame

Also see: Tune MoziilaDeepSpeech to recognize specific sentences

opinion?

You need to pass branch parameter to util/taskcluster.py, default is master.

1 Like

OK, I will try to change readme or change taskcluster to make it more explicitly visible.

Well, I have spotted a few people recently struggling with that. I would have expected people to use --help and sort it out, but it looks like it is not the case. Maybe we should change the behavior and pull from current matching tag?

1 Like

yeah I had not checked the help, my bad. but to default to master is also strange behavior, if I checkout a branch, Iā€™d expect all other scripts to follow that tag.

Well, I think you are the one who filed #2418 and this is actually now fixed on master :slight_smile:

1 Like