Error: Trie file version mismatch (4 instead of expected 3). Update your trie file

alchemi5t · July 15, 2019, 7:13am

I am trying to get the mic_vad_streaming example running with my language model but it throws
Error: Trie file version mismatch (4 instead of expected 3). Update your trie file.
followed by:
Successfully loaded LM and TRIE
[The program does not stop and continues to the inference stage]
The streaming goes on to work as if there was no language model post processing(equivalent to lm_alpha,lm_beta =0)

I had this issue while training but we fixed it by installing the correct decoder version and i can not seem to fix this now. any help?

[INFO]

Python 3.6.8

tensorflow.version
‘1.14.0’

ds-ctcdecoder==0.6.0a0

deepspeech-gpu==0.5.0

I do not have that error when I run it with the pretrain model folder i downloaded for the deepspeech repo.

lissyx · July 15, 2019, 7:27am

Can you please retry with proper dependencies installed pip install -r examples/mic_vad_streaming/requirements.txt ?

Also, it’s unclear what you did when you say:

Have you built it yourself ? You need to use generate_trie matching requirements.txt.

That’s not super consistent as well.

alchemi5t · July 15, 2019, 7:35am

Previously, I ran it with this requirements. I retried it just in case anyway, same results.

I’ve generated a language model with kenlm then generated the trie with the generate_trie i got after Compiling libdeepspeech.so & generate_trie.

Deepspeech version is from the mic_vad_streaming requirements. ctcdecoder is from when you fixed the same issue i had while training.

I cannot seem to find the requirements.txt. where would i find it?

lissyx · July 15, 2019, 7:45am

So, can you justify why you rebuilt everything ? If you do so, you need to rebuilt from the matching tag.

I gave you the path earlier, under examples/mic_vad_streaming. You cannot mix DeepSpeech 0.4.1 runtime and generate your trie file with generate_trie from 0.6.0 for example …

alchemi5t · July 15, 2019, 7:50am

I could’nt figure out how else i’d use it. I’ll go back and try to figure out how to vanilla.

ah ok. i’ll try to fix the version mismatch as well. Thank you!

lissyx · July 15, 2019, 7:57am

Please give us feedback on what is unclear in the docs then. I think we make it pretty clear

alchemi5t · July 15, 2019, 8:06am

It just might not be clear to me. Let me try and figure out what’s going on.

If i was not to rebuild everything, where and how am i to get access to the generate_trie executable? I cannot seem to find that. I might just be missing something very basic.

lissyx · July 15, 2019, 10:39am

Well, if it’s not clear to you, it’d be great to know what is unclear.

It’s packaged inside native_client.tar.xz

alchemi5t · July 15, 2019, 5:31pm

Now that you said this, I looked up the releases and found what I needed[I think there is no mention of this in the readmes, which is probably why a lot of people end up rebuilding the binaries(Might be common knowledge to most though)]. I regenerated the trie with the CUDA-linux native_client and the error is gone.(Testing this remotely right now, cannot tunnel microphone. will update in 8 hours or so after thorough testing). Also, I have to use deepspeech 0.5.0 instead of 0.4.1 which was in the requirements because the model doesnt load, but thats probably because of the which version i used for training.

deepspeech==0.5.0
ds-ctcdecoder==0.6.0a0

and i picked up native_clients from 0.4.1

I am not sure if this mix should be working or not, but the error seems to have vanished. I’ll keep you posted on the results!

alchemi5t · July 16, 2019, 4:24am

Looks like it’s working.

Is this in the readMes?

I didn’t know this existed, which was why i rebuilt everything. This confused me while going through the docs.

lissyx · July 16, 2019, 7:26am

I’m pretty sure it is. If you want to improve doc, do not hesitate to send a PR.

alchemi5t · July 16, 2019, 8:09am

you’re right. The problem was in certain places, like the readme in data,generate_trie was not referenced back to where it is( And it says generated from generate_trie.cpp which was why i looked up how to build it). The main readme didnt specifically have “trie” related mentions either.

This was where it might have been helpful.

(which includes the deepspeech binary and associated libraries.)

I’ll send a PR with some updates on the docs, hope its useful.

safas · October 8, 2019, 9:11am

the issue is that the readme says:
python util/taskcluster --target
and this will download generate_trie for version of DeepSpeech tag but that won’t work.

lissyx · October 8, 2019, 9:16am

What do you mean ? What won’t work ?

safas · October 8, 2019, 9:58pm

I tried to build own language model from own vocabulary.txt and use it for inference from acoustic model v0.5.0
Here is what I did:
lm.binary generation using kenlm(so far nothing to do with deepspeech)
to generate trie (and not to have to compile everything) I used taskcluster.py
See:
https://github.com/mozilla/DeepSpeech/blob/master/USING.rst#using-the-command-line-client which says: python util/taskcluster.py --target .
I had checked out v0.5.0 branch.
then:
./generate_trie alphabet.txt /path-to-own/lm.binary ./output-path/trie

then inference with deepspeech (pip install deepspeech) will result in error: trie mismatch, but continue with inference and so the generated lm.binary will not be used

But, if you use the generate_trie file packed in native_client.tar.gz, like you have said so in this thread, everything will work fine.
So, I think either I’m missing something here, or README.md needs to be updated or taskcluster.py is to blame

Also see: Tune MoziilaDeepSpeech to recognize specific sentences

opinion?

lissyx · October 9, 2019, 6:56am

You need to pass branch parameter to util/taskcluster.py, default is master.

safas · October 9, 2019, 7:11am

OK, I will try to change readme or change taskcluster to make it more explicitly visible.

lissyx · October 9, 2019, 7:17am

Well, I have spotted a few people recently struggling with that. I would have expected people to use --help and sort it out, but it looks like it is not the case. Maybe we should change the behavior and pull from current matching tag?

safas · October 9, 2019, 7:18am

yeah I had not checked the help, my bad. but to default to master is also strange behavior, if I checkout a branch, I’d expect all other scripts to follow that tag.

lissyx · October 10, 2019, 4:47pm

Well, I think you are the one who filed #2418 and this is actually now fixed on master