Error: Can't parse trie file, invalid header. Try updating your trie file problem

yes already doing from deepspeech 0.4.1 but this command is giving me error of 404 when i am trying to download pre built binaries and trie file python3 util/taskcluster.py --branch “v0.4.1” --target “.” and when i am running the same command without mentioning branch then it takes from master

Again, download the proper native_client from the github release page.

Already downloaded again 0.4.1 from the link you gave me and then tried
python3 util/taskcluster.py --target “.” without specifying branch.
After that cloned kenlm from the github and created arpa and binary and then created trie from generate trie which i got after running taskcluster file.
And then when I run my model it gives me error as
Error: Trie file version mismatch (4 instead of expected 3). Update your trie file.

Not able to understand what to do more to get rid of the error

I don’t understand why you keep trying to download using taskcluster.py if you downloaded the tar from github. Extract it, generate_trie is inside.

Okay i downloaded native_client.amd64.cpu.linux.tar.xz and then ran the model this time it didnt give error but it shows as running inference and then without giving result it ends with segmentation fault(core dumped)

Have you used generate_trie ?

I’m afraid you really need to share us more informations …

Solved thanks i was mixing version. Thank you so much.You are always first to answer on this forum irrespective of timing. I really admire that. thanks a lot.
Can you tell me what is the use of trie in lm as i was getting correct inference despite of failing trie.

Segmentation fault (core dumped) problem is appearing again yes i used generate trie after extracting from native client and generate trie and using the same in inference and at the end i am getting as “Segmentation fault (core dumped)” after running inference appear on command line

I can’t do divination, so you will have to share more context again … But honestly, I don’t have time to debug a segfault on 0.4.1.

Sorry but can you please explain the reason to me? Cause it is only giving segmentation fault(core dumped) only when i include lm and trie and if i dont include them i am getting correct inference.
So is there something wrong in my trie file making or lm model.
The command which i am using is

for ARPA
after going into kenlm/build/bin

./lmplz --text /home/sanjay/DEEPSPEECH\ WORK/words.txt --arpa words.arpa --o 3 --discount_fallback

for binary

./build_binary -T -s words.arpa lm.binary

and then generating trie after extracting generate_trie from native_client.amd64.cpu.linux.tar.xz and then using following command

./generate_trie /home/sanjay/DEEPSPEECH\ WORK/models/alphabet.txt /home/sanjay/DEEPSPEECH\ WORK/models/lm.binary /home/sanjay/DEEPSPEECH\ WORK/models/trie

And then using it during inference in command line. Please help.

Reason for what ?

And no segfault with default LM / trie ?

Please check the documentation.

That does not looks like what we document in https://github.com/mozilla/DeepSpeech/blob/v0.4.1/data/lm/README.md

I tried generating trie and lm with what is given in data/lm/read.md as well and also created trie and lm but after using it. I am still getting Segmentation fault error and i dont use it on command line i get inference. so i am getting inference on default one but not on this one. i downloaded native_client.amd64.cpu.linux.tar.xz but there was also same with CUDA as well.So the error is because i am using some other generate trie or something?

I can’t reply until you share more infos. And you need to answer my previous questions.

This is unrelated to the LM.

When I am running following command to run the model

deepspeech --model /home/sanjay/DEEPSPEECH_WORK/models/output_graph.pb
–alphabet /home/sanjay/DEEPSPEECH_WORK/models/alphabet.txt
–lm /home/sanjay/DEEPSPEECH_WORK/models/lm.binary
–trie /home/sanjay/DEEPSPEECH_WORK/models/trie
–audio /home/sanjay/maharashtra1/aaloo_cheese_sandwich_grill.wav

it is giving me output

Loading language model from files /home/sanjay/DEEPSPEECH_WORK/models/lm.binary /home/sanjay/DEEPSPEECH_WORK/models/trie
Loaded language model in 0.000131s.
Running inference.
Segmentation fault (core dumped)

and When i am running without mentioning lm and trie using command it is giving me correct inference

deepspeech --model /home/sanjay/DEEPSPEECH_WORK/models/output_graph.pb
–alphabet /home/sanjay/DEEPSPEECH_WORK/models/alphabet.txt
–audio /home/sanjay/maharashtra1/aaloo_cheese_sandwich_grill.wav

Output

Loaded model in 0.101s.
Running inference.

aaloo cheese sandwich grill
Inference took 2.745s for 3.079s audio file.

Version

TensorFlow: v1.12.0-10-ge232881
DeepSpeech: v0.4.1-0-g0e40db6

I’ve asked you if you had the same behavior with the default LM. Also, what are the size of your generated files ?

Just tried with default lm and trie and no it is giving correct inference. the size of my trie and lm binary is 75 bytes and 385.5 kB .
the problem is only coming when i am running model with my generated files.

That’s very low. The LM at 385kB might be legit, but the 75 bytes one seems suspect.

yes thank you that was the problem.The trie file was of previous vocab.I mixed it now everything is working thanks a ton!!!

1 Like