Error: Can't parse trie file, invalid header. Try updating your trie file problem

sanjay.pandey · November 5, 2019, 6:03am

yes already doing from deepspeech 0.4.1 but this command is giving me error of 404 when i am trying to download pre built binaries and trie file python3 util/taskcluster.py --branch “v0.4.1” --target “.” and when i am running the same command without mentioning branch then it takes from master

lissyx · November 5, 2019, 6:44am

Again, download the proper native_client from the github release page.

sanjay.pandey · November 5, 2019, 7:05am

Already downloaded again 0.4.1 from the link you gave me and then tried
python3 util/taskcluster.py --target “.” without specifying branch.
After that cloned kenlm from the github and created arpa and binary and then created trie from generate trie which i got after running taskcluster file.
And then when I run my model it gives me error as
Error: Trie file version mismatch (4 instead of expected 3). Update your trie file.

Not able to understand what to do more to get rid of the error

lissyx · November 5, 2019, 7:28am

I don’t understand why you keep trying to download using taskcluster.py if you downloaded the tar from github. Extract it, generate_trie is inside.

sanjay.pandey · November 5, 2019, 7:54am

Okay i downloaded native_client.amd64.cpu.linux.tar.xz and then ran the model this time it didnt give error but it shows as running inference and then without giving result it ends with segmentation fault(core dumped)

lissyx · November 5, 2019, 8:15am

Have you used generate_trie ?

I’m afraid you really need to share us more informations …

sanjay.pandey · November 5, 2019, 8:15am

Solved thanks i was mixing version. Thank you so much.You are always first to answer on this forum irrespective of timing. I really admire that. thanks a lot.
Can you tell me what is the use of trie in lm as i was getting correct inference despite of failing trie.

sanjay.pandey · November 5, 2019, 10:38am

Segmentation fault (core dumped) problem is appearing again yes i used generate trie after extracting from native client and generate trie and using the same in inference and at the end i am getting as “Segmentation fault (core dumped)” after running inference appear on command line

lissyx · November 5, 2019, 10:49am

I can’t do divination, so you will have to share more context again … But honestly, I don’t have time to debug a segfault on 0.4.1.

sanjay.pandey · November 5, 2019, 11:10am

Sorry but can you please explain the reason to me? Cause it is only giving segmentation fault(core dumped) only when i include lm and trie and if i dont include them i am getting correct inference.
So is there something wrong in my trie file making or lm model.
The command which i am using is

for ARPA
after going into kenlm/build/bin

./lmplz --text /home/sanjay/DEEPSPEECH\ WORK/words.txt --arpa words.arpa --o 3 --discount_fallback

for binary

./build_binary -T -s words.arpa lm.binary

and then generating trie after extracting generate_trie from native_client.amd64.cpu.linux.tar.xz and then using following command

./generate_trie /home/sanjay/DEEPSPEECH\ WORK/models/alphabet.txt /home/sanjay/DEEPSPEECH\ WORK/models/lm.binary /home/sanjay/DEEPSPEECH\ WORK/models/trie

And then using it during inference in command line. Please help.

lissyx · November 5, 2019, 11:14am

Reason for what ?

And no segfault with default LM / trie ?

Please check the documentation.

lissyx · November 5, 2019, 11:17am

That does not looks like what we document in https://github.com/mozilla/DeepSpeech/blob/v0.4.1/data/lm/README.md

sanjay.pandey · November 5, 2019, 12:22pm

I tried generating trie and lm with what is given in data/lm/read.md as well and also created trie and lm but after using it. I am still getting Segmentation fault error and i dont use it on command line i get inference. so i am getting inference on default one but not on this one. i downloaded native_client.amd64.cpu.linux.tar.xz but there was also same with CUDA as well.So the error is because i am using some other generate trie or something?

lissyx · November 5, 2019, 12:27pm

I can’t reply until you share more infos. And you need to answer my previous questions.

lissyx · November 5, 2019, 12:28pm

This is unrelated to the LM.

sanjay.pandey · November 5, 2019, 12:44pm

When I am running following command to run the model

deepspeech --model /home/sanjay/DEEPSPEECH_WORK/models/output_graph.pb
–alphabet /home/sanjay/DEEPSPEECH_WORK/models/alphabet.txt
–lm /home/sanjay/DEEPSPEECH_WORK/models/lm.binary
–trie /home/sanjay/DEEPSPEECH_WORK/models/trie
–audio /home/sanjay/maharashtra1/aaloo_cheese_sandwich_grill.wav

it is giving me output

Loading language model from files /home/sanjay/DEEPSPEECH_WORK/models/lm.binary /home/sanjay/DEEPSPEECH_WORK/models/trie
Loaded language model in 0.000131s.
Running inference.
Segmentation fault (core dumped)

and When i am running without mentioning lm and trie using command it is giving me correct inference

deepspeech --model /home/sanjay/DEEPSPEECH_WORK/models/output_graph.pb
–alphabet /home/sanjay/DEEPSPEECH_WORK/models/alphabet.txt
–audio /home/sanjay/maharashtra1/aaloo_cheese_sandwich_grill.wav

Output

Loaded model in 0.101s.
Running inference.

aaloo cheese sandwich grill
Inference took 2.745s for 3.079s audio file.

Version

TensorFlow: v1.12.0-10-ge232881
DeepSpeech: v0.4.1-0-g0e40db6

lissyx · November 5, 2019, 12:50pm

I’ve asked you if you had the same behavior with the default LM. Also, what are the size of your generated files ?

sanjay.pandey · November 5, 2019, 1:01pm

Just tried with default lm and trie and no it is giving correct inference. the size of my trie and lm binary is 75 bytes and 385.5 kB .
the problem is only coming when i am running model with my generated files.

lissyx · November 5, 2019, 1:03pm

That’s very low. The LM at 385kB might be legit, but the 75 bytes one seems suspect.

sanjay.pandey · November 5, 2019, 1:58pm

yes thank you that was the problem.The trie file was of previous vocab.I mixed it now everything is working thanks a ton!!!