Error: Can't parse trie file, invalid header. Try updating your trie file problem

sanjay.pandey · November 4, 2019, 10:40am

My model is giving good inference also loading model and language model but fail to load trie file. i downloaded kenml and followed step to create arpa,lm and generate trie with the same command given by @elpimous_robot but during inference i get the following error

lissyx · November 4, 2019, 10:56am

Have you tried following the docs under data/lm/README.rst ? The tutorial from @elpimous_robot is old and likely refers to different ways of generating.

sanjay.pandey · November 4, 2019, 1:34pm

Yes i followed that too but now getting “Error: Trie file version mismatch (4 instead of expected 3). Update your trie file” i also updated ds-ctcdecoder==0.6.0a11 for python 3.6 .
And on trying python3 util/taskcluster.py --branch “v0.4.1” --target “.” it gives me 404 error so not able to use generate trie from native-client.tar.gz

lissyx · November 4, 2019, 1:37pm

@sanjay.pandey Please share more context on your setup, there’s a lot of things going in every direction right now.

sanjay.pandey · November 4, 2019, 1:42pm

I have built my own language model but on running command line it gives
“Error: Trie file version mismatch (4 instead of expected 3). Update your trie file” .
So one of the solution I found out was to download native-client from python3 util/taskcluster.py --branch “v0.4.1” --target “.” and use generate trie from that but i am unable to download native-client with python3 util/taskcluster.py --branch “v0.4.1” --target “.” as it gives http 404 error.

I am not sure if this is the solution for my main problem but currently stucked at the first step itself

lissyx · November 4, 2019, 1:52pm

You are still not documenting your context. Versions, etc …

sanjay.pandey · November 4, 2019, 2:26pm

Sorry.

I am using Deepspeech 0.4.1 and have trained more than 3 lacs indian audio file from pretrained model.
I have trained them on around 20k words and hence i wanted to include the same 20k words in my language model.
So i created my language model by downloading kenml and then creating arpa then binary file then i generated trie after that when i did inference it was giving trie error but though inference was true.
What i observed is i downloaded master version of generate_trie file as i havent specified any branch while using task_cluster.py and hence i think so maybe the problem lies there but right now when i am trying to download native client by specifying branch 0.4.1 it gives me http didnt found 404 error.

lissyx · November 4, 2019, 2:34pm

Yes, this is known, please download it from github release page: Release Deep Speech 0.4.1 · mozilla/DeepSpeech · GitHub

sanjay.pandey · November 5, 2019, 6:03am

yes already doing from deepspeech 0.4.1 but this command is giving me error of 404 when i am trying to download pre built binaries and trie file python3 util/taskcluster.py --branch “v0.4.1” --target “.” and when i am running the same command without mentioning branch then it takes from master

lissyx · November 5, 2019, 6:44am

Again, download the proper native_client from the github release page.

sanjay.pandey · November 5, 2019, 7:05am

Already downloaded again 0.4.1 from the link you gave me and then tried
python3 util/taskcluster.py --target “.” without specifying branch.
After that cloned kenlm from the github and created arpa and binary and then created trie from generate trie which i got after running taskcluster file.
And then when I run my model it gives me error as
Error: Trie file version mismatch (4 instead of expected 3). Update your trie file.

Not able to understand what to do more to get rid of the error

lissyx · November 5, 2019, 7:28am

I don’t understand why you keep trying to download using taskcluster.py if you downloaded the tar from github. Extract it, generate_trie is inside.

sanjay.pandey · November 5, 2019, 7:54am

Okay i downloaded native_client.amd64.cpu.linux.tar.xz and then ran the model this time it didnt give error but it shows as running inference and then without giving result it ends with segmentation fault(core dumped)

lissyx · November 5, 2019, 8:15am

Have you used generate_trie ?

I’m afraid you really need to share us more informations …

sanjay.pandey · November 5, 2019, 8:15am

Solved thanks i was mixing version. Thank you so much.You are always first to answer on this forum irrespective of timing. I really admire that. thanks a lot.
Can you tell me what is the use of trie in lm as i was getting correct inference despite of failing trie.

sanjay.pandey · November 5, 2019, 10:38am

Segmentation fault (core dumped) problem is appearing again yes i used generate trie after extracting from native client and generate trie and using the same in inference and at the end i am getting as “Segmentation fault (core dumped)” after running inference appear on command line

lissyx · November 5, 2019, 10:49am

I can’t do divination, so you will have to share more context again … But honestly, I don’t have time to debug a segfault on 0.4.1.

sanjay.pandey · November 5, 2019, 11:10am

Sorry but can you please explain the reason to me? Cause it is only giving segmentation fault(core dumped) only when i include lm and trie and if i dont include them i am getting correct inference.
So is there something wrong in my trie file making or lm model.
The command which i am using is

for ARPA
after going into kenlm/build/bin

./lmplz --text /home/sanjay/DEEPSPEECH\ WORK/words.txt --arpa words.arpa --o 3 --discount_fallback

for binary

./build_binary -T -s words.arpa lm.binary

and then generating trie after extracting generate_trie from native_client.amd64.cpu.linux.tar.xz and then using following command

./generate_trie /home/sanjay/DEEPSPEECH\ WORK/models/alphabet.txt /home/sanjay/DEEPSPEECH\ WORK/models/lm.binary /home/sanjay/DEEPSPEECH\ WORK/models/trie

And then using it during inference in command line. Please help.

lissyx · November 5, 2019, 11:14am

Reason for what ?

And no segfault with default LM / trie ?

Please check the documentation.

lissyx · November 5, 2019, 11:17am

That does not looks like what we document in DeepSpeech/data/lm/README.md at v0.4.1 · mozilla/DeepSpeech · GitHub

Topic		Replies	Views
Error: Trie file version mismatch (4 instead of expected 3). Update your trie file DeepSpeech	20	2050	October 10, 2019
Error: Can't parse trie file, invalid header. Try updating your trie file DeepSpeech	31	2657	July 15, 2019
Language Model Creation DeepSpeech	24	3959	October 18, 2019
Error during trie creation DeepSpeech	59	4836	September 5, 2018
Trie file version mismatch (4 instead of expected 3). Update your trie file DeepSpeech	6	1054	August 19, 2019

Error: Can't parse trie file, invalid header. Try updating your trie file problem

Related topics