It does not looks like you are passing the correct arguments to build_binary
.
Please reply and provide the feedback on the other items I asked you to check.
It does not looks like you are passing the correct arguments to build_binary
.
Please reply and provide the feedback on the other items I asked you to check.
I was following this tutorial " How I trained a specific french model to control my robot
creating binary file :
/bin/bin/./build_binary -T -s words.arpa lm.binary
He was building it in the same way, please tell me otherwise what params to pass.
Also ,
If i use lm.binary in default package and my trie its giving core-dump. But if I use my binaries and trie given in the default package its working. Not sure why?
Because you keep insisting on :
ds_ctcdecoder
packageI will stop helping you until you actually read and act on what I aksed earlier.
I am really sorry, forgot to inform i did ran
pip install --upgrade $(python util/taskcluster.py --decoder)
but issue still persists.
I am continuously referring data/lm as well the
to generate the language model. May be i am missing some small things, just not able to get that.
Wait, can we avoid confusion and get the whole picture ? It’s completely unclear what you are doing now.
Can you cross-check and share pip list | grep ds_ctcdecoder
as well as git describe --tags
?
Do you have the crash with the default language model / trie ? Since you failed to share proper status at first, I assumed you had a mismatch …
Please, read doc and script. Don’t refer to anything else.
pip list | grep ds_ctcdecoder
ds-ctcdecoder 0.6.1
git describe --tags
v0.6.1-35-g94882fb
Yes default lm.binary and trie are working perfectly fine
Ok will check the generate_lm script and see the docs.
Weird. If you are on v0.6.1
, you should not have that tag. This shows you are on master
, so you’re going to have troubles if you don’t stick to matching versions.
I will checkout that tag and will let you know
Now i have matched versions
pip list | grep ds_ctcdecoder
ds-ctcdecoder 0.6.1
git describe --tags
v0.6.1
Now also after generating the LM, I am getting this error while training on checkpoint with my added vocabulary and my LM binary and trie
cmd=>
python3 DeepSpeech.py \
--train_files /home/Downloads/indian_train.csv \
--dev_files /home/Downloads/indian_dev.csv \
--test_files /home/Downloads/indian_test.csv \
--n_hidden 2048 \
--train_batch_size 20 \
--dev_batch_size 10 \
--test_batch_size 10 \
--epochs 1 \
--learning_rate 0.0001 \
--export_dir /home/Desktop/mark3/trieModel/ \
--checkpoint_dir /home/Desktop/mark3/DeepSpeech/deepspeech-0.6.1-checkpoint/ \
--cudnn_checkpoint /home/Desktop/mark3/DeepSpeech/deepspeech-0.6.1-checkpoint/ \
--alphabet_config_path /home/Desktop/mark3/mfit-models/alphabet.txt \
--lm_binary_path /home/Desktop/mark3/mfit-models/lm.binary \
--lm_trie_path /home/Desktop/mark3/mfit-models/trie \
Error during training after it ends doing dev, while checking the test.csv error is coming
I Restored variables from best validation checkpoint at /home/Desktop/mark3/DeepSpeech/deepspeech-0.6.1-checkpoint/best_dev-234353, step 234353
Testing model on /home/Downloads/indian_test.csv
Test epoch | Steps: 0 | Elapsed Time: 0:00:00 Fatal Python error: Fatal Python error: Fatal Python error: Fatal Python error: Segmentation faultSegmentation faultSegmentation fault
Segmentation faultThread 0x
Segmentation fault (core dumped)
There’s something wrong in your ctc decoder setup / trie production that is broken…
Can you please tell us exactly how you proceed ? I’m really starting to loose patience here.
What are the sizes of:
vocabulary.txt
filelm.binary
filetrie
fileCan you ensure you used exactly the same alphabet file ?
Maybe there’s something bogus in your dataset.
vocabulary.txt = 1.7MB
lm.binary = 20.1MB
trie = 80 Bytes
So you failed at generating the trie file. Since you have not yet shared how you do that, we can’t help you …
I have taken the generate_trie from native_client.amd64.cpu.linux.tar.xz and did ./generate_trie ../data/alphabet.txt lm.binary trie
to generate the trie
So that’s not the alphabet you are using for the training ?!
--alphabet_config_path /home/Desktop/mark3/mfit-models/alphabet.txt
that does not looks like the same path as ../data/alphabet.txt
…
i am sorry i used the same i.e. /home/Desktop/mark3/mfit-models/alphabet.txt
did paste other path here sorry
Also i checked the size of default lm.binary in ./data/lm its 945MB. Is there some issue in mine with 20MB size ?