Hello, I am trying to train deepspeech on my own small dataset.
I installed DeepSpeech this way:
~$ git clone GitHub - mozilla/DeepSpeech: DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
~$ cd DeepSpeech
~/DeepSpeech$ pip3 install --user -r requirements.txt
~/DeepSpeech$ pip3 uninstall tensorflow -y
~/DeepSpeech$ pip3 install --user ‘tensorflow-gpu==1.12.0rc2’
And I am creating a new language model:
~/kenlm/build$ ./bin/lmplz --order 5 --text ~/lm/vocab.txt --arpa ~/lm/lm.arpa
~/kenlm/build$ ./bin/build_binary -a 255 -q 8 trie ~/lm/lm.arpa ~/lm/lm.binary
To create trie file:
~/DeepSpeech$ cat VERSION
0.4.0-alpha.0
~/DeepSpeech$ python3 util/taskcluster.py --branch “v0.4.0-alpha.0” --arch gpu --target .
~/DeepSpeech$ ./generate_trie ~/lm/alphabet.txt ~/lm/lm.binary ~/lm/trie
At last, when I run this:
python3 -u DeepSpeech.py
–train_files “$data_dir”/test.csv
–dev_files “$data_dir”/test.csv
–test_files “$data_dir”/test.csv
–train_batch_size 1
–dev_batch_size 1
–test_batch_size 1
–n_hidden 494
–epoch 75
–checkpoint_dir “$checkpoint_dir”
–decoder_library_path libctc_decoder_with_kenlm.so
–alphabet_config_path “$lm_dir”/alphabet.txt
–lm_binary_path “$lm_dir”/lm.binary
–lm_trie_path “$lm_dir”/trie
“$@”
I get this error:
I Training epoch 74…
100% (1 of 1) |##########################################################################################################| Elapsed Time: 0:00:00 ETA: 00:00:00Error: Trie file version mismatch (3 instead of expected 2). Update your trie file.
I Training of Epoch 74 - loss: 322.916992
I FINISHED Optimization - training time: 0:00:25
*I also tried running util/taskcluster.py without --branch option, but the result was same.
Any ideas how to solve this problem?
If you need more information about something else, please let me know!