I cant find native_client.tar.xz [DeepSpeech 0.6.1]

Hello,

I am following the tutorial french model trainning but I am stucked in the part of create the trie file.
I was looking for the native_client.tar.xz where it is the binary of the generate trie, but I just can’t find it. My native_client folder has only the generate_trie.cpp file but no tar.xz.
I also tried to build the binary with the command:

bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic -c opt --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-fvisibility=hidden //native_client:libdeepspeech.so //native_client:generate_trie

Extracting Bazel installation...
ERROR: The 'build' command is only supported from within a workspace (below a directory having a WORKSPACE file).
See documentation at https://docs.bazel.build/versions/master/build-ref.html#workspace

But the above error appears. (executing in the native_client folder)

Can someone help me?

Python: 3.7.3
SO: Debian buster
GPU: RTX 2060 SUPER
Using common voice Portugues dataset.

For training I am using the commando below:

./DeepSpeech.py --checkpoint_dir /home/deep_train/checkpoint --train_files /home/deep_train/portugues/clips/train.csv --dev_files /home/deep_train/portugues/clips/dev.csv --test_files /home/deep_train/portugues/clips/test.csv --export_dir /home/deep_train/portugues --alphabet_config_path /home/deep_train/portugues/alphabet.txt --lm_binary_path /home/deep_train/portugues/clips/lm.binary --train_batch_size 8 --test_batch_size 8 --dev_batch_size 8 --export_batch_size 1 --epochs 75 > /home/train.log &

And when it is finished and I try to transcribe any audio, I only get “a”, “e” and “o” letters. Command to transcribe, for example:
deepspeech --model /home/deep_train/portugues/output_graph.pb --audio /home/deep_train/portugues/clips/common_voice_pt_19887570.wav

I appreciate any help with the trie file.
Best regards.

Have you tried looking at the release page? https://github.com/mozilla/DeepSpeech/releases/tag/v0.6.1

1 Like

Have you properly followed the documentation to build it ?

This is not actionable if you don’t share your training log. How can we know what your model learnt ? What was the test set result ?

You might want to try and join efforts on building on top of this Docker https://github.com/Common-Voice/commonvoice-fr/blob/master/DeepSpeech/Dockerfile.train like Italian community does: https://github.com/MozillaItalia/DeepSpeech-Italian-Model

What material did you use to build your LM ? Also, you pass no trie, but I guess it is because of generate_trie.

1 Like

I builded the kenlm from source.
I am using the Common Voice portuguese dataset, for lm.binary I used:

cat dev.csv >> vocabulary.txt && cat test.csv >> vocabulary.txt && cat train.csv >> vocabulary.txt
/home/deep_train/kenlm/build/bin/lmplz --text vocabulary.txt --arpa words.arpa --o 3
/home/deep_train/kenlm/build/bin/build_binary -T -s words.arpa lm.binary

cat is for create a big file with all the dev, train and test csv.

I guess it might work with some pre-filtering. But here you are also passing CSV metadata, filenames and filesize. That’s likely going to confuse a lot your LM.

For example, this should be a bit better.

$ tail -n -1 dev.csv | cut -d ',' -f 3 > vocabulary.txt

I get it, its only the text in this file, not the wav filename or file size.
Thanks.

Hello again, I will post here the log from the test part. The command to train the model:

./DeepSpeech.py --checkpoint_dir /home/deep_train/checkpoint --train_files /home/deep_train/portugues/clips/train.csv --dev_files /home/deep_train/portugues/clips/dev.csv --test_files /home/deep_train/portugues/clips/test.csv --export_dir /home/deep_train/portugues --alphabet_config_path /home/deep_train/portugues/alphabet.txt --automatic_mixed_precision=True --early_stop=False --lm_binary_path /home/deep_train/portugues/clips/lm.binary --lm_trie_path /home/deep_train/portugues/clips/trie --train_batch_size 32 --test_batch_size 16 --dev_batch_size 16 --export_batch_size 1 --learning_rate 0.00095 --epochs 100 > /home/train.log &

I used early stop false because I wanted to do all the iterations. But the log result are still very bad:

Test on /home/deep_train/portugues/clips/test.csv - WER: 0.866583, CER: 0.601828, loss: 107.132584

WER: 2.500000, CER: 0.866667, loss: 34.759850

  • wav: file:///home/deep_train/portugues/clips/common_voice_pt_19435022.wav
  • src: “ativar legendas”
  • res: “o de vale de no”

WER: 2.333333, CER: 0.629630, loss: 62.630676

  • wav: file:///home/deep_train/portugues/clips/common_voice_pt_19401141.wav
  • src: “pode solicitar participação”
  • res: “o sol e a aros e as”

WER: 2.000000, CER: 0.636364, loss: 21.256580

  • wav: file:///home/deep_train/portugues/clips/common_voice_pt_19312911.wav
  • src: “notarização”
  • res: “duraria a”

WER: 2.000000, CER: 0.700000, loss: 27.122267

  • wav: file:///home/deep_train/portugues/clips/common_voice_pt_20059720.wav
  • src: “brevemente”
  • res: “de onde”

WER: 2.000000, CER: 0.750000, loss: 30.222834

  • wav: file:///home/deep_train/portugues/clips/common_voice_pt_19403802.wav
  • src: “é lógico”
  • res: “e o de o”

WER: 2.000000, CER: 0.642857, loss: 33.146614

  • wav: file:///home/deep_train/portugues/clips/common_voice_pt_20017611.wav
  • src: “muito dinheiro”
  • res: “o de de mero”

WER: 2.000000, CER: 0.611111, loss: 33.159065

  • wav: file:///home/deep_train/portugues/clips/common_voice_pt_19375886.wav
  • src: “eu chamei benjamin”
  • res: “eu te a me de a vi”

WER: 2.000000, CER: 0.750000, loss: 36.686123

  • wav: file:///home/deep_train/portugues/clips/common_voice_pt_19458070.wav
  • src: “outra plataforma”
  • res: “de prata sua mar”

WER: 2.000000, CER: 0.750000, loss: 49.787704

  • wav: file:///home/deep_train/portugues/clips/common_voice_pt_19338110.wav
  • src: “quanto foi novamente”
  • res: “o sino na a me o”

WER: 2.000000, CER: 0.695652, loss: 51.571831

  • wav: file:///home/deep_train/portugues/clips/common_voice_pt_19420984.wav
  • src: “desporto associativismo”
  • res: “são um ano teve”

I Exporting the model…
I Models exported at /home/deep_train/portugues

The common voice Portuguese dataset has only 27h validated. is it so little that he can’t even learn?
Also, I used 100 iterations, but I am not sure it can improve so much(since 75 iter is recommended)
I fixed the trie and he is clean now, only the phrases.

Oh yeah, with 27h you can’t expect anything really. Your best bet here would be to use transfer learning. This is on master, though.