DeepSpeech model training

I’m wrong, 0.5.1 model was not released with noise robustness improvements. But the rest of my comment is valid.

Thanks @lissyx for your response.

the audio files i mentioned were created by me only usning online tool for text to speech conversion.

Audio files works fine for me , But currently i an working on live streaming via microphone and that is not giving proper accuracy as mentioned.

i will re-check on the mic audio quality .

And yes i noticed deepspeech version in requirment file and i updated it to - 0.5.1 for my local running . It was giving error if i try to execute with version -0.4.1 ,(i think may be because i have deepspeech interface of version 0.5.1)

Thanks!!

So it’s not you speaking ?

Then maybe it is also a problem of accent.

yes… correct.

how could i mitigate it . Do i need to train the english model with my accent for example and with back-ground noice as well.

Wanted to check, is you and your team is going to releasing a noice robust model in near future for english.

Mostly, yes

This is something we are working on, but near future I can’t tell.

Maybe try some denoising library in front ?

1 Like

Can anyone enlighten me, I am stuck here Fatal Python Error: Segmentation fault.
I am using a vertual environment. and have run DeepSpeech using below .sh file.


This is my error log,

This is my .sh file
shFile

could you solve it this error .i have the same error.?
how can solve it?

Yes I have solved it finally

1 Like

HI @lissyx ,

I have started to train my model to the actual uses with the Indian ascent , But each time i get early stopping message with the default learning rate and i i increase learning rate to 0.01 or 0.05 it starts to give loss as infinite.

To cross check , i started training with the official English data set download from voice-web site , which is about 30Gb in size . I started training with default parameters for learning rate and epochs and i got same early stop message after 4 epochs .

Could you please guide me in identifying the cause/reason , what i must be missing.

Without more informations on your dataset and its size and the other training parameters, it’s hard to give a definitive answer. It’s not unlikely that your learning rate, at least, is wrong, and you should try values like 1e-5 or 1e-6.

HI @lissyx,

Currently to verify the process of training at our we have started training on official English data-set https://voice.mozilla.org/en/datasets.
Which is about 30 GB on size. we are are using default training parameters as mentioned in https://github.com/mozilla/DeepSpeech/blob/master/util/flags.py

Thanks

here is the command i am using to train dataset-

python3 DeepSpeech.py --checkpoint_dir /home/laxmikantm/deepspeech_training/checkpoints/checkpoint_local/ --export_dir /home/laxmikantm/deepspeech_training/model/ --train_files ./train.csv --dev_files ./dev.csv --test_files ./test.csv

And i get below message-

I Early stop triggered as (for last 4 steps) validation loss: 151.497941 with standard deviation: 0.247606 and mean: 151.535807
I FINISHED optimization in 1 day, 13:32:10.745527

if you dont want early stop and want to complete the epoch set --noearly_stop

1 Like

HI ,
Yeah but if model loss is not decreasing then model won’t be of much use, right?
As suggested by @lissyx , i have started training with 1-e6 learning rate , currectly it’s running on 7th epoch.

I wants to check if like learning rate to i need to change default value of any other parameters . Or there any good refrence where i could get much information about training parameters.

HI @lissyx,
i was able to verify the re-training of acoustic (output_graph.pb) model based on your comment and using other parameters from the release note of 0.5.1 model.

Now we wants to verify if we could fine tune the language model ( lm.binary & trie) with our domain related keywords . I followed below 2 discussions and what i understood is - “fine tuning of language model is not possible yet”… is that right??

You are basing your understanding on very old threads. Have a look at data/lm, it has everything you need.

@laxmikant04.yadav: if you are still looking for DeepSpeech results on German Language and training process. Check this paper and repository. It might be useful.

https://www.researchgate.net/publication/336532830_German_End-to-end_Speech_Recognition_based_on_DeepSpeech

1 Like

HI @lissyx,

I was able to create new trie and lm.binary files based on our organisation specific keywords. I followed below two references and i am using deepspeech version -0.5.1

1.TUTORIAL : How I trained a specific french model to control my robot
2.https://github.com/mozilla/DeepSpeech/tree/v0.5.1/data/lm

when i started training on newly generated trie and lm.binary files to generate acoustic model , train.csv and dev.csv gave no error but i got fatal error at test step.

Fatal Python error: Segmentation fault

when i looked more on forum , some places i found that it could be because of version mismatch , Could that be a case ? if yes, where should i be looking fist to get it right.

Note: When i train with trie and lm.binary thats are there in git repo , it works fine.

Make sure you properly create the LM and trie file as documented in data/lm. The tutorial is likely out of date, so I’d advise not to spend too much time.

What’s the size of your LM and trie? Can you verify you are using the good ones ? Can you share your exact steps for producing the trie ?

Hi @lissyx,

Below are the steps i used for generating trie and lm.binary files

Generating language model –

  • Clone deep speech git repo branch 0.5.1 with git lfs
  • install the dependencies
    • pip3 install -r requirement.txt
  • install CTC decoder
    • pip3 install $(python3 util/taskcluster.py --decoder)
  • clone tensor flow in same directory as DeepSpeech
  • in tensor flow directory run
    • git checkout origin/r1.13
  • As tensor flow version is 1.13 , so Bazel build tool version would be Bazel 0.19.2
  • I am using linux 16.04 , so based on https://docs.bazel.build/versions/master/install-ubuntu.html , Executed below commands
    • sudo apt-get install pkg-config zip g++ zlib1g-dev unzip python3
    • Downloaded Bazel version - bazel-0.19.2installer-linux-x86_64.sh
    • chmod +x bazel-0.19.2installer-linux-x86_64.sh
    • ./ bazel-0.19.2installer-linux-x86_64.sh –user
      • Used all the default/recommended options
    • export PATH="$PATH:$HOME/bin"
  • navigated to tensorflow directory and executed
    • ./configure
      • Used all the default/recommended options
    • ln -s …/DeepSpeech/native_client ./
    • bazel build --config=monolithic -c opt --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-fvisibility=hidden //native_client:libdeepspeech.so //native_client:generate_trie
  • So far, I was able to compile DeepSpeech and I could see binaries in /tensorflow/bazel-bin/native-client directory.
  • Navigated to DeepSpeech/native-client ,
    • Clone kenlm repo - git clone --depth 1 https://github.com/kpu/kenlm
    • In kenlm directory, create a build folder
    • Navigate to build folder and execute
      • cmake …
      • make -j 4
  • After above step I could see lmplz and build_binary in /DeepSpeech/native_client/kenlm/build/bin directory
  • Form /DeepSpeech/native_client/kenlm/build/bin directory , executed
    • ./lmplz --order 5 --memory 50% --text /home/laxmikantm/proto_1/vocabulary.txt --arpa /tmp/lm.apra --prune 0 0 0 1 --temp_prefix /tmp/
    • ./build_binary -a 255 -q 8 trie /tmp/lm.apra /tmp/lm.binary
  • From /tensorflow/bazel-bin/native_client directory
    • ./generate_trie /home/xxxxxxx/proto_1/vocabulary.txt /tmp/lm.binary /tmp/trie
  • After above step I had trie and lm.binary in tmp folder , I copied these files to a new folder and then used from there.

For testing purpose i have only 10 files , and generated file size is -
trie - 75 Byte
lm.binary - 9.4K