DeepSpeech model training

cryptoaimdy · October 14, 2019, 9:14am

if you dont want early stop and want to complete the epoch set --noearly_stop

laxmikant04.yadav · October 14, 2019, 9:21am

HI ,
Yeah but if model loss is not decreasing then model won’t be of much use, right?
As suggested by @lissyx , i have started training with 1-e6 learning rate , currectly it’s running on 7th epoch.

I wants to check if like learning rate to i need to change default value of any other parameters . Or there any good refrence where i could get much information about training parameters.

laxmikant04.yadav · October 17, 2019, 11:28am

HI @lissyx,
i was able to verify the re-training of acoustic (output_graph.pb) model based on your comment and using other parameters from the release note of 0.5.1 model.

Now we wants to verify if we could fine tune the language model ( lm.binary & trie) with our domain related keywords . I followed below 2 discussions and what i understood is - “fine tuning of language model is not possible yet”… is that right??

lissyx · October 17, 2019, 11:33am

You are basing your understanding on very old threads. Have a look at data/lm, it has everything you need.

agarwalaashish20 · October 22, 2019, 8:54am

@laxmikant04.yadav: if you are still looking for DeepSpeech results on German Language and training process. Check this paper and repository. It might be useful.

https://www.researchgate.net/publication/336532830_German_End-to-end_Speech_Recognition_based_on_DeepSpeech

laxmikant04.yadav · November 7, 2019, 10:37am

HI @lissyx,

I was able to create new trie and lm.binary files based on our organisation specific keywords. I followed below two references and i am using deepspeech version -0.5.1

1.TUTORIAL : How I trained a specific french model to control my robot
2.https://github.com/mozilla/DeepSpeech/tree/v0.5.1/data/lm

when i started training on newly generated trie and lm.binary files to generate acoustic model , train.csv and dev.csv gave no error but i got fatal error at test step.

Fatal Python error: Segmentation fault

when i looked more on forum , some places i found that it could be because of version mismatch , Could that be a case ? if yes, where should i be looking fist to get it right.

Note: When i train with trie and lm.binary thats are there in git repo , it works fine.

lissyx · November 7, 2019, 11:23am

Make sure you properly create the LM and trie file as documented in data/lm. The tutorial is likely out of date, so I’d advise not to spend too much time.

What’s the size of your LM and trie? Can you verify you are using the good ones ? Can you share your exact steps for producing the trie ?

laxmikant04.yadav · November 7, 2019, 12:04pm

Hi @lissyx,

Below are the steps i used for generating trie and lm.binary files

Generating language model –

Clone deep speech git repo branch 0.5.1 with git lfs
- git clone --branch v0.5.1 https://github.com/mozilla/DeepSpeech.git
install the dependencies
- pip3 install -r requirement.txt
install CTC decoder
- pip3 install $(python3 util/taskcluster.py --decoder)
clone tensor flow in same directory as DeepSpeech
- git clone https://github.com/mozilla/tensorflow.git
in tensor flow directory run
- git checkout origin/r1.13
As tensor flow version is 1.13 , so Bazel build tool version would be Bazel 0.19.2
I am using linux 16.04 , so based on https://docs.bazel.build/versions/master/install-ubuntu.html , Executed below commands
- sudo apt-get install pkg-config zip g++ zlib1g-dev unzip python3
- Downloaded Bazel version - bazel-0.19.2installer-linux-x86_64.sh
- chmod +x bazel-0.19.2installer-linux-x86_64.sh
- ./ bazel-0.19.2installer-linux-x86_64.sh –user
  - Used all the default/recommended options
- export PATH="$PATH:$HOME/bin"
navigated to tensorflow directory and executed
- ./configure
  - Used all the default/recommended options
- ln -s …/DeepSpeech/native_client ./
- bazel build --config=monolithic -c opt --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-fvisibility=hidden //native_client:libdeepspeech.so //native_client:generate_trie
So far, I was able to compile DeepSpeech and I could see binaries in /tensorflow/bazel-bin/native-client directory.
Navigated to DeepSpeech/native-client ,
- Clone kenlm repo - git clone --depth 1 https://github.com/kpu/kenlm
- In kenlm directory, create a build folder
- Navigate to build folder and execute
  - cmake …
  - make -j 4
After above step I could see lmplz and build_binary in /DeepSpeech/native_client/kenlm/build/bin directory
Form /DeepSpeech/native_client/kenlm/build/bin directory , executed
- ./lmplz --order 5 --memory 50% --text /home/laxmikantm/proto_1/vocabulary.txt --arpa /tmp/lm.apra --prune 0 0 0 1 --temp_prefix /tmp/
- ./build_binary -a 255 -q 8 trie /tmp/lm.apra /tmp/lm.binary
From /tensorflow/bazel-bin/native_client directory
- ./generate_trie /home/xxxxxxx/proto_1/vocabulary.txt /tmp/lm.binary /tmp/trie
After above step I had trie and lm.binary in tmp folder , I copied these files to a new folder and then used from there.

For testing purpose i have only 10 files , and generated file size is -
trie - 75 Byte
lm.binary - 9.4K

lissyx · November 7, 2019, 1:49pm

You don’t need to rebuild libdeepspeech.so, nor generate_trie, just download the prebuilt ones ?

Check your trie creation, 75 bytes is wrong.

You may need to adjust the lmplz parameters as well, if you don’t hve a lot of data.

lissyx · November 7, 2019, 1:51pm

I don’t see a python virtualenv being setup, you should use one.

especially this might get tricky, can you pip3 list and share ?

laxmikant04.yadav · November 7, 2019, 2:54pm

Hi @lissyx

here is the list -

Package Version

absl-py 0.8.1
astor 0.8.0
attrdict 2.0.1
audioread 2.1.8
bcrypt 3.1.7
beautifulsoup4 4.8.1
bs4 0.0.1
certifi 2019.9.11
cffi 1.13.2
chardet 3.0.4
cryptography 2.8
cycler 0.10.0
decorator 4.4.1
ds-ctcdecoder 0.5.1
gast 0.3.2
grpcio 1.25.0
h5py 2.10.0
idna 2.8
joblib 0.14.0
Keras-Applications 1.0.8
Keras-Preprocessing 1.1.0
kiwisolver 1.1.0
librosa 0.7.1
llvmlite 0.30.0
Markdown 3.1.1
matplotlib 3.0.3
mock 3.0.5
numba 0.46.0
numpy 1.15.4
pandas 0.24.2
paramiko 2.6.0
pip 19.3.1
pkg-resources 0.0.0
progressbar2 3.47.0
protobuf 3.10.0
pycparser 2.19
PyNaCl 1.3.0
pyparsing 2.4.4
python-dateutil 2.8.1
python-utils 2.3.0
pytz 2019.3
pyxdg 0.26
requests 2.22.0
resampy 0.2.2
scikit-learn 0.21.3
scipy 1.3.1
setuptools 41.6.0
six 1.13.0
SoundFile 0.10.2
soupsieve 1.9.5
sox 1.3.7
tensorboard 1.13.1
tensorflow 1.13.1
tensorflow-estimator 1.13.0
termcolor 1.1.0
urllib3 1.25.6
Werkzeug 0.16.0
wheel 0.33.6

laxmikant04.yadav · November 7, 2019, 2:55pm

i created a virtualenv , sorry i forgot to mention

laxmikant04.yadav · November 7, 2019, 3:04pm

sorry i didn’t get that . You mean downloading these file externally .
i can see generate_trie.cpp in /DeepSpeech/native_client folder.

PS: forgive me for replying separately , will keep in mind from next time

lissyx · November 7, 2019, 4:01pm

Yeah, generate_trie is being bundled in native_client.tar.xz

laxmikant04.yadav · November 8, 2019, 12:44pm

HI @lissyx,

I was able to generate language model and test it using below steps. Thank-you for your help.

Created virtualenv -
- virtualenv -p python3 $HOME/tmp/deepspeech-venv/
- source $HOME/tmp/deepspeech-venv/bin/activate
Clone deep speech git repo branch 0.5.1 with git lfs
- git clone --branch v0.5.1 https://github.com/mozilla/DeepSpeech.git
install the dependencies
- pip3 install -r requirement.txt
- pip3 uninstall tensorflow
- pip3 install ‘tensorflow-gpu==1.13.1’
install CTC decoder
- pip3 install $(python3 util/taskcluster.py --decoder)
download native_client from - https://github.com/mozilla/DeepSpeech/releases/download/v0.5.1/native_client.amd64.cuda.linux.tar.xz and extracted it into external_native_client folder in the same directory as DeepSpeech.
Navigated to DeepSpeech/native-client ,
- Delete existing kenlm folder - rm -rf kenlm
- Clone kenlm repo - got clone https://github.com/kpu/kenlm.git
- In kenlm directory, create a build folder
- Navigate to build folder and execute
  - cmake …
  - make -j 4
Form /DeepSpeech/native_client/kenlm/build/bin directory , executed
- ./lmplz --text /home/laxmikantm/proto_1/vocabulary.txt --arpa /tmp/words.arpa --order 5 --discount_fallback --temp_prefix /tmp/
- ./build_binary -T -s trie /tmp/words.arpa /tmp/lm.binary
Now using the external_native_client files created trie file
- ./generate_trie /home/laxmikantm/DeepSpeech/data/alphabet.txt /tmp/lm.binary /tmp/trie
After above step I had trie and lm.binary in tmp folder , I copied these files to a new folder and then used from there.

Thanks!!

lissyx · November 8, 2019, 1:21pm

Thanks @laxmikant04.yadav. Have you been able to identify the mistake you did in the past? If you can expose it clearly it might help others

laxmikant04.yadav · November 8, 2019, 2:27pm

Hi @lissyx,

Below are the major changes that made me to get it right -

I am working on deepspeech model 0.5.1 , but for language model creation i was referring to latest doc which are of version -0.6.0-alpha.14. So please refer to the correct version docs
I was setting up tensor flow and Bazel separability but as you advised we don’t need to do that. We can get the native_client.tar.gz from the release page .
Instead of cloaning kenLM as -
git clone --depth 1 https://github.com/kpu/kenlm , i cloned it as
git clone https://github.com/kpu/kenlm.git . ( I am not sure if it was making a difference)
Again not so sure, but when i set-up on normal tensor flow i got error segmentation fault (core dumped) while testing the files . And when i did the set-up again in new virtualenv with tensorflow-gpu it worked fine for me . (It Could be it issue the virtualenv)
And most most important, DO set-up in new virtualenv

Thanks!!

lissyx · November 8, 2019, 3:42pm

If you are referring to the links, those are right if you select the right branch.

I’m not sure why people constantly go to the full build, if you see the pattern in the doc that leads to that, please feel free to open an issue / send a PR.

Should make no difference

Was you first virtualenv fresh, or an old one ? I got into that as well, with old virtualenv on a debian sid that gets upgraded regularly.

laxmikant04.yadav · November 12, 2019, 11:33am

HI @lissyx,

this comments may look like the repeated one to the ones discussed earlier, forgive me for that.

But here are my observation-

Newly trained language and acoustic model is working good the the data i trained.
But when i recorded audio files for few sentences (10 sentences) and created i whole new language and acoustic model , it not giving good accuracy .(very poor accuracy).

i know there is bit of back-ground noise , but we were expecting to work on training files , and as it was new training so it was not using any previous files/checkout.

I tried changing following hyper-parameters -
test batch size- 1/2/3
dev batch size- 1/2/3
train batch size- 1/2/3
learning rate - 0.001/0.0001/0.0001/0.00001

wanted to check is , if need to really clean audio files to have literally zero background noise or there can be some training parameters that i need to play with to get it right?

Any suggestions please

lissyx · November 12, 2019, 12:48pm

Sorry, I don’t get the exact question here.