Training Vietnamese model

bem0302 · April 7, 2019, 12:21pm

Following my result of my first run to train the model I don’t know this is the bad or good result ?

Data: My data is about 12.000 files wav (train:dev:test = 9500:1700:800), the transcript is good.
I create alphabet.txt contains all my characters in data transcript.
Language model: I create LM from 100% transcript text of the data above.

./lmplz --text vocabulary.txt --arpa  words.arpa --o 3

./build_binary -T -s words.arpa  lm.binary

The lm.binary file have size about 4.1 MB. Then I use the LM to generate trie file (size of trie file is about 65KB) .

Then I start training the default source of Deepspeech.py with this command line:

python3 DeepSpeech.py \
--train_files=/Users/tringuyen/Documents/DeepSpeech/train.csv \
--test_files=/Users/tringuyen/Documents/DeepSpeech/test.csv \
--dev_files=/Users/tringuyen/Documents/DeepSpeech/dev.csv \
--alphabet_config_path=/Users/tringuyen/Documents/DeepSpeech/mymodels/alphabet.txt \
--lm_binary_path=/Users/tringuyen/Documents/DeepSpeech/mymodels/vnlm.binary \
--lm_trie_path=/Users/tringuyen/Documents/DeepSpeech/mymodels/vntrie \
--checkpoint_dir=/Users/tringuyen/Documents/DeepSpeech/myresult/checkpoints \
--export_dir=/Users/tringuyen/Documents/DeepSpeech/myresult/export \
--summary_dir=/Users/tringuyen/Documents/DeepSpeech/myresult/summary \
--epoch=80 \
--train_batch_size=64 \
--dev_batch_size=64 \
--test_batch_size=32 \
--report_count=100 \
--use_seq_length=False \
--es_std_th=0.1 \
--es_mean_th=0.1

And after 4 epochs, it’s stop training and print these result:

I Finished validating epoch 3 - loss: 377.712136
I Early stop triggered as (for last 4 steps) validation loss: 377.712136 with standard deviation: 4.922094 and mean: 363.410538
Preprocessing ['/Users/tringuyen/Documents/DeepSpeech/test.csv']
Preprocessing done
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Computing acoustic model predictions...
100% |#######################################################################################################################################################################################################################################|
Decoding predictions...
100% |#######################################################################################################################################################################################################################################|
Test - WER: 0.997571, CER: 0.974905, loss: 185.690430
--------------------------------------------------------------------------------
WER: 1.000000, CER: 5.000000, loss: 33.746929
 - src: "bà nói"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 5.000000, loss: 36.778442
 - src: "ôi dào"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 8.000000, loss: 47.495766
 - src: "vì tối đó"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 7.000000, loss: 47.781513
 - src: "trán giô"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 9.000000, loss: 48.339012
 - src: "cái khó là"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 8.000000, loss: 50.391266
 - src: "làm khung"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 25.000000, loss: 108.753731
 - src: "đàn ông có hai thứ để buồn"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 23.000000, loss: 108.777031
 - src: "là tương đối có hiệu quả"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 22.000000, loss: 108.902710
 - src: "chín mươi chín mươi mốt"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 25.000000, loss: 110.358658
 - src: "để chọn máy in cho phù hợp"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 25.000000, loss: 110.703247
 - src: "tôi có thể căm giận đủ thứ"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 23.000000, loss: 110.721542
 - src: "được mỹ ráo riết tung ra"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 23.000000, loss: 110.758110
 - src: "nhưng giấc ngủ chập chờn"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 24.000000, loss: 111.529968
 - src: "dọc hai bên sông ngàn sâu"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 22.000000, loss: 111.639183
 - src: "làm đẹp không đúng cách"
 - res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 25.000000, loss: 112.300140
 - src: "hai mươi tám hai mươi chín"
 - res: "ở "
--------------------------------------------------------------------------------
I Exporting the model...
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/tools/freeze_graph.py:232: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.convert_variables_to_constants
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/framework/graph_util_impl.py:245: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
I Models exported at /Users/tringuyen/Documents/DeepSpeech/myresult/export

I don’t know what is going on with this result. Is this bad or good after 4 epochs ? What can I do to make the result more better.

jageshmaharjan · April 8, 2019, 3:15am

I had the same issue, while training on Chinese language.
Let me get back to you later for what i did.
But i think it’s good to increase the language model, by adding other text to the transcript dataset.

bem0302 · April 10, 2019, 2:29am

I’m still stucking in this issue. Hope someone can help me.

Tudou880306 · April 25, 2019, 2:49am

command line add --early_stop False . these flags can be found in /util/flags.py

bem0302 · April 25, 2019, 2:56am

No no, I just want to know is my data too bad? Why all the results output ares the same.

Tudou880306 · April 25, 2019, 3:15am

i think you need to change the batchsize and learning rate ,your mode not convergence

noor_e_emaan11 · May 15, 2019, 8:53am

I am working on Urdu language and facing the same issue.
Please help. If somebody find a solution of this problem.
My data is almost 100 hours.

noor_e_emaan11 · May 15, 2019, 8:56am

@lissyx @kdavis Kindly help.

kdavis · May 15, 2019, 9:05am

What does “same issue” mean in this context?

noor_e_emaan11 · May 15, 2019, 9:18am

That my trained model gives me only one word for all files. Either I checked it on train files or test files.

noor_e_emaan11 · May 15, 2019, 9:24am

When I trained on a very small corpus, almost 6 files, it learned only one word, though that word doesn’t belong to any set of data. and it gives the same word if I tried to decode either on train file or test file.

and When I tried on almost 100 hours, it doesn’t give any single word in the results.
Epochs: 20, with early stop = TRUE
Learning rate: 0.0001

Why is it so ?

kdavis · May 15, 2019, 9:41am

What happens if you run run-ldc93s1.sh as follows:

(.virtualenv) kdavis-19htdh:DeepSpeech kdavis$ ./bin/run-ldc93s1.sh

noor_e_emaan11 · May 15, 2019, 12:45pm

WER: 1.000000, CER: 48.000000, loss: 27.773754

src: “she had your dark suit in greasy wash water all year”
res: “edted”
This is the result of ldc93s1.

kdavis · May 15, 2019, 1:51pm

I’m not sure how but your install is very broken. This is basically a “smoke test” that every one of our PR’s has to pass.

If I were you, I’d check everything out from scratch and follow the README instructions again.

noor_e_emaan11 · May 16, 2019, 4:49am

Are we suppose to train the model while activating the virtual environment ?

kdavis · May 16, 2019, 5:04am

It’s recommended that you use a virtual environment.

noor_e_emaan11 · May 16, 2019, 7:26am

I followed the github guideline throughout and working on 0.4.1 master.
I also got the exported model with best validation loss.

Installation:
Linux: 16.04 LTS
CUDA 9.0
CUDNN= 7.1.3
Python 3.6.3
DeepSpeech 0.4.1 master
requirements.txt installed. but Tensorflow-gpu ==1.12.0
Installed LFS from the link given on github
Bazel 0.5.1
Downloaded and checked the pre-trained model from Common voice utterances and results are almost 99%
install CTC
Data prepared in CSV format.
Build the native client
have language model prepared from KenLM, generate trie file.

There is no error I got in training and got the .pb model in the end and also results gave me WER though it doesn’t have anything in decoded output.

Please let me know where I am doing wrong,
Thank you for all the help.

kdavis · May 16, 2019, 8:55am

There are many steps here and a problem can creep in anywhere.

To help in debugging can you supply the final training log?

noor_e_emaan11 · May 17, 2019, 4:26am

log.zip (106.9 KB)

Here it is.

kdavis · May 17, 2019, 5:02am

Don’t know if it’s the encoding, but the log looks like line noise.

Topic		Replies	Views
Can we use DeepSpeech for Vietnamese Speech To Text? DeepSpeech	38	7151	January 25, 2022
Should i need to create lm.binary and trie with my own data DeepSpeech	0	298	September 10, 2019
Issue with Language Model DeepSpeech	11	1035	January 3, 2019
Custom Model created by me has a small size DeepSpeech	1	405	October 10, 2018
DeepSpeech Training Problems for Brazilian Portuguese DeepSpeech	7	3912	June 13, 2020

Training Vietnamese model

Related topics