Following my result of my first run to train the model I don’t know this is the bad or good result ?
-
Data: My data is about 12.000 files wav (train:dev:test = 9500:1700:800), the transcript is good.
-
I create
alphabet.txt
contains all my characters in data transcript. -
Language model: I create LM from 100% transcript text of the data above.
./lmplz --text vocabulary.txt --arpa words.arpa --o 3
./build_binary -T -s words.arpa lm.binary
The lm.binary
file have size about 4.1 MB. Then I use the LM to generate trie file (size of trie file is about 65KB) .
- Then I start training the default source of Deepspeech.py with this command line:
python3 DeepSpeech.py \
--train_files=/Users/tringuyen/Documents/DeepSpeech/train.csv \
--test_files=/Users/tringuyen/Documents/DeepSpeech/test.csv \
--dev_files=/Users/tringuyen/Documents/DeepSpeech/dev.csv \
--alphabet_config_path=/Users/tringuyen/Documents/DeepSpeech/mymodels/alphabet.txt \
--lm_binary_path=/Users/tringuyen/Documents/DeepSpeech/mymodels/vnlm.binary \
--lm_trie_path=/Users/tringuyen/Documents/DeepSpeech/mymodels/vntrie \
--checkpoint_dir=/Users/tringuyen/Documents/DeepSpeech/myresult/checkpoints \
--export_dir=/Users/tringuyen/Documents/DeepSpeech/myresult/export \
--summary_dir=/Users/tringuyen/Documents/DeepSpeech/myresult/summary \
--epoch=80 \
--train_batch_size=64 \
--dev_batch_size=64 \
--test_batch_size=32 \
--report_count=100 \
--use_seq_length=False \
--es_std_th=0.1 \
--es_mean_th=0.1
And after 4 epochs, it’s stop training and print these result:
I Finished validating epoch 3 - loss: 377.712136
I Early stop triggered as (for last 4 steps) validation loss: 377.712136 with standard deviation: 4.922094 and mean: 363.410538
Preprocessing ['/Users/tringuyen/Documents/DeepSpeech/test.csv']
Preprocessing done
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Computing acoustic model predictions...
100% |#######################################################################################################################################################################################################################################|
Decoding predictions...
100% |#######################################################################################################################################################################################################################################|
Test - WER: 0.997571, CER: 0.974905, loss: 185.690430
--------------------------------------------------------------------------------
WER: 1.000000, CER: 5.000000, loss: 33.746929
- src: "bà nói"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 5.000000, loss: 36.778442
- src: "ôi dào"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 8.000000, loss: 47.495766
- src: "vì tối đó"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 7.000000, loss: 47.781513
- src: "trán giô"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 9.000000, loss: 48.339012
- src: "cái khó là"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 8.000000, loss: 50.391266
- src: "làm khung"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 25.000000, loss: 108.753731
- src: "đàn ông có hai thứ để buồn"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 23.000000, loss: 108.777031
- src: "là tương đối có hiệu quả"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 22.000000, loss: 108.902710
- src: "chín mươi chín mươi mốt"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 25.000000, loss: 110.358658
- src: "để chọn máy in cho phù hợp"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 25.000000, loss: 110.703247
- src: "tôi có thể căm giận đủ thứ"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 23.000000, loss: 110.721542
- src: "được mỹ ráo riết tung ra"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 23.000000, loss: 110.758110
- src: "nhưng giấc ngủ chập chờn"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 24.000000, loss: 111.529968
- src: "dọc hai bên sông ngàn sâu"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 22.000000, loss: 111.639183
- src: "làm đẹp không đúng cách"
- res: "ở "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 25.000000, loss: 112.300140
- src: "hai mươi tám hai mươi chín"
- res: "ở "
--------------------------------------------------------------------------------
I Exporting the model...
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/tools/freeze_graph.py:232: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.convert_variables_to_constants
WARNING:tensorflow:From /usr/local/lib/python3.7/site-packages/tensorflow/python/framework/graph_util_impl.py:245: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
I Models exported at /Users/tringuyen/Documents/DeepSpeech/myresult/export
I don’t know what is going on with this result. Is this bad or good after 4 epochs ? What can I do to make the result more better.