Hi @lissyx,
I have followed the same hyper-parameters as like pre-trained models and i got output
Use standard file APIs to delete files with this prefix.
Epoch 0 | Training | Elapsed Time: 1:12:57 | Steps: 7574 | Loss: 44.740725
Epoch 0 | Validation | Elapsed Time: 0:04:18 | Steps: 1528 | Loss: 49.311232 | Dataset: /home/speech/DeepSpeech1/data/corpus/clips/dev.csv
I Saved new best validating model with loss 49.311232 to: /home/speech/DeepSpeech/data/checkpoint/best_dev-474930
Epoch 1 | Training | Elapsed Time: 1:13:19 | Steps: 7574 | Loss: 40.200257
Epoch 1 | Validation | Elapsed Time: 0:04:16 | Steps: 1528 | Loss: 50.055145 | Dataset: /home/speech/DeepSpeech1/data/corpus/clips/dev.csv
Epoch 2 | Training | Elapsed Time: 1:13:16 | Steps: 7574 | Loss: 37.661443
Epoch 2 | Validation | Elapsed Time: 0:04:17 | Steps: 1528 | Loss: 48.825582 | Dataset: /home/speech/DeepSpeech1/data/corpus/clips/dev.csv
I Saved new best validating model with loss 48.825582 to: /home/speech/DeepSpeech/data/checkpoint/best_dev-490078
Epoch 3 | Training | Elapsed Time: 1:13:19 | Steps: 7574 | Loss: 36.223534
Epoch 3 | Validation | Elapsed Time: 0:04:20 | Steps: 1528 | Loss: 47.598456 | Dataset: /home/speech/DeepSpeech1/data/corpus/clips/dev.csv
I Saved new best validating model with loss 47.598456 to: /home/speech/DeepSpeech/data/checkpoint/best_dev-497652
Epoch 4 | Training | Elapsed Time: 1:13:15 | Steps: 7574 | Loss: 34.446500
Epoch 4 | Validation | Elapsed Time: 0:04:13 | Steps: 1528 | Loss: 47.315255 | Dataset: /home/speech/DeepSpeech1/data/corpus/clips/dev.csv
I Saved new best validating model with loss 47.315255 to: /home/speech/DeepSpeech/data/checkpoint/best_dev-505226
Epoch 5 | Training | Elapsed Time: 1:13:16 | Steps: 7574 | Loss: 31.589634
Epoch 5 | Validation | Elapsed Time: 0:04:18 | Steps: 1528 | Loss: 47.296769 | Dataset: /home/speech/DeepSpeech1/data/corpus/clips/dev.csv
I Saved new best validating model with loss 47.296769 to: /home/speech/DeepSpeech/data/checkpoint/best_dev-512800
Epoch 6 | Training | Elapsed Time: 1:13:17 | Steps: 7574 | Loss: 30.819907
Epoch 6 | Validation | Elapsed Time: 0:04:16 | Steps: 1528 | Loss: 48.929547 | Dataset: /home/speech/DeepSpeech1/data/corpus/clips/dev.csv
I Early stop triggered as (for last 4 steps) validation loss: 48.929547 with standard deviation: 0.138065 and mean: 47.403493
I FINISHED optimization in 9:02:49.285819
INFO:tensorflow:Restoring parameters from /home/speech/DeepSpeech/data/checkpoint/best_dev-512800
I0921 03:05:05.785426 140454949234496 saver.py:1280] Restoring parameters from /home/speech/DeepSpeech/data/checkpoint/best_dev-512800
I Restored variables from best validation checkpoint at /home/speech/DeepSpeech/data/checkpoint/best_dev-512800, step 512800
Testing model on /home/speech/DeepSpeech1/data/corpus/clips/test.csv
Test epoch | Steps: 3014 | Elapsed Time: 0:18:11
Test on /home/speech/DeepSpeech1/data/corpus/clips/test.csv - WER: 0.545397, CER: 0.351697, loss: 54.847588
--------------------------------------------------------------------------------
WER: 8.000000, CER: 3.666667, loss: 146.070511
- wav: file:///home/speech/DeepSpeech1/data/corpus/clips/common_voice_en_54384.wav
- src: "undefined"
- res: "the dinner and the man is durandarte in"
--------------------------------------------------------------------------------
WER: 2.250000, CER: 2.764706, loss: 234.570007
- wav: file:///home/speech/DeepSpeech1/data/corpus/clips/common_voice_en_17645060.wav
- src: "did you know that"
- res: "it detentat the juno that tecolote the denotat the quotas"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.571429, loss: 10.787078
- wav: file:///home/speech/DeepSpeech1/data/corpus/clips/common_voice_en_18320583.wav
- src: "nosiree"
- res: "no there"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.750000, loss: 13.659227
- wav: file:///home/speech/DeepSpeech1/data/corpus/clips/common_voice_en_191353.wav
- src: "amen"
- res: "a man "
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.454545, loss: 20.949049
- wav: file:///home/speech/DeepSpeech1/data/corpus/clips/common_voice_en_16047346.wav
- src: "kettledrums"
- res: "catal drams"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.727273, loss: 31.451639
- wav: file:///home/speech/DeepSpeech1/data/corpus/clips/common_voice_en_629809.wav
- src: "kettledrums"
- res: "do drames"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.642857, loss: 36.353798
- wav: file:///home/speech/DeepSpeech1/data/corpus/clips/common_voice_en_17267925.wav
- src: "any volunteers"
- res: "and i am the"
--------------------------------------------------------------------------------
WER: 1.666667, CER: 0.814815, loss: 59.893326
- wav: file:///home/speech/DeepSpeech1/data/corpus/clips/common_voice_en_2421.wav
- src: "programming requires brains"
- res: "the game is won the"
--------------------------------------------------------------------------------
WER: 1.600000, CER: 0.689655, loss: 63.139175
- wav: file:///home/speech/DeepSpeech1/data/corpus/clips/common_voice_en_18798367.wav
- src: "valentin dubinin has two sons"
- res: "the man on divan in his cheek on"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.500000, loss: 12.955254
- wav: file:///home/speech/DeepSpeech1/data/corpus/clips/common_voice_en_599362.wav
- src: "itching palm"
- res: "i in part"
--------------------------------------------------------------------------------
I Exporting the model...
INFO:tensorflow:Restoring parameters from /home/speech/DeepSpeech/data/checkpoint/train-520374
I0921 03:23:23.183858 140454949234496 saver.py:1280] Restoring parameters from /home/speech/DeepSpeech/data/checkpoint/train-520374
WARNING:tensorflow:From /home/tmp/deepspeech-venv/lib/python3.6/site-packages/tensorflow/python/tools/freeze_graph.py:233: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
W0921 03:23:23.356780 140454949234496 deprecation.py:323] From /home/tmp/deepspeech-venv/lib/python3.6/site-packages/tensorflow/python/tools/freeze_graph.py:233: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
WARNING:tensorflow:From /home/tmp/deepspeech-venv/lib/python3.6/site-packages/tensorflow/python/framework/graph_util_impl.py:270: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
W0921 03:23:23.356925 140454949234496 deprecation.py:323] From /home/tmp/deepspeech-venv/lib/python3.6/site-packages/tensorflow/python/framework/graph_util_impl.py:270: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
INFO:tensorflow:Froze 12 variables.
I0921 03:23:23.395812 140454949234496 graph_util_impl.py:311] Froze 12 variables.
INFO:tensorflow:Converted 12 variables to const ops.
I0921 03:23:23.472876 140454949234496 graph_util_impl.py:364] Converted 12 variables to const ops.
I Models exported at /home/speech/DeepSpeech/data/export/
Here is command i have followed
export TF_FORCE_GPU_ALLOW_GROWTH=true
python -u DeepSpeech.py \
--checkpoint_dir /home/speech/DeepSpeech/data/checkpoint/ \
--train_files /home/speech/DeepSpeech1/data/corpus/clips/train.csv \
--dev_files /home/speech/DeepSpeech1/data/corpus/clips/dev.csv \
--test_files /home/speech/DeepSpeech1/data/corpus/clips/test.csv \
--train_batch_size 8 \
--dev_batch_size 8 \
--test_batch_size 4 \
--n_hidden 2048 \
--learning_rate 0.0001 \
--dropout_rate 0.15 \
--epochs 75 \
--lm_alpha 0.75 \
--lm_beta 1.85 \
--lm_binary_path /home/speech/DeepSpeech/data/lm/lm.binary \
--lm_trie_path /home/speech/DeepSpeech/data/lm/trie \
--export_dir /home/speech/DeepSpeech/data/export/ \
"$@"
But model (output_graph.pbmm) looks bad when compared to old.
My doubt is should common voice mozilla corpus will be good accuracy datasets? or should i need to train with my own data?
I am training English language. Checkpoint steps has been increased from 467356 to 520374.