Test being just shown in the end of training

Hi guys. I’m following this tutorial, but when training the model in order to overfit it, it just test when the training finishes, not in each step. Here it is the output and the run.sh:

I STARTING Optimization
Epoch 0 |   Training | Elapsed Time: 0:00:00 | Steps: 0 | Loss: 0.000000                                                                         2019-08-07 08:18:57.382234: W tensorflow/core/framework/allocator.cc:107] Allocation of 134217728 exceeds 10% of system memory.
Epoch 0 |   Training | Elapsed Time: 0:00:28 | Steps: 1 | Loss: 110.038635
Epoch 0 | Validation | Elapsed Time: 0:00:01 | Steps: 1 | Loss: 106.234268 | Dataset: /home/juan/dataset_overfit/train.csv
I Saved new best validating model with loss 106.234268 to: /home/juan/.local/share/deepspeech/checkpoints/best_dev-8
Epoch 1 |   Training | Elapsed Time: 0:00:00 | Steps: 0 | Loss: 0.000000                                                                         2019-08-07 08:19:38.255979: W tensorflow/core/framework/allocator.cc:107] Allocation of 134217728 exceeds 10% of system memory.
Epoch 1 |   Training | Elapsed Time: 0:00:35 | Steps: 1 | Loss: 106.234268
Epoch 1 | Validation | Elapsed Time: 0:00:01 | Steps: 1 | Loss: 99.310707 | Dataset: /home/juan/dataset_overfit/train.csv
W0807 08:20:19.372156 140061870270272 deprecation.py:323] From /home/juan/tmp/deepspeech-venv/lib/python3.6/site-packages/tensorflow/python/training/saver.py:960: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to delete files with this prefix.
I Saved new best validating model with loss 99.310707 to: /home/juan/.local/share/deepspeech/checkpoints/best_dev-9
Epoch 2 |   Training | Elapsed Time: 0:00:29 | Steps: 1 | Loss: 99.310707
Epoch 2 | Validation | Elapsed Time: 0:00:01 | Steps: 1 | Loss: 91.462814 | Dataset: /home/juan/dataset_overfit/train.csv
I Saved new best validating model with loss 91.462814 to: /home/juan/.local/share/deepspeech/checkpoints/best_dev-10
Epoch 3 |   Training | Elapsed Time: 0:00:29 | Steps: 1 | Loss: 91.462814
Epoch 3 | Validation | Elapsed Time: 0:00:01 | Steps: 1 | Loss: 84.872940 | Dataset: /home/juan/dataset_overfit/train.csv
I Saved new best validating model with loss 84.872940 to: /home/juan/.local/share/deepspeech/checkpoints/best_dev-11
Epoch 4 |   Training | Elapsed Time: 0:00:29 | Steps: 1 | Loss: 84.872940
Epoch 4 | Validation | Elapsed Time: 0:00:01 | Steps: 1 | Loss: 81.304314 | Dataset: /home/juan/dataset_overfit/train.csv
I Saved new best validating model with loss 81.304314 to: /home/juan/.local/share/deepspeech/checkpoints/best_dev-12
Epoch 5 |   Training | Elapsed Time: 0:00:33 | Steps: 1 | Loss: 81.304314
Epoch 5 | Validation | Elapsed Time: 0:00:01 | Steps: 1 | Loss: 81.230461 | Dataset: /home/juan/dataset_overfit/train.csv
I Saved new best validating model with loss 81.230461 to: /home/juan/.local/share/deepspeech/checkpoints/best_dev-13
Epoch 6 |   Training | Elapsed Time: 0:00:32 | Steps: 1 | Loss: 81.230461
Epoch 6 | Validation | Elapsed Time: 0:00:01 | Steps: 1 | Loss: 83.050926 | Dataset: /home/juan/dataset_overfit/train.csv
Epoch 7 |   Training | Elapsed Time: 0:00:33 | Steps: 1 | Loss: 83.050926
Epoch 7 | Validation | Elapsed Time: 0:00:02 | Steps: 1 | Loss: 84.060440 | Dataset: /home/juan/dataset_overfit/train.csv
I Early stop triggered as (for last 4 steps) validation loss: 84.060440 with standard deviation: 0.841309 and mean: 81.861900
I FINISHED optimization in 0:05:30.997140
I0807 08:24:27.838671 140061870270272 saver.py:1280] Restoring parameters from /home/juan/.local/share/deepspeech/checkpoints/best_dev-13
I Restored variables from best validation checkpoint at /home/juan/.local/share/deepspeech/checkpoints/best_dev-13, step 13
Testing model on /home/juan/dataset_overfit/train.csv
Test epoch | Steps: 1 | Elapsed Time: 0:00:02
Test on /home/juan/dataset_overfit/train.csv - WER: 1.000000, CER: 0.870968, loss: 81.230461
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.870968, loss: 81.230461
 - wav: file:///home/juan/dataset_overfit/wav/common_voice_es_18572713.wav
 - src: "pero tenemos para dar y regalar"
 - res: "era "
--------------------------------------------------------------------------------
I Exporting the model...

And the run.sh:

#!/usr/bin/env bash

set -xe
if [ ! -f DeepSpeech.py ]; then
    echo "Please make sure you run this from DeepSpeech's top level directory."
    exit 1
fi;

python -u DeepSpeech.py \
    --train_files "/home/juan/dataset_overfit/train.csv" \
    --dev_files "/home/juan/dataset_overfit/train.csv"  \
    --test_files "/home/juan/dataset_overfit/train.csv"  \
    --alphabet_config_path "/home/juan/overfit-models/alphabet.txt" \
    --lm_binary_path "/home/juan/overfit-models/lm.binary" \
    --lm_trie_path "/home/juan/overfit-models/trie" \
    --learning_rate 0.000025 \
    --dropout_rate 0 \
    --word_count_weight 3.5 \
    --log_level 1 \
    --display_step 1 \
    --epoch 200 \
    --export_dir "/home/juan/overfit-models"

Also, I don’t understand why it early stops if I didn’t set up any criteria. Is it predefined? Thanks in advance.

We have updated the training code/flags since that tutorial was written. We modified the code to only do a test epoch at the end, since it’s a very computationally expensive process and it usually does not make sense to test after every single epoch.

1 Like

And is there any parameter to test every 5 epochs, for example? I can’t find it.

As @reuben stated, we removed that because it was computationnally intensive and not that useful.

2 Likes