Error is fluctuating, but WER and distance are decreasing

I’m using pre-trained model (https://github.com/mozilla/DeepSpeech/releases) to boost training of different language. After few epochs the validation and test error fluctuate while the training error decrease. I think that result is expected, because my own data are smaller compared to data used on pre-trained model (over-fitting). However, the WER and mean edit distance are still decreasing. Can anyone explain why this is happening?

Are you sure you’re not looking at the training WER and mean edit distance? Test WER is only displayed during test epochs.

It’s test WER and mean edit distance.
I’m not using language model.

I Training of Epoch 1 - loss: 36.160567
I Validation of Epoch 1 - loss: 28.253072
Test of Epoch 2 - WER: 0.952353, loss: 28.234670582939597, mean edit distance: 0.270661
I Training of Epoch 2 - loss: 29.365231
I Validation of Epoch 2 - loss: 25.669184
I Training of Epoch 3 - loss: 24.687250
I Validation of Epoch 3 - loss: 30.121517
I Test of Epoch 4 - WER: 0.902173, loss: 24.7095708286061, mean edit distance: 0.237376
 I Training of Epoch 4 - loss: 20.829074
I Validation of Epoch 4 - loss: 23.312354
Test of Epoch 5 - WER: 0.883708, loss: 23.685436304877786, mean edit distance: 0.226368
I Training of Epoch 5 - loss: 17.884079
I Validation of Epoch 5 - loss: 23.330662
Test of Epoch 6 - WER: 0.861326, loss: 24.185932608211743, mean edit distance: 0.221251
I Training of Epoch 6 - loss: 15.401079
I Validation of Epoch 6 - loss: 23.116984
Test of Epoch 7 - WER: 0.855501, loss: 23.522796462563907, mean edit distance: 0.21414
I Training of Epoch 7 - loss: 13.353899
I Validation of Epoch 7 - loss: 22.591326
Test of Epoch 8 - WER: 0.818344, loss: 23.33637237548828, mean edit distance: 0.205596
I Training of Epoch 8 - loss: 11.634906
I Validation of Epoch 8 - loss: 22.598969
Test of Epoch 9 - WER: 0.798256, loss: 23.326629638671875, mean edit distance: 0.197845

so on …

hi,
may I ask you a question? How do you show the training WER and mean edit distance in training process?I tried to add an op in “calculate_mean_edit_distance_and_loss(model_feeder, tower, dropout, reuse, step)” like :
“decoded, _ = tf.nn.ctc_beam_search_decoder(logits, batch_seq_len, FLAGS.beam_width)
wer = tf.edit_distance(tf.cast(decoded[0], tf.int32), batch_y, normalize=True)” ,
then write to summary in training process. But it is too slow(maybe the decoding process is too slow). Could you give some suggestions about this problem?