Fine tuning with custom dataset doubles WER

I am using DeepSpeech 0.7.0. With the default released checkpoint and language model I had a ~41% WER on my custom data set. With a customized language model the WER on my custom data set just went down to ~40%.

so I attempted fine tuning the released 0.7.0 model with about 100 hours of clear voice and clean transcript for 100 epochs with early stopping. This training data is closer to my real world data

python DeepSpeech.py --n_hidden 2048 --checkpoint_dir deepspeech-0.7.0-checkpoint --epochs 100 --train_files data/train100.csv --dev_files data/dev100.csv --learning_rate 0.000001 --scorer_path models/deepspeech-0.7.0-models.scorer --train_cudnn --use_allow_growth --train_batch_size 32 --dev_batch_size 32 --es_epochs 20 --early_stop True --export_dir custommodel --save_checkpoint_dir custommodel

Early stopping was triggered and best model was saved after 40 epochs.
With this model the WER doubled to ~81% and ~80% on the same custom dataset (with LibriSpeech LM and my customized LM respectively).

Can you please provide some pointers on why the WER is almost doubling? Is it because

  1. there is some error in the training process
  2. Or is 100 hours too little to make any difference
  3. Should I be overfitting and not run with early stopping?
  4. Or is the dataset not helping at all?

My real world data which I am using for testing (7 hours) is conversational American english. My training data (100 hours) is on similar lines as well. In all cases I used evaluate.py to compute the WER.
Thanks and any input is appreciated.

  1. Converted .webm files to .wav using

ffmpeg with ffmpeg -i <input.webm> -ab 160k -ac 1 -ar 16000 -vn <output.wav>

  1. split these transcript and wav files into <15 sec chunks using the similar code as in import_fisher.py

  2. Created test/dev/train split after shuffling the whole set and picking last 1K for dev, the next 1K for test and remaining for train

  3. ended up with 97:14 hours of train, 1:26 hours of dev and 1:29 hours of test data
    Training:

    I STARTING Optimization
    Epoch 0 | Training | Elapsed Time: 0:18:15 | Steps: 1033 | Loss: 66.129022
    Epoch 0 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 64.608861 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 64.608861 to: youtubemodel/best_dev-733555
    Epoch 1 | Training | Elapsed Time: 0:16:36 | Steps: 1033 | Loss: 48.541638
    Epoch 1 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 60.441715 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 60.441715 to: youtubemodel/best_dev-734588
    Epoch 2 | Training | Elapsed Time: 0:16:37 | Steps: 1033 | Loss: 46.005513
    Epoch 2 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 58.070462 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 58.070462 to: youtubemodel/best_dev-735621
    Epoch 3 | Training | Elapsed Time: 0:16:38 | Steps: 1033 | Loss: 44.431500
    Epoch 3 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 56.244110 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 56.244110 to: youtubemodel/best_dev-736654
    Epoch 4 | Training | Elapsed Time: 0:16:42 | Steps: 1033 | Loss: 43.353938
    Epoch 4 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 54.991581 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 54.991581 to: youtubemodel/best_dev-737687
    Epoch 5 | Training | Elapsed Time: 0:16:47 | Steps: 1033 | Loss: 42.467951
    Epoch 5 | Validation | Elapsed Time: 0:00:06 | Steps: 15 | Loss: 54.254252 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 54.254252 to: youtubemodel/best_dev-738720
    Epoch 6 | Training | Elapsed Time: 0:16:49 | Steps: 1033 | Loss: 41.723224
    Epoch 6 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 53.478734 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 53.478734 to: youtubemodel/best_dev-739753
    Epoch 7 | Training | Elapsed Time: 0:16:49 | Steps: 1033 | Loss: 41.048253
    Epoch 7 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 53.091078 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 53.091078 to: youtubemodel/best_dev-740786
    Epoch 8 | Training | Elapsed Time: 0:16:51 | Steps: 1033 | Loss: 40.471919
    Epoch 8 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 52.443399 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 52.443399 to: youtubemodel/best_dev-741819
    Epoch 9 | Training | Elapsed Time: 0:16:58 | Steps: 1033 | Loss: 39.939238
    Epoch 9 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 52.124259 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 52.124259 to: youtubemodel/best_dev-742852
    Epoch 10 | Training | Elapsed Time: 0:16:57 | Steps: 1033 | Loss: 39.448903
    Epoch 10 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 51.744253 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 51.744253 to: youtubemodel/best_dev-743885
    Epoch 11 | Training | Elapsed Time: 0:16:59 | Steps: 1033 | Loss: 39.037509
    Epoch 11 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 51.529975 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 51.529975 to: youtubemodel/best_dev-744918
    Epoch 12 | Training | Elapsed Time: 0:17:01 | Steps: 1033 | Loss: 38.621032
    Epoch 12 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 51.258250 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 51.258250 to: youtubemodel/best_dev-745951
    Epoch 13 | Training | Elapsed Time: 0:17:08 | Steps: 1033 | Loss: 38.235733
    Epoch 13 | Validation | Elapsed Time: 0:00:06 | Steps: 15 | Loss: 51.048094 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 51.048094 to: youtubemodel/best_dev-746984
    Epoch 14 | Training | Elapsed Time: 0:17:06 | Steps: 1033 | Loss: 37.901955
    Epoch 14 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.851648 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 50.851648 to: youtubemodel/best_dev-748017
    Epoch 15 | Training | Elapsed Time: 0:17:08 | Steps: 1033 | Loss: 37.595092
    Epoch 15 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.733523 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 50.733523 to: youtubemodel/best_dev-749050
    Epoch 16 | Training | Elapsed Time: 0:17:20 | Steps: 1033 | Loss: 37.285941
    Epoch 16 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.541233 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 50.541233 to: youtubemodel/best_dev-750083
    Epoch 17 | Training | Elapsed Time: 0:17:24 | Steps: 1033 | Loss: 36.988033
    Epoch 17 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.353015 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 50.353015 to: youtubemodel/best_dev-751116
    Epoch 18 | Training | Elapsed Time: 0:17:28 | Steps: 1033 | Loss: 36.733689
    Epoch 18 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.312286 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 50.312286 to: youtubemodel/best_dev-752149
    Epoch 19 | Training | Elapsed Time: 0:17:36 | Steps: 1033 | Loss: 36.463198
    Epoch 19 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.179744 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 50.179744 to: youtubemodel/best_dev-753182
    Epoch 20 | Training | Elapsed Time: 0:17:35 | Steps: 1033 | Loss: 36.180759
    Epoch 20 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.011344 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 50.011344 to: youtubemodel/best_dev-754215
    Epoch 21 | Training | Elapsed Time: 0:17:40 | Steps: 1033 | Loss: 35.951546
    Epoch 21 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.127234 | Dataset: data/dev100.csv
    Epoch 22 | Training | Elapsed Time: 0:17:39 | Steps: 1033 | Loss: 35.706959
    Epoch 22 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 49.927496 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 49.927496 to: youtubemodel/best_dev-756281
    Epoch 23 | Training | Elapsed Time: 0:17:43 | Steps: 1033 | Loss: 35.490063
    Epoch 23 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 49.996520 | Dataset: data/dev100.csv
    Epoch 24 | Training | Elapsed Time: 0:17:50 | Steps: 1033 | Loss: 35.242190
    Epoch 24 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 49.989241 | Dataset: data/dev100.csv
    Epoch 25 | Training | Elapsed Time: 0:17:49 | Steps: 1033 | Loss: 35.000803
    Epoch 25 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 49.916563 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 49.916563 to: youtubemodel/best_dev-759380
    Epoch 26 | Training | Elapsed Time: 0:17:50 | Steps: 1033 | Loss: 34.793564
    Epoch 26 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 49.900704 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 49.900704 to: youtubemodel/best_dev-760413
    Epoch 27 | Training | Elapsed Time: 0:17:53 | Steps: 1033 | Loss: 34.584330
    Epoch 27 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.089868 | Dataset: data/dev100.csv
    Epoch 28 | Training | Elapsed Time: 0:17:57 | Steps: 1033 | Loss: 34.364640
    Epoch 28 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.003384 | Dataset: data/dev100.csv
    Epoch 29 | Training | Elapsed Time: 0:18:00 | Steps: 1033 | Loss: 34.164118
    Epoch 29 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.005821 | Dataset: data/dev100.csv
    Epoch 30 | Training | Elapsed Time: 0:18:14 | Steps: 1033 | Loss: 33.933112
    Epoch 30 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.115147 | Dataset: data/dev100.csv
    Epoch 31 | Training | Elapsed Time: 0:18:09 | Steps: 1033 | Loss: 33.725534
    Epoch 31 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.020507 | Dataset: data/dev100.csv
    Epoch 32 | Training | Elapsed Time: 0:18:06 | Steps: 1033 | Loss: 33.535337
    Epoch 32 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 49.869817 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 49.869817 to: youtubemodel/best_dev-766611
    Epoch 33 | Training | Elapsed Time: 0:18:07 | Steps: 1033 | Loss: 33.325055
    Epoch 33 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 49.940694 | Dataset: data/dev100.csv
    Epoch 34 | Training | Elapsed Time: 0:18:03 | Steps: 1033 | Loss: 33.138998
    Epoch 34 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.078021 | Dataset: data/dev100.csv
    Epoch 35 | Training | Elapsed Time: 0:18:01 | Steps: 1033 | Loss: 32.956792
    Epoch 35 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.143071 | Dataset: data/dev100.csv
    Epoch 36 | Training | Elapsed Time: 0:18:00 | Steps: 1033 | Loss: 32.749378
    Epoch 36 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.153516 | Dataset: data/dev100.csv
    Epoch 37 | Training | Elapsed Time: 0:17:54 | Steps: 1033 | Loss: 32.551314
    Epoch 37 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.338773 | Dataset: data/dev100.csv
    Epoch 38 | Training | Elapsed Time: 0:17:49 | Steps: 1033 | Loss: 32.376073
    Epoch 38 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.302928 | Dataset: data/dev100.csv
    Epoch 39 | Training | Elapsed Time: 0:17:50 | Steps: 1033 | Loss: 32.182514
    Epoch 39 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.237611 | Dataset: data/dev100.csv
    Epoch 40 | Training | Elapsed Time: 0:17:52 | Steps: 1033 | Loss: 32.175969
    Epoch 40 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 49.486913 | Dataset: data/dev100.csv
    I Saved new best validating model with loss 49.486913 to: youtubemodel/best_dev-774875
    Epoch 41 | Training | Elapsed Time: 0:18:09 | Steps: 1033 | Loss: 32.093791
    Epoch 41 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 49.655365 | Dataset: data/dev100.csv
    Epoch 42 | Training | Elapsed Time: 0:17:51 | Steps: 1033 | Loss: 31.873352
    Epoch 42 | Validation | Elapsed Time: 0:00:06 | Steps: 15 | Loss: 49.913037 | Dataset: data/dev100.csv
    Epoch 43 | Training | Elapsed Time: 0:16:53 | Steps: 1033 | Loss: 31.663920
    Epoch 43 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.145547 | Dataset: data/dev100.csv
    Epoch 44 | Training | Elapsed Time: 0:16:26 | Steps: 1033 | Loss: 31.462906
    Epoch 44 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.364806 | Dataset: data/dev100.csv
    Epoch 45 | Training | Elapsed Time: 0:16:10 | Steps: 1033 | Loss: 31.261476
    Epoch 45 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.741706 | Dataset: data/dev100.csv
    Epoch 46 | Training | Elapsed Time: 0:15:54 | Steps: 1033 | Loss: 31.076214
    Epoch 46 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.820950 | Dataset: data/dev100.csv
    Epoch 47 | Training | Elapsed Time: 0:15:49 | Steps: 1033 | Loss: 30.889643
    Epoch 47 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 50.925875 | Dataset: data/dev100.csv
    Epoch 48 | Training | Elapsed Time: 0:15:46 | Steps: 1033 | Loss: 30.677845
    Epoch 48 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 51.038690 | Dataset: data/dev100.csv
    Epoch 49 | Training | Elapsed Time: 0:15:53 | Steps: 1033 | Loss: 30.504118
    Epoch 49 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 51.121799 | Dataset: data/dev100.csv
    Epoch 50 | Training | Elapsed Time: 0:15:54 | Steps: 1033 | Loss: 30.367995
    Epoch 50 | Validation | Elapsed Time: 0:00:06 | Steps: 15 | Loss: 51.354299 | Dataset: data/dev100.csv
    Epoch 51 | Training | Elapsed Time: 0:15:56 | Steps: 1033 | Loss: 30.178685
    Epoch 51 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 51.579119 | Dataset: data/dev100.csv
    Epoch 52 | Training | Elapsed Time: 0:15:52 | Steps: 1033 | Loss: 29.992545
    Epoch 52 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 51.444832 | Dataset: data/dev100.csv
    Epoch 53 | Training | Elapsed Time: 0:15:52 | Steps: 1033 | Loss: 29.805326
    Epoch 53 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 51.793347 | Dataset: data/dev100.csv
    Epoch 54 | Training | Elapsed Time: 0:15:49 | Steps: 1033 | Loss: 29.610107
    Epoch 54 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 51.844049 | Dataset: data/dev100.csv
    Epoch 55 | Training | Elapsed Time: 0:15:53 | Steps: 1033 | Loss: 29.443057
    Epoch 55 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 52.088029 | Dataset: data/dev100.csv
    Epoch 56 | Training | Elapsed Time: 0:15:51 | Steps: 1033 | Loss: 29.275235
    Epoch 56 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 52.262704 | Dataset: data/dev100.csv
    Epoch 57 | Training | Elapsed Time: 0:15:48 | Steps: 1033 | Loss: 29.093619
    Epoch 57 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 52.075927 | Dataset: data/dev100.csv
    Epoch 58 | Training | Elapsed Time: 0:15:50 | Steps: 1033 | Loss: 28.943511
    Epoch 58 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 52.302304 | Dataset: data/dev100.csv
    Epoch 59 | Training | Elapsed Time: 0:15:45 | Steps: 1033 | Loss: 28.771473
    Epoch 59 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 52.588906 | Dataset: data/dev100.csv
    Epoch 60 | Training | Elapsed Time: 0:15:43 | Steps: 1033 | Loss: 28.596827
    Epoch 60 | Validation | Elapsed Time: 0:00:05 | Steps: 15 | Loss: 52.745685 | Dataset: data/dev100.csv
    I Early stop triggered as the loss did not improve the last 20 epochs
    I FINISHED optimization in 17:27:37.095537
    I Exporting the model…
    I Loading best validating checkpoint from deepspeech-0.7.0-checkpoint/best_dev-732522
    I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
    I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
    I Loading variable from checkpoint: layer_1/bias
    I Loading variable from checkpoint: layer_1/weights
    I Loading variable from checkpoint: layer_2/bias
    I Loading variable from checkpoint: layer_2/weights
    I Loading variable from checkpoint: layer_3/bias
    I Loading variable from checkpoint: layer_3/weights
    I Loading variable from checkpoint: layer_5/bias
    I Loading variable from checkpoint: layer_5/weights
    I Loading variable from checkpoint: layer_6/bias
    I Loading variable from checkpoint: layer_6/weights

Test set:

Test epoch | Steps: 1980 | Elapsed Time: 0:22:09
Test on audio/test/test.csv - WER: 0.879427, CER: 0.841129, loss: 789.895447

Best WER:

WER: 0.000000, CER: 0.000000, loss: 5.918374

  • wav: file:audio/test/28.wav
  • src: “the term new normal”
  • res: “the term new normal”

WER: 0.000000, CER: 0.000000, loss: 2.896821

  • wav: file:audio/test/48.wav
  • src: “really instantly take away”
  • res: “really instantly take away”

WER: 0.000000, CER: 0.000000, loss: 1.120514

  • wav: file:audio/test/80.wav
  • src: “and this is actually a great idea for”
  • res: “and this is actually a great idea for”

WER: 0.111111, CER: 0.087719, loss: 20.143023

  • wav: file:audio/test/file_733.wav
  • src: " digging into things and using the search technology to "
  • res: "digging into things and using the search technology "

WER: 0.142857, CER: 0.192771, loss: 34.474407

  • wav: file:audio/test/file_1567.wav
  • src: "and see where they talk positively or maybe not so positively about certain topics "
  • res: “and see where they talk positively or maybe not so positively about”

Median WER:

WER: 0.883721, CER: 0.867299, loss: 916.896118

  • wav: file:audio/test/file_1008.wav
  • src: “yeah you see you’ve hit it right on the nail i mean we have been aggressively looking at home and you know it’s funny we started probably a month or so ago as we thought about look if we cancel this wedding like”
  • res: "yeah you had read on the nail "

WER: 0.883721, CER: 0.831050, loss: 915.033203

  • wav: file:audio/test/file_844.wav
  • src: "where you’re spending time so i’m only getting like the cliff notes version of this and if you find it important enough right then that’s where you can kind of dig it so i’m not looking at the entire page or the report "
  • res: “where you’re spending time in getting”

WER: 0.883721, CER: 0.848341, loss: 912.311768

  • wav: file:audio/test/file_1147.wav
  • src: " is there a way to set up a notification when companies report or something or new companies i guess oh yeah so i guess a new transcript has been added right if you add a watch list to an industry right its just"
  • res: “he was set up a notification when”

WER: 0.884615, CER: 0.893238, loss: 1369.062134

  • wav: file:audio/test/2042.wav
  • src: “at the bottom of the language it said it was now in the digital media market which was the new market they bought their way into and it was catching the word market and it was catching the word expanded or improved or increased or whatever it might be completely different sentence”
  • res: "at the bottom of the language "

WER: 0.884615, CER: 0.843750, loss: 1217.601196

  • wav: file:audio/test/file_42.wav
  • src: " available if you think about yourself as the ai you have these literally at your disposal for hey i want to search across this i don’t what are your kind of thoughts you think about the way that’s organized well we can get as you said a can’t get a lot of"
  • res: "are available if you think about yourself the "

Worst WER:

WER: 1.000000, CER: 0.866667, loss: 39.513969

  • wav: file:audio/test/file_718.wav
  • src: "awesome awesome so i think we "
  • res: “some”

WER: 1.000000, CER: 0.714286, loss: 29.982481

  • wav: file:audio/test/14.wav
  • src: “oh good”
  • res: “algoma”

WER: 1.000000, CER: 0.880952, loss: 24.731026

  • wav: file:audio/test/15.wav
  • src: “yeah but so as a kind of noted in my email”
  • res: “yahya”

WER: 1.000000, CER: 1.000000, loss: 18.603931

  • wav: file:audio/test/11.wav
  • src: “oh good”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 1.595282

  • wav: file:audio/test/57.wav
  • src: “yeah”
  • res: “”

How did this go? I’m actually very curious about this case :slight_smile:

Have you manually listened to a sample of the audiofiles reporting the worst WER, and thus verified that the samples are actually correct ?