Fine-Tuning progresses on my training data but regresses on the audio given in the v0.7.4 release

Lately I’ve been trying to fine-tune the latest release model to consistently recognize certain medical terms such as “eczema.”

Computing Environment:

  • DeepSpeech v0.7.4
  • Ubuntu 18.04 LTS
  • Python 3.6.9
  • Tensorflow 1.15.2

I have not been able to find many other resources on this forum or through Google. I understand that in this release, the deepspeech-0.7.4-models.scorer replaced the language model and trie so my first inclination is that I need to update that as well to fit the newly fine-tuned model. But I’m not sure if that’s the root of the problem.

Here’s when I run the original pre-trained model:
Given Audio:

$ deepspeech --model deepspeech-0.7.4-models.pbmm --scorer deepspeech-0.7.4-models.scorer --audio audio/2830-3980-0043.wav
Loading model from file deepspeech-0.7.4-models.pbmm
TensorFlow: v1.15.0-24-gceb46aa
DeepSpeech: v0.7.4-0-gfcd9563
2020-07-29 11:46:06.984198: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 0.187s.
Loading scorer from files deepspeech-0.7.4-models.scorer
Loaded scorer in 0.136s.
Running inference.
experience proves this
Inference took 12.036s for 1.975s audio file.

One of my test files: (transcription: “tell me about your eczema”)

$ deepspeech --model deepspeech-0.7.4-models.pbmm --scorer deepspeech-0.7.4-models.scorer --audio data/bm_test/Eggs.wav
Loading model from file deepspeech-0.7.4-models.pbmm
TensorFlow: v1.15.0-24-gceb46aa
DeepSpeech: v0.7.4-0-gfcd9563
2020-07-29 12:03:59.036974: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 0.018s.
Loading scorer from files deepspeech-0.7.4-models.scorer
Loaded scorer in 0.00878s.
Running inference.
tell me about your eggs a
Inference took 10.176s for 2.805s audio file.

Here’s when I run the fine-tuned model:
Given Audio:

$ deepspeech --model exports/output_graph.pb --scorer deepspeech
-0.7.4-models.scorer --audio audio/2830-3980-0043.wav
Loading model from file exports/output_graph.pb
TensorFlow: v1.15.0-24-gceb46aa
DeepSpeech: v0.7.4-0-gfcd9563
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2020-07-29 11:46:44.780034: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 1.03s.
Loading scorer from files deepspeech-0.7.4-models.scorer
Loaded scorer in 0.00873s.
Running inference.
iii
Inference took 7.992s for 1.975s audio file.

One of my test files: (transcription: “tell me about your eczema”)

$ deepspeech --model exports/output_graph.pb --scorer deepspeech
-0.7.4-models.scorer --audio data/bm_test/Eggs.wav
Loading model from file exports/output_graph.pb
TensorFlow: v1.15.0-24-gceb46aa
DeepSpeech: v0.7.4-0-gfcd9563
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2020-07-29 12:09:32.695840: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 0.227s.
Loading scorer from files deepspeech-0.7.4-models.scorer
Loaded scorer in 0.00892s.
Running inference.
tell me about your eczema
Inference took 9.169s for 2.805s audio file.

Here’s My .sh file to run the fine-tuning session:

#!/bin/sh

set -xe

bm_dir="./data/bm_test"
bm_csv="${bm_dir}/bm.csv"

if [ ! -f DeepSpeech.py ]; then
    echo "Please make sure you run this from DeepSpeech's top level directory."
    exit 1
fi;

# Force only one visible device because we have a single-sample dataset
# and when trying to run on multiple devices (like GPUs), this will break
export CUDA_VISIBLE_DEVICES=0

python -u DeepSpeech.py --noearly_stop \
  --train_files ${bm_csv} --train_batch_size 1 \
  --dev_files ${bm_csv} --dev_batch_size 1 \
  --test_files ${bm_csv} --test_batch_size 1 \
  --n_hidden 2048 --epochs 10 \
  --max_to_keep 1 --save_checkpoint_dir './fine_tuning_checkpoints' \
  --load_checkpoint_dir './load_checkpoints' \
  --load_cudnn \
  --export_dir './exports' \
  --learning_rate 0.0005 --dropout_rate 0.05 \
  --scorer_path 'deepspeech-0.7.4-models.scorer' | tee /tmp/resume.log

if ! grep "Loading best validating checkpoint from" /tmp/resume.log; then
  echo "Did not resume training from checkpoint"
  exit 1
else
  exit 0
fi

In the ‘load_checkpoints’ directory I have the checkpoint data from the v0.7.4 release:

$ ls load_checkpoints/
alphabet.txt                         best_dev-732522.index  best_dev_checkpoint
best_dev-732522.data-00000-of-00001  best_dev-732522.meta   checkpoint

This is my data/bm_test/ directory I’m using for the training data:

ls data/bm_test/
Common.wav         EczemaFlareUps.wav  IBSEffect.wav        ManageIBS.wav         SetIBS.wav        WhatManage.wav
DietGuide.wav      Eggs.wav            IBSPrescription.wav  Metals.wav            ShelterFood.wav   WorkEffect.wav
Difficult.wav      Episode.wav         IBSSymptoms.wav      ObtainPermission.wav  TalkAboutIBS.wav  bm.csv
DiscussIBS.wav     Foods.wav           Instigate.wav        Prescription.wav      Topical.wav
DiscussMetals.wav  HiAlan.wav          Life.wav             ProbsWithIBS.wav      TreatIBS.wav
Eczema.wav         Hygiene.wav         ManageEczema.wav     Religious.wav         WhatHelps.wav

This is my bm.csv:

wav_filename,wav_filesize,transcript
/home/wes/DeepSpeech/data/bm_test/Eggs.wav,89804,tell me about your eczema
/home/wes/DeepSpeech/data/bm_test/HiAlan.wav,101278,hi alan my name is wesley
/home/wes/DeepSpeech/data/bm_test/ObtainPermission.wav,173930,is it alright if we talk in this non clinical setting
/home/wes/DeepSpeech/data/bm_test/Religious.wav,135198,do you have any religious or spiritual affiliations
/home/wes/DeepSpeech/data/bm_test/TalkAboutIBS.wav,88844,tell me about your ibs
/home/wes/DeepSpeech/data/bm_test/ProbsWithIBS.wav,80664,what problems have you been having with your ibs
/home/wes/DeepSpeech/data/bm_test/EczemaFlareUps.wav,73976,please describe you eczema flare ups
/home/wes/DeepSpeech/data/bm_test/Life.wav,138718,how has eczema affected your life 
/home/wes/DeepSpeech/data/bm_test/Eczema.wav,66764,eczema
/home/wes/DeepSpeech/data/bm_test/Metals.wav,121438,certain metals can cause eczema to develop
/home/wes/DeepSpeech/data/bm_test/DiscussMetals.wav,114718,discuss metals as an eczema trigger
/home/wes/DeepSpeech/data/bm_test/Topical.wav,117324,have you considered using topical creams for eczema
/home/wes/DeepSpeech/data/bm_test/Prescription.wav,130398,lets talk about possibly using prescriptions for your eczema
/home/wes/DeepSpeech/data/bm_test/ManageEczema.wav,86284,how do you manage your eczema
/home/wes/DeepSpeech/data/bm_test/WhatManage.wav,86878,what do you do to manage your eczema
/home/wes/DeepSpeech/data/bm_test/WorkEffect.wav,134558,how has work afftected your eczema and ibs
/home/wes/DeepSpeech/data/bm_test/SetIBS.wav,87884,what sets off your ibs
/home/wes/DeepSpeech/data/bm_test/Instigate.wav,104524,what instigates your ibs 
/home/wes/DeepSpeech/data/bm_test/Foods.wav,109004,do you know of any food that triggers ibs
/home/wes/DeepSpeech/data/bm_test/Common.wav,100684,some common foods can trigger ibs
/home/wes/DeepSpeech/data/bm_test/IBSSymptoms.wav,111518,please describe some of your ibs symptoms
/home/wes/DeepSpeech/data/bm_test/Episode.wav,131690,can you tell me about an ibs episode you have had
/home/wes/DeepSpeech/data/bm_test/IBSEffect.wav,90764,how has ibs affected your life
/home/wes/DeepSpeech/data/bm_test/DiscussIBS.wav,122078,lets talk about how ibs has affected your quality of life
/home/wes/DeepSpeech/data/bm_test/ShelterFood.wav,110878,how does the shelter food affect your ibs
/home/wes/DeepSpeech/data/bm_test/TreatIBS.wav,102238,lets look at some treatment options for ibs
/home/wes/DeepSpeech/data/bm_test/IBSPrescription.wav,89758,have your considered prescriptions for ibs
/home/wes/DeepSpeech/data/bm_test/DietGuide.wav,112158,discuss dietary guidelines for ibs
/home/wes/DeepSpeech/data/bm_test/WhatHelps.wav,83038,what will help your ibs

I understand that this is not a large amount of training data. I’m really just trying to get the basics of fine-tuning down first.

I’m not entirely sure If it is the way that I have trained the model or if its the way that I’m running it for inference. When I searched the internet for how to run the newly trained model it was referring to older models instead of the v0.7.4 where the scorer has replaced the lm and trie. Please point me in the right direction