Fine Tuning Model not giving correct expected output

Hi, I am making a model to recognize the Indian English accent. I am using pretrained v0.7 and fine-tuning it.
I have a limitation on storage space so I am training it in parts. what I mean is I am fine-tuning in stages. For eg. Fine tune the model for x amount of data, checkpoint after 3 epochs, then start fine-tuning again with new y data from the earlier checkpoint and repeat the same process.

here is the command line I am using for it

!python3 ‘/content/DeepSpeech/DeepSpeech.py’
–n_hidden 2048
–checkpoint_dir ‘/content/drive/My Drive/fine_tuning/deepspeech-0.7.0-checkpoint/’
–epochs 3
–alphabet_config_path ‘/content/drive/My Drive/fine_tuning/deepspeech-0.7.0-checkpoint/alphabet.txt’
–train_batch_size 24
–dev_batch_size 24
–test_batch_size 48 \
–train_files ‘/content/drive/My Drive/ytd_project/audio_files/csv_files/train.csv’
–dev_files ‘/content/drive/My Drive/ytd_project/audio_files/csv_files/dev.csv’
–test_files ‘/content/drive/My Drive/ytd_project/audio_files/csv_files/test.csv’
–export_dir ‘/content/drive/My Drive/fine_tuning/exported_model’
–scorer ‘/content/drive/My Drive/transfer_learning/testing/deepspeech-0.7.0-models.scorer’
–load_cudnn

here I update the test train and dev data with my new data csv files after training is completed.

Now the problem is the WER is getting worse after every batch I train.
Is something fundamentally wrong in the approach?

also, as per my understanding, I can reuse the scorer file released with pretrained V0.7.0

I exported the model after the training completed and converted it to a pbmm file and used the same v0.7.0 scorer to test some files myself.
here is the output for one such file:

i

then, to compare I tried transcribing the same file with the pretrained model. here is its output:

our nearest martingale and superman devil ah then you have

please help me in understanding the issues and finding solutions.

Looks you are suffering forceful forgetting …

Training in smaller batches sounds strange, do the whole amount of data in one go if you can.

Otherwise use transfer learning to drop a layer (check docs), and set the learning rate lower (search forums) and maybe use a higher dropout (0.25-0.4).

Ok I will try training it using transfer learning.

Hi, can the forceful forgetting be due to the way I am preparing my data set?
I am making multiple copies of a single file and superimposing each one with same noise(a traffic noise file i have downloaded from youtube) at diff loudness levels.
I am doing this because I want to make the model robust so that it will perform well even if there is background noise as I will be using the model after training to transribe data that has noise in the background.