Fine-tuning in chinese model and loss suddenly increases

Hi, I am very happy to use DeepSpeech, but I ran into a problem recently. When I fine-tune model, the loss decreased steadily, and then suddenly became very high, causing the training to fail. The code is based on 0.9.3 and the parameter is showed as below.

tensorflow-gpu==1.14.0
CUDA version: 10.0
cudnn : 7
OS Platform: Linux Ubuntu 7.5.0
Python:3.6

Epoch 30 | Training | Elapsed Time: 0:16:17 | Steps: 916 | Loss: 17.802369
Epoch 30 | Validation | Elapsed Time: 0:01:41 | Steps: 54 | Loss: 16.134481 | Dataset: …/data/cleaned_dev_without_p.csv
I Saved new best validating model with loss 16.134481 to: …/ini_fin_test/keep_ini_fin_aug_big_cleaned_drop4/best_dev-1494871

Epoch 31 | Training | Elapsed Time: 0:16:18 | Steps: 916 | Loss: 17.626082
Epoch 31 | Validation | Elapsed Time: 0:01:40 | Steps: 54 | Loss: 15.901346 | Dataset: …/data/cleaned_dev_without_p.csv
I Saved new best validating model with loss 15.901346 to: …/ini_fin_test/keep_ini_fin_aug_big_cleaned_drop4/best_dev-1495787

Epoch 32 | Training | Elapsed Time: 0:16:12 | Steps: 916 | Loss: 17.451552
Epoch 32 | Validation | Elapsed Time: 0:01:40 | Steps: 54 | Loss: 15.847291 | Dataset: …/data/cleaned_dev_without_p.csv
I Saved new best validating model with loss 15.847291 to: …/ini_fin_test/keep_ini_fin_aug_big_cleaned_drop4/best_dev-1496703

Epoch 33 | Training | Elapsed Time: 0:16:15 | Steps: 916 | Loss: 17.296975
Epoch 33 | Validation | Elapsed Time: 0:01:41 | Steps: 54 | Loss: 15.649380 | Dataset: …/data/cleaned_dev_without_p.csv
I Saved new best validating model with loss 15.649380 to: …/ini_fin_test/keep_ini_fin_aug_big_cleaned_drop4/best_dev-1497619

Epoch 34 | Training | Elapsed Time: 0:16:05 | Steps: 916 | Loss: 219.944838
Epoch 34 | Validation | Elapsed Time: 0:01:41 | Steps: 54 | Loss: 638.858899 | Dataset: …/data/cleaned_dev_without_p.csv

Epoch 35 | Training | Elapsed Time: 0:15:33 | Steps: 916 | Loss: 700.915154
Epoch 35 | Validation | Elapsed Time: 0:01:40 | Steps: 54 | Loss: 637.616827 | Dataset: …/data/cleaned_dev_without_p.csv

Epoch 36 | Training | Elapsed Time: 0:15:29 | Steps: 916 | Loss: 699.750439
Epoch 36 | Validation | Elapsed Time: 0:01:41 | Steps: 54 | Loss: 636.648227 | Dataset: …/data/cleaned_dev_without_p.csv

Epoch 37 | Training | Elapsed Time: 0:15:33 | Steps: 916 | Loss: 698.731593
Epoch 37 | Validation | Elapsed Time: 0:01:41 | Steps: 54 | Loss: 635.740994 | Dataset: …/data/cleaned_dev_without_p.csv

Epoch 38 | Training | Elapsed Time: 0:15:33 | Steps: 916 | Loss: 697.751794
Epoch 38 | Validation | Elapsed Time: 0:01:40 | Steps: 54 | Loss: 634.852043 | Dataset: …/data/cleaned_dev_without_p.csv

Epoch 39 | Training | Elapsed Time: 0:15:35 | Steps: 916 | Loss: 696.783801
Epoch 39 | Validation | Elapsed Time: 0:01:40 | Steps: 54 | Loss: 633.970629 | Dataset: …/data/cleaned_dev_without_p.csv

python -u DeepSpeech.py
–train_files …/data/cleaned_train_without_p.csv
–dev_files …/data/cleaned_dev_without_p.csv
–test_files …/data/cleaned_test_without_p.csv
–train_batch_size 128 --dev_batch_size 128
–test_batch_size 128
–n_hidden 2048
–learning_rate 0.0001
–dropout_rate 0.20
–epochs 75
–train_cudnn
–drop_source_layers 4
–reduce_lr_on_plateau
–early_stop
–feature_cache “…/tmp/feature_ini_fin_aug_big_cleaned_drop4.cache”
–load_checkpoint_dir “…/deepspeech-0.9.1-checkpoint”
–save_checkpoint_dir “…/ini_fin_test/keep_ini_fin_aug_big_cleaned_drop4”
–summary_dir “…/ini_fin_test/keep_ini_fin_aug_big_cleaned_drop4/summaries”
–scorer_path ./data/zh_ini_fin_big/kenlm_ini_fin.scorer
–report_count 20
–alphabet_config_path ./data/zh_ini_fin_big/alphabet.txt
–test_output_file “…/test_output/output_ini_fin_aug_big_cleaned.json”
–augment pitch[pitch=1~0.1]
–augment tempo[factor=1~0.1]
–augment resample[p=0.2,rate=12000~4000]
–augment codec[p=0.2,bitrate=32000~16000]
–augment reverb[p=0.2,decay=0.7~0.15,delay=10~8]
–augment volume[p=0.2,dbfs=-10~10]

Please change your images to text.

And to general information from link above give some more information on the data you use for refinement.