The loss rizing

I try to make transfer learning for russian . I downloaded pretrained model with scorer prepare new ( russian albabyte) , freeze 1 layer and begin train witn common voice ( russian) dataset. If i take batchsize=1 trainloss changed from approx 600 to approx 72 and begin to rize.
If i take batchsize= 128 , trainloss fall down to 226 approx and rise.
I tried to change dropout from 0.05 to 0.25. Nothing helps.
May be the problem is in scorer for english not for russian ?
How can i solve this problem ?

python3 DeepSpeech.py --drop_source_layers 1 --alphabet_config_path ~/ASR/data-cv/alphabet.ru --save_checkpoint_dir ~/ASR/ru-output-checkpoint --load_checkpoint_dir ~/ASR/ru-release-checkpoint --train_files ~/ASR/data-cv/clips/train.csv --dev_files ~/ASR/data-cv/clips/dev.csv --test_files ~/ASR/data-cv/clips/test.csv --train_batch_size 64 --dropout_rate 0.25

Scorer is only meaningfull on test set.

How big is it ?

You don’t change learning rate, you might want to reduce it.

russian dataset is approx 9600 samples. trainloss begin to rise before finishing first epoch.
thanks for recomendation to change learning rate . I will try to chane it now and compare results.

If i take batch size =100 should i reduce learning rate in 100 times to have similar result ?

Please read more about deep learning first, batch size and learning rate are somewhat “correlated” but not how you think. If you can use a batch size of 128 you are fine, then try different learning rates, e.g. 1e-4, -5, …

If you are making a research or something be aware that the flag -drop-source-layers doesn’t freeze layers. It drops a specific number of layers from the top, so if you specify a flag -drop-source-layers 1, then the output layer will be dropped. The remaining layers are fine-tuned by default.

How can i freeze layers ?

You will have to code that yourself, no built in flags for that.

That’s what I also thought