Suspicious bad results when testing a new model

Pablo · December 30, 2020, 6:34pm

Essentially I would like to continue the training of a pre-trained model, the pre-trained model is the one provided by the latest release of deepspeech italia (from here I took files alphabet.txt, checkpoint_it and scorer) and is built with transfer learning and with a different alphabet. I downloaded the zip file of the deepspeech v0.9 repository and after run python3 setup.py install and creating the environment like this:

python3 -m pip install --no-cache-dir --upgrade pip==20.0.2 wheel==0.34.2 setuptools==46.1.3
python3 -m pip install install numpy==1.18.5
python3 -m pip install pandas==1.1.5
python3 -m pip install scipy==1.5.1
python3 -m pip install tensorflow==1.15.4
python3 -m pip install tensorflow-gpu==1.15.4

I ran the following command to test the training with a small dataset called cv-tiny (consisting of only 50 clips).

python3 DeepSpeech.py \
--load_cudnn False \
--alphabet_config_path /alphabet.txt \
--checkpoint_dir /transfer_checkpoint_it \
--train_files cv-tiny/train.csv \
--dev_files cv-tiny/dev.csv \
--test_files cv-tiny/test.csv \
--scorer_path /scorer \
--train_batch_size 64 \
--dev_batch_size 64 \
--test_batch_size 64 \
--n_hidden 2048 \
--epochs 30 \
--learning_rate 0.0001 \
--dropout_rate 0.4 \
--es_epochs 10 \
--early_stop 1 \
--drop_source_layers 1 \
--export_dir /ckpt/ \
--export_file_name 'output_graph'

the problem comes now, after training while testing the model I get all suspicious bad results like this:

WER: 1.000000, CER: 2.000000, loss: 462.202454
 - wav: file:///home/pablo/deep-speech/cv-tiny/common_voice_it_17544185.wav
 - src: "il vuoto assoluto"
 - res: "mnmnmnmnmnmnm incensurato finanzierebbe "

if I use the pre-training model I can correctly transcribe the clip "il vuoto assoluto", instead during the test of the new model it is completely wrong. The result shouldn’t be like that, theoretically I’m adding knowledge to the pre-trained model which should then correctly transcribe the clip.
Did I forget any steps or flags?

ps. I am using the cpu.

othiele · December 31, 2020, 9:48am

I don’t know why you didn’t post that in your old thread, but let’s continue here.

Please understand what you are doing. Why install GPU if you don’t have one? Really, you need to check little details and understand them.

Why do you choose this learning rate for fine tuning. Have you searched the forum for it? Did you read about deep learning as suggested?

You didn’t answer the question about the hours of material you fine tune with. If you have just 50 samples as suggested in the other thread you won’t get anything from training this way. And get a GPU, training with CPU is not worth it.