Hello,
I did the fine-tuning of the deepspeech model using below command
python3 DeepSpeech.py --n_hidden 2048 --checkpoint_dir ~/workspace/chib/model/DeepSpeech/checkPointsDir/ --epochs 1 --train_files ~/workspace/chib/data/DeepSpeech/commonVoiceData/data/clips/train.csv --dev_files ~/workspace/chib/data/DeepSpeech/commonVoiceData/data/clips/dev.csv --test_files ~/workspace/chib/data/DeepSpeech/commonVoiceData/data/clips/testNew.csv --learning_rate 0.0001
Test file is the same as the testfile of common voice data shared on github. But I am bit strange by the results.
Testing model on /home/aml/workspace/chib/data/DeepSpeech/commonVoiceData/data/clips/testNew.csv
Test epoch | Steps: 43 | Elapsed Time: 0:00:21
Test on /home/aml/workspace/chib/data/DeepSpeech/commonVoiceData/data/clips/testNew.csv - WER: 0.873239, CER: 0.637005, loss: 90.517876
WER: 1.000000, CER: 0.642857, loss: 26.292532
- src: “do you want me”
- res: “no man no”
WER: 1.000000, CER: 0.812500, loss: 27.239510
- src: “what was he like”
- res: “maimie”
WER: 1.000000, CER: 0.642857, loss: 38.179359
- src: “do you mean it”
- res: “more rent”
WER: 1.000000, CER: 0.666667, loss: 40.403831
- src: “you wanna take this outside”
- res: “i want missy”
WER: 1.000000, CER: 0.689655, loss: 46.982464
- src: “that would be funny if he did”
- res: “at obfuscated”
WER: 1.000000, CER: 0.818182, loss: 54.964172
- src: “what do you advise sir”
- res: “alabaster”
WER: 1.000000, CER: 0.863636, loss: 58.110783
- src: “i’m so glad to see you”
- res: “in preparing”
WER: 1.000000, CER: 0.842105, loss: 61.973919
- src: “she’ll be all right”
- res: “every”
WER: 1.000000, CER: 0.800000, loss: 72.811066
- src: “you yellow giant thing of the frost”
- res: “a alcantro”
WER: 1.000000, CER: 0.535714, loss: 74.692726
- src: “groves started writing songs when she was four years old”
- res: “go i started it in song washpool”
Can any one tell me why the results seems bad ?