ok. So Im using deepspeech v0.6.1 and downloaded the checkpoints for the same version. And common voice dataset is downloaded from https://voice.mozilla.org/en/datasets.
Im getting the below result.
Test on /home/user/en/clips/test.csv - WER: 0.422665, CER: 0.253988, loss: 44.272888
--------------------------------------------------------------------------------
WER: 6.000000, CER: 3.222222, loss: 193.430511
- wav: file:///home/user/en/clips/common_voice_en_54384.wav
- src: "undefined"
- res: "everything on her and he banterer "
--------------------------------------------------------------------------------
WER: 3.750000, CER: 3.882353, loss: 363.831543
- wav: file:///home/user/en/clips/common_voice_en_17645060.wav
- src: "did you know that"
- res: "the two road the du know that did you know that they do know that did you know that"
--------------------------------------------------------------------------------
WER: 2.666667, CER: 0.655172, loss: 120.730492
- wav: file:///home/user/en/clips/common_voice_en_125325.wav
- src: "elizabeth reclined gracefully"
- res: "it is a bet to an integrate full"
--------------------------------------------------------------------------------
WER: 2.285714, CER: 1.928571, loss: 343.015198
- wav: file:///home/user/en/clips/common_voice_en_17832183.wav
- src: "as you sow so shall you reap"
- res: "i just she didn't fall it all over myself i just sit in front at all"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 1.000000, loss: 20.358313
- wav: file:///home/user/en/clips/common_voice_en_191353.wav
- src: "amen"
- res: "the men"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.600000, loss: 21.027395
- wav: file:///home/user/en/clips/common_voice_en_18442278.wav
- src: "behave yourself"
- res: "the head or self"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.785714, loss: 33.058380
- wav: file:///home/user/en/clips/common_voice_en_17267925.wav
- src: "any volunteers"
- res: "in a woman to"
--------------------------------------------------------------------------------
WER: 1.833333, CER: 1.285714, loss: 141.307816
- wav: file:///home/user/en/clips/common_voice_en_680693.wav
- src: "find me the saga air cavalry"
- res: "fin made the aga i will cover for time the saga a cabal"
--------------------------------------------------------------------------------
WER: 1.666667, CER: 0.600000, loss: 42.782768
- wav: file:///home/user/en/clips/common_voice_en_18429519.wav
- src: "ideas are uncopyrightable"
- res: "idea for an operator well"
--------------------------------------------------------------------------------
WER: 1.666667, CER: 0.666667, loss: 45.606365
- wav: file:///home/user/en/clips/common_voice_en_2421.wav
- src: "programming requires brains"
- res: "so came i guess in"
--------------------------------------------------------------------------------
I Exporting the model...
Now, my query is even if i have followed all the steps mentioned in documentation for training model for common voice dataset, Im not able to get good accuracy result. I just want to know is there anything wrong with my approach.
Thanks!