Now i am trying to use DeepSpeech to train Indonesia ASR. The result is below :
Worst WER:
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.142857, loss: 4.199829
- wav: file:///home/jovyan/data2/data/peter/asr_data/cv-corpus-6.1-2020-12-11/id/clips/common_voice_id_19647960.wav
- src: "jadilah tamuku"
- res: "jadi lah hamuku"
--------------------------------------------------------------------------------
WER: 1.666667, CER: 0.250000, loss: 24.360512
- wav: file:///home/jovyan/data2/data/peter/asr_data/cv-corpus-6.1-2020-12-11/id/clips/common_voice_id_20954939.wav
- src: "zamenhof penggagas esperanto"
- res: "zaben hof pengaga a keranto"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 1.100000, loss: 26.081905
- wav: file:///home/jovyan/data2/data/peter/asr_data/cv-corpus-6.1-2020-12-11/id/clips/common_voice_id_20260506.wav
- src: "kamu bodoh"
- res: "kamu modoh un dina a"
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.375000, loss: 4.821108
- wav: file:///home/jovyan/data2/data/peter/asr_data/cv-corpus-6.1-2020-12-11/id/clips/common_voice_id_19315802.wav
- src: "makanlah"
- res: "akan lah "
--------------------------------------------------------------------------------
WER: 2.000000, CER: 0.333333, loss: 4.393207
- wav: file:///home/jovyan/data2/data/peter/asr_data/cv-corpus-6.1-2020-12-11/id/clips/common_voice_id_20953138.wav
- src: "perhatian"
- res: "pehati aan"
--------------------------------------------------------------------------------
I guess the worse performance could be improved by adding language model. I have searched the doc but no hint.