High WER CER LOSS training own model

I am trying to build deepspeech model for indonesian language,
for the dataset, I used it from common voice with total of 13k hours, and 11k hours for validation.

I have tried to use this paramater to train my model.
%cd /content/DeepSpeech/ ! python3 DeepSpeech.py \ --train_files /content/id/cv-corpus-7.0-2021-07-21/id/clips/train.csv \ --dev_files /content/id/cv-corpus-7.0-2021-07-21/id/clips/dev.csv \ --test_files /content/id/cv-corpus-7.0-2021-07-21/id/clips/test.csv \ --checkpoint_dir /content/drive/MyDrive/DeepSpeech/checkpoint_1 \ --export_dir /content/drive/MyDrive/DeepSpeech/model \ --alphabet_config_path /content/id/alphabet.txt \ --scorer data/lm/kenlm.scorer \ --train_batch_size 1 \ --test_batch_size 1 \ --n_hidden 100 \ --epochs 10 \ --utf8

And the results is very bad, I don’t know what I am doing wrong.

Test epoch | Steps: 3038 | Elapsed Time: 0:13:44
Test on /content/id/cv-corpus-7.0-2021-07-21/id/clips/test.csv - WER: 1.000000, CER: 1.000000, loss: 76.047173

Best WER:

WER: 1.000000, CER: 1.000000, loss: 771.664917

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_22967183.wav
  • src: “cintai teman kelasmu cintai kedua orang tuamu cintai tanah airmu”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 509.124603

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_19773611.wav
  • src: “dia berkata pada dirinya sendiri aku pasti bisa”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 355.355682

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_20954734.wav
  • src: “terima kasih untuk pertolongan anda”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 354.010773

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_20287820.wav
  • src: “satu hal bagi lelaki yang sudah menikah adalah jangan pernah melupakan hari perayaan pernikahan”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 332.152679

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_20954648.wav
  • src: “apakah saya harus membelikannya barang”
  • res: “”

Median WER:

WER: 1.000000, CER: 1.000000, loss: 68.902573

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_20962340.wav
  • src: “di sini tempat yang sangat terkenal di jepang”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 68.881241

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_19967474.wav
  • src: “malam ini saya tidak ingin pergi ke mana mana”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 68.862602

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_25221497.wav
  • src: “saya yakin saya akan dapat menemukannya”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 68.847015

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_24976979.wav
  • src: “ketika berbelanja saya menggunakan kartu”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 68.812393

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_23967336.wav
  • src: “sesekali ikutlah acara kami”
  • res: “”

Worst WER:

WER: 1.000000, CER: 1.000000, loss: 7.721856

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_22412572.wav
  • src: “keren”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 4.876138

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_25221714.wav
  • src: “perhatian”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 4.278893

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_22366433.wav
  • src: “iya”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 2.609815

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_22528019.wav
  • src: “satu”
  • res: “”

WER: 1.000000, CER: 1.000000, loss: 1.907428

  • wav: file:///content/id/cv-corpus-7.0-2021-07-21/id/clips/common_voice_id_20362675.wav
  • src: “tidak”
  • res: “”

For generating lm.binary I collect some indonesian sentences and it generated 38.130 vocabulary in vocabulary-500000.txt files.

For the alphabet here is what I got from training,test,and valid datasets.

### Reading in the following transcript files: ###
### ['/content/id/cv-corpus-7.0-2021-07-21/id/clips/train.csv', '/content/id/cv-corpus-7.0-2021-07-21/id/clips/dev.csv', '/content/id/cv-corpus-7.0-2021-07-21/id/clips/test.csv'] ###
### The following unique characters were found in your transcripts: ###
p
”
b
j
f
t
!
c
,
“
—
o
a
ł
ń
s
é
z
m
á
–
n
q
i
e
w
g
‘
’
 
d
'
l
x
y
v
r
h
k
u
### ^^^ You can copy-paste these into data/alphabet.txt ###

I have several question regarding training own deepspeech model.
Q1 : Is there some steps that I forget to use?
Q2 : As you can see the alphabet is messy,there are some characters,that does not belong to Indonesian. should i clean the dataset first?
Q3 : To generate lm.binary with kenlm how many sentences should i put in?

Thank you…