Empty results in inference mode

Mid_Neet · February 6, 2020, 9:08am

Hi, I’m training Common Voice of TW data on deepspeech,
I installed the version 0.6.1 deepspeech along with the native_client.tar.xz version=0.6.0

I train the model with command line:

python3 DeepSpeech.py
 --train_files mozilla_common_voice/clips/new_train.csv 
 --dev_files mozilla_common_voice/clips/new_dev.csv 
 --test_files mozilla_common_voice/clips/new_test.csv 
 --checkpoint_dir checkpoints 
 --export_dir checkpoints 
 --alphabet_config_path data/alphabet.txt 
 --lm_binary_path data/lm/lm.binary  
 --lm_trie_path ./trie
 --train_batch_size 32
 --test_batch_size 32
 --dev_batch_szie 32

about the data,
I have new_train.csv with 48966 .wavs (I’ve used import_cv2.py to transform mp3 to wav)
new_dev.csv with 5281 .wavs
new_test.csv with 2430 .wavs
the sample of data is like this:

wav_filename,wav_filesize,transcript
mozilla_common_voice/clips/common_voice_zh-TW_18500863.wav,107564,在 黑 暗 中 進 行
mozilla_common_voice/clips/common_voice_zh-TW_19673313.wav,148268,地 面 層 平 均 單 價 約 為 每 坪 四 十 萬 元
mozilla_common_voice/clips/common_voice_zh-TW_17850420.wav,157484,報 名 費 太 貴 了
mozilla_common_voice/clips/common_voice_zh-TW_19424053.wav,184364,突 然 想 到 一 個 念 頭

I used the space separated new_train.csv as corpus to train the kenlm lm.binary
It automatically trained for 4 epochs and ran into early stopping.
the evaluation results as follows:

--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 15.238984
 - wav: file://mozilla_common_voice/clips/common_voice_zh-TW_19354170.wav
 - src: "於 是"
 - res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 16.088865
 - wav: file://mozilla_common_voice/clips/common_voice_zh-TW_19275254.wav
 - src: "沒 有"
 - res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 17.799671
 - wav: file://mozilla_common_voice/clips/common_voice_zh-TW_18072642.wav
 - src: "近 年"
 - res: ""
--------------------------------------------------------------------------------

Is my configuration wrong? or any step of installation is not correct?
Thank you very much.

lissyx · February 6, 2020, 9:47am

You should have same versions everywhere to avoid problems.

How much data is that in term of hours of audio ? Importer shows it at the end of the process.

Likely that default early stopping parameters are not okay in your case and you just have a model that learnt nothing.

Mid_Neet · February 6, 2020, 11:18am

hi @lissyx,
Thanks for your quick reply,
I’ve tried to do what you said.
I’ve made the version fixed to v0.6.0 for both deepspeech and native_client.

As for the audio length,
the training set is about 43:59:27
the test set is about 2:20:21
the dev set is about 2:13:32

and I set es_steps as 15 epochs,
so now the results turn to be like this:

Test on /home/nsml/workspace/nlp_data/mozilla_common_voice/clips/new_test.csv - WER: 0.985370, CER: 0.926809, loss: 66.073044
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.666667, loss: 14.619208
 - wav: file:///home/nsml/workspace/nlp_data/mozilla_common_voice/clips/common_voice_zh-TW_18500869.wav
 - src: "你 好"
 - res: "我 "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.666667, loss: 20.093563
 - wav: file:///home/nsml/workspace/nlp_data/mozilla_common_voice/clips/common_voice_zh-TW_17666033.wav
 - src: "站 住"
 - res: "我 "
--------------------------------------------------------------------------------

the overall WER did reduce to 0.98x,
would you mind having a look?
In your experience, is it caused by early stop steps(hope so)? or data issue?
Thank you very much

lissyx · February 6, 2020, 12:08pm

I told you to use 0.6.1.

That’s really a low amount.

Early stops analyzes the behavior of the loss, you give me no log over time, I can’t help you. In your case, I think it’s both.

Please take a look at the other early stop flags or just disable early stop.