Hi, I’m training Common Voice of TW data on deepspeech,
I installed the version 0.6.1 deepspeech
along with the native_client.tar.xz version=0.6.0
I train the model with command line:
python3 DeepSpeech.py
--train_files mozilla_common_voice/clips/new_train.csv
--dev_files mozilla_common_voice/clips/new_dev.csv
--test_files mozilla_common_voice/clips/new_test.csv
--checkpoint_dir checkpoints
--export_dir checkpoints
--alphabet_config_path data/alphabet.txt
--lm_binary_path data/lm/lm.binary
--lm_trie_path ./trie
--train_batch_size 32
--test_batch_size 32
--dev_batch_szie 32
about the data,
I have new_train.csv
with 48966
.wavs (I’ve used import_cv2.py to transform mp3 to wav)
new_dev.csv
with 5281
.wavs
new_test.csv
with 2430
.wavs
the sample of data is like this:
wav_filename,wav_filesize,transcript
mozilla_common_voice/clips/common_voice_zh-TW_18500863.wav,107564,在 黑 暗 中 進 行
mozilla_common_voice/clips/common_voice_zh-TW_19673313.wav,148268,地 面 層 平 均 單 價 約 為 每 坪 四 十 萬 元
mozilla_common_voice/clips/common_voice_zh-TW_17850420.wav,157484,報 名 費 太 貴 了
mozilla_common_voice/clips/common_voice_zh-TW_19424053.wav,184364,突 然 想 到 一 個 念 頭
I used the space separated new_train.csv as corpus to train the kenlm lm.binary
It automatically trained for 4 epochs and ran into early stopping.
the evaluation results as follows:
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 15.238984
- wav: file://mozilla_common_voice/clips/common_voice_zh-TW_19354170.wav
- src: "於 是"
- res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 16.088865
- wav: file://mozilla_common_voice/clips/common_voice_zh-TW_19275254.wav
- src: "沒 有"
- res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 17.799671
- wav: file://mozilla_common_voice/clips/common_voice_zh-TW_18072642.wav
- src: "近 年"
- res: ""
--------------------------------------------------------------------------------
Is my configuration wrong? or any step of installation is not correct?
Thank you very much.