Dear Support,
I am training the UTF-8 Cantonese model with dataset from Common Voice. The Training Phase and Validation Phase have been completed successfully by the below command.
./DeepSpeech.py
–train_files /mnt/deepspeechdata/filter/CV/zh-HK/clips/train.csv
–dev_files /mnt/deepspeechdata/filter/CV/zh-HK/clips/dev.csv
–epochs 30
–checkpoint_dir /mnt/deepspeechdata/filter/results/checkpoint/
–alphabet_config_path /mnt/deepspeechdata/filter/CV/zh-HK/alphabet.txt
–scorer_path /mnt/deepspeechdata/filter/lm/kenlm.scorer
–reduce_lr_on_plateau
–learning_rate 0.0001
–n_hidden 2048
–train_batch_size 160
–dev_batch_size 20
–dropout_rate 0.28
–utf8 \
But when start the Test Phase, the error occurs. The error log as below:
I Loading best validating checkpoint from /mnt/deepspeechdata/filter/results/checkpoint/best_dev-28
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/weights
Testing model on /mnt/deepspeechdata/filter/CV/zh-HK/clips/test.csv
I Test epoch...
Fatal Python error: Segmentation fault
Thread 0x00007f50df91b740 (most recent call first):
File "/Segmentation fault
- Have I written custom code (as opposed to running examples on an unmodified clone of the repository) : NO
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04) : Windows - Dockerfile - Ubuntu 18.04
- TensorFlow installed from (our builds, or upstream TensorFlow) : our builds
- TensorFlow version (use command below) : tensorflow r1.15.0
- Python version : Python3.7
- Bazel version (if compiling from source) :
- GCC/Compiler version (if compiling from source) :
- CUDA/cuDNN version :
- GPU model and memory :
- Exact command to reproduce :
./DeepSpeech.py
–noshow_progressbar
–test_files /mnt/deepspeechdata/filter/CV/zh-HK/clips/test.csv
–checkpoint_dir /mnt/deepspeechdata/filter/results/checkpoint/
–alphabet_config_path /mnt/deepspeechdata/filter/CV/zh-HK/alphabet.txt
–scorer_path /mnt/deepspeechdata/filter/lm/kenlm.scorer
–n_hidden 2048
–test_batch_size 20
–utf8 \
The version number as below:
root@94f02792732d:/DeepSpeech# pip list
Package Version Location
-------------------- ------------------------------------ --------------------
absl-py 0.9.0
alembic 1.4.2
astor 0.8.1
attrdict 2.0.1
audioread 2.1.8
beautifulsoup4 4.9.1
bs4 0.0.1
certifi 2020.4.5.2
cffi 1.14.0
chardet 3.0.4
cliff 3.1.0
cmaes 0.5.0
cmd2 0.8.9
colorlog 4.1.0
decorator 4.4.2
deepspeech 0.7.3
deepspeech-training training-deepspeech-training-VERSION /DeepSpeech/training
ds-ctcdecoder 0.7.3
gast 0.2.2
google-pasta 0.2.0
grpcio 1.29.0
h5py 2.10.0
idna 2.9
importlib-metadata 1.6.1
joblib 0.15.1
Keras-Applications 1.0.8
Keras-Preprocessing 1.1.2
librosa 0.7.2
llvmlite 0.31.0
Mako 1.1.3
Markdown 3.2.2
MarkupSafe 1.1.1
numba 0.47.0
numpy 1.18.5
opt-einsum 3.2.1
optuna 1.5.0
opuslib 2.0.0
pandas 1.0.4
pbr 5.4.5
pip 20.0.2
prettytable 0.7.2
progressbar2 3.51.3
protobuf 3.12.2
pycparser 2.20
pyparsing 2.4.7
pyperclip 1.8.0
python-dateutil 2.8.1
python-editor 1.0.4
python-utils 2.4.0
pytz 2020.1
pyxdg 0.26
PyYAML 5.3.1
requests 2.23.0
resampy 0.2.2
scikit-learn 0.23.1
scipy 1.4.1
semver 2.10.1
setuptools 39.1.0
six 1.15.0
SoundFile 0.10.3.post1
soupsieve 2.0.1
sox 1.3.7
SQLAlchemy 1.3.17
stevedore 2.0.0
tensorboard 1.15.0
tensorflow 1.15.2
tensorflow-estimator 1.15.1
tensorflow-gpu 1.15.0
termcolor 1.1.0
threadpoolctl 2.1.0
tqdm 4.46.1
urllib3 1.25.9
wcwidth 0.2.4
Werkzeug 1.0.1
wheel 0.33.6
wrapt 1.12.1
zipp 3.1.0
I have reviewed all the datasets and filtered the better quality dataset for the training.
Please find the datasets and .csv file are attached as below:
The lm.bainary and kenlm.scorer files are attached as below:
When I train the datasets without the Validation Phase, the Test Phase also occurs errors.
Please find the logs as below:
root@94f02792732d:/DeepSpeech# ./DeepSpeech.py \
> --train_files /mnt/deepspeechdata/filter/CV/zh-HK/clips/train.csv \
> --test_files /mnt/deepspeechdata/filter/CV/zh-HK/clips/test.csv \
> --epochs 30 \
> --checkpoint_dir /mnt/deepspeechdata/filter/results/checkpoint/ \
> --alphabet_config_path /mnt/deepspeechdata/filter/CV/zh-HK/alphabet.txt \
> --scorer_path /mnt/deepspeechdata/filter/lm/kenlm.scorer \
> --reduce_lr_on_plateau \
> --learning_rate 0.0001 \
> --n_hidden 2048 \
> --train_batch_size 160 \
> --test_batch_size 20 \
> --dropout_rate 0.28 \
> --utf8 \
>
I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
I Initializing all variables.
I STARTING Optimization
Epoch 0 | Training | Elapsed Time: 0:01:09 | Steps: 1 | Loss: 1095.837158
Epoch 1 | Training | Elapsed Time: 0:01:07 | Steps: 1 | Loss: 666.401001
Epoch 2 | Training | Elapsed Time: 0:01:10 | Steps: 1 | Loss: 261.214905
Epoch 3 | Training | Elapsed Time: 0:01:09 | Steps: 1 | Loss: 238.135345
Epoch 4 | Training | Elapsed Time: 0:01:08 | Steps: 1 | Loss: 294.818665
Epoch 5 | Training | Elapsed Time: 0:01:10 | Steps: 1 | Loss: 296.967285
Epoch 6 | Training | Elapsed Time: 0:01:10 | Steps: 1 | Loss: 264.965759
Epoch 7 | Training | Elapsed Time: 0:01:10 | Steps: 1 | Loss: 221.122955
Epoch 8 | Training | Elapsed Time: 0:01:10 | Steps: 1 | Loss: 191.377640
Epoch 9 | Training | Elapsed Time: 0:01:09 | Steps: 1 | Loss: 199.586411
Epoch 10 | Training | Elapsed Time: 0:01:08 | Steps: 1 | Loss: 200.823044
Epoch 11 | Training | Elapsed Time: 0:01:08 | Steps: 1 | Loss: 188.246185
Epoch 12 | Training | Elapsed Time: 0:01:09 | Steps: 1 | Loss: 177.267181
Epoch 13 | Training | Elapsed Time: 0:01:09 | Steps: 1 | Loss: 172.420319
Epoch 14 | Training | Elapsed Time: 0:01:10 | Steps: 1 | Loss: 172.861816
Epoch 15 | Training | Elapsed Time: 0:01:09 | Steps: 1 | Loss: 175.622055
Epoch 16 | Training | Elapsed Time: 0:01:08 | Steps: 1 | Loss: 177.549561
Epoch 17 | Training | Elapsed Time: 0:01:10 | Steps: 1 | Loss: 177.466965
Epoch 18 | Training | Elapsed Time: 0:01:07 | Steps: 1 | Loss: 175.106949
Epoch 19 | Training | Elapsed Time: 0:01:07 | Steps: 1 | Loss: 171.883148
Epoch 20 | Training | Elapsed Time: 0:01:07 | Steps: 1 | Loss: 168.839874
Epoch 21 | Training | Elapsed Time: 0:01:07 | Steps: 1 | Loss: 166.874115
Epoch 22 | Training | Elapsed Time: 0:01:09 | Steps: 1 | Loss: 165.515945
Epoch 23 | Training | Elapsed Time: 0:01:10 | Steps: 1 | Loss: 164.082031
Epoch 24 | Training | Elapsed Time: 0:01:09 | Steps: 1 | Loss: 161.798309
Epoch 25 | Training | Elapsed Time: 0:01:07 | Steps: 1 | Loss: 159.209564
Epoch 26 | Training | Elapsed Time: 0:01:07 | Steps: 1 | Loss: 157.023651
Epoch 27 | Training | Elapsed Time: 0:01:07 | Steps: 1 | Loss: 155.863556
Epoch 28 | Training | Elapsed Time: 0:01:07 | Steps: 1 | Loss: 155.569336
Epoch 29 | Training | Elapsed Time: 0:01:07 | Steps: 1 | Loss: 155.955887
I FINISHED optimization in 0:36:19.105478
I Could not find best validating checkpoint.
I Loading most recent checkpoint from /mnt/deepspeechdata/filter/results/checkpoint/train-30
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/weights
Testing model on /mnt/deepspeechdata/filter/CV/zh-HK/clips/test.csv
Test epoch | Steps: 0 | Elapsed Time: 0:00:00 Fatal Python error: Segmentation fault
Thread 0x00007f28117fa700 (most recent call first):
File "/usr/lib/python3.6/threading.py", line 295 in wait
File "/usr/lib/python3.6/queue.py", line 164 in get
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/summary/writer/event_file_writer.py", line 159 in run
File "/usr/lib/python3.6/threading.py", line 916 in _bootstrap_inner
File "/usr/lib/python3.6/threading.py", line 884 in _bootstrap
Thread 0x00007f2811ffb700 (most recent call first):
File "/usr/lib/python3.6/threading.py", line 295 in wait
File "/usr/lib/python3.6/queue.py", line 164 in get
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/summary/writer/event_file_writer.py", line 159 in run
File "/usr/lib/python3.6/threading.py", line 916 in _bootstrap_inner
File "/usr/lib/python3.6/threading.py", line 884 in _bootstrap
Thread 0x00007f28c5f09740 (most recent call first):
File "/usr/local/lib/python3.6/dist-packages/ds_ctcdecoder/swigwrapper.py", line 364 in ctc_beam_search_decoder_batch
File "/usr/local/lib/python3.6/dist-packages/ds_ctcdecoder/__init__.py", line 134 in ctc_beam_search_decoder_batch
File "/DeepSpeech/training/deepspeech_training/evaluate.py", line 110 in run_test
File "/DeepSpeech/training/deepspeech_training/evaluate.py", line 128 in evaluate
File "/DeepSpeech/training/deepspeech_training/train.py", line 645 in test
File "/DeepSpeech/training/deepspeech_training/train.py", line 917 in main
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250 in _run_main
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299 in run
File "/DeepSpeech/training/deepspeech_tSegmentation fault
But when I set the --n_hidden from 2048 to 512, the Test Phase can execute normally.
root@94f02792732d:/DeepSpeech# ./DeepSpeech.py \
> --train_files /mnt/deepspeechdata/filter/CV/zh-HK/clips/train.csv \
> --test_files /mnt/deepspeechdata/filter/CV/zh-HK/clips/test.csv \
> --epochs 30 \
> --checkpoint_dir /mnt/deepspeechdata/filter/results/checkpoint/ \
> --alphabet_config_path /mnt/deepspeechdata/filter/CV/zh-HK/alphabet.txt \
> --scorer_path /mnt/deepspeechdata/filter/lm/kenlm.scorer \
> --reduce_lr_on_plateau \
> --learning_rate 0.0001 \
> --n_hidden 512 \
> --train_batch_size 160 \
> --test_batch_size 20 \
> --dropout_rate 0.28 \
> --utf8 \
>
I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
I Initializing all variables.
I STARTING Optimization
Epoch 0 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 1093.047852
Epoch 1 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 1046.377197
Epoch 2 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 990.846680
Epoch 3 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 930.426086
Epoch 4 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 864.664734
Epoch 5 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 794.309631
Epoch 6 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 717.662781
Epoch 7 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 636.994812
Epoch 8 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 552.994995
Epoch 9 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 470.793152
Epoch 10 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 392.140533
Epoch 11 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 325.533356
Epoch 12 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 276.200684
Epoch 13 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 244.471344
Epoch 14 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 229.381546
Epoch 15 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 226.331833
Epoch 16 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 230.216553
Epoch 17 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 237.089355
Epoch 18 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 243.226028
Epoch 19 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 249.238281
Epoch 20 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 252.872406
Epoch 21 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 255.091064
Epoch 22 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 255.621185
Epoch 23 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 255.067184
Epoch 24 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 253.103195
Epoch 25 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 250.204681
Epoch 26 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 246.448883
Epoch 27 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 242.263184
Epoch 28 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 238.072708
Epoch 29 | Training | Elapsed Time: 0:00:06 | Steps: 1 | Loss: 233.581329
I FINISHED optimization in 0:03:30.042398
I Could not find best validating checkpoint.
I Loading most recent checkpoint from /mnt/deepspeechdata/filter/results/checkpoint/train-30
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/weights
Testing model on /mnt/deepspeechdata/filter/CV/zh-HK/clips/test.csv
Test epoch | Steps: 1 | Elapsed Time: 0:00:12
Test on /mnt/deepspeechdata/filter/CV/zh-HK/clips/test.csv - WER: 1.000000, CER: 0.985772, loss: 322.528870
--------------------------------------------------------------------------------
Best WER:
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.913043, loss: 268.580719
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20137012.wav
- src: "姨 媽 同 我 去 長 洲 冰 廠 路 買 餸"
- res: "請問我想去大埔滘�"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.923077, loss: 450.225311
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20137138.wav
- src: "老 闆 請 我 去 葵 涌 童 子 街 間 餐 廳 食 西 多 士 飲 奶 茶"
- res: "請問我想去大埔滘科�"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.960000, loss: 273.003296
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20197584.wav
- src: "有 個 老 人 去 左 西 貢 沙 咀 街 食 齋"
- res: "請問我想去大埔滘�"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.972973, loss: 433.278564
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20101461.wav
- src: "八 號 風 球 好 大 風 西 營 盤 爹 核 里 依 家 橫 風 橫 雨"
- res: "請問我想去大埔滘�"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.974359, loss: 447.331390
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20148149.wav
- src: "老 友 唔 記 得 去 石 硤 尾 澤 安 道 南 參 加 緩 步 跑 練 習"
- res: "請問我想去大埔滘�"
--------------------------------------------------------------------------------
Median WER:
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 402.582550
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20137044.wav
- src: "細 佬 喺 沙 田 沙 田 車 站 圍 收 養 左 一 隻 流 浪 狗"
- res: "請問我想去大埔滘�"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 397.107635
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20226277.wav
- src: "有 無 人 知 道 愉 景 灣 深 水 埗 徑 係 點 去 㗎"
- res: "請問我想去大埔�"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 392.437347
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20197632.wav
- src: "亞 爸 喺 灣 仔 盧 押 道 買 左 三 磅 士 多 啤 梨 返 屋 企"
- res: "請問我想去大埔滘�"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 372.739594
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20136999.wav
- src: "流 浪 貓 喺 沙 田 禾 盛 街 嘅 垃 圾 桶 搵 野 食"
- res: "請問我想去大埔滘�"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 366.272583
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20101462.wav
- src: "有 個 老 婆 婆 喺 牛 池 灣 紫 葳 路 等 緊 小 巴"
- res: "請問我想去大埔滘�"
--------------------------------------------------------------------------------
Worst WER:
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 358.208649
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20226141.wav
- src: "有 個 老 婆 婆 喺 東 涌 翔 東 路 等 緊 小 巴"
- res: "請問我想去大埔滘�"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 209.531494
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20137132.wav
- src: "我 住 喺 何 文 田 站 附 近"
- res: "請問我想去大�"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 78.237846
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20143952.wav
- src: "寧 波 街"
- res: "請問�"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.000000, loss: 74.291573
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20143973.wav
- src: "咩 事 呀"
- res: "請問我�"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 1.200000, loss: 73.213852
- wav: file:///mnt/deepspeechdata/filter/CV/zh-HK/clips/test/common_voice_zh-HK_20137365.wav
- src: "義 德 道"
- res: "請問我想去�"
--------------------------------------------------------------------------------