- Have I written custom code (as opposed to running examples on an unmodified clone of the repository) : No
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04) : Linux Ubuntu 18.04.1 LTS
- TensorFlow installed from (our builds, or upstream TensorFlow) : Using Docker
- TensorFlow version (use command below) : Using Docker
- Python version : Using Docker
- Bazel version (if compiling from source) : Docker
- GCC/Compiler version (if compiling from source) : Docker
- CUDA/cuDNN version : Docker
- GPU model and memory : GeForce GTX 1650/PCIe/SSE2
- Exact command to reproduce : Provided below
This is my command
root@e658b51810f6:/DeepSpeech# python3 DeepSpeech.py
–train_files deepspeech-data/cv-corpus-6.1-2020-12-11/zh-CN/clips/train.csv
–dev_files deepspeech-data/cv-corpus-6.1-2020-12-11/zh-CN/clips/dev.csv
–test_files deepspeech-data/cv-corpus-6.1-2020-12-11/zh-CN/clips/test.csv
–checkpoint_dir deepspeech-data/checkpoints --export_dir deepspeech-data/exported-model --n_hidden 256 --reduce_lr_on_plateau true --plateau_epochs 8 --plateau_reduction 0.08 --early_stop true --es_epochs 10 --es_min_delta 0.06 --dropout_rate 0.4 --bytes_output_mode --automatic_mixed_precision --train_batch_size 128 --dev_batch_size 128 --test_batch_size 128 --lm_alpha 0.6940122363709647 --lm_beta 4.777924224113021 --epochs 1
the logs of the error recieved
Testing model on deepspeech-data/cv-corpus-6.1-2020-12-11/zh-CN/clips/test.csv
Test epoch | Steps: 0 | Elapsed Time: 0:00:00 Traceback (most recent call last):
File “DeepSpeech.py”, line 12, in
ds_train.run_script()
File “/DeepSpeech/training/deepspeech_training/train.py”, line 982, in run_script
absl.app.run(main)
File “/usr/local/lib/python3.6/dist-packages/absl/app.py”, line 300, in run
_run_main(main, args)
File “/usr/local/lib/python3.6/dist-packages/absl/app.py”, line 251, in _run_main
sys.exit(main(argv))
File “/DeepSpeech/training/deepspeech_training/train.py”, line 958, in main
test()
File “/DeepSpeech/training/deepspeech_training/train.py”, line 682, in test
samples = evaluate(FLAGS.test_files.split(’,’), create_model)
File “/DeepSpeech/training/deepspeech_training/evaluate.py”, line 132, in evaluate
samples.extend(run_test(init_op, dataset=csv))
File “/DeepSpeech/training/deepspeech_training/evaluate.py”, line 114, in run_test
cutoff_prob=FLAGS.cutoff_prob, cutoff_top_n=FLAGS.cutoff_top_n)
File “/usr/local/lib/python3.6/dist-packages/ds_ctcdecoder/init.py”, line 228, in ctc_beam_search_decoder_batch
for beam_results in batch_beam_results
File “/usr/local/lib/python3.6/dist-packages/ds_ctcdecoder/init.py”, line 228, in
for beam_results in batch_beam_results
File “/usr/local/lib/python3.6/dist-packages/ds_ctcdecoder/init.py”, line 227, in
[(res.confidence, alphabet.Decode(res.tokens)) for res in beam_results]
File “/usr/local/lib/python3.6/dist-packages/ds_ctcdecoder/init.py”, line 138, in Decode
return res.decode(‘utf-8’)
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xe5 in position 0: invalid continuation byte
After I searched ,i found the solution from @ yang_jiao
change ds_ctcdecoder init .py decode function
def Decode(self, input):
‘’‘Decode a sequence of labels into a string.’’’
res = super(UTF8Alphabet, self).Decode(input)
return res.decode(‘utf-8’,‘ignore’)
But when I want to revise the file in a container of docker , i find that the file is empty
root@e658b51810f6:/DeepSpeech# cd /usr/local/lib/python3.6/dist-packages/ds_ctcdecoder
root@e658b51810f6:/usr/local/lib/python3.6/dist-packages/ds_ctcdecoder# vim init.py
How can i solve the problem?
BTW, another solution is using sorcer
So i use the zh-CN sorcer
root@e658b51810f6:/DeepSpeech# python3 DeepSpeech.py --train_files deepspeech-data/cv-corpus-6.1-2020-12-11/zh-CN/clips/train.csv --dev_files deepspeech-data/cv-corpus-6.1-2020-12-11/zh-CN/clips/dev.csv --test_files deepspeech-data/cv-corpus-6.1-2020-12-11/zh-CN/clips/test.csv --checkpoint_dir deepspeech-data/checkpoints --export_dir deepspeech-data/exported-model --n_hidden 256 --reduce_lr_on_plateau true --plateau_epochs 8 --plateau_reduction 0.08 --early_stop true --es_epochs 10 --es_min_delta 0.06 --dropout_rate 0.4 --bytes_output_mode --automatic_mixed_precision --train_batch_size 128 --dev_batch_size 128 --test_batch_size 128 --lm_alpha 0.6940122363709647 --lm_beta 4.777924224113021 --epochs 1
–scorer_path deepspeech-data
–scorer deepspeech-0.9.3-models-zh-CN.scorer
The error i received
Traceback (most recent call last):
File “DeepSpeech.py”, line 12, in
ds_train.run_script()
File “/DeepSpeech/training/deepspeech_training/train.py”, line 982, in run_script
absl.app.run(main)
File “/usr/local/lib/python3.6/dist-packages/absl/app.py”, line 300, in run
_run_main(main, args)
File “/usr/local/lib/python3.6/dist-packages/absl/app.py”, line 251, in _run_main
sys.exit(main(argv))
File “/DeepSpeech/training/deepspeech_training/train.py”, line 949, in main
early_training_checks()
File “/DeepSpeech/training/deepspeech_training/train.py”, line 934, in early_training_checks
FLAGS.scorer_path, Config.alphabet)
File “/usr/local/lib/python3.6/dist-packages/ds_ctcdecoder/init.py”, line 36, in init
raise ValueError(‘Scorer initialization failed with error code 0x{:X}’.format(err))
ValueError: Scorer initialization failed with error code 0x2005