I generated my model with
python3 ./../DeepSpeech/DeepSpeech.py \
--beam_width 1024 \
--train_files clips/train.csv \
--dev_files clips/dev.csv \
--test_files clips/test.csv \
--train_batch_size 20 \
--dev_batch_size 48 \
--test_batch_size 48 \
--n_hidden 2048 \
--epochs 1 \
--report_count 900000 \
--dropout_rate 0.11 \
--early_stop True \
--learning_rate 0.0001 \
--lm_alpha 0.75 \
--lm_beta 2.2 \
--lm_binary_path ./../my-model/lm.binary \
--export_dir result_pb/models-tl \
--checkpoint_dir result_pb/ckpts-tl2 \
--alphabet_config_path ./../alphabet.txt
and during training It recognized the phrases with great accuracy. Example:
WER: 0.000000, CER: 0.000000, loss: 32.239586
- wav: file:///home/adrian/es/clips/common_voice_es_19708616.wav
- src: “la iglesia fue restaurada en diversas ocasiones”
- res: “la iglesia fue restaurada en diversas ocasiones”
I took the same sentence to try with the model that exports output_graph.pb
I executed:
deepspeech --model ./../es/result_pb/models-tl/output_graph.pb --lm lm.binary --audio ./../es/clips/common_voice_es_19708616.wav
But the result has been
la lecia fuedestabrada endivesas opaciones
2020-03-06 12:20:40.526840: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2020-03-06 12:20:40.527180: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-06 12:20:40.527717: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-06 12:20:40.528159: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-06 12:20:40.528580: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10783 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
Loading model from file ./../es/result_pb/models-tl/output_graph.pb
Loaded model in 2.73s.
Running inference.
Inference took 3.545s for 4.680s audio file.
la lecia fuedestabrada endivesas opaciones
la lecia fuedestabrada endivesas opaciones
Where do you think the mistake might be? Is it because I don’t have the trie file?
Version DeepSpeech 0.6.1
Thank you very much!