My model recognizes well when generating the model but when I use it I don't

I generated my model with

python3 ./../DeepSpeech/DeepSpeech.py \
  --beam_width 1024 \
  --train_files clips/train.csv \
  --dev_files clips/dev.csv \
  --test_files clips/test.csv \
  --train_batch_size 20 \
  --dev_batch_size 48 \
  --test_batch_size 48 \
  --n_hidden 2048 \
  --epochs 1 \
  --report_count 900000 \
  --dropout_rate 0.11 \
  --early_stop True \
  --learning_rate 0.0001 \
  --lm_alpha 0.75 \
  --lm_beta 2.2 \
  --lm_binary_path ./../my-model/lm.binary \
  --export_dir result_pb/models-tl \
  --checkpoint_dir result_pb/ckpts-tl2 \
  --alphabet_config_path ./../alphabet.txt

and during training It recognized the phrases with great accuracy. Example:

WER: 0.000000, CER: 0.000000, loss: 32.239586

  • wav: file:///home/adrian/es/clips/common_voice_es_19708616.wav
  • src: “la iglesia fue restaurada en diversas ocasiones”
  • res: “la iglesia fue restaurada en diversas ocasiones”

I took the same sentence to try with the model that exports output_graph.pb

I executed:

deepspeech --model ./../es/result_pb/models-tl/output_graph.pb --lm lm.binary --audio ./../es/clips/common_voice_es_19708616.wav

But the result has been
la lecia fuedestabrada endivesas opaciones

2020-03-06 12:20:40.526840: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N
2020-03-06 12:20:40.527180: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-06 12:20:40.527717: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-06 12:20:40.528159: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-06 12:20:40.528580: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10783 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
Loading model from file ./../es/result_pb/models-tl/output_graph.pb
Loaded model in 2.73s.
Running inference.
Inference took 3.545s for 4.680s audio file.
la lecia fuedestabrada endivesas opaciones

la lecia fuedestabrada endivesas opaciones

Where do you think the mistake might be? Is it because I don’t have the trie file?

Version DeepSpeech 0.6.1

Thank you very much!

You can’t use the LM without the trie. Please generate it.

Is it Training ? Dev ? Test ?

You only train on one epoch ? How much data do you have ?

Aside: Are you working on Spanish? Have you had a look at joining the efforts and leveraging https://github.com/Common-Voice/commonvoice-fr/blob/master/DeepSpeech/Dockerfile.train like Italian Community did? https://github.com/MozillaItalia/DeepSpeech-Italian-Model

There is no need to re-duplicate the efforts.

1 Like

Thanks for your incredible support @lissyx
I used epochs 1 to do a quick test but I am aware that I have to increase it. Thanks again :smiley:

I am working for an ambitious project, in a few weeks I will show you all Mozillians :+1:

1 Like

This is nice, but please try to share efforts. Everyone will move faster.

Totally agree with you!!! #community-portal:open-source-community

@lissyx another question related to “trie”, Is it normal that training with “trie param” gives me worse results?

The LM and trie are not used during training.

1 Like

Thanks @lissyx !!!