My inference is completely blank after training

araf.hasan15 · November 29, 2019, 1:34pm

I trained just 500 audio data (Bangla), Amount of data I took to train is:
train.csv --> 500
dev.csv --> 100
test.csv --> 30
I trained my data approximately 15 hours, I have completed 3 epochs (in edition my machine take too much time to train), when I make the ‘output_graph.pb’ and ‘output_graph.pbmm’ from this training parameter, and run the inference for checking output the inference is completely blank, No transcript is shown there. I expect at least one character should show in inference, but it is not happening. (NB: I successfully build my own ‘lm’ and ‘trie’ file). Inference also show this file loaded successfully. But no transcription is shown there. I paste my complete inference operation here:

araf15@araf15-HP-Pavilion-Notebook:~/test_Gen_Bangla$ deepspeech --model models/output_graph.pbmm --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio common_voice_bn_00000008.wav
Loading model from file models/output_graph.pbmm
TensorFlow: v1.13.1-10-g3e0cc53
DeepSpeech: v0.5.1-0-g4b29b78
2019-11-29 19:01:41.863077: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-11-29 19:01:41.868988: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "UnwrapDatasetVariant" device_type: "CPU"') for unknown op: UnwrapDatasetVariant
2019-11-29 19:01:41.869023: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "WrapDatasetVariant" device_type: "GPU" host_memory_arg: "input_handle" host_memory_arg: "output_handle"') for unknown op: WrapDatasetVariant
2019-11-29 19:01:41.869032: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "WrapDatasetVariant" device_type: "CPU"') for unknown op: WrapDatasetVariant
2019-11-29 19:01:41.869108: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "UnwrapDatasetVariant" device_type: "GPU" host_memory_arg: "input_handle" host_memory_arg: "output_handle"') for unknown op: UnwrapDatasetVariant
Loaded model in 0.00804s.
Loading language model from files models/lm.binary models/trie
Loaded language model in 0.00941s.
Running inference.

Inference took 2.314s for 2.880s audio file.

Can anyone please give me any idea where I am going wrong ?

lissyx · November 29, 2019, 1:35pm

You just don’t have enough data and trained for not long enough. So you have a transcript, it’s just empty.

araf.hasan15 · November 29, 2019, 1:47pm

@lissyx can you please mention me. At least how much training data should I have, and at least how many epochs should I complete to get a rough transcription.

lissyx · November 29, 2019, 1:49pm

This is hard to say, as is. For englis we have 3000-4000 hours of data for training. Number of epochs depends on others parameters, but you would need at least 15-20 with a decent amount of data. For french, I use ~600h of data and train for 25 epochs.

araf.hasan15 · November 29, 2019, 1:54pm

@lissyx Thanx, It’s a very useful information I am searching for…!!!