Gibberish inference result when using pre-trained model for unknown distribution

sakhawatsumit · June 8, 2018, 9:35am

I was trying to inference using pre-trained librispeech model with some audio sample randomly collected from web. But the result is quite depressing, model predicted every single character wrong.

Ground truth: “The Story of Arthur the Rat. Once upon a time there was a rat who couldn’t make up his”

Predicted : “HAM AUUEWIR CCHIUVHE C O HO AA UBBUSH”

Is there any way to solve this?

lissyx · June 8, 2018, 9:38am

Random audio samples ? Can you share more informations on their characteristics ?

sakhawatsumit · June 8, 2018, 10:01am

Downloaded the audio sample from here.. then splitted the samples in 7s.
Audio properties-
Duration : 7s
channels : 2
sampling rate: 44.1khz
Bit rate : 112 kbps

If you need more info please feel free to ask.

lissyx · June 8, 2018, 10:34am

Okay, then I’d guess our automatic resampling is not good enough and likely kills the data inside. Model expects mono, 16kHz 16-bits PCM. We do have code that perform transformation to that, but obviously this is not good enough.

lissyx · June 8, 2018, 10:36am

Would you have a direct link to share one sample ? I’d like to see what happens after transformation.

sakhawatsumit · June 8, 2018, 10:49am

sure, audio sample arthur the rat

deepakgupta1313 · June 18, 2018, 11:42pm

Use the sox or ffmpeg command for proper encoding of the input file:
“sox “+ip_file+” --bits 16 --channels 1 --rate 16000 “+op_file+””;

Topic		Replies	Views
Inference with model different than 16kHz DeepSpeech	19	5291	December 18, 2019
Inference on Self-Trained Model produces gibberish as output DeepSpeech	16	1658	March 7, 2019
Running inference on long audio files (30-45 minutes) sampled at 44.1kHz with DeepSpeech 0.7.0 DeepSpeech	8	1973	May 10, 2020
Recommended approach for downsampling 44.1kHz audio to 16kHz to ensure accurate results? DeepSpeech	13	12505	June 2, 2020
Inference output only rubbish? DeepSpeech	6	712	February 20, 2021

Gibberish inference result when using pre-trained model for unknown distribution

Related topics