Only gibberish output

hanno · September 11, 2021, 9:41am

I recently tried out deepspeech. I followed the instructions from the docs:
https://deepspeech.readthedocs.io/en/r0.9/

Installed via pip, downloaded pre-trained english models. My source was a video from a press conference which I converted to a fitting wav file first via ffmpeg:
ffmpeg -i [videofile] -acodec pcm_u8 -ar 16000 out.wav

The video was around 1:15, mostly english language spoken, without much background noise, something I’d expect to produce reasonable output. The process ran for around 25 minutes.

The output looked like this:
“entertainments internationalisation teetotallers teetotallers teetotalers teetotallers oesterreichischer disconsolately specialisation inaccessibleness teetotallers teetotalers teetotallers teetotallers teetotallers teetotallers secessionists etiennette itineraries teetotalers etiennette […]”

So it was just meaningless gibberish (the word “teetotallers” appeared a lot for whatever reason). The whole output was only ~5000 bytes (for a >1h video of spoken language one would likely expect much much more text output).

I think I made some fundamental mistake somewhere that produced useless output, but I have no idea where. Any pointers?

Topic		Replies	Views
DeepSpeech generates long nonsense tokens as output DeepSpeech	1	594	July 3, 2018
[Solved] Help with custom language model, output is gibberish DeepSpeech	4	557	April 9, 2020
Gibberish inference result when using pre-trained model for unknown distribution DeepSpeech	6	660	June 18, 2018
Inference output only rubbish? DeepSpeech	6	712	February 20, 2021
Inference on Self-Trained Model produces gibberish as output DeepSpeech	16	1658	March 7, 2019

Only gibberish output

Related topics