Why 5s audio?

m3m3.chan · June 26, 2019, 1:10pm

Hello,

I would like to know why does it state in the instructions that the recognition works only on 5s long audio files? will it not work on ones >10s or is it that the quality just drops?

Thanks,

lissyx · June 26, 2019, 1:22pm

This was an old constraint, and I can’t find references to it. Can you please link ?

dabinat · June 26, 2019, 2:20pm

Oh, that’s been lifted now? I was still splitting everything up into small segments.

Is the length limit still there, only higher, or can we assume it to be unlimited?

lissyx · June 26, 2019, 2:24pm

Well, we had that “”“limit”"" back then because we knew the model was not performing so well on long audio, due to the bidirectionnal layer. Now we have proper streaming, it should be better.

If you are referring to the training part, there’s still some kind of limit, because too long audio will make it hard to fit into GPUs memory.

m3m3.chan · June 26, 2019, 3:15pm

I read it on github here
thought that’s still the case since its mentioned on the project homepage:

Once everything is installed, you can then use the deepspeech binary to do speech-to-text on short (approximately 5-second long) audio files

anyway thanks for the response

Topic		Replies	Views
Longer audio files with Deep Speech DeepSpeech	12	12080	November 21, 2019
Can DeepSpeech process longer audio files? DeepSpeech	5	6427	December 18, 2019
Audio files for Deepspeech DeepSpeech	1	443	June 24, 2019
Using the python package on longer audio DeepSpeech	5	763	March 4, 2018
Information on training and inferring audio file length DeepSpeech	5	1140	August 15, 2018

Why 5s audio?

Related topics