Using Deep Speech

(Buvana R) #21

Hello, what are the training data sets that went into the model that is available at

Will you release a fully trained NN?
(kdavis) #22

LibriSpeech[1], Fisher[2,3,4,5], and Switchboard[6]

(Mirko) #23

I was wondering on execution time:

Inference took 3.607s for 1.393s audio file.

Is this normal exec time, since I have seen some examples online that 30s was needed for 28sec of audio file.
Also I know few examples where usual time is half of duration of audio.
I haven’t tried GPU powered deepspeech since my hardware+OS is in fight with Nvidia atm.


(kdavis) #24

Unfortunately, as to whether this is “normal” is all hardware dependent.

(Mirko) #25

Thanks for reply :slightly_smiling_face:
So if I prepare a more powerful GPU box, I should expect much better results.
The only reason is that some of the proprietary software brag about 1/0.5 ratio of duration/transcription …

(kdavis) #26

A 1/0.5 ratio should be achievable on a GeForce GTX 1070 or above for clips a few sec long.

(Mirko) #27

Great, thanks for info kdavis :slight_smile:

(Shriya485) #28

How i can use the pre trained model?

(kdavis) #29

The README of the current v.0.1.1 release describes usage.

(David Radio) #30

Hi! I got same error, did yu fixed it?

(David Radio) #31

anybody can fix? :roll_eyes::roll_eyes::roll_eyes::frowning:

(Lissyx) #32

Please avoid hijacking threads, properly document your error and your setup, otherwise, nobody can help you.

(rinin_farina) #33

based on snapshot by @yesterdays, python 2.7.5 is used. use python 3.5 instead. i am using python 3.5 and it worked fine.