Using Deep Speech


(Buvana R) #21

Hello, what are the training data sets that went into the model that is available at https://github.com/mozilla/DeepSpeech/releases?


Will you release a fully trained NN?
(kdavis) #22

LibriSpeech[1], Fisher[2,3,4,5], and Switchboard[6]


(Mirko) #23

Hi,
I was wondering on execution time:

Inference took 3.607s for 1.393s audio file.

Is this normal exec time, since I have seen some examples online that 30s was needed for 28sec of audio file.
Also I know few examples where usual time is half of duration of audio.
I haven’t tried GPU powered deepspeech since my hardware+OS is in fight with Nvidia atm.

Thanks,
mirko


(kdavis) #24

Unfortunately, as to whether this is “normal” is all hardware dependent.


(Mirko) #25

Thanks for reply :slightly_smiling_face:
So if I prepare a more powerful GPU box, I should expect much better results.
The only reason is that some of the proprietary software brag about 1/0.5 ratio of duration/transcription …


(kdavis) #26

A 1/0.5 ratio should be achievable on a GeForce GTX 1070 or above for clips a few sec long.


(Mirko) #27

Great, thanks for info kdavis :slight_smile:
Regards,
mirko


(Shriya485) #28

How i can use the pre trained model?


(kdavis) #29

The README of the current v.0.1.1 release describes usage.


(David Radio) #30

Hi! I got same error, did yu fixed it?