Hello, what are the training data sets that went into the model that is available at https://github.com/mozilla/DeepSpeech/releases?
I was wondering on execution time:
Inference took 3.607s for 1.393s audio file.
Is this normal exec time, since I have seen some examples online that 30s was needed for 28sec of audio file.
Also I know few examples where usual time is half of duration of audio.
I haven’t tried GPU powered deepspeech since my hardware+OS is in fight with Nvidia atm.
Unfortunately, as to whether this is “normal” is all hardware dependent.
Thanks for reply
So if I prepare a more powerful GPU box, I should expect much better results.
The only reason is that some of the proprietary software brag about 1/0.5 ratio of duration/transcription …
A 1/0.5 ratio should be achievable on a GeForce GTX 1070 or above for clips a few sec long.
Great, thanks for info kdavis
How i can use the pre trained model?
Hi! I got same error, did yu fixed it?