Inference time run speeds

jtian138 · March 27, 2018, 5:05pm

Hi, I’ve seen some talk about how fast Deep Speech will run on various platforms at inference time, and I just wanted to verify my understanding.

The github says that “The realtime factor on a GeForce GTX 1070 is about 0.44”, I just wanted to confirm that this means that for every 1 second of real time speech, it takes about .44 seconds of computing time to process? (And not the converse - 1second of processing processes .44 seconds of real speech)

Another question seems to be, this blog mentions: “On a MacBook Pro, using the GPU, the model can do inference at a real-time factor of around 0.3x, and around 1.4x on the CPU alone.” Was this MacBook Pro using a GPU that’s more powerful than a GTX 1070? It appears to be faster. than the reported time for the GTX 1070. Also, why is the speed up so modest for GPU compared to CPU? In my experience, my 1080Ti gives me a factor of 50-100 speed up in my deep neural networks over my CPU.

Lastly, what’s the real-time factor for speech inference on a Titan X?

Topic		Replies	Views
CPU bottleneck when inferencing with GPU? DeepSpeech	4	541	June 23, 2020
Inference time on V100 seems slow DeepSpeech	13	3173	March 13, 2018
GPU much slower DeepSpeech	9	1894	February 25, 2018
DeepSpeech benchmarking / Shorten inference time DeepSpeech	16	5741	February 14, 2018
Using deepspeech-rs with GPU DeepSpeech	12	987	February 20, 2020

Inference time run speeds

Related topics