Inference time run speeds

Hi, I’ve seen some talk about how fast Deep Speech will run on various platforms at inference time, and I just wanted to verify my understanding.

The github says that “The realtime factor on a GeForce GTX 1070 is about 0.44”, I just wanted to confirm that this means that for every 1 second of real time speech, it takes about .44 seconds of computing time to process? (And not the converse - 1second of processing processes .44 seconds of real speech)

Another question seems to be, this blog mentions: “On a MacBook Pro, using the GPU, the model can do inference at a real-time factor of around 0.3x, and around 1.4x on the CPU alone.” Was this MacBook Pro using a GPU that’s more powerful than a GTX 1070? It appears to be faster. than the reported time for the GTX 1070. Also, why is the speed up so modest for GPU compared to CPU? In my experience, my 1080Ti gives me a factor of 50-100 speed up in my deep neural networks over my CPU.

Lastly, what’s the real-time factor for speech inference on a Titan X?