GPU requirements: half (FP16), single (FP32) or double (FP64) precision floating point for calculations?

Hi,

Is training of Deepspeech required half (FP16), single (FP32) or double (FP64) precision floating point for calculations ?

I’m looking for price/performance for GPU cards. Any specific recommendations for GPU cards ?

Tan

As per the exact usage of floating points during training, it’s not something we did look in details and it might be dependant on some tensorflow and/or cuda behavior, so we really don’t have any reliable informations on that.

For sure, the faster, the better: we do train on a some TITANX GPUs. I guess you want to max out your performance at the best budget, but still, it’d be useful to know what kind of workload you expect, and get a rough idea of your budget?

I guess there should be some way to get more instrumentation from TensorFlow. Maybe @reuben or @kdavis has any insight?

We train at FP32 an, as far as I know, have never tried FP16 or FP64 training.

As to GPU recommendations, it depends. I’d suggest taking a look at Tim Dettmers’ Which GPU(s) to Get for Deep Learning as a start.

Yes, that’s what I was going to comment about: the few places where we explicitely set a floating point precision is around the RNN implementation, and it’s FP32. But still, I’m not sure precisely what happens below, in TensorFlow, and even in CUDA: maybe the computation is done, at some level, with a different precision level?

No, not unless you explicitly opt in. (It often requires additional changes to the training code.)