Deepspeech inference with multiple gpu

Hello,
I am currently trying to inference a large number of files using a trained model. During inference, deepspeech python client uses only single gpu out of 3. How can I extend it to use all of them in parallel?

We don’t have support for batch inference in the library currently, your best solution is evaluate.py / transcribe.py

The inference process is bottlenecked by the decoder which is CPU only. Using all GPUs won’t gain you much performance, which is why evaluate.py and transcribe.py only use a single GPU.