You can either:
- rebuild the python client to use multi-processing Running multiple inferences in parallel on a GPU
- Run the DeepSpeech inference in a flask app. Flask will handle multithreading predict. Be sure to run 0.6.0 since 0.5.1 is not thread-safe.