Set Memory Limit in Deepspeech-gpu while prediction

I used to predict using Python-Flask API and Deepspeech-Gpu. It’s throwing out of memory error.

How to set the Memory Limit inside the Deepspeech-GPU code

There is no simple switch for that. As far as I remember you will have to implement some sort of scheduling. What do you do in order to get the error? What is your typical workload?

I am using 8 GB GPU NVIDIA GTX 1080 for Prediction and Training.

I meant how are you using DS with how much material in what time that you get mem problems?

I am using deepspeech 0.9.2 version for prediction. There is one API Framework in python named Flask. I written a API to upload the audio file and then it predicts using deepspeech.

When I call the API, as usual it predicts and returns the output. But when i install “deepspeech-gpu” library and predicts, it returns “Memory insufficient error”. I already installed prerequisites that given in deepspeech doc.

Do you have deepspeech and deepspeech-gpu installed at the same time? And try running them in different virtual environments. Both work fine for me with Flask.

Yeah I installed deepspeech and deepspeech-gpu in different environment. and run as API. deepspeech package seems ok. But deepspeech-gpu gives memory insuffient error.

If its works fine for you, Could you please share the code of “Flask with Deepspeech GPU Prediction” and how much memory does your gpu have???

Sorry, code for a client. But search a bit here on the forum for CUDA, there is probably some sort of problem with your environment or your setup.

Does deepspeech-gpu run on CLI with the same audio?

Perfectly working deepspeech-gpu on CLI with the same audio. . .

When it is used as an API, the problem occurs. How much GPU memory you are having when running Flask.

Ah, you could have written that before. This is not a deepspeech problem then.

Look at some of the other servers on github and check how you are loading and running the model.

Depending on the model you use 8GB GPU RAM can be insufficient, especially when you are using a memory-mapped model (.pbmm file)

Ah, Thank you so much @dkreutz @othiele . Will try in Another Servers. . .