Right CUDA version for using deepspeech-gpu

caucheteux · June 27, 2019, 9:41am

Hello Everyone,

May be it’s a dumb question but I don’t see any doc or response which satisfies me regarding the right version of CUDA.
I saw in the documentation that you need CUDA 10.0 and CUDNN 7.5 in order to make deepspeech-gpu work. Currently with TensorFlow 1.13 it depends on CUDA 10.0 and CuDNN v7.5.
But in this topic they used CUDA 10.2, just like me.
Following the ouput of nvida-smi command:

What I did is install deepspeech-gpu and tensorflow-gpu with pip3 in a virtualenv just as the doc said, and after I used deepspeech with the pre-trained model I got the following error:
deepspeech --model models/output_graph.pbmm --alphabet models/alphabet.txt --lm models/lm.binary --trie models/trie --audio agni1.wav
Traceback (most recent call last):
File “/home/leopaul/tmp/deepspeech-venv/lib/python3.7/site-packages/deepspeech/impl.py”, line 14, in swig_import_helper
return importlib.import_module(mname)
File “/home/leopaul/tmp/deepspeech-venv/lib/python3.7/importlib/init.py”, line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File “”, line 1006, in _gcd_import
File “”, line 983, in _find_and_load
File “”, line 967, in _find_and_load_unlocked
File “”, line 670, in _load_unlocked
File “”, line 583, in module_from_spec
File “”, line 1043, in create_module
File “”, line 219, in _call_with_frames_removed
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/leopaul/tmp/deepspeech-venv/bin/deepspeech”, line 6, in
from deepspeech.client import main
File “/home/leopaul/tmp/deepspeech-venv/lib/python3.7/site-packages/deepspeech/init.py”, line 14, in
from deepspeech.impl import PrintVersions as printVersions
File “/home/leopaul/tmp/deepspeech-venv/lib/python3.7/site-packages/deepspeech/impl.py”, line 17, in
_impl = swig_import_helper()
File “/home/leopaul/tmp/deepspeech-venv/lib/python3.7/site-packages/deepspeech/impl.py”, line 16, in swig_import_helper
return importlib.import_module(’_impl’)
File “/home/leopaul/tmp/deepspeech-venv/lib/python3.7/importlib/init.py”, line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named ‘_impl’

Did I get something wrong ? I’m relatively new to ubuntu…
Is it have something to do with the fact that my Nvidia card may not be powerful enough ?

Thank you !

caucheteux · June 27, 2019, 9:51am

Note that the final goal is to train my own model using CommonVoice data and DeepSpeech.

It’s too long to do it using CPU so I want to use GPU instead.

It works well when I used pre-trained model and CPU but training is way too expensive in term of RAM if not using GPU…

alchemi5t · June 27, 2019, 10:06am

This problem has nothing to do with how powerful your card is, it is more likely an issue with your installation of CUDA toolkit and cudnn.

check this out.

also your program is looking for 10.0 while you have 10.2. either configure it and build it again at 10.2 or the pick the easier option which is to get rid of 10.2 toolkit and install 10.0.

lissyx · June 27, 2019, 11:01am

So we advertise to use CUDA 10.0, you use 10.2, and it does not work, and you are surprised ?

So you should not care about deepspeech-gpu but tensorflow-gpu

caucheteux · June 27, 2019, 11:10am

Not surprise, as I said, I saw 2 contradictory information about CUDA version (10.0 in the doc and 10.2 in a topic), that’s why I asked how this is possible.

Now, it seems to be easier to downgrade CUDA to 10.0 instead of adapt the c So you should not care about deepspeech-gpu but tensorflow-gpuode to 10.2, is it the right move to do ?

Ok, thanks, is it a problem if I have deepspeech-gpu and tensorflow-gpu installed in my virtualenv ?

Thank you

lissyx · June 27, 2019, 11:13am

no

I don’t see anything related to CUDA 10.2 in this topic

It’s not that it is easier, it’s that you don’t have a choice

no, I’m saying that you are mixing inference-only deepspeech-gpu and training requirement tensorflow-gpu. And in both case, you depend on TensorFlow r1.13 which depends on CUDA 10.0, as we document.

caucheteux · June 27, 2019, 11:16am

Alright, downgrade it is then.

I’m confused with this topic where it seems to me that he/she uses cuda 10.2, that’s why I asked but whatever.

Keep you in touch once it’s done
Thanks again

lissyx · June 27, 2019, 11:18am

Are you referring to the CUDA mention in nvidia-smi ? I don’t think it’s really reliable, my Debian system has 10.1 exposed like that and I still use a 10.0 user-level install of CUDA.

alchemi5t · June 27, 2019, 11:29am

@lissyx is right. i use 10.0. Nvidia-smi is just garbage at showing the right version

caucheteux · June 27, 2019, 11:34am

Okay, well that’s good to know. I checked and indeed the version exposed is not the version runned. Thanks nvidia-smi…

I had 10.2 exposed and 10.1 runned. I’ll go to 10.0 and hope it’ll work

Thanks a lot both of you !

caucheteux · June 27, 2019, 12:45pm

Last question, just to be sure.
What version of CUDNN do I have to use ? because in the doc it says 7.5. so that means i can use 7.5.0 or 7.5.1 with no impact ?

Thanks

lissyx · June 27, 2019, 12:46pm

Yes, 7.5 targets anything like 7.5.0, 7.5.1.

caucheteux · June 27, 2019, 1:37pm

All good, it works with CUDA 10.0 and CuDNN 7.5.1 !
I hope this topic will help a lot of people

We’ll see how the training wil go now

Thanks again