GPU support

Hello, I’m having trouble using GPU.
I installed tensorflow with GPU support using conda and deepspeech-gpu using pip, in the same conda environment.
Tensorflow’s GPU test is ok:

In [2]: from tensorflow import test                                                                                                                                                                          

In [3]: test.is_gpu_available()                                                                                                                                                                              
2020-06-11 16:55:26.062288: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-06-11 16:55:26.122973: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2099935000 Hz
2020-06-11 16:55:26.126706: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5619cc00c9c0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-11 16:55:26.126750: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-06-11 16:55:26.130136: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-06-11 16:55:27.403423: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:03:00.0 name: GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.7335GHz coreCount: 20 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 298.32GiB/s
2020-06-11 16:55:27.407490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 1 with properties: 
pciBusID: 0000:82:00.0 name: GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.7335GHz coreCount: 20 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 298.32GiB/s
2020-06-11 16:55:27.448564: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-06-11 16:55:27.790562: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-06-11 16:55:27.967516: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-06-11 16:55:28.085097: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-06-11 16:55:28.460026: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-06-11 16:55:28.574367: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-06-11 16:55:29.177483: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-06-11 16:55:29.184343: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0, 1
2020-06-11 16:55:29.185599: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-06-11 16:55:29.189935: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-11 16:55:29.189963: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 1 
2020-06-11 16:55:29.189974: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N N 
2020-06-11 16:55:29.189983: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 1:   N N 
2020-06-11 16:55:29.196639: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:0 with 7526 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:03:00.0, compute capability: 6.1)
2020-06-11 16:55:29.278384: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:1 with 7526 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080, pci bus id: 0000:82:00.0, compute capability: 6.1)
2020-06-11 16:55:29.284814: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5619cd351070 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-06-11 16:55:29.284847: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1080, Compute Capability 6.1
2020-06-11 16:55:29.284861: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (1): GeForce GTX 1080, Compute Capability 6.1
Out[3]: True

However, when I import deepspeech (or use the CLI) I get the following warning and deepspeech doesn’t seem to use the GPU when running inference.

In [1]: import deepspeech                                                                                                                                                                                    
2020-06-11 16:54:17.050751: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64
2020-06-11 16:54:17.060613: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

Note that cuda 10.1 is installed on my machine (you can also read it in the TF test log), however the above warning relates to ‘libcudart.so.10.0’.

Here’s the output of nvidia-smi if that can help:

(deepspeech) lerner@m150:/vol/work/lerner/pyannote-db-plumcot$ nvidia-smi 
Thu Jun 11 16:58:17 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.01    Driver Version: 418.87.01    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:03:00.0 Off |                  N/A |
| 23%   33C    P0    38W / 180W |      0MiB /  8119MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 00000000:82:00.0 Off |                  N/A |
| 23%   30C    P0    37W / 180W |      0MiB /  8119MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Thank you for your help,

Bests,

What version of TensorFlow?

It’s loading 10.1, which as documented is not for TensorFlow r1.15

So you know why, you are having mismatched CUDA vs the requirements. Just install CUDA 10.0 side-by-side and set LD_LIBRARY_PATH accordingly so that inference code can see it.

Ok thank you, I guess I overlooked tf doc about CUDA versions, it looks like it’s working with tf 2.2.0

Yes, but we don’t have training support for r2.2. So you should adjust and use TF r1.15 requirements.