Deepspeech-gpu version dumping core on running pretrained model

I have been following the instructions on this page to set up DeepSpeech on my GPU: https://github.com/mozilla/DeepSpeech/blob/master/doc/index.rst

indent preformatted text by 4 spaces
Details on my CUDA enable GPU:
01:00.0 VGA compatible controller: NVIDIA Corporation GP104M [GeForce GTX 1070 Mobile] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
02:00.0 3D controller: NVIDIA Corporation GP104M [GeForce GTX 1070 Mobile] (rev a1)
02:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)

Linix Version: Ubuntu 16.04.6 LTS
GCC Version: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
Kernel headers: 4.15.0-96-generic
CUDA Version: 10.2.89
CuDNN Version: 7.6.5
Python: 3.5.2
Tensorflow-gpu: 1.15.2

nvidia-smi
Wed Apr 29 10:55:07 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:01:00.0  On |                  N/A |
| N/A   52C    P8    10W /  N/A |   1663MiB /  8085MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1070    Off  | 00000000:02:00.0 Off |                  N/A |
| N/A   42C    P8     6W /  N/A |      2MiB /  8119MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1031      G   /usr/lib/xorg/Xorg                           866MiB |
|    0      1984      G   compiz                                       398MiB |
|    0      2481      G   ...AAAAAAAAAAAACAAAAAAAAAA= --shared-files   394MiB |
+-----------------------------------------------------------------------------+
indent preformatted text by 4 spaces

I am able to run DeepSpeech (non-gpu version) without any issue but the gpu version is failing when running the inference engine:

deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav 
2020-04-29 10:34:17.009050: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
Loading model from file deepspeech-0.7.0-models.pbmm
TensorFlow: v1.15.0-24-gceb46aa
DeepSpeech: v0.7.0-0-g3fbbca2
2020-04-29 10:34:17.124255: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-04-29 10:34:17.125135: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-04-29 10:34:17.258251: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:34:17.258554: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:01:00.0
2020-04-29 10:34:17.258597: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:34:17.258874: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 1 with properties: 
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:02:00.0
2020-04-29 10:34:17.258884: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-04-29 10:34:17.259737: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-04-29 10:34:17.260611: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-04-29 10:34:17.260776: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-04-29 10:34:17.261633: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-04-29 10:34:17.262057: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-04-29 10:34:17.263867: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-29 10:34:17.263934: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:34:17.264248: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:34:17.264558: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:34:17.264848: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:34:17.265136: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0, 1
2020-04-29 10:34:17.265783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-29 10:34:17.265792: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 1 
2020-04-29 10:34:17.265810: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N Y 
2020-04-29 10:34:17.265815: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 1:   Y N 
2020-04-29 10:34:17.265870: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:34:17.266177: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:34:17.266482: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:34:17.266773: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:34:17.267055: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6343 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-04-29 10:34:17.267255: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:34:17.267551: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-29 10:34:17.267874: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 7619 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1070, pci bus id: 0000:02:00.0, compute capability: 6.1)
Loaded model in 0.221s.
Loading scorer from files deepspeech-0.7.0-models.scorer
Loaded scorer in 0.00351s.
Running inference.
2020-04-29 10:34:17.343444: F tensorflow/stream_executor/cuda/cuda_driver.cc:175] Check failed: err == cudaSuccess || err == cudaErrorInvalidValue Unexpected CUDA error: invalid argument
Aborted (core dumped)

Can you please help identify what I am missing here.

Thanks

Did you do as advised here? And post command lines

https://deepspeech.readthedocs.io/en/v0.7.0/TRAINING.html#recommendations

Thanks for the quick response. I am not yet training the model (want to do this as the next step). I am following the steps in https://deepspeech.readthedocs.io/en/v0.7.0/USING.html (Install python bindings section) and just running this command:
deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio my_audio_file.wav

Check the dependencies here, you got CUDA 10.2 and it is made for 10.0. Search the forum, but you might need to install the lower CUDA version to get GPU support.

https://deepspeech.readthedocs.io/en/v0.7.0/USING.html#cuda-dependency

Alternatively, just install deepspeech instead of deepspeech-gpu if you just want to test it a bit. If you are running only a couple files, there is not much of a difference :slight_smile:

Just Deepspeech (non-gpu version works. Since I am interested in attempting in bulk I would like to utilize my gpu.

Yes I have CUDA 10.2.89 with Tensorflow 1.15.2 and as per the TF docs:
* NVIDIA® GPU drivers —CUDA 10.1 requires 418.x or higher.
* CUDA® Toolkit —TensorFlow supports CUDA 10.1 (TensorFlow >= 2.1.0)

Should the CUDA driver be version 10.0 as per the DeepSpeech cuda-dependency link and not 10.2 (which TF 1.15.2 supports)

thanks

At least with the last branch of DeepSpeech Cuda version 10.0 was required and for me I got the same errors with CUDA version 10.2 and 10.1. This one worked for me: https://dmitry.ai/t/topic/33

TensorFlow 1.15 does not support CUDA 10.1/10.2, the 1.15 docs have just been removed from the TensorFlow website at some point in the 2.x cycle.

The 1.15 release in general had lots of documentation problems. They tried to go to a single package (instead of separate CPU and GPU packages), but then went back on it. It’s a bit of a mess.

These are the actual 1.15 GPU dependencies: https://web.archive.org/web/20191024091316/https://www.tensorflow.org/install/gpu

Thank you very much. uninstalling 10.2 and installing 10.0 resolved this and I am now able to run the pretrained model.

2 Likes