How to build gpu version from source

Ironvil · March 17, 2018, 4:40pm

Compiled from source with tensor flow configure having all nvidia cuda options turned on, how can I then get deepspeech to use the cuda version?

lissyx · March 17, 2018, 6:02pm

There’s nothing specific to do, as long as you follow the TensorFlow CUDA steps and thus build with --config=cuda.

Ironvil · March 17, 2018, 8:18pm

Where do I add --config=cuda, on the bazel command line? that wasn’t specified anywhere going to try with that now, takes an awfully long time to build though.

Ironvil · March 17, 2018, 8:35pm

That doesn’t seem to make any difference either, it built a lot of things for gpu, yet when i run it it doesn’t use it. Is there an option at runtime that makes native_client/deepspeech use the gpu?

lissyx · March 17, 2018, 10:09pm

No, --config=cuda. But you need to have setup everything for using CUDA.

lissyx · March 17, 2018, 10:12pm

We do refer to TensorFlow’s docs for specifics of each platforms: https://github.com/mozilla/DeepSpeech/blob/master/native_client/README.md#building, CUDA is documented by TensorFlow.

lissyx · March 17, 2018, 10:13pm

Is there any reason you would need to rebuild from source ? Prebuilt binaries not working ? https://tools.taskcluster.net/index/project.deepspeech.deepspeech.native_client.master/gpu

Ironvil · March 17, 2018, 10:22pm

don’t have avx extensions i’m going to try with cuda 8.0 and cudnn 6 since 9.1 and 7 aren’t working. Going to use tf 1.5 branch instead of master, this the best one to use?

lissyx · March 17, 2018, 10:25pm

Ok, no AVX :(. Best one for ? CUDA 8.0 ? I’m not so sure. But there is really no magic, bazel build --config=cuda [...] //native_client:libdeepspeech.so and you have it build with CUDA.

Ironvil · March 17, 2018, 10:28pm

does it fallback to cpu if the gpu isn’t found? is there a way I can check that isn’t happening?

lissyx · March 17, 2018, 10:43pm

Building TensorFlow with CUDA links it against CUDA:

runtime stdout/stderr will show it’s using the GPU
if you are missing the CUDA libs, it will not even fallback to CPU since linker will complain.

lissyx · March 17, 2018, 10:59pm

@Ironvil FTR here is a sample output of running from the link above:

$ time ./deepspeech ../models/output_graph.pb ../models/alphabet.txt ../audio/ -t
TensorFlow: v1.6.0-9-g236f83e
DeepSpeech: v0.1.1-44-gd68fde8
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2018-03-17 23:58:25.631405: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-03-17 23:58:25.839521: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-03-17 23:58:25.839879: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties: 
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.797
pciBusID: 0000:41:00.0
totalMemory: 7.92GiB freeMemory: 7.47GiB
2018-03-17 23:58:25.839894: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-03-17 23:58:25.965245: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7230 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:41:00.0, compute capability: 6.1)
Running on directory ../audio/
> ../audio//2830-3980-0043.wav
experience proves tis
cpu_time_overall=2.29062 cpu_time_mfcc=0.00429 cpu_time_infer=2.28632
> ../audio//4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=0.57249 cpu_time_mfcc=0.00561 cpu_time_infer=0.56689
> ../audio//8455-210777-0068.wav
your powr is sufficient i said
cpu_time_overall=0.52592 cpu_time_mfcc=0.00427 cpu_time_infer=0.52165

real	0m4,223s
user	0m2,881s
sys	0m1,199s

lissyx · March 17, 2018, 11:00pm

@Ironvil And ldd:

$ ldd deepspeech libdeepspeech.so 
deepspeech:
	linux-vdso.so.1 (0x00007ffe8d8b3000)
	libcudart.so.9.0 => /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/CUDA/lib64/libcudart.so.9.0 (0x00007f54c70e1000)
	libcuda.so.1 => /usr/lib/x86_64-linux-gnu/libcuda.so.1 (0x00007f54c6541000)
	libdeepspeech.so => /home/alexandre/tmp/deepspeech/gpu/./libdeepspeech.so (0x00007f54afd1b000)
	libdeepspeech_utils.so => /home/alexandre/tmp/deepspeech/gpu/./libdeepspeech_utils.so (0x00007f54afb16000)
	libsox.so.2 => /usr/lib/x86_64-linux-gnu/libsox.so.2 (0x00007f54af881000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f54af4fc000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f54af169000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f54aef51000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f54aeb97000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f54ae993000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f54ae775000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f54ae56d000)
	libnvidia-fatbinaryloader.so.390.42 => /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.390.42 (0x00007f54ae321000)
	libcusolver.so.9.0 => /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/CUDA/lib64/libcusolver.so.9.0 (0x00007f54a9726000)
	libcublas.so.9.0 => /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/CUDA/lib64/libcublas.so.9.0 (0x00007f54a62f0000)
	libcudnn.so.7 => /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/CUDA/lib64/libcudnn.so.7 (0x00007f5494e59000)
	libcufft.so.9.0 => /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/CUDA/lib64/libcufft.so.9.0 (0x00007f548cdb8000)
	libcurand.so.9.0 => /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/CUDA/lib64/libcurand.so.9.0 (0x00007f5488e54000)
	libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f5488c25000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f54c734e000)
	libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007f5488a1b000)
	libpng16.so.16 => /usr/lib/x86_64-linux-gnu/libpng16.so.16 (0x00007f54887e8000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f54885ce000)
	libmagic.so.1 => /usr/lib/x86_64-linux-gnu/libmagic.so.1 (0x00007f54883ac000)
	libgsm.so.1 => /usr/lib/x86_64-linux-gnu/libgsm.so.1 (0x00007f548819f000)
libdeepspeech.so:
	linux-vdso.so.1 (0x00007ffc1a97b000)
	libcusolver.so.9.0 => /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/CUDA/lib64/libcusolver.so.9.0 (0x00007ffa86564000)
	libcublas.so.9.0 => /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/CUDA/lib64/libcublas.so.9.0 (0x00007ffa8312e000)
	libcuda.so.1 => /usr/lib/x86_64-linux-gnu/libcuda.so.1 (0x00007ffa8258e000)
	libcudnn.so.7 => /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/CUDA/lib64/libcudnn.so.7 (0x00007ffa710f7000)
	libcufft.so.9.0 => /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/CUDA/lib64/libcufft.so.9.0 (0x00007ffa69056000)
	libcurand.so.9.0 => /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/CUDA/lib64/libcurand.so.9.0 (0x00007ffa650f2000)
	libcudart.so.9.0 => /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/CUDA/lib64/libcudart.so.9.0 (0x00007ffa64e85000)
	libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007ffa64c56000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007ffa64a52000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007ffa646bf000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007ffa644a1000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007ffa6411c000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007ffa63f04000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffa63b4a000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ffaa1985000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007ffa63942000)
	libnvidia-fatbinaryloader.so.390.42 => /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.390.42 (0x00007ffa636f6000)

Ironvil · March 17, 2018, 11:11pm

HP-Z600-Workstation:~/development/gitrepos/gpuDeeps/DeepSpeech/native_client$ ldd deepspeech libdeepspeech.so
deepspeech:
linux-vdso.so.1 => (0x00007ffeff1be000)
libdeepspeech.so => /usr/local/lib/libdeepspeech.so (0x00007f1751639000)
libdeepspeech_utils.so => /usr/local/lib/libdeepspeech_utils.so (0x00007f175393a000)
libsox.so.3 => /usr/local/lib/libsox.so.3 (0x00007f17513af000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f175102d000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f1750d24000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f1750b0e000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1750744000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f1750540000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f1750323000)
/lib64/ld-linux-x86-64.so.2 (0x00007f175373e000)
libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f1750101000)
libdeepspeech.so:
ldd: ./libdeepspeech.so: No such file or directory

HP-Z600-Workstation:~/development/gitrepos/gpuDeeps/tensorflow/native_client$ time ./deepspeech ~/deepspeech/models/output_graph.pb ~/deepspeech/models/alphabet.txt ~/deepspeech/audio/ -t
TensorFlow: v1.6.0-rc1-1443-g8cbf4dd
DeepSpeech: v0.1.1-44-gd68fde8
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
Running on directory /home/jacobmh/deepspeech/audio/

/home/jacobmh/deepspeech/audio//2830-3980-0043.wav
experience proves tis
cpu_time_overall=17.84354 cpu_time_mfcc=0.00434 cpu_time_infer=17.83920
/home/jacobmh/deepspeech/audio//8455-210777-0068.wav
your powr is sufficient i said
cpu_time_overall=9.12456 cpu_time_mfcc=0.00542 cpu_time_infer=9.11913
/home/jacobmh/deepspeech/audio//4507-16021-0012.wav
why should one halt on the way
cpu_time_overall=10.29592 cpu_time_mfcc=0.00568 cpu_time_infer=10.29024

real 0m30.819s
user 0m29.590s
sys 0m9.068s

this is the results, seems like its trying to link to the wrong libraries

Ironvil · March 17, 2018, 11:12pm

HP-Z600-Workstation:~/development/gitrepos/gpuDeeps/tensorflow/native_client$ make deepspeech
c++ -o deepspeech pkg-config --cflags sox client.cc -Wl,–no-as-needed -Wl,-rpath,$ORIGIN -L/home/jacobmh/development/gitrepos/gpuDeeps/tensorflow/bazel-bin/native_client -ldeepspeech -ldeepspeech_utils pkg-config --libs sox

lissyx · March 17, 2018, 11:15pm

Why are they here? They should be in your bazel’s dir. You copied them by hand ? Please ldd /usr/local/lib/libdeepspeech.so, but given the output, I’d bet it’s not linked against CUDA. This is wrong, you need to verify again your configure and build steps, something is not good.

Ironvil · March 17, 2018, 11:16pm

Right, gpu version now works, after using PREFIX=/usr/local sudo make install

just to get multithreading working now. Still seems pretty quick on this gpu though

lissyx · March 17, 2018, 11:21pm

Makes sense, if you did install, then obviously you need to ensure you install new build, because it takes precedence :).

Topic		Replies	Views
ARM native_client with GPU support DeepSpeech	54	5534	July 27, 2018
Right CUDA version for using deepspeech-gpu DeepSpeech	12	3743	June 27, 2019
how to build native_client with tensorflow 1.5 and cuda8 DeepSpeech	1	1506	March 16, 2018
CUDA version for DeepSpeech on Ubuntu 20.04 DeepSpeech	0	360	April 22, 2021
(Help) Building from source (for Jetson TX2) with cuda support DeepSpeech	13	1295	September 2, 2020

How to build gpu version from source

Related topics