DeepSpeech on AMD GPU

fabien4455 · May 24, 2019, 7:59am

AMD Support ?

opened 07:57AM - 23 May 19 UTC

closed 07:51AM - 24 May 19 UTC

invalid

For support and discussions, please use our [Discourse forums](https://discourse….mozilla.org/c/deep-speech). If you've found a bug, or have a feature request, then please create an issue with the following information: - **OS Platform and Distribution (e.g., Linux Ubuntu 16.04)**: Ubuntu 18.04.2 - **TensorFlow installed from (our builds, or upstream TensorFlow)**: tensorflow-rocm 1.13.3 from tensorflow-upstream - **TensorFlow version (use command below)**: b'v1.13.1-691-gf092438' 1.13.1 - **Python version**: 3.6.7 - **Bazel version (if compiling from source)**: Don't know what it is ? - **GCC/Compiler version (if compiling from source)**: Don't know what it is ? - **GPU model and memory**: AMD WX 9100 16 Go HBM2 I have installed tensorflow-rocm and i got my GPU working successfully with tensorflow and rocm drivers. I can run a python script that make my GPU works between 80W and 180W usage... So tensorflow works. Now I'm asking you, what can i do to make deepspeech using my CPU ? Because actually i only got CPU working, my GPU is idling with deepspeech. With tensorflow directly, my gpu works, so it should works ? Here are my pip modules : ``` Package Version -------------------- -------- absl-py 0.7.1 asn1crypto 0.24.0 astor 0.8.0 attrdict 2.0.1 audioread 2.1.7 bcrypt 3.1.6 beautifulsoup4 4.7.1 bs4 0.0.1 certifi 2019.3.9 cffi 1.12.3 chardet 3.0.4 cryptography 2.6.1 cycler 0.10.0 decorator 4.4.0 ds-ctcdecoder 0.5.0a9 gast 0.2.2 grpcio 1.20.1 h5py 2.9.0 idna 2.8 joblib 0.13.2 Keras-Applications 1.0.7 Keras-Preprocessing 1.0.9 kiwisolver 1.1.0 librosa 0.6.3 llvmlite 0.28.0 Markdown 3.1.1 matplotlib 3.1.0 mock 3.0.5 numba 0.43.1 numpy 1.15.4 pandas 0.24.2 paramiko 2.4.2 pip 19.1.1 pkg-resources 0.0.0 progressbar2 3.39.3 protobuf 3.7.1 pyasn1 0.4.5 pycparser 2.19 PyNaCl 1.3.0 pyparsing 2.4.0 python-dateutil 2.8.0 python-utils 2.3.0 pytz 2019.1 pyxdg 0.26 requests 2.22.0 resampy 0.2.1 scikit-learn 0.21.1 scipy 1.3.0 setuptools 41.0.1 six 1.12.0 SoundFile 0.10.2 soupsieve 1.9.1 sox 1.3.7 tensorboard 1.13.1 tensorflow-estimator 1.13.0 tensorflow-rocm 1.13.3 termcolor 1.1.0 urllib3 1.25.2 Werkzeug 0.15.4 wheel 0.33.4 ``` Here is how i run deepspeech : ``` ./DeepSpeech/DeepSpeech.py --train_files clips/train.csv --dev_files clips/dev.csv --test_files clips/test.csv ``` When i add this code : ``` if __name__ == '__main__': sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) #cette ligne là create_flags() tf.app.run(main) ``` in deepspeech.py i got this : --> ``` Device mapping: /job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Vega 10 XT [Radeon PRO WX 9100], pci bus id: 0000:03:00.0 ``` When i do this : ``` import tensorflow as tf if tf.test.gpu_device_name(): print('Default GPU Device: {}'.format(tf.test.gpu_device_name())) else: print("Please install GPU version of TF") ``` I got this : ``` 2019-05-22 11:54:46.383769: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2019-05-22 11:54:46.385204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1531] Found device 0 with properties: name: Vega 10 XT [Radeon PRO WX 9100] AMDGPU ISA: gfx900 memoryClockRate (GHz) 1.5 pciBusID 0000:03:00.0 Total memory: 15.98GiB Free memory: 15.73GiB 2019-05-22 11:54:46.385304: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1642] Adding visible gpu devices: 0 2019-05-22 11:54:46.385368: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-05-22 11:54:46.385389: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1059] 0 2019-05-22 11:54:46.385405: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1072] 0: N 2019-05-22 11:54:46.385498: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/device:GPU:0 with 15306 MB memory) -> physical GPU (device: 0, name: Vega 10 XT [Radeon PRO WX 9100], pci bus id: 0000:03:00.0) 2019-05-22 11:54:46.406955: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1642] Adding visible gpu devices: 0 2019-05-22 11:54:46.406990: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-05-22 11:54:46.406996: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1059] 0 2019-05-22 11:54:46.407000: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1072] 0: N 2019-05-22 11:54:46.407018: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/device:GPU:0 with 15306 MB memory) -> physical GPU (device: 0, name: Vega 10 XT [Radeon PRO WX 9100], pci bus id: 0000:03:00.0) Default GPU Device: /device:GPU:0 ``` So GPU is detected and working on Tensorflow... Why it should not work on DeepSpeech uh ? :D please tell me what i'm doing wrong or help me to make this amd card working on deepspeech And here is what i got with python 3.6.7 ( i ran these 2 commands... ) : >>> from tensorflow.python.client import device_lib >>> local_device_protos = device_lib.list_local_devices() ``` 2019-05-23 10:01:23.148688: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2019-05-23 10:01:23.150208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1531] Found device 0 with properties: name: Vega 10 XT [Radeon PRO WX 9100] AMDGPU ISA: gfx900 memoryClockRate (GHz) 1.5 pciBusID 0000:03:00.0 Total memory: 15.98GiB Free memory: 15.73GiB 2019-05-23 10:01:23.150272: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1642] Adding visible gpu devices: 0 2019-05-23 10:01:23.150313: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-05-23 10:01:23.150334: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1059] 0 2019-05-23 10:01:23.150349: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1072] 0: N 2019-05-23 10:01:23.150445: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/device:GPU:0 with 15306 MB memory) -> physical GPU (device: 0, name: Vega 10 XT [Radeon PRO WX 9100], pci bus id: 0000:03:00.0) ``` So GPU is detected everywhere ... And this is how my command run ( i added a print function in deepspeech/util/gpu.py to get informations at deepspeech startup ): ``` (venv36_2) root@JPVELOIA001:~/deepspeech# ./DeepSpeech/DeepSpeech.py --train_files clips/train.csv --dev_files clips/dev.csv --test_files clips/test.csv [name: "/device:GPU:0" device_type: "GPU" memory_limit: 16049923687 locality { bus_id: 2 numa_node: 1 links { } } incarnation: 10152498493872152683 physical_device_desc: "device: 0, name: Vega 10 XT [Radeon PRO WX 9100], pci bus id: 0000:03:00.0" ] ('/device:GPU:0',) WARNING:tensorflow:From /root/venv36_2/lib/python3.6/site-packages/tensorflow/python/data/ops/dataset_ops.py:429: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version. Instructions for updating: tf.py_func is deprecated in TF V2. Instead, use tf.py_function, which takes a python function which manipulates tf eager tensors instead of numpy arrays. It's easy to convert a tf eager tensor to an ndarray (just call tensor.numpy()) but having access to eager tensors means `tf.py_function`s can use accelerators such as GPUs as well as being differentiable using a gradient tape. WARNING:tensorflow:From /root/venv36_2/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py:358: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. WARNING:tensorflow:From /root/venv36_2/lib/python3.6/site-packages/tensorflow/contrib/rnn/python/ops/lstm_ops.py:696: to_int64 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. WARNING:tensorflow:From /root/venv36_2/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. I Restored variables from most recent checkpoint at /root/.local/share/deepspeech/checkpoints/train-5039, step 5040 I STARTING Optimization Epoch 0 | Training | Elapsed Time: 0:00:18 | Steps: 2 | Loss: 38.506773 ``` So tell me, why GPU is displayed, but, gpu is not used ?

Here is my github ticket to have informations…

I have AMD GPU WX9100 16GB HBM2 RAM.

I try to make deepspeech work. Actually only CPU is working…

They suggest me to increase batch size.

The file that i need to change is util/flags.py ?

lissyx · May 24, 2019, 8:00am

Still lacking the information about the dataset that was asked on that issue.

No, please read the documentation and use --{train,dev,test}_batch_size X when starting training

fabien4455 · May 24, 2019, 11:20am

We use this DATASET : https://voice.mozilla.org/fr/datasets

lissyx · May 24, 2019, 2:15pm

Looks like you are unblocked, so to update context, I’ve linked you the Dockerfile for french model that includes other dataset, and now the goal is to have a Dockerfile for AMD with feature parity to the NVIDIA-based Docker file

Topic		Replies	Views
DeepSpeech problems with video card DeepSpeech	6	1760	July 15, 2019
The same spped with cpu and with gpu DeepSpeech	42	2277	May 3, 2020
Deepspeech does not seem to use gpu while training, however does use it when using native-client DeepSpeech	17	1788	November 19, 2020
Deepspeech on ubuntu 18.04 DeepSpeech	11	2946	September 6, 2018
Fine tuning Deepspeech 0.9.1 with same alphabet DeepSpeech learning	40	1500	December 4, 2020

DeepSpeech on AMD GPU

Related topics