My RTX 4000 GPU is not used fully while i pre-trained the deepspeech 0.5.1 models

javi.rahman · September 6, 2019, 12:22pm

Hi Guys,

I am pre-training the deepspeech 0.5.1 models from the downloaded checkpoint.

1). Cloned Deepspeech 0.5.1 and cherry pick git cherry-pick 007e512
2). Downloaded Deepspeech.py given by How to find the which file is making loss inf - #8 by vigneshgig (to find the file which making training loss)
3). Downloaded deepspeech 0.5.1 checkpoint.
4). Downloaded Common voice Mozilla corpus data.
4). Tensorflow 1.14.0 GPU for faster GPU.

Models is training too slow.
Here is the command

python3 -u DeepSpeech.py
–n_hidden 2048
–epochs 3
–checkpoint_dir data/checkpoint/
–train_files data/corpus/clips/train.csv
–dev_files data/corpus/clips/dev.csv
–test_files data/corpus/clips/test.csv
–train_batch_size 8
–dev_batch_size 10
–test_batch_size 10
–dropout_rate 0.15
–lm_alpha 0.75
–lm_beta 1.85
–learning_rate 0.0001
–lm_binary_path data/originalLmBinary/lm.binary
–lm_trie_path data/originalLmBinary/trie
–export_dir data/export/

My System Configuration::
1).Quadro RTX 4000 (8 GB Ram) → 1
2). 500SSD
3). Ubuntu 18.04
4). CUDA 10.0 and CuDNN v7.5
5). Nvidia Driver 435.

Here is the nvidia-smi output.
Ignore CUDA version 10.1 on here it is wrong.
I am using two GPU but i think RTX 4000 is running but only 4% or 15% while checking not fully used. (checked with nvtop)

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1314 G /usr/lib/xorg/Xorg 39MiB |
| 0 1363 G /usr/bin/gnome-shell 55MiB |
| 0 1672 G /usr/lib/xorg/Xorg 222MiB |
| 0 1816 G /usr/bin/gnome-shell 136MiB |
| 0 2223 G …equest-channel-token=222207177073321633 141MiB |
| 0 4780 G …pareRendererForSitePerProcess --disable 47MiB |
±----------------------------------------------------------------------------+

Please correct me if i am wrong.

reuben · September 6, 2019, 12:33pm

There’s no python process using the GPU in your nvidia-smi output, so something is wrong with your tensorflow-gpu install. It’s not using the GPU at all.

javi.rahman · September 6, 2019, 1:09pm

Hi @reuben,

Sorry i have took nvidia-smi without running the deepspeech command.

I am using virtualenv and i have installed tensorflow-gpu 1.14.0.
Note: I have note installed tensorflow-gpu on core of the system pip.
Here is the output of nvidia-smi

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1314 G /usr/lib/xorg/Xorg 39MiB |
| 0 1363 G /usr/bin/gnome-shell 55MiB |
| 0 1672 G /usr/lib/xorg/Xorg 249MiB |
| 0 1816 G /usr/bin/gnome-shell 187MiB |
| 0 2223 G …equest-channel-token=222207177073321633 151MiB |
| 0 7119 G /opt/teamviewer/tv_bin/TeamViewer 25MiB |
| 0 15865 C python 93MiB |
| 0 26111 G …quest-channel-token=9205947884022005057 51MiB |
| 1 15865 C python 43MiB |
±----------------------------------------------------------------------------+

reuben · September 6, 2019, 3:53pm

Now they’re listed as being in a low power state and with only small amounts of memory allocated, which doesn’t make any sense either. Probably your CUDA setup is broken. I’ve never seen this before. Try reinstalling all the dependencies from scratch? I’d specify CUDA_VISIBLE_DEVICES=“0” when training so that it only uses the Quadro RTX 4000, because otherwise you’ll be forced to use a lower batch size to accommodate for the GTX 1050 and thus will under utilize the beefier GPU.

javi.rahman · September 7, 2019, 11:49am

Hi @reuben,

I wrongly installed 10.1 instead of 10.0 so reinstalled 10.0, CUDNN 7.5.1 and trained the model looks like GPU is utilised thank you.

I have followed this link --> https://gist.github.com/bogdan-kulynych/f64eb148eeef9696c70d485a76e42c3a

My Doubt is
I have trained existing pre-trained model with common voice dataset, i have used first 5000 steps and trained. I got successful output and model exported.

I0907 17:02:50.652766 140018494904128 graph_util_impl.py:364] Converted 12 variables to const ops.
I Models exported at /home/karthik/speech/DeepSpeech/data/export/

But the output_graph.pb remains same size of existing models memory 188.9 MB.
My question is that the size of output_graph.pb will get increased after i trained the existing model with new dataset?

lissyx · September 13, 2019, 6:54am

I’m sure I already replied to that question in another topic.

Topic		Replies	Views
Deepspeech does not seem to use gpu while training, however does use it when using native-client DeepSpeech	17	1771	November 19, 2020
Doesn`t use GPU while training, but during recognition it uses one DeepSpeech learning	25	4838	September 6, 2019
Deepspeech-gpu version dumping core on running pretrained model DeepSpeech	8	764	April 30, 2020
The same spped with cpu and with gpu DeepSpeech	42	2263	May 3, 2020
Long Training Time DeepSpeech	13	614	April 14, 2020

My RTX 4000 GPU is not used fully while i pre-trained the deepspeech 0.5.1 models

Related topics