The same spped with cpu and with gpu

plusout · May 2, 2020, 3:21pm

I made virtual environment for training without gpu and test model . For dataset i had 1 step per 40 seconds. After that i stop training ? made separate virtual environment for gpu training ? install there
pip3 uninstall tensorflow
pip3 install ‘tensorflow-gpu==1.15.2’
and test training with the same parameters. This case i have the same speed of training as with CPU . Why i din’t have faster training ?

python3 DeepSpeech.py --drop_source_layers 1 --alphabet_config_path ~/ASR/data-cv/alphabet.ru --save_checkpoint_dir ~/ASR/ru-output-checkpoint --load_checkpoint_dir ~/ASR/ru-release-checkpoint --train_files ~/ASR/data-cv/clips/train.csv --dev_files ~/ASR/data-cv/clips/dev.csv --test_files ~/ASR/data-cv/clips/test.csv --scorer_path ~/ASR/ru-release-checkpoint/deepspeech-0.7.0-models.scorer --train_batch_size 64 --dropout_rate 0.4 --learning_rate 0.0001 --dev_batch_size 64

othiele · May 2, 2020, 6:31pm

Try

train_cudnn

as a flag. See more info in flags.py. If that doesn’t help post output of training. Usually it tells you whether you are using CUDA or not.

github.com/mozilla/DeepSpeech

training/deepspeech_training/util/flags.py

master

from __future__ import absolute_import, division, print_function

import os
import absl.flags

FLAGS = absl.flags.FLAGS

# sphinx-doc: training_ref_flags_start
def create_flags():
    # Importer
    # ========

    f = absl.flags

    f.DEFINE_string('train_files', '', 'comma separated list of files specifying the dataset used for training. Multiple files will get merged. If empty, training will not be run.')
    f.DEFINE_string('dev_files', '', 'comma separated list of files specifying the datasets used for validation. Multiple files will get reported separately. If empty, validation will not be run.')
    f.DEFINE_string('test_files', '', 'comma separated list of files specifying the datasets used for testing. Multiple files will get reported separately. If empty, the model will not be tested.')
    f.DEFINE_string('metrics_files', '', 'comma separated list of files specifying the datasets used for tracking of metrics (after validation step). Currently the only metric is the CTC loss but without affecting the tracking of best validation loss. Multiple files will get reported separately. If empty, metrics will not be computed.')

    f.DEFINE_string('read_buffer', '1MB', 'buffer-size for reading samples from datasets (supports file-size suffixes KB, MB, GB, TB)')

This file has been truncated. show original

plusout · May 3, 2020, 4:46am

I tried flag train_cudnn

Here is the results

With GPU 1 step 1m02 sec.

 python3 DeepSpeech.py     --drop_source_layers 1     --alphabet_config_path ~/ASR/data-cv/alphabet.ru     --save_checkpoint_dir ~/ASR/ru-output-checkpoint     --load_checkpoint_dir ~/ASR/ru-release-checkpoint     --train_files   ~/ASR/data-cv/clips/train.csv     --dev_files   ~/ASR/data-cv/clips/dev.csv     --test_files  ~/ASR/data-cv/clips/test.csv --scorer_path ~/ASR/ru-release-checkpoint/deepspeech-0.7.0-models.scorer --train_batch_size 64 --dropout_rate 0.25 --learning_rate 0.00005 --dev_batch_size 64 —train_cudnn True
W WARNING: You specified different values for --load_checkpoint_dir and --save_checkpoint_dir, but you are running training and testing in a single invocation. The testing step will respect --load_checkpoint_dir, and thus WILL NOT TEST THE CHECKPOINT CREATED BY THE TRAINING STEP. Train and test in two separate invocations, specifying the correct --load_checkpoint_dir in both cases, or use the same location for loading and saving.
I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
I Initializing all variables.
I STARTING Optimization
Epoch 0 |   Training | Elapsed Time: 0:01:02 | Steps: 1 | Loss: 293.825439

Without GPU
1 step 0m 56 sec

python3 DeepSpeech.py     --drop_source_layers 1     --alphabet_config_path ~/ASR/data-cv/alphabet.ru     --save_checkpoint_dir ~/ASR/ru-output-checkpoint     --load_checkpoint_dir ~/ASR/ru-release-checkpoint     --train_files   ~/ASR/data-cv/clips/train.csv     --dev_files   ~/ASR/data-cv/clips/dev.csv     --test_files  ~/ASR/data-cv/clips/test.csv --scorer_path ~/ASR/ru-release-checkpoint/deepspeech-0.7.0-models.scorer --train_batch_size 64 --dropout_rate 0.5 --learning_rate 0.00005 --dev_batch_size 64 
W WARNING: You specified different values for --load_checkpoint_dir and --save_checkpoint_dir, but you are running training and testing in a single invocation. The testing step will respect --load_checkpoint_dir, and thus WILL NOT TEST THE CHECKPOINT CREATED BY THE TRAINING STEP. Train and test in two separate invocations, specifying the correct --load_checkpoint_dir in both cases, or use the same location for loading and saving.
I Could not find best validating checkpoint.
I Could not find most recent checkpoint.
I Initializing all variables.
I STARTING Optimization
Epoch 0 |   Training | Elapsed Time: 0:00:56 | Steps: 1 | Loss: 297.249207

Why training without GPU is fater tthan with GPU?
May be i did something wrong.
I have GTX 1060 card with 3Gb memory.

othiele · May 3, 2020, 10:06am

You are doing sth wrong and you don’t provide much info to go with, check the info here

https://deepspeech.readthedocs.io/en/v0.7.0/TRAINING.html#installing-deepspeech-training-code-and-its-dependencies

plusout · May 3, 2020, 10:22am

I did
pip3 uninstall tensorflow
pip3 install ‘tensorflow-gpu==1.15.2’

all ok. Page CUDA dependency. doesn;t exist ( page not found).
How can i recieve additional info you need for help?

othiele · May 3, 2020, 10:34am

Really, how long are you in IT? You simply look at the link given and you’ll see. Otherwise try the search function, that allows you to find stuff on web pages …

@lissyx Looks like the link is broken in the docs.

plusout · May 3, 2020, 10:39am

I am 40+ in IT, but I am novice with ubuntu and GPU.

lissyx · May 3, 2020, 10:42am

I don’t know where you got that link, because it works if you pick it on rtd: https://deepspeech.readthedocs.io/en/v0.7.0/USING.html#cuda-dependency

Have you verified it is really using the GPU ? nvidia-smi during training.

Don’t expect too much from that, though.

lissyx · May 3, 2020, 10:43am

6 secs of delta? My money is on “there was never a GPU used” here. Please raise log level and share more tensorflow training output: if it’s loading the GPU, you will see it.

plusout · May 3, 2020, 10:51am

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1982 G /usr/lib/xorg/Xorg 370MiB |
| 0 2367 G /usr/bin/gnome-shell 231MiB |
| 0 3095 G gnome-control-center 2MiB |
| 0 3372 G /usr/lib/firefox/firefox 2MiB |
| 0 5284 G /usr/lib/firefox/firefox 2MiB |
| 0 10257 C python 2265MiB |
| 0 14734 G /usr/lib/firefox/firefox 2MiB |
±----------------------------------------------------------------------------+
(deepspeech-train-venv) (base) v@gpu:~/DeepSpeech$

plusout · May 3, 2020, 10:53am

I want to make some experiments with GPU 1060 to tune all procecc of train and later to work with Cristofary supercomp for training.

othiele · May 3, 2020, 10:59am

hm, if I click the link I provided above I get 404 as well for the CUDA dependency as it doesn’t convert the .rst into .html on Firefox

plusout · May 3, 2020, 10:59am

in this page
https://deepspeech.readthedocs.io/en/v0.7.0/TRAINING.html#transfer-learning-new-alphabet
CUDA dependency. has bug

plusout · May 3, 2020, 11:00am

in page USING all ok

plusout · May 3, 2020, 11:09am

Can deepspeech support GPU while training with DeepSpeech.py ?
Or i need deepspeech-gpu ?

utunga · May 3, 2020, 11:22am

DeepSpeech 100% supports GPU training. There is no such thing as a deepspeech-gpu.

What can happen - often does happen - is that if the dependencies are not correctly set up (or for whatever other reason) the GPU is not visible to Tensorflow, then Tensorflow ‘falls back’ to CPU-only training. So that even if you are asking it to do ‘gpu’ training it actually only uses the CPU.

Getting the GPU to be recognized by Tensorflow is the first thing to figure out. Once you have that sorted you should be able to use DeepSpeech to do training, and with the GPU at the same time.

One way to determine if the GPU is being used is to run nvidia-smi at the same time as the training is happening. nvidia-smi is one way to see how much activity is going on in the GPU. If things are correctly working DeepSpeech should be giving the GPU a good work out. If its ‘falling back’ to CPU then the GPU will be at almost 0% usage while training is occuring.

So. 99% chanace is that you need to work out what dependencies are not working. Often its NVidia drivers, CUDA or CUDNN that are not working…

If you read the links here on dependencies it might be of some use or google things like ‘tensorflow use gpu on <insert your os/version>’ etc etc. Its quite a common problem that tensorflow can’t see the GPU so don’t feel bad. But its also out of scope for DeepSpeech. YOu have to solve that first then DeepSpeech should be able to make use of it.

Hope this is helpful. Best of luck!

lissyx · May 3, 2020, 11:23am

it’s fixed on master

plusout · May 3, 2020, 11:27am

Thanks for detailed recommendations. I will try it to train on GPU. But just now similar package DeepSpech from NVIDIA openseq2seq works on GPU without problem.

lissyx · May 3, 2020, 11:28am

Please, can you just share the details we ask you? That’s your fourth reply, and you still have not provided more complete training logs with higher --log_level. We really cannot help you if you don’t share that: GPU works very well for us.

reuben · May 3, 2020, 11:36am

On top of everything else that’s been said here, so far you have only provided the timing of a single training step, the first one. This is not a useful benchmark because there is a bunch of setup work that happens on the first step, and it is independent of using the CPU or the GPU. You need to look at step timings over an entire epoch to get a good idea of the performance.

Topic		Replies	Views
Training runs on GPU but test runs on CPU and takes a long time DeepSpeech	1	259	December 24, 2020
Optimization of Deepspeech training in multi-GPU environment DeepSpeech learning	4	1047	April 14, 2021
Testing step slow using CPU when GPU was used for training DeepSpeech	8	722	October 8, 2021
Deepspeech does not seem to use gpu while training, however does use it when using native-client DeepSpeech	17	1768	November 19, 2020
Doesn`t use GPU while training, but during recognition it uses one DeepSpeech learning	25	4833	September 6, 2019

The same spped with cpu and with gpu

Related topics