I’ve already read few posts about this, but i can’t seem to find an answer.
My problem is that Deepspeech doesn’t seem to run on GPU when training a model. Maybe I’m missing something, but I think I did everything accordingly to the Readme of the repo. Here is what I did:
create virtualenv with ‘… -p python3’ and activated it
I also should have all the CUDA dependencys, that’s what my colleague did and also nvidia-smi spits out the following:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro P4000 On | 00000000:02:00.0 On | N/A |
| 46% 31C P8 6W / 105W | 383MiB / 8116MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1173 G /usr/lib/xorg/Xorg 243MiB |
| 0 1391 G /usr/bin/gnome-shell 137MiB |
+-----------------------------------------------------------------------------+
When starting training, Deepspeech gives me some warnings though, one of them is:
WARNING:tensorflow:From /home/encoder80/Desktop/190919_Deepspeech/lib/python3.6/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
which makes me wonder, shouldn’t it be /site-packages/tensorflow-GPU/ or something? Also pip3 list gives me this output:
What am I missing? Hoping that someone can give me a hint or something.
Thanks in advance!
gneulyn
P.s.: everything seems to works fine, just not on the GPU
1 Like
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
2
It’s installed. If you are starting training in the properly activated virtualenv there is no reason this would not work.
Is this during training ?
You could start by sharing more training logs. And run with --log_level x with x > 1.
That would be consistent with an improperly uninstalled tensorflow python wheel, or incorrectly setup virtualenv. But yes, the package name in that warning should be tensorflow-gpu (the warnings themselves are harmless).
thanks for helping. I was kinda hoping for your replay
Here is a loglevel 2 log, when i start training:
./DeepSpeech.py --train_files ~/Desktop/Files/train.csv --dev_files ~/Desktop/Files/dev.csv --test_files ~/Desktop/Files/test.csv --epochs 1 --export_dir ~/Desktop/190919_Deepspeech/model_export --checkpoint_dir ~/Desktop/190919_Deepspeech/checkpoints --test_batch_size 200 --train_batch_size 200 --dev_batch_size 200 --log_level 2
WARNING:tensorflow:From /home/encoder80/Desktop/190919_Deepspeech/lib/python3.6/site-packages/tensorflow/python/data/ops/dataset_ops.py:494: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
options available in V2.
- tf.py_function takes a python function which manipulates tf eager
tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
an ndarray (just call tensor.numpy()) but having access to eager tensors
means `tf.py_function`s can use accelerators such as GPUs as well as
being differentiable using a gradient tape.
- tf.numpy_function maintains the semantics of the deprecated tf.py_func
(it is not differentiable, and manipulates numpy arrays). It drops the
stateful argument making all functions stateful.
W0919 16:54:59.203839 140230625675072 deprecation.py:323] From /home/encoder80/Desktop/190919_Deepspeech/lib/python3.6/site-packages/tensorflow/python/data/ops/dataset_ops.py:494: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
options available in V2.
- tf.py_function takes a python function which manipulates tf eager
tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
an ndarray (just call tensor.numpy()) but having access to eager tensors
means `tf.py_function`s can use accelerators such as GPUs as well as
being differentiable using a gradient tape.
- tf.numpy_function maintains the semantics of the deprecated tf.py_func
(it is not differentiable, and manipulates numpy arrays). It drops the
stateful argument making all functions stateful.
WARNING:tensorflow:From /home/encoder80/Desktop/190919_Deepspeech/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py:348: Iterator.output_types (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.data.get_output_types(iterator)`.
W0919 16:54:59.264383 140230625675072 deprecation.py:323] From /home/encoder80/Desktop/190919_Deepspeech/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py:348: Iterator.output_types (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.data.get_output_types(iterator)`.
WARNING:tensorflow:From /home/encoder80/Desktop/190919_Deepspeech/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py:349: Iterator.output_shapes (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.data.get_output_shapes(iterator)`.
W0919 16:54:59.264564 140230625675072 deprecation.py:323] From /home/encoder80/Desktop/190919_Deepspeech/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py:349: Iterator.output_shapes (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.data.get_output_shapes(iterator)`.
WARNING:tensorflow:From /home/encoder80/Desktop/190919_Deepspeech/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py:351: Iterator.output_classes (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.data.get_output_classes(iterator)`.
W0919 16:54:59.264675 140230625675072 deprecation.py:323] From /home/encoder80/Desktop/190919_Deepspeech/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py:351: Iterator.output_classes (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.data.get_output_classes(iterator)`.
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
* https://github.com/tensorflow/addons
* https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.
W0919 16:55:01.491833 140230625675072 lazy_loader.py:50]
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
* https://github.com/tensorflow/addons
* https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.
WARNING:tensorflow:From /home/encoder80/Desktop/190919_Deepspeech/lib/python3.6/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0919 16:55:01.493896 140230625675072 deprecation.py:506] From /home/encoder80/Desktop/190919_Deepspeech/lib/python3.6/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:Entity <bound method LSTMBlockWrapper.call of <tensorflow.contrib.rnn.python.ops.lstm_ops.LSTMBlockFusedCell object at 0x7f89304ef940>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method LSTMBlockWrapper.call of <tensorflow.contrib.rnn.python.ops.lstm_ops.LSTMBlockFusedCell object at 0x7f89304ef940>>: AttributeError: module 'gast' has no attribute 'Num'
W0919 16:55:01.522188 140230625675072 ag_logging.py:145] Entity <bound method LSTMBlockWrapper.call of <tensorflow.contrib.rnn.python.ops.lstm_ops.LSTMBlockFusedCell object at 0x7f89304ef940>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method LSTMBlockWrapper.call of <tensorflow.contrib.rnn.python.ops.lstm_ops.LSTMBlockFusedCell object at 0x7f89304ef940>>: AttributeError: module 'gast' has no attribute 'Num'
WARNING:tensorflow:From ./DeepSpeech.py:232: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0919 16:55:01.582261 140230625675072 deprecation.py:323] From ./DeepSpeech.py:232: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Epoch 0 | Training | Elapsed Time: 0:00:26 | Steps: 1 | Loss: 358.728058
properly activated virtualenv - by running the activate script, right?
source path/to/bla/activate
the nvidia-smi while training looks like this:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro P4000 On | 00000000:02:00.0 On | N/A |
| 46% 32C P8 8W / 105W | 342MiB / 8116MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1173 G /usr/lib/xorg/Xorg 129MiB |
| 0 1391 G /usr/bin/gnome-shell 132MiB |
| 0 27317 C python 77MiB |
+-----------------------------------------------------------------------------+
I think I followed every step in the Readme. How can I properly uninstall tensorflow? I already tried to remove the folder, but then the module is not found of course…