- Mozilla STT version: Deepspeech 0.9.1
- OS: Linux 18.04
- Python: 3.6.5
- Tensorflow-gpu version: 1.15.4
- GPU: NVIDIA GeForce MX230
- CUDA version:
(env) ghada@ghada-Inspiron-3593:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
- CUDNN version: cudnn-10.1-linux-x64-v7.6.5.32
Hello I’ve been looking into DS for a while.
I’ve installed DeepSpeech with pip.
pip3 install deepspeech
and downloaded both the pre-trained model and the scorer from the latest release (v0.9.1).
I ran inference with
deepspeech --model /my/path/to/deepspeech-0.9.1-models.pbmm --scorer /my/path/to/deepspeech-0.9.1-models.scorer --audio /my/path/to/myaudio.wav
and got the following results:
2020-11-27 03:42:35.544279: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
Loading model from file /home/ghada/deepspeech-0.9.1-models.pbmm
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.1-0-gab8bd3e
2020-11-27 03:42:35.646252: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-11-27 03:42:35.647121: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-11-27 03:42:35.669893: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-27 03:42:35.670202: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce MX230 computeCapability: 6.1
coreClock: 1.531GHz coreCount: 2 deviceMemorySize: 1.96GiB deviceMemoryBandwidth: 44.76GiB/s
2020-11-27 03:42:35.670292: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-11-27 03:42:35.673877: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-11-27 03:42:35.674952: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-11-27 03:42:35.675210: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-11-27 03:42:35.676835: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-11-27 03:42:35.677750: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-11-27 03:42:35.681210: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-11-27 03:42:35.681316: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-27 03:42:35.681637: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-27 03:42:35.681894: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-11-27 03:42:35.941946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-27 03:42:35.941990: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2020-11-27 03:42:35.941993: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2020-11-27 03:42:35.942156: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-27 03:42:35.942463: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-27 03:42:35.942728: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-27 03:42:35.943037: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1259 MB memory) -> physical GPU (device: 0, name: GeForce MX230, pci bus id: 0000:01:00.0, compute capability: 6.1)
Loaded model in 0.321s.
Loading scorer from files /home/ghada/deepspeech-0.9.1-models.scorer
Loaded scorer in 0.000138s.
Running inference.
2020-11-27 03:42:35.998799: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
karen read was on them by the greatness of your arm there is still as a stone and talouel as overlord until the people pass over who you have purchased
Inference took 4.402s for 9.189s audio file.
The point of this work is to get to train deepspeech model over bible verses in order to become familiar with bible vocabulary.
I have tested DS with several .wav files and I was getting a few mistakes in the transcripts that I’m trying to avoid by training the pretrained model on 73 hours of bible audio files (31080 files).
I will try to document my every step so the experts can point out my mistakes .
I’ve followed the documentation steps,
-
git clone --branch v0.9.1 https://github.com/mozilla/DeepSpeech
-
python3 -m venv $HOME/tmp/env/
-
source $HOME/tmp/env/bin/activate
-
cd DeepSpeech pip3 install --upgrade pip==20.2.2 wheel==0.34.2 setuptools==49.6.0 pip3 install --upgrade -e . sudo apt-get install python3-dev ```
I also followed these steps to get CUDA and CUDNN in the required versions.
-
skipped Dockerfile part.
-
downloaded the checkpoint model, pre-trained model and scorer from latest release (v0.9.1)
-
prepared data:
- I formatted .wav files to int16, sample rate 16000 and mono channel.
- I splitted my corpus to a
7:2:1
ratio fortrain:dev:test
respectively, - My CSV files contain
wav_filename,wav_filesize,transcript
columns.
Finally, I used the following command to train the model:
python3 DeepSpeech.py --n_hidden 2048 --checkpoint_dir /home/ghada/deepspeech-0.9.1-checkpoint/ --epochs 3 --train_cudnn --train_files train/train.csv --dev_files dev/dev.csv --test_files test/test.csv --scorer /home/ghada/deepspeech-0.9.1-models.scorer --learning_rate 0.0001 --export_dir output/ --export_tflite
I’ve set epochs to only 3 because I first want to train the model over only a few files to get to estimate time consumption, but later I will set that to something between 10-20 epochs (please correct me if I’m wrong )
Running this command gave me the following error:
I Loading variable from checkpoint: beta1_power
Traceback (most recent call last):
File "/home/ghada/anaconda3/envs/env/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/home/ghada/anaconda3/envs/env/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1348, in _run_fn
self._extend_graph()
File "/home/ghada/anaconda3/envs/env/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1388, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'CudnnRNNCanonicalToParams' used by {{node tower_0/cudnn_lstm/cudnn_lstm/CudnnRNNCanonicalToParams}}with these attrs: [seed=4568, dropout=0, num_params=8, input_mode="linear_input", T=DT_FLOAT, direction="unidirectional", rnn_mode="lstm", seed2=247]
Registered devices: [CPU, XLA_CPU, XLA_GPU]
Registered kernels:
device='GPU'; T in [DT_DOUBLE]
device='GPU'; T in [DT_FLOAT]
device='GPU'; T in [DT_HALF]
[[tower_0/cudnn_lstm/cudnn_lstm/CudnnRNNCanonicalToParams]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "DeepSpeech.py", line 12, in <module>
ds_train.run_script()
File "/home/ghada/DeepSpeech/deepspeech_training/train.py", line 976, in run_script
absl.app.run(main)
File "/home/ghada/anaconda3/envs/env/lib/python3.6/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/home/ghada/anaconda3/envs/env/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "/home/ghada/DeepSpeech/deepspeech_training/train.py", line 948, in main
train()
File "/home/ghada/DeepSpeech/deepspeech_training/train.py", line 527, in train
load_or_init_graph_for_training(session)
File "/home/ghada/DeepSpeech/deepspeech_training/util/checkpoints.py", line 137, in load_or_init_graph_for_training
_load_or_init_impl(session, methods, allow_drop_layers=True)
File "/home/ghada/DeepSpeech/deepspeech_training/util/checkpoints.py", line 98, in _load_or_init_impl
return _load_checkpoint(session, ckpt_path, allow_drop_layers, allow_lr_init=allow_lr_init)
File "/home/ghada/DeepSpeech/deepspeech_training/util/checkpoints.py", line 71, in _load_checkpoint
v.load(ckpt.get_tensor(v.op.name), session=session)
File "/home/ghada/anaconda3/envs/env/lib/python3.6/site-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "/home/ghada/anaconda3/envs/env/lib/python3.6/site-packages/tensorflow_core/python/ops/variables.py", line 1033, in load
session.run(self.initializer, {self.initializer.inputs[1]: value})
File "/home/ghada/anaconda3/envs/env/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/home/ghada/anaconda3/envs/env/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/home/ghada/anaconda3/envs/env/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/home/ghada/anaconda3/envs/env/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'CudnnRNNCanonicalToParams' used by node tower_0/cudnn_lstm/cudnn_lstm/CudnnRNNCanonicalToParams (defined at /home/ghada/anaconda3/envs/env/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) with these attrs: [seed=4568, dropout=0, num_params=8, input_mode="linear_input", T=DT_FLOAT, direction="unidirectional", rnn_mode="lstm", seed2=247]
Registered devices: [CPU, XLA_CPU, XLA_GPU]
Registered kernels:
device='GPU'; T in [DT_DOUBLE]
device='GPU'; T in [DT_FLOAT]
device='GPU'; T in [DT_HALF]
[[tower_0/cudnn_lstm/cudnn_lstm/CudnnRNNCanonicalToParams]]
I searched around for a similarerror and it turned out to be wrong CUDA and CUDNN versions (for other deepspeech releases), yet I have the required versions as mentioned above and yet I uninstalled it and reinstalled it several times.
when I try to see if I have GPU enabled over tensorflow using
import tensorflow as tf; tf.test.is_gpu_available()
I get the following:
2020-11-27 04:33:19.587097: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1497600000 Hz
2020-11-27 04:33:19.587869: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x560309ea6170 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-27 04:33:19.587921: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-11-27 04:33:19.591450: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-27 04:33:19.648328: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-27 04:33:19.648744: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x560309f3f2a0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-11-27 04:33:19.648757: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce MX230, Compute Capability 6.1
2020-11-27 04:33:19.648926: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-27 04:33:19.649194: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties:
name: GeForce MX230 major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:01:00.0
2020-11-27 04:33:19.649328: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda/lib64:/usr/local/cuda-10.1/lib64
2020-11-27 04:33:19.649394: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda/lib64:/usr/local/cuda-10.1/lib64
2020-11-27 04:33:19.649485: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda/lib64:/usr/local/cuda-10.1/lib64
2020-11-27 04:33:19.649579: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda/lib64:/usr/local/cuda-10.1/lib64
2020-11-27 04:33:19.649683: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda/lib64:/usr/local/cuda-10.1/lib64
2020-11-27 04:33:19.649745: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda/lib64:/usr/local/cuda-10.1/lib64
2020-11-27 04:33:19.652300: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-27 04:33:19.652318: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1662] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-11-27 04:33:19.652338: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-27 04:33:19.652354: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186] 0
2020-11-27 04:33:19.652364: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0: N
False
If I run the same deepspeech command to train over CPU, ( replace --train_cudnn flag by --load_cudnn ) it works perfectly, but it just takes soo long, this is why I want to train over GPU.
Did I miss a step somewhere?