At pip3 install --upgrade -e . - it does complain about numpy version
Then i do ./bin/run-ldc93s1.sh which works
To test gpu, i just duplicated the data lines in ./data/ldc93s1/ldc93s1.csv, until i have 100+ inputs instead of just 1.
Then i modified the ./bin/run-ldc93s1.sh file -
#Force only one visible device because we have a single-sample dataset #and when trying to run on multiple devices (like GPUs), this will break #export CUDA_VISIBLE_DEVICES=0
python -u DeepSpeech.py --noshow_progressbar --train_files data/ldc93s1/ldc93s1.csv --train_batch_size 100 --n_hidden 100 --epochs 5 --bytes_output_mode --checkpoint_dir /home/anon/.local/share/deepspeech/ldc93s1
I Could not find best validating checkpoint.
I Loading most recent checkpoint from /home/anon/.local/share/deepspeech/ldc93s1/train-120
I Loading variable from checkpoint: beta1_power
I Loading variable from checkpoint: beta2_power
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias/Adam
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias/Adam_1
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel/Adam
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel/Adam_1
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/bias/Adam
I Loading variable from checkpoint: layer_1/bias/Adam_1
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_1/weights/Adam
I Loading variable from checkpoint: layer_1/weights/Adam_1
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/bias/Adam
I Loading variable from checkpoint: layer_2/bias/Adam_1
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_2/weights/Adam
I Loading variable from checkpoint: layer_2/weights/Adam_1
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/bias/Adam
I Loading variable from checkpoint: layer_3/bias/Adam_1
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_3/weights/Adam
I Loading variable from checkpoint: layer_3/weights/Adam_1
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/bias/Adam
I Loading variable from checkpoint: layer_5/bias/Adam_1
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_5/weights/Adam
I Loading variable from checkpoint: layer_5/weights/Adam_1
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/bias/Adam
I Loading variable from checkpoint: layer_6/bias/Adam_1
I Loading variable from checkpoint: layer_6/weights
I Loading variable from checkpoint: layer_6/weights/Adam
I Loading variable from checkpoint: layer_6/weights/Adam_1
I Loading variable from checkpoint: learning_rate
I STARTING Optimization
I Training epoch 0…
I Finished training epoch 0 - loss: 357.920593
I Training epoch 1…
I Finished training epoch 1 - loss: 357.626068
I Training epoch 2…
I Finished training epoch 2 - loss: 357.542145
I Training epoch 3…
I Finished training epoch 3 - loss: 357.455444
I Training epoch 4…
I Finished training epoch 4 - loss: 357.503204
I FINISHED optimization in 0:00:04.511124
Running this on mx-linux - its a debian based linux distro
Graphics card Nvidia GTX 1650
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
4
please enable more verbose logging so we get CUDA feedback from tensorflow
$ python -u DeepSpeech.py --noshow_progressbar --train_files data/ldc93s1/ldc93s1.csv --train_batch_size 100 --n_hidden 100 --epochs 3 --bytes_output_mode --checkpoint_dir /home/anon/.local/share/deepspeech/ldc93s1 --verbosity 1
I Could not find best validating checkpoint.
I Loading most recent checkpoint from /home/anon/.local/share/deepspeech/ldc93s1/train-195
I Loading variable from checkpoint: beta1_power
I Loading variable from checkpoint: beta2_power
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias/Adam
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias/Adam_1
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel/Adam
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel/Adam_1
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/bias/Adam
I Loading variable from checkpoint: layer_1/bias/Adam_1
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_1/weights/Adam
I Loading variable from checkpoint: layer_1/weights/Adam_1
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/bias/Adam
I Loading variable from checkpoint: layer_2/bias/Adam_1
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_2/weights/Adam
I Loading variable from checkpoint: layer_2/weights/Adam_1
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/bias/Adam
I Loading variable from checkpoint: layer_3/bias/Adam_1
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_3/weights/Adam
I Loading variable from checkpoint: layer_3/weights/Adam_1
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/bias/Adam
I Loading variable from checkpoint: layer_5/bias/Adam_1
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_5/weights/Adam
I Loading variable from checkpoint: layer_5/weights/Adam_1
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/bias/Adam
I Loading variable from checkpoint: layer_6/bias/Adam_1
I Loading variable from checkpoint: layer_6/weights
I Loading variable from checkpoint: layer_6/weights/Adam
I Loading variable from checkpoint: layer_6/weights/Adam_1
I Loading variable from checkpoint: learning_rate
I STARTING Optimization
I Training epoch 0…
I Finished training epoch 0 - loss: 351.114716
I Training epoch 1…
I Finished training epoch 1 - loss: 350.875793
I Training epoch 2…
I Finished training epoch 2 - loss: 350.717255
Alright however the only reason i made this post was because i couldnt find much, the only other post similar was the one i linked. Anyway i will try using the docker file. Thank you!
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
17
If you had shared full text logs, instead of screenshots, I could have seen you are using CUDA 10.1 which IS NOT SUPPORTED BY TensorFlow 1.15.4. Please use proper setup as documented.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
18
@Shravan_Shetty it’s really a mess: you don’t share the full training logs with cuda infos for training I can only infer you are using CUDA 10.1 from your inference screenshots.