Hi,
I’m training a portuguese model and I’m facing the below error:
I Saved new best validating model with loss 97.882270 to: <checkpoint_dir>\best_dev-2516
--------------------------------------------------------------------------------
I FINISHED optimization in 0:23:12.610174
2020-08-18 17:42:47.411058: F tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch value instead of handling error Internal: failed to get device attribute 13 for device 0: CUDA_ERROR_UNKNOWN: unknown error
Fatal Python error: Aborted
Thread 0x00001a44 (most recent call first):
File "<...>\appdata\local\programs\python\python36\lib\threading.py", line 295 in wait
File "<...>\appdata\local\programs\python\python36\lib\queue.py", line 164 in get
File "<...>\virtualenv\<virtualenv>\lib\site-packages\tensorflow_core\python\summary\writer\event_file_writer.py", line 159 in run
File "<...>\appdata\local\programs\python\python36\lib\threading.py", line 916 in _bootstrap_inner
File "<...>\appdata\local\programs\python\python36\lib\threading.py", line 884 in _bootstrap
Thread 0x00001b0c (most recent call first):
File "<...>\appdata\local\programs\python\python36\lib\threading.py", line 295 in wait
File "<...>\appdata\local\programs\python\python36\lib\queue.py", line 164 in get
File "<...>\virtualenv\<virtualenv>\lib\site-packages\tensorflow_core\python\summary\writer\event_file_writer.py", line 159 in run
File "<...>\appdata\local\programs\python\python36\lib\threading.py", line 916 in _bootstrap_inner
File "<...>\appdata\local\programs\python\python36\lib\threading.py", line 884 in _bootstrap
Thread 0x00001a54 (most recent call first):
File "<...>\appdata\local\programs\python\python36\lib\threading.py", line 295 in wait
File "<...>\appdata\local\programs\python\python36\lib\queue.py", line 164 in get
File "<...>\virtualenv\<virtualenv>\lib\site-packages\tensorflow_core\python\summary\writer\event_file_writer.py", line 159 in run
File "<...>\appdata\local\programs\python\python36\lib\threading.py", line 916 in _bootstrap_inner
File "<...>\appdata\local\programs\python\python36\lib\threading.py", line 884 in _bootstrap
Current thread 0x00004238 (most recent call first):
File "<...>\virtualenv\<virtualenv>\lib\site-packages\tensorflow_core\python\client\session.py", line 699 in __init__
File "<...>\virtualenv\<virtualenv>\lib\site-packages\tensorflow_core\python\client\session.py", line 1585 in __init__
File "<deepspeech_dir>\DeepSpeech\training\mozilla_voice_stt_training\evaluate.py", line 86 in evaluate
File "<deepspeech_dir>\DeepSpeech\training\mozilla_voice_stt_training\train.py", line 665 in test
File "<deepspeech_dir>\DeepSpeech\training\mozilla_voice_stt_training\train.py", line 937 in main
File "<...>\virtualenv\<virtualenv>\lib\site-packages\absl\app.py", line 250 in _run_main
File "<...>\virtualenv\<virtualenv>\lib\site-packages\absl\app.py", line 299 in run
File "<deepspeech_dir>\DeepSpeech\training\mozilla_voice_stt_training\train.py", line 961 in run_script
File "DeepSpeech.py", line 12 in <module>
My execution .bat:
python DeepSpeech.py ^
--alphabet_config_path D:\Pedro\EtherCity\deepspeech-test\cv-corpus-5.1-2020-06-22\pt\alphabet.txt ^
--train_files D:\Pedro\EtherCity\deepspeech-test\cv-corpus-5.1-2020-06-22\pt\clips\train-all.csv ^
--dev_files D:\Pedro\EtherCity\deepspeech-test\cv-corpus-5.1-2020-06-22\pt\clips\dev.csv ^
--test_files D:\Pedro\EtherCity\deepspeech-test\cv-corpus-5.1-2020-06-22\pt\clips\test.csv ^
--train_batch_size 80 ^
--dev_batch_size 80 ^
--test_batch_size 40 ^
--n_hidden 375 ^
--epochs 1 ^
--early_stop True ^
--dropout_rate 0.22 ^
--learning_rate 0.00095 ^
--report_count 100 ^
--export_dir D:\Pedro\EtherCity\deepspeech-test\ptModel\results\model_export/ ^
--checkpoint_dir D:\Pedro\EtherCity\deepspeech-test\ptModel\results\checkpoint
My virtualenv pip freeze:
absl-py==0.9.0
astor==0.8.1
attrdict==2.0.1
certifi==2020.6.20
chardet==3.0.4
gast==0.2.2
google-pasta==0.2.0
grpcio==1.31.0
h5py==2.10.0
idna==2.10
importlib-metadata==1.7.0
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.2
Markdown==3.2.2
mozilla-voice-stt-tflite==0.9.0a6
mvs-ctcdecoder==0.9.0a6
numpy==1.16.0
opt-einsum==3.3.0
pandas==0.25.3
progressbar2==3.47.0
protobuf==3.12.4
python-dateutil==2.8.1
python-utils==2.3.0
pytz==2020.1
pyxdg==0.26
requests==2.24.0
semver==2.10.2
six==1.13.0
sox==1.4.0
tensorboard==1.15.0
tensorflow-estimator==1.15.1
tensorflow-gpu==1.15.2
termcolor==1.1.0
urllib3==1.25.10
webrtcvad==2.0.10
Werkzeug==1.0.1
wrapt==1.12.1
zipp==3.1.0
I’m using CUDA v10.1 and cudnn-10.1 (cudnn64_7.dll).
How can I check what’s missing here?
Thanks in advance for any kind of support.