Hello guys,
I am fine-tuning DeepSpeech 0.9.2 for the Indian subcontinent English. I have collected 10-hour male speech and 5-hour female speech for bn_IN and 7 hours of both male and female hi_IN English speaking recording. I have chosen those two groups specifically because their accent is closest to my target audience.
Fine Tuning was not easy at all, I keep getting hit by a roadblock. The alphabet characters were a big pain point too. Although the sentences were in the English alphabet check_characters.py
falsy flagged a lot of characters as missing from the alphabet. I replaced all the missing characters using this code and ignored the check_characters.py
report.
def replace_func(text):
text = text.replace('&', "and")
return re.sub("[^a-z'\\s]", "", text)
Due to the long time required for the training I have cut down the data and now only using bn_IN data (15 hours). I am facing a completely different type of issue right now. After the first epoch, the training crashes stating double free or corruption (out). I could not find any proper fix for this. Can any knowledgable person please help me out here?
Epoch 1 | Training | Elapsed Time: 0:30:14 | Steps: 5381 | Loss: 174.223625 WARNING: sample rate of sample "/kaggle/tmp/bengali_female_english/wav/f0088.wav" ( 48000 ) does not match FLAGS.audio_sample_rate. This can lead to incorrect results.
double free or corruption (out)
Fatal Python error: Aborted
Thread 0x00007fe5137fe700 (most recent call first):
File "/opt/conda/lib/python3.7/multiprocessing/connection.py", line 379 in _recv
File "/opt/conda/lib/python3.7/multiprocessing/connection.py", line 407 in _recv_bytes
File "/opt/conda/lib/python3.7/multiprocessing/connection.py", line 250 in recv
File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 470 in _handle_results
File "/opt/conda/lib/python3.7/threading.py", line 870 in run
File "/opt/conda/lib/python3.7/threading.py", line 926 in _bootstrap_inner
File "/opt/conda/lib/python3.7/threading.py", line 890 in _bootstrap
Thread 0x00007fe512ffd700 (most recent call first):
File "/kaggle/tmp/DeepSpeech/training/deepspeech_training/util/helpers.py", line 97 in _limit
File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 292 in _guarded_task_generation
File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 426 in _handle_tasks
File "/opt/conda/lib/python3.7/threading.py", line 870 in run
File "/opt/conda/lib/python3.7/threading.py", line 926 in _bootstrap_inner
File "/opt/conda/lib/python3.7/threading.py", line 890 in _bootstrap
Thread 0x00007fe513fff700 (most recent call first):
File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 413 in _handle_workers
File "/opt/conda/lib/python3.7/threading.py", line 870 in run
File "/opt/conda/lib/python3.7/threading.py", line 926 in _bootstrap_inner
File "/opt/conda/lib/python3.7/threading.py", line 890 in _bootstrap
Thread 0x00007fe5977fe700 (most recent call first):
File "/opt/conda/lib/python3.7/threading.py", line 296 in wait
File "/opt/conda/lib/python3.7/queue.py", line 170 in get
File "/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/summary/writer/event_file_writer.py", line 159 in run
File "/opt/conda/lib/python3.7/threading.py", line 926 in _bootstrap_inner
File "/opt/conda/lib/python3.7/threading.py", line 890 in _bootstrap
Thread 0x00007fe597fff700 (most recent call first):
File "/opt/conda/lib/python3.7/threading.py", line 296 in wait
File "/opt/conda/lib/python3.7/queue.py", line 170 in get
File "/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/summary/writer/event_file_writer.py", line 159 in run
File "/opt/conda/lib/python3.7/threading.py", line 926 in _bootstrap_inner
File "/opt/conda/lib/python3.7/threading.py", line 890 in _bootstrap
Thread 0x00007fe59cd1d700 (most recent call first):
File "/opt/conda/lib/python3.7/threading.py", line 296 in wait
File "/opt/conda/lib/python3.7/queue.py", line 170 in get
File "/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/summary/writer/event_file_writer.py", line 159 in run
File "/opt/conda/lib/python3.7/threading.py", line 926 in _bootstrap_inner
File "/opt/conda/lib/python3.7/threading.py", line 890 in _bootstrap
Thread 0x00007fe6e28d2740 (most recent call first):
File "/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443 in _call_tf_sessionrun
File "/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350 in _run_fn
File "/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365 in _do_call
File "/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1359 in _do_run
File "/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1180 in _run
File "/opt/conda/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956 in run
File "/kaggle/tmp/DeepSpeech/training/deepspeech_training/train.py", line 570 in run_set
File "/kaggle/tmp/DeepSpeech/training/deepspeech_training/train.py", line 605 in train
File "/kaggle/tmp/DeepSpeech/training/deepspeech_training/train.py", line 948 in main
File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 251 in _run_main
File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 303 in run
File "/kaggle/tmp/DeepSpeech/training/deepspeech_training/train.py", line 976 in run_script
File "./DeepSpeech.py", line 12 in <module>
Aborted (core dumped)
!python3 util/taskcluster.py --source tensorflow --artifact convert_graphdef_memmapped_format --branch r1.15 --target .
!./convert_graphdef_memmapped_format --in_graph=/kaggle/working/models/ft_model.pb --out_graph=/kaggle/working/models/ft_model.pbmm
Downloading https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r1.15.cpu/artifacts/public/convert_graphdef_memmapped_format ...
Downloading: 100%
2020-12-05 01:30:20.906529: E tensorflow/contrib/util/convert_graphdef_memmapped_format.cc:79] Conversion failed Failed to load graph at '/kaggle/working/models/ft_model.pb' : /kaggle/working/models/ft_model.pb; No such file or directory
This was the training parameter. I have not specified a dropout or learning rate because I was not sure what is a good value here.
!python3 ./DeepSpeech.py --train_cudnn True --early_stop True --es_epochs 3 --n_hidden 2048 --epochs 5 --export_dir /kaggle/working/models/ --checkpoint_dir /kaggle/tmp/model_checkpoints/ --train_files /kaggle/tmp/train.csv --dev_files /kaggle/tmp/dev.csv --test_files /kaggle/tmp/test.csv --export_file_name 'ft_model' --augment reverb[p=0.2,delay=50.0~30.0,decay=10.0:2.0~1.0] --augment volume[p=0.2,dbfs=-10:-40] --augment pitch[p=0.2,pitch=1~0.2] --augment tempo[p=0.2,factor=1~0.5]
I have also shared the full code I used to train the model. If anyone has any suggestions, please, let me know. I am new to this and any help is really appreciated.
NB: I used Kaggle for training.