Training on Common Voice Error - Error after importing

lissyx · September 16, 2019, 7:50am

An RTX is useless if it’s not able to be fed at pace. We’ve got reports from people using specific PCIe configurations with huge slowdown depending on TensorFlow / CUDA / our model changes (we don’t know what impacts). So i’m just warning people.

Also, a real model really needs more than just one RTX GPU if you have a serious amount of data. WIth ~250h of French, it takes ~4h to train a model on 2x RTX 2080 Ti.

lissyx · September 16, 2019, 7:51am

Please make an effort and use proper code formatting. Some important informations might be mangled by the markdown parsing.

That’s indeed the case, your error is incomplete, I cannot help you.

reyxuan · September 16, 2019, 9:16am

@lissyx Thanks for the warning. I think my PCIe configuration is fine. For the moment I have ~250h and an RTX 2080 Super.

WIth ~250h of French, it takes ~4h to train a model on 2x RTX 2080 Ti.

Do you mean training a model or training an epoch?

lissyx · September 16, 2019, 11:13am

One model. Please check https://github.com/Common-Voice/commonvoice-fr/blob/master/DeepSpeech/Dockerfile.train.fr for details.

JohnWayne · September 17, 2019, 12:17am

Apologies for that:

The code:
(deepspeech-train-venv) chabani@chabani-VirtualBox:~/DeepSpeech/DeepSpeech$ bin/import_cv2.py --filter_alphabet alphabet.txt /media/sf_en/

/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_qint8 = np.dtype([(“qint8”, np.int8, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_quint8 = np.dtype([(“quint8”, np.uint8, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_qint16 = np.dtype([(“qint16”, np.int16, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_quint16 = np.dtype([(“quint16”, np.uint16, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_qint32 = np.dtype([(“qint32”, np.int32, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
np_resource = np.dtype([(“resource”, np.ubyte, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_qint8 = np.dtype([(“qint8”, np.int8, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_quint8 = np.dtype([(“quint8”, np.uint8, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_qint16 = np.dtype([(“qint16”, np.int16, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_quint16 = np.dtype([(“quint16”, np.uint16, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_qint32 = np.dtype([(“qint32”, np.int32, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
np_resource = np.dtype([(“resource”, np.ubyte, 1)])
Loading TSV file: /media/sf_en/train.tsv
Saving new DeepSpeech-formatted CSV file to: /media/sf_en/clips/train.csv
Importing mp3 files…

Traceback (most recent call last):
File “bin/import_cv2.py”, line 166, in
_preprocess_data(PARAMS.tsv_dir, AUDIO_DIR, label_filter_fun, PARAMS.space_after_every_character)
File “bin/import_cv2.py”, line 43, in _preprocess_data
_maybe_convert_set(input_tsv, audio_dir, label_filter, space_after_every_character)
File “bin/import_cv2.py”, line 100, in _maybe_convert_set
bar = progressbar.ProgressBar(max_value=num_samples, widgets=SIMPLE_BAR)
TypeError: init() got an unexpected keyword argument ‘max_value’

Hope the posting is appropriate. This is the output I receive from terminal

lissyx · September 17, 2019, 7:53am

Please use proper code formatting, this is unreadable and Markdown parser is eating important Python informations.

reyxuan · September 17, 2019, 8:20am

@JohnWayne Please use ``` your code ```.

JohnWayne · September 17, 2019, 3:04pm

Okay, hope its readable now.

(deepspeech-train-venv) chabani@chabani-VirtualBox:~/DeepSpeech/DeepSpeech$ bin/import_cv2.py --filter_alphabet alphabet.txt /media/sf_en/

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/chabani/tmp/deepspeech-train-venv/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
Loading TSV file:  /media/sf_en/train.tsv
Saving new DeepSpeech-formatted CSV file to:  /media/sf_en/clips/train.csv
Importing mp3 files...
Traceback (most recent call last):
  File "bin/import_cv2.py", line 166, in <module>
    _preprocess_data(PARAMS.tsv_dir, AUDIO_DIR, label_filter_fun, PARAMS.space_after_every_character)
  File "bin/import_cv2.py", line 43, in _preprocess_data
    _maybe_convert_set(input_tsv, audio_dir, label_filter, space_after_every_character)
  File "bin/import_cv2.py", line 100, in _maybe_convert_set
    bar = progressbar.ProgressBar(max_value=num_samples, widgets=SIMPLE_BAR)
TypeError: __init__() got an unexpected keyword argument 'max_value```

Placed the inverted commas as @reyxuan suggested. Quite different on linux from mac. Hope it works now

lissyx · September 17, 2019, 12:43pm

No it’s still the same, you have used the wrong ones.

nmstoker · September 17, 2019, 1:30pm

@JohnWayne you may have missed it, but you can also typically edit a post.
That way you avoid a whole repost; just need to go in and type the correct character. My guess is that you’ve somehow got inverted commas that are “smart” (ie adjusted for opening and closing quotes) and those are not the ones to use

JohnWayne · September 17, 2019, 3:05pm

Oh, thanks for the help and @reyxuan. Got the formatting done correctly

lissyx · September 18, 2019, 7:52am

Have you properly setup your virtualenv? It looks like you have an incompatible progressbar …

JohnWayne · September 18, 2019, 10:41pm

I tried to set up my virtualenv again. Still had the same issue. Some user on stackoverflow pointed out Progressbar2 deals with max_value while Progressbar in GNU/Linux accepts maxval.

It works now after install Progressbar2.

Thank you for the help

lissyx · September 19, 2019, 7:58am

Exactly like I said: requirements.txt:progressbar2, which confirms you wrongly installed your virtualenv.

JohnWayne · September 19, 2019, 12:07pm

my apologies, didnt read that properly in the documentation.

Lastly, my terminal hangs when importing the mp3. Have tried the process multiple times and hangs at the same point.

Any idea why this would happen? Didnt have an issue on the MacOS terminal.

Loading TSV file:  /media/sf_en/train.tsv
Saving new DeepSpeech-formatted CSV file to:  /media/sf_en/clips/train.csv
Importing mp3 files...
Progress |#################################################### |  98% completedWriting CSV file for DeepSpeech.py as:  /media/sf_en/clips/train.csv
Progress |#####################################################| 100% completed
Imported 12123 samples.
Skipped 103 samples that failed on transcript validation.
Skipped 12 samples that were longer than 10 seconds.
Final amount of imported audio: 14:52:21.
Loading TSV file:  /media/sf_en/test.tsv
Saving new DeepSpeech-formatted CSV file to:  /media/sf_en/clips/test.csv
Importing mp3 files...
Progress |#################################################### |  98% completedWriting CSV file for DeepSpeech.py as:  /media/sf_en/clips/test.csv
Progress |#####################################################| 100% completed
Imported 6810 samples.
Skipped 360 samples that failed on transcript validation.
Skipped 206 samples that were longer than 10 seconds.
Final amount of imported audio: 10:21:17.
Loading TSV file:  /media/sf_en/dev.tsv
Saving new DeepSpeech-formatted CSV file to:  /media/sf_en/clips/dev.csv
Importing mp3 files...
Progress |###########################################          |  82% completed```

Stops at the same % all the time.

lissyx · September 19, 2019, 12:17pm

No idea, sorry, this is not something we experienced.

reyxuan · September 19, 2019, 3:10pm

@JohnWayne Try to print which mp3 is being read so you know which on one the script is failing. Just to make sure that the problem is not the file.

JohnWayne · September 23, 2019, 12:36pm

Thank you, issue is sorted now.

lissyx · September 23, 2019, 2:32pm

Could you please share more details ? It can be useful if others run into the same issue.

JohnWayne · September 24, 2019, 3:38am

I rebooted the system and ran the code again. It worked properly afterwards.