ERROR: Inexistent --validate_label_locale specified

Hello. I’m starting my journey with DeepSpeech.

Language: Polish
DeepSpeech: 9.3
System: ubuntu 20.04
Common Voice: pl_129h_2020-12-11

I managed to run the test training, then tried to train my model on Polish Common Voice.

python3 bin/import_cv2.py --validate_label_locale /home/validate_label_pl.py --filter_alphabet /home/alphabet.txt /home/utomek/Polskids/cv-corpus-6.1-2020-12-11/pl 

this is command i used

and here is the output:

Loading TSV file:  /home/utomek/Polskids/cv-corpus-6.1-2020-12-11/pl/test.tsv

Importing mp3 files…
ERROR: Inexistent --validate_label_locale specified. Please check.
Process ForkPoolWorker-1:
Traceback (most recent call last):
File “/usr/lib/python3.6/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/usr/lib/python3.6/multiprocessing/process.py”, line 93, in run
self._target(*self._args, **self._kwargs)
File “/usr/lib/python3.6/multiprocessing/pool.py”, line 103, in worker
initializer(*initargs)
File “bin/import_cv2.py”, line 54, in init_worker
alphabet = Alphabet(params.filter_alphabet) if params.filter_alphabet else None
File “/home/utomek/tmp/deepspeech-train-venv/lib/python3.6/site-packages/ds_ctcdecoder/init.py”, line 47, in init
raise ValueError(‘Alphabet initialization failed with error code 0x{:X}’.format(err))
ValueError: Alphabet initialization failed with error code 0x1
ERROR: Inexistent --validate_label_locale specified. Please check.
Process ForkPoolWorker-2:
Traceback (most recent call last):
File “/usr/lib/python3.6/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/usr/lib/python3.6/multiprocessing/process.py”, line 93, in run
self._target(*self._args, **self._kwargs)
File “/usr/lib/python3.6/multiprocessing/pool.py”, line 103, in worker
initializer(*initargs)
File “bin/import_cv2.py”, line 54, in init_worker
alphabet = Alphabet(params.filter_alphabet) if params.filter_alphabet else None
File “/home/utomek/tmp/deepspeech-train-venv/lib/python3.6/site-packages/ds_ctcdecoder/init.py”, line 47, in init
raise ValueError(‘Alphabet initialization failed with error code 0x{:X}’.format(err))
ValueError: Alphabet initialization failed with error code 0x1
ERROR: Inexistent --validate_label_locale specified. Please check.
Process ForkPoolWorker-3:

I have polish alphabet file filled with polish letters and this is my validate_label_pl.py

def validate_label(label):
if 'a' in label: # disallow labels with 'a'
    return None
return label.lower() # lower case valid labels

Not sure why it says my file is “Inexistent”. The alphabet.txt, validate_label_pl.py and Common Voice files are located inside home directory. Tried my best to follow documentation and discourse like this discusion

This is just a side-effect of failing to load validate_label_pl.py

Looks like this is your problem. No idea why.

Can it be related to my alphabet file?

Ok looks like somehow i put path wrong…
after that installed sox and succesfully imported files.

sudo apt-get install sox libsox-fmt-mp3