Need to create Arabic models

limino · July 14, 2020, 1:21pm

I am trying to create an Arabic audio model using about 80,000 audio files.

This is the error I am having while trying to train. Can you help me know if something is not right with the parameters I’m assigning or something else? :

python -u DeepSpeech.py --train_files /Users/naveen/Downloads/all_datasets/DeepSpeech/TRAIN/train.csv --dev_files /Users/naveen/Downloads/all_datasets/DeepSpeech/DEV/dev.csv --test_files /Users/naveen/Downloads/all_datasets/DeepSpeech/TEST/test.csv --train_batch_size 80 --dev_batch_size 80 --test_batch_size 40 --n_hidden 375 --epoch 33 --validation_step 1 --early_stop True --earlystop_nsteps 6 --estop_mean_thresh 0.1 --estop_std_thresh 0.1 --dropout_rate 0.22 --learning_rate 0.00095 --report_count 100 --use_seq_length False --export_dir /Users/naveen/Downloads/all_datasets/DeepSpeech/results/model_export/ --checkpoint_dir /Users/naveen/Downloads/all_datasets/DeepSpeech/results/checkout/ --decoder_library_path /Users/naveen/Downloads/DeepSpeech/DeepSpeech/libctc_decoder_with_kenlm.so --alphabet_config_path /Users/naveen/Downloads/all_datasets/DeepSpeech/alphabet.txt --lm_binary_path /Users/naveen/Downloads/all_datasets/DeepSpeech/lm.binary --lm_trie_path /Users/naveen/Downloads/all_datasets/DeepSpeech/trie
/Users/naveen/anaconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File “/Users/naveen/Downloads/DeepSpeech/DeepSpeech/util/audio.py”, line 7, in
from deepspeech.utils import audioToInputVector
ModuleNotFoundError: No module named ‘deepspeech’
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “DeepSpeech.py”, line 24, in
from util.audio import audiofile_to_input_vector
File “/Users/naveen/Downloads/DeepSpeech/DeepSpeech/util/audio.py”, line 10, in
from python_speech_features import mfcc
ModuleNotFoundError: No module named ‘python_speech_features’
Deepak’s MacBook Pro:DeepSpeech naveen$ pip install deepspeech
Collecting deepspeech
Using cached https://files.pythonhosted.org/packages/14/c9/e969fbdaac6b2ce7a0fc4c24f0bc96ab4aaaac0e5c0be85f0dceb90c6fb9/deepspeech-0.1.1-cp36-cp36m-macosx_10_10_x86_64.whl
Requirement already satisfied: scipy in /Users/naveen/anaconda3/lib/python3.6/site-packages (from deepspeech)
Requirement already satisfied: numpy in /Users/naveen/anaconda3/lib/python3.6/site-packages (from deepspeech)
Installing collected packages: deepspeech
Successfully installed deepspeech-0.1.1
You are using pip version 9.0.1, however version 10.0.1 is available.
You should consider upgrading via the ‘pip install --upgrade pip’ command.
Deepak’s MacBook Pro:DeepSpeech naveen$ sh run_file.sh

othiele · July 15, 2020, 7:46am

This doesn’t look right, start over with a virtual environment. Current deepspeech should be 0.7.4. There is some problem in your getting pypi packages

lissyx · July 15, 2020, 9:33am

Please read the documentation, this is for inference, you want to perform training.

This is super old code, please use current version

You won’t be able to perform any serious training on macOS since there is no CUDA support there for TensorFlow.

khalilrhouma · November 24, 2020, 1:46pm

@limino is the Arabic dataset publicly available or private?
if it is public, could you please show me the link?

Topic		Replies	Views
Created an Arabic LM, but deepspeech is not learning, early stops DeepSpeech	9	2990	January 10, 2020
Training DeepSpeech in reinforcement learning envoirment DeepSpeech	2	437	April 30, 2020
How to train mozilla DeepSpeech model on arabic language on google colab DeepSpeech learning	0	592	January 16, 2022
DeepSpeech Training own English model for call center speech recognition DeepSpeech	22	3266	October 8, 2019
Arabic model stuck DeepSpeech issue	37	2947	November 22, 2022

Need to create Arabic models

Related topics