Okay, so I remove –filter_alphabet ./20140421/path/to/some/alphabet.txt
and I get the valid csv Output.
But when I run this :
python DeepSpeech.py --train_files ./20140421/scripts/Ib/clips/train.csv --dev_files ./20140421/scripts/Ib/clips/dev.csv --test_files ./20140421/scripts/Ib/clips/test.csv
, I get the following error :
I Could not find best validating checkpoint.
I Loading most recent checkpoint from /home/ritish/.local/share/deepspeech/checkpoints/train-0
I Loading variable from checkpoint: beta1_power
I Loading variable from checkpoint: beta2_power
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias/Adam
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias/Adam_1
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel/Adam
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel/Adam_1
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/bias/Adam
I Loading variable from checkpoint: layer_1/bias/Adam_1
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_1/weights/Adam
I Loading variable from checkpoint: layer_1/weights/Adam_1
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/bias/Adam
I Loading variable from checkpoint: layer_2/bias/Adam_1
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_2/weights/Adam
I Loading variable from checkpoint: layer_2/weights/Adam_1
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/bias/Adam
I Loading variable from checkpoint: layer_3/bias/Adam_1
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_3/weights/Adam
I Loading variable from checkpoint: layer_3/weights/Adam_1
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/bias/Adam
I Loading variable from checkpoint: layer_5/bias/Adam_1
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_5/weights/Adam
I Loading variable from checkpoint: layer_5/weights/Adam_1
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: layer_6/bias/Adam
I Loading variable from checkpoint: layer_6/bias/Adam_1
I Loading variable from checkpoint: layer_6/weights
I Loading variable from checkpoint: layer_6/weights/Adam
I Loading variable from checkpoint: layer_6/weights/Adam_1
I Loading variable from checkpoint: learning_rate
I STARTING Optimization
Epoch 0 | Training | Elapsed Time: 0:00:00 | Steps: 0 | Loss: 0.000000 Traceback (most recent call last):
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/text.py”, line 30, in _label_from_string
return self._str_to_label[string]
KeyError: ‘é’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/text.py”, line 128, in text_to_char_array
transcript = alphabet.encode(transcript)
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/text.py”, line 44, in encode
res.append(self._label_from_string(char))
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/text.py”, line 36, in _label_from_string
).with_traceback(e.traceback)
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/text.py”, line 30, in _label_from_string
return self._str_to_label[string]
KeyError: “ERROR: Your transcripts contain characters (e.g. ‘é’) which do not occur in ‘/home/ritish/DeepSpeech/DeepSpeech/data/alphabet.txt’! Use util/check_characters.py to see what characters are in your [train,dev,test].csv transcripts, and then add all these to ‘/home/ritish/DeepSpeech/DeepSpeech/data/alphabet.txt’.”
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “DeepSpeech.py”, line 12, in
ds_train.run_script()
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py”, line 942, in run_script
absl.app.run(main)
File “/home/ritish/.local/lib/python3.7/site-packages/absl/app.py”, line 299, in run
_run_main(main, args)
File “/home/ritish/.local/lib/python3.7/site-packages/absl/app.py”, line 250, in _run_main
sys.exit(main(argv))
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py”, line 914, in main
train()
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py”, line 592, in train
train_loss, _ = run_set(‘train’, epoch, train_init_op)
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/train.py”, line 553, in run_set
exception_box.raise_if_set()
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/helpers.py”, line 117, in raise_if_set
raise exception # pylint: disable = raising-bad-type
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/helpers.py”, line 125, in do_iterate
yield from iterable()
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/feeding.py”, line 123, in generate_values
transcript = text_to_char_array(sample.transcript, Config.alphabet, context=sample.sample_id)
File “/home/ritish/DeepSpeech/DeepSpeech/training/deepspeech_training/util/text.py”, line 136, in text_to_char_array
raise ValueError(‘While processing: {}\n{}’.format(context, e))
ValueError: While processing: 20140421/scripts/Ib/clips/common_voice_ga-IE_17638591.wav
“ERROR: Your transcripts contain characters (e.g. ‘é’) which do not occur in ‘/home/ritish/DeepSpeech/DeepSpeech/data/alphabet.txt’! Use util/check_characters.py to see what characters are in your [train,dev,test].csv transcripts, and then add all these to ‘/home/ritish/DeepSpeech/DeepSpeech/data/alphabet.txt’.”
Please help me in getting this fixed. I am not sure whether the filter command is necessary or not.