I have created a speech dataset to train with DeepSpeech while following this( https://medium.com/@klintcho/creating-an-open-speech-recognition-dataset-for-almost-any-language-c532fb2bc0cf ) tutorial.
But, I couldn’t trained my dataset with deepspeech. It gives an error as a result of train command like
python DeepSpeech.py --train_files /mnt/c/wsl/teneke_out_bolum1/
It gives:
pandas.errors.ParserError: Error tokenizing data. C error: Calling >read(nbytes) on source failed. Try engine=‘python’.
I have created dataset after aeneas force allignment and fine tuning with finetuneas:
Here is my code that I used on Google Colab to train with DeepSpeech:
I found some solutions on Google like
data = pd.read_csv('file1.csv', error_bad_lines=False)
Also as error output, I may solve with setting
engine=‘python’
But, I couldn’t figure out where I should change.
So, where should I edit to fix this issue.
Thanks.