German dataset doesn't work for training

I’m trying to train a german model with DeepSpeech. But as I downloaded the german dataset from the website, I noticed that the folder names and the the whole structure of the tsv-files differs from the english ones which I downloaded with the import_cv.py script (those even were csv-files).

I tried to unpack the german dataset with the import_cv.py script in the hope that it would reformat the files and folders like it did with the english dataset. But it didn’t.

Is there some kind of converter, or do I have to write one myself? And why aren’t the files already in the right form so DeepSpeech.py can use them?

Did you try using import_cv2.py?

1 Like

In fact, I did not. But with import_cv2.py it works.
Thank you.

1 Like

@wagnrd: if you are still looking for DeepSpeech results on German Language. Check paper and repository. It might be useful.

https://www.researchgate.net/publication/336532830_German_End-to-end_Speech_Recognition_based_on_DeepSpeech

1 Like