Problems training a model with Common Voice

Hello,
I want to use Mozilla TTS with Mozilla Common Voice (German).
When I start the training I get the following error: Pastebin: error log. My config.json can be found here: Pastebin: config.json.
Thank’s for your help.

Check if that’s a valid path for the wav file. This usually happens because of wrong paths or corrupt files. If it’s a valid file path, remove it from the train or test csv

Hi, thank’s for your answer.
I already removed one file from the train and validate.csv, but then the same error happens again with another file. I think now that the problem is that the dataset by Mozilla only contains mp3 files.
Greetings

If that’s happening always, you chance that path isn’t valid. If that’s only happening for a couple, then remove them from the csv.

It’s happening always. The path to the folder which contains the audio files is correct. I think the problem is that Mozilla TTS tries to open .wav files, but the dataset only contains .mp3 files.

I looked at the error again. That path is absolutely invalid. Its looking for a ‘.*.mp3.wav’ which it’ll not find there. Go to TTS/datasets/preprocess.py and in common_voice function, make sure the wav file variable will have a proper path, I.e., path/to/mp3.

Thank you for your answer.
I changed the following line in preprocess.py
wav_file = os.path.join(root_path, "clips", cols[1] + ".wav")
to
wav_file = os.path.join(root_path, "clips", cols[1] + "")
and then the path is correct. But now I get the following error:
RuntimeError: Error opening 'CommonVoice/clips/common_voice_de_18482210.mp3': File contains data in an unknown format.
Greetings

Can I have the stack traceback.

Also, I believe it’s because you’re trying to open mp3 with soundfile. Anyway, go to utils/audio.py and try opening with librosa. Should work.

I have made the changes you recommend. But now I am getting the following error:
error log. My audio.py can be found here: audio.py.
Thank you!

Change sample rate in config to 48k instead of 22050.

Hi Timo,
I have the same error while trying to train with the italian common voice dataset. I implemented your same solution but I still have the same error. Did you modify anything else inside the common_voice function in preprocess.py?
Thanks

Hello,
I don’t think so. But I am not sure, it is long ago.
Greetings,
Timo