Trouble running TTS_example colab

Good Evening,

I am in the process of trying to finetune a tacotron model with a few voice samples+ M-AI-Labs in French and I wanted to give a try at training it with a classic dataset to see how it worked.
I found @erogol colab example here that describe how to train it with LJSpeech.
But somehow the script gives me an error saying it can’t load the module utils.io: No module named 'utils.io' .
I am not sure what I am doing wrong but would appreciate any pointers.

Thanks a lot,

The examples all seem faulty. For me too. I guess the path to find the utils.io is incorrect and should be changed to something like TTS.utils.io or TTS.tts.utils.io…depending on where you are. IT can be seen as it’s looking for a path where the dots are representing what is normaly " /"

Hi @CrazyJoeDevola
TTS.utils.io did the tricks, encountering other issues but now that I know that I need to modify the paths I will try to resolve them.
Thanks for the tip,
Have a good day,

colab examples should be used by their corresponding checkouts. I dont thing in that case there would be a problem.

Otherwise feel free to contribute back your fix to the repo.

Hi,
Thanks for taking the time to answer. I don’t have a background in computer science so sometimes I am struggling understanding certain terms, sorry about that. What exactly do you mean by “corresponding checkouts”?

No problem :slight_smile: If you look at the code you see I checkout a certain version of the repo. "git checkout ". This updates the code to a certain version which is compatible with the example code. Hope it clarifies.

1 Like

Great!
I checked out from this hash: 7e799d5. I had to modify the losses.py script (as explained here) and now it seems to run smoothly.
Thanks a lot.
Next step is using another dataset.

I’ve also released an small update which handles module name mismatch as you use ‘’‘load_checkpoint’’’ function

After using Ljspeech, I am trying to feed it another dataset, namely the french part of mailabs.
I used preprocess.py to generate a nested list containing all the wav files that I saved as a csv file using csv.writer

import numpy as np
import csv
from datasets.preprocess import mailabs
items= mailabs("/content/gdrive/My Drive/Tacotron/fr_FR")
with open('/content/gdrive/My Drive/Tacotron/fr_FR/metadata.csv', 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerows(items)

I then proceeded to split it into train and eval with this code:

shuf metadata.csv > metadata_shuf.csv
head -n 12000 metadata_shuf.csv > metadata_train.csv
tail -n 1100 metadata_shuf.csv > metadata_val.csv

Then I changed the path and name of the dataset in config.json

"datasets": [{"name": "mailabs", "path": "/content/gdrive/My Drive/Tacotron/fr_FR", "meta_file_train": null, "meta_file_val": null}]

But I keep getting the same error:

  File "/usr/local/lib/python3.6/dist-packages/TTS-0.0.3+7e799d5-py3.6.egg/TTS/utils/generic_utils.py", line 78, in split_dataset
    assert eval_split_size > 0, " [!] You do not have enough samples to train. You need at least 100 samples."
AssertionError:  [!] You do not have enough samples to train. You need at least 100 samples.

However I have 31143 rows in my csv with valid wav files.
I am not sure where to look next, I am thinking it might have to do with the order of the elements in the csv, but I am not sure.

@julian.weber I think you successfully trained with the french part of the m-ai-labs dataset, would you accept to share your process ?

EDIT: Nevermind, I realised that the preprocess was run inside the train function, so it worked when I changed the regex in the preprocess for mailabs to:

speaker_regex = re.compile("(male|female)/(?P<speaker_name>[^/]+)/")

have you successfully trained the model in french?

Hey @kr6k3n,
Yes, it worked for the mailabs, and the results were nice, but not the fine-tuning I was hoping to do with my own dataset. I think it might have to do with the quality

Nice ! Would you mind sharing results?: I have been searching someone like you for months since I can’t afford any good gpu to train it on. Otherwise I would totally understand.

Yeah it was the regex that wasn’t right for the french part of the dataset like I said in this issue back in March. But maybe this is worth a PR