Quite some things happened.
I updated my server to an 8 core cpu and 16 GB of RAM.
Finally I could use
coqui-ai / TTS inference on my server and it was quite fast.
Now my goal is to train my own voices.
I had good experiences fine-tuning already good models.
So I downloaded a model that
coqui-ai / TTS uses and tried to train it using a modified version of
config.json with my dataset and with
coqui-ai / TTS and
Alas it complained about
stopnet and other parameters to be missing.
I copied them from another
config.json I got the traceback
Traceback (most recent call last): File "/content/TTS/TTS/bin/train_tacotron.py", line 664, in <module> main(args) File "/content/TTS/TTS/bin/train_tacotron.py", line 548, in main optimizer.load_state_dict(checkpoint['optimizer']) File "/usr/local/lib/python3.7/dist-packages/torch/optim/optimizer.py", line 141, in load_state_dict raise ValueError("loaded state dict has a different number of " ValueError: loaded state dict has a different number of parameter groups
which means that the models are incompatible.
How do you fine-tune a model for it to work with the tts / synthesize.py command?
Are there models that just work for fine-tuning and reusing with the current version?
I also did it the other way round, used a model from the github page and it wasn’t accepted by the tts on my server.