ASR and TTS for Italian Model with Common Voice data

Hi,
i would like to train an ASR and a TTS model for italian based on Common Voice Dataset.
I have some questions.

  1. DeepSpeech can only train ASR is correct?
  2. The procedure to follow to train own’s model is the one pointed out in TUTORIAL : How I trained a specific french model to control my robot ? Have i to follow this one?
    in positive case, i would like to apply my model to domotic, so is there any kind of pre-processing or sound properties or other stuff i need to know to properly train the model ? Can i find anything i need in the paper https://arxiv.org/abs/1412.5567 ? Or can you suggest me other references?
  3. Can someone give an advice on a good architecture to train TTS ?

thank you a lot!
Christian

You should take contact with @Mte90

Please follow the official documentation, this tutorial is good but it’s old and for a specific case. https://github.com/mozilla/DeepSpeech/blob/master/TRAINING.rst

Likely you can have a look at what @erogol does

Hi Christian, we already have the model for italian if you check on https://discourse.mozilla.org/c/voice/it you can find the italian category.

Also if you have telegram our community is there and discussing how to improve it, check @mozitabot and pick the developers group.

The model is at https://github.com/MozillaItalia/DeepSpeech-Italian-Model/