Hello. First of all I would like to thank you all for your efforts. The demo voices
really sound great!
For my project, a game with many AI characters, I am looking for suggestions on how
the following might be achieved:
1- TTS for a lot of different voices: male, female, young, adolescent, adult, old, sick, fantasy & sci-fi (monsters, aliens).
Could I use a few base voices and do some kind of morphing on the speaker embedding?
Or could the speaker embedding be randomly generated based on some range?
2- Control intonations and emotions.
3- Introduce foreign accents and speech defects or just variations.
By variations I mean the length, pitch, emphasis of some syllables. Sometimes only on some specific word.
4- Other sounds that humans sometime do: sigh, sneeze, breath, breath while talking, clear throat, cough, burp, argh!, hum, hyper-ventilate.
5- More complex sounds like crying, laughing, humming, whistling, singing.
6- Could animal ‘voices’ be simulated too? (bird chirp/sing, cat, dog).