Contributing my german voice for tts

Hey guys.

I wish all of you “merry christmas” :christmas_tree: and some relaxing days.

Best
Thorsten

5 Likes

Hello.

I hope all of you had some relaxing days and a good (and healthy) start into 2021.

Just in case it’s interesting for you. Our “thorsten” dataset is now available on openslr.org too.

http://openslr.org/95/

8 Likes

Hello.
The nice guy @sanjaesc experimented recently with my dataset and sent me some samples with HifiGAN vocoder which i think are quite useable :smiley:.

The breathing is nice, but it’s too often - so i think i’d it’s better without breezing in this speaking speed.

Soundcloud Playlist HifiGAN


What do you think?

Thanks for your great support :+1:

4 Likes

Hello @mrthorstenm, @sanjaesc Thank you for the update on HifiGAN.

Currently HifiGAN is my personal favorite on your vocoder comparison page in respect to the combination of interference speed and voice quality. I am waiting for the final results of @monatis regarding Mulitband MelGAN.

Regarding the breathing: I like it in short sentences as it gives your voice an additional natural touch. In long sentences it seems too much, never reflected about how many breaths we take when speaking :slight_smile:

p.s. Is the HifiGAN model already available somewhere so that we can “play” with it?

1 Like

I’ve just added version 3 of my “thorsten” dataset. It’s based on v02, but speed has been increased by 10%. Trained TTS models will generate a little faster (but still natural) speechflow.

3 Likes

Recording my emotional dataset is finished :smiley: .

It took longer and was difficulter to pronounce emotional on non emotional (or wrong emotional) phrases but it’s done.
Now @dkreutz is doing his audio optimization magic. One he’s done i’ll publish the “Thorsten emotional dataset”.

Always keep in mind that i’m no professional voice actor, just a normal guy contributing his voice.

Details and an audio sample can be found here:

4 Likes

Just in case it’s interesting for you. I’ve created a Twitteraccount for my german voice contribution where i plan to post new models, news or updates around “Thorsten” dataset.

https://twitter.com/ThorstenVoice

2 Likes

Hey guys.

@Erogol from Coqui released my first trained open german TTS model :partying_face: .

It consists of:

  • Tacotron2 DCA model (based on “Thorsten” dataset)
  • WaveGrad vocoder

WaveGrad vocoder has a bad real time factor on cpu and an acceptable rtf on cuda. Next i’m training a Fullband-MelGAN vocoder for getting a better rtf to work with Mycroft voice assistant.

Want to give it a try?

pip install -U tts
tts --model_name tts_models/de/thorsten/tacotron2-DCA --text "Was geht, was geht, ich sags dir ganz konkret." --use_cuda=true

For updates on new models check my twitter account (https://twitter.com/ThorstenVoice).

Thank you all guys for your great support on this :blush:

Happy easter holidays :rabbit:

3 Likes

Hello.

I’ve just released my open german “emotional” dataset :partying_face: .
For details on dataset, audio samples and download link visit my github page:

https://twitter.com/ThorstenVoice

I hope it’s useful for someone and (as always) please keep in mind, that i’m no professional voice talent, just a guy contributing his voice :slightly_smiling_face:.

Wishing you nice easter holidays :rabbit:

1 Like

My “open german voice dataset” has now an article on german wikipedia :partying_face: :smiley:.

It’s a great journey together with you guys :+1:

3 Likes

I recently had a public talk at Tensorflow Turkey on “How to make machines talk with your voice”.
If you’re interested on what steps i did to record my dataset and train a TTS model of my voice you might want to take a look. I addition to that i named some mistakes i made and lessons i learnt.

2 Likes

I just released v02 of my EMOTIONAL dataset.
Now included is “drunk” and “whispering”. Details / Samples and download link is available on my Github page:

3 Likes

I recorded a practical walk through screencast video on the process on creating your own TTS voice.
Starting from preparing a text corpus for recording up to synthesizing voice.

3 Likes

As i have passion for TTS i thought why shouldn’t i share my thoughts, mistakes and lessons learned not with the community (nothing cool or fancy stuff). Just little bit of tech talk.

Did i mention, that i’m recording a new version of my neutral dataset? This time respecting my lessons-learned:

  • Better microphone
  • Better recording room situation
  • More natural speechflow

I’ve just recorded 8.000 phrases (so lot of work to do), but i share this recording-in-progress dataset with you.

Any feedback is highly appreciated and might affect futher recordings.

Download link is here: https://github.com/thorstenMueller/deep-learning-german-tts#dataset-thorsten-neutral

3 Likes

Hi :wave:. I could need your help :pray:.

Do you know about public articles, papers, projects, resources where my “Thorsten” german dataset is involved?

1 Like

My datasets are both listed now on Zenodo as DOI :partying_face:.

Emotional dataset:
DOI

Neutral dataset:
DOI

1 Like

I’ve created an OpenVoice-Tech Wiki as central knowledgebase for ALL OPEN VOICE Enthusiasts :partying_face:.

1 Like

Hi,
it’s been some time since my latest post :grin:.

I trained two variations of a new Tacotron2 DDC based TTS “Thorsten” model. But i’m unsure which variation to be released soon.

Maybe you can give some samples a listen and vote which variation you like more. The winning variation will be released as new TTS model.

Thanks :blush:

The vote is over. Thanks for everybody who participated on my poll. “Variante 2” has won and is not available using Coqui TTS (Version 0.8.0) :smiley:.

https://www.thorsten-voice.de/2022/08/22/neues-thorsten-tts-modell-verfuegbar-🥳/