Tip for improving voice naturalness: review power parameter (again)

It’s worth reviewing the settings for the power parameter in your config.json file post-training to see if you can improve the quality of your output speech.

Adjusting various parameters including power is covered here so you may well already have looked at it, but it’s potentially worth going a little higher than the 1.5 upper range suggested in the associated comment in the config file. I found I got the best results around 1.8-1.9.

NB: only applies if you’re using Griffin-Lim.

Credit for making me look closer at this goes to the poster of this reply to an issue here (for another similar TTS repo that also uses Griffin Lim)

To demonstrate the impact, there are two samples in the zip file attached, one with 1.4 (more “robotic” / “reverberating”) and one with 1.8 (which to my ear sounds more natural). You’ll likely want to try various levels, YMMV. Beware if you go too far the voice starts to sound quieter and more muffled. Fortunately this can all be experimented with post-training.

Thought I’d share it in case others found it useful.

adjusting_power_parameter.zip (253.0 KB)

Hey there,
Was wondering if you’d be up for discussing “improving voice naturalness(just taco/taco2+GL or with neural vocoders)/” in the weekly meet. I don’t see anything on the agenda; would be a shame to skip it.

We could possibly discuss what we’ve tried and what we’re trying to address each problems.

@carlfm01 @nmstoker?

Hi @alchemi5t - that’s a good suggestion and in principle I’d be happy to share details / discuss it further on the call, but as I’m carrying out my efforts in regard to this in a non-work capacity and the weekly call is during my working day, it will normally be infeasible for me to join the call. Therefore I’d prefer to discuss via the forum.

Oh ok! That’s perfectly understandable.

Just in case, if any other day would be convenient, I can make time regardless of when to manage a weekly/Bi weekly/monthly meet.

@alchemi5t Unfortunately, my attempt to adapt the female voice using the pretrained male voice did not end well, I’ve reviewed with DS but something somewhere is still broken, now I’m manually reviewing and is taking most of my time. By the time of the meeting I’ll be in traffic :confused:

I think I can join when I’m not working close to a deadline.

