Final results LPCNet + Tacotron2 (Spanish)

manuel3265 · October 10, 2019, 10:06pm

@carlfm01 16GB version

manuel3265 · October 13, 2019, 9:06pm

Hello @carlfm01.

I thank you for your answers, I already started with Tacotron training normally. I have another question.

Is it normal for audios generated in tacotron training to be heard like this?

http://www.mediafire.com/file/r7p3ggsbfqrqpud/step-2500-wave-from-mel.wav/file
(‘I’m still a new user’)

Thank you.

carlfm01 · October 11, 2019, 10:34pm

Please share your attention plot, sounds like the attention is broken.

manuel3265 · October 13, 2019, 9:05pm

@carlfm01 this is my attention plot

carlfm01 · October 12, 2019, 3:12am

manuel3265:

feature_extract.sh

mkdir -p /home/manuel_garcia02/LPCNet/spanish/audio/
for i in /home/manuel_garcia02/LPCNet/spanish/s16/*.s16
do
./dump_data -test $i /home/manuel_garcia02/LPCNet/spanish/audio/$(basename “$i” | cut -d. -f1).npy
echo $i
done

header_removal.sh

Can you share a single file created by this script to see if it is correct? Try to use Mozilla Send

What about the silence at the beginning and at the end? long silence can damage the performance

Did you changed something?

manuel3265 · October 12, 2019, 3:55am

@carlfm01 this is an audio.

I can’t upload the file with Mozilla send. I’m still a new user.
https://transfer.sh/5kb14/audio-archivo-156579483968273.npy
I don’t change nothing

carlfm01 · October 12, 2019, 4:27am

audio-archivo-156579483968273.zip (164,8 KB)

I’m able to synthesize your training file, thus your training format is correct, can be about transcription/audio quality, I mean wrong transcriptions or empty audio, like the last one you removed.

manuel3265 · October 12, 2019, 7:43am

The last thing I eliminated, they were audios with phrases too extensive. I used the erogol’s notebook and eliminated audio. these auidos
eliminated, if renian audio, worse ko could be processed, I do not know what it can be. After that, I had already trained tacotron without lpcnet. And this was the result.

Could it be that I took a workout that was already saved, before deleting the long sentences?

I think the problem is to resume training with different files. At this time, I executed a new training from scratch.

later I discuss the attention plot, when the 2 workouts understand at the same level of training

carlfm01 · October 12, 2019, 12:45pm

Ok, let’s wait.

Yes, you need to delete the model trained with the wrong sentences.

manuel3265 · October 12, 2019, 7:44pm

Hello @carlfm01
In fact, my attention plot doesn’t look like it used to. this is the current one.

but the audio is heard with the same noise as before.

carlfm01 · October 12, 2019, 10:18pm

Share the generated feature to test? Looks like it needs silence trimming at the end

manuel3265 · October 12, 2019, 11:20pm

@carlfm01 This is an audio synthesized with tacotron, and processed with LPCNet.

https://transfer.sh/HvMt2/test-out.wav

carlfm01 · October 12, 2019, 11:27pm

Sounds good, but hard to tell if it needs more training with just 3 words, share a longer audio?

And the issue? How did you fix it?

carlfm01 · October 12, 2019, 11:40pm

Now I’m trying to adapt a new voice with just 3h using the pretrained model with the two old voices (Tux and Epachuko) with 10k steps
3h.zip (346,7 KB)

manuel3265 · October 13, 2019, 12:53am

the model still needs more training, when I have at least about 25 thousand steps, I will begin to carry out sentences with longer sentences.

The noise audio was an audio generated by tacotron, in the evaluation. these audios are still produced the same.

I appreciate all your support, all that is just your merits.

manuel3265 · October 16, 2019, 9:24pm

@carlfm01 Have you tried to freeze the model?

carlfm01 · October 17, 2019, 8:20am

Yes, you need to use Tacotron_model/inference/add as output name

manuel3265 · October 17, 2019, 12:22pm

@carlfm01 Could we talk through an email or other means,please?

carlfm01 · October 18, 2019, 2:53am

Use the DM of the forum?

Solbiati_Alessandro · December 20, 2019, 6:05pm

hi @carlfm01! I was trying to runn synthesize.py from your tacotron-2 fork using your checkpoints, but looks like the tacotron_checkpoints are broken for me. Here is what I did:

Fork your repo carlfm01/Tacotron-2
Put the checkpoints from GDrive to a local ./checkpoint01 folder
Run tacotron-synthesize using all the default args (mode=eval, model=Tacotron and so on) and adding some example sentences

The checkpoints loads correctly:

Loading checkpoint: ./checkpoints01/tacotron_model.ckpt-55000
INFO:tensorflow:Restoring parameters from ./checkpoints01/tacotron_model.ckpt-55000

But than I have some missing variable errors:

    NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

2 root error(s) found.
  (0) Not found: Key Tacotron_model/inference/decoder/Location_Sensitive_Attention/attention_bias_1 not found in checkpoint
	 [[node save_5/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
  (1) Not found: Key Tacotron_model/inference/decoder/Location_Sensitive_Attention/attention_bias_1 not found in checkpoint
	 [[node save_5/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
	 [[GroupCrossDeviceControlEdges_0/save_5/restore_all/_8]]

This is a full minimal repro notebook of what I am trying to do: https://colab.research.google.com/drive/1Ys6oWXIRUnGDYUVWYppiJXFOOUTHT-JN