Final results LPCNet + Tacotron2 (Spanish)

How much data you training on? Epoch 2?

1 Like

How many hours? Sounds good for epoch 2, train until epoch 10 or so

Good numbers and good sound, nothing to worry about, just train for longer

@erogol this may interest you, I didn’t know that IBM is using LPCNet:
http://srv-wtts.haifa.il.ibm.com/TTS-voice-conversion-IS2019/


Hello @carlfm01, could you explain to me how I do these steps please. I don’t know exactly how to do them. I would appreciate.

Are these results generated by Tacotron training?

How long did tacotron 2 training take you?. for the 47k steps

thx I saw the work at interspeech but it has a complex work with proprietary parts for linguistic features. However, it shows how promising LPCNet is.

Yes, I’ve tried to adapt a new speaker from male to female but failed, now I’m trying with a new run of male to male voice. New male voice data on the way!

Use the preprocess.py of tacotron, then replace the generated audio directory with your feature extract audio directory.

The 47k audios? Yes

About 2 days using a single K80

Tanks @carlfm01.

I saw that in your Spanish version of Tacotron, in haparams, you have a sample_rate of 16,000, but the data you shared is in 22050. Did you process it that way?

Yes, just make sure your header removal script is converting it to 16KHz, prior to ./feature_extract.sh

result for a new speaker fine tune with 10k steps for taco2 and 1 epoch for LPCNet, still training.
to much ‘s’ sounds, like whisper from the dataset and the breathing is too loud.
voice adapt.zip (1,1 MB)

19h for this new speaker(I’ll share :smiley: )

Wow @carlfm01. NIce.

Have you tried Tacotron training without LPCNet, if so, are the results good?

Yes the Mozilla’s version with GL:

tuxmozillatts.zip (266,1 KB)

The wavernn fork did not converge and I’m limited with the compute power that I can spent. I guess it needs more training.

@carlfm01

I get this error in tacotron training, do you know what it can be?

Traceback (most recent call last):
File “/usr/lib/python3.5/threading.py”, line 914, in _bootstrap_inner
self.run()
File “/usr/lib/python3.5/threading.py”, line 862, in run
self._target(*self._args, **self._kwargs)
File “/home/manuel_garcia02/Tacotron-2/tacotron/feeder.py”, line 162, in _enqueue_next_train_group
examples = [self._get_next_example() for i in range(n * _batches_per_group)]
File “/home/manuel_garcia02/Tacotron-2/tacotron/feeder.py”, line 162, in
examples = [self._get_next_example() for i in range(n * _batches_per_group)]
File “/home/manuel_garcia02/Tacotron-2/tacotron/feeder.py”, line 196, in _get_next_example
mel_target = np.resize(mel_target, (-1, self._hparams.num_mels))
File “/usr/local/lib/python3.5/dist-packages/numpy/core/fromnumeric.py”, line 1174, in resize
return mu.zeros(new_shape, a.dtype)
ValueError: negative dimensions are not allowed
Exception in thread background:
Traceback (most recent call last):
File “/usr/lib/python3.5/threading.py”, line 914, in _bootstrap_inner
self.run()
File “/usr/lib/python3.5/threading.py”, line 862, in run
self._target(*self._args, **self._kwargs)
File “/home/manuel_garcia02/Tacotron-2/tacotron/feeder.py”, line 176, in _enqueue_next_test_group
test_batches, r = self.make_test_batches()
File “/home/manuel_garcia02/Tacotron-2/tacotron/feeder.py”, line 145, in make_test_batches
examples = [self._get_test_groups() for i in range(len(self._test_meta))]
File “/home/manuel_garcia02/Tacotron-2/tacotron/feeder.py”, line 145, in
examples = [self._get_test_groups() for i in range(len(self._test_meta))]
File “/home/manuel_garcia02/Tacotron-2/tacotron/feeder.py”, line 129, in _get_test_groups
mel_target = np.resize(mel_target, (-1, self._hparams.num_mels))
File “/usr/local/lib/python3.5/dist-packages/numpy/core/fromnumeric.py”, line 1174, in resize
return mu.zeros(new_shape, a.dtype)
ValueError: negative dimensions are not allowed

Definitely something wrong with your extracted features, you mind sharing the extraction scripts to check?

@carlfm01 can you give me your pip3 list please?
I would greatly appreciate it

@carlfm01

feature_extract.sh

mkdir -p /home/manuel_garcia02/LPCNet/spanish/audio/
for i in /home/manuel_garcia02/LPCNet/spanish/s16/*.s16
do
./dump_data -test $i /home/manuel_garcia02/LPCNet/spanish/audio/$(basename “$i” | cut -d. -f1).npy
echo $i
done

header_removal.sh

mkdir -p spanish/s16
for i in spanish/locutores/wavs/*.wav
do 
    sox $i -r 16000 -c 1 -t sw - > spanish/s16T/audio-$(basename "$i" | cut -d. -f1).s16
    echo $i
done
##merge all PCM to single file
mkdir -p spanish/pcm
for i in spanish/s16T/*.s16
do 
    cat "$i" >> spanish/pcm/final.pcm
    echo $i
done
echo "Final.pcm created..."

did you make sure this is compiled with taco=1?

did you replace the audio directory created by preprocess.py with this?

@carlfm01 Yes, I did all this

then it may be a broken audio, can you sort by duration and see if the shortest is correct?