I am trying to fine-tune Tacotron-DDC-130k on LJSpeech on the Arabic dataset using the phonemizer. Arabic is supported under ar tag.
During the training, these warnings didn’t stop for the entire epoch:
[WARNING] extra phones may appear in the "ar" phoneset
[WARNING] language switch flags have been removed (applying "remove-flags" policy)
[WARNING] 1 utterances containing language switches on lines 1
[WARNING] extra phones may appear in the "ar" phoneset
[WARNING] language switch flags have been removed (applying "remove-flags" policy)
[WARNING] 1 utterances containing language switches on lines 1
[WARNING] extra phones may appear in the "ar" phoneset
[WARNING] language switch flags have been removed (applying "remove-flags" policy)
[WARNING] 1 utterances containing language switches on lines 1
[WARNING] extra phones may appear in the "ar" phoneset
[WARNING] language switch flags have been removed (applying "remove-flags" policy)
[WARNING] 1 utterances containing language switches on lines 1
............
Where the problem came from? I mean is there anything to change before training except these parameters:
text_cleaner: phoneme_cleaners
use_phonemes: true
phoneme_language: ar
I think it may be related to text cleaning Any help?
Thank you
You also have that as an issue, please close the issue if you continue here. Otherwise it is hard for people to follow.
Just to understand, you are using a 130k TacoDDC trained on LJSpeech English and you want to transfer that to Arabic with how much material?
The warning states that there is a mix of languages (not ar in this case) in your input. And the flags (e.g. <en>...</en>) for that have been removed. Do you have mixed material?
But @mrthorstenm and @repodiac are the experts on that. Post more and we might be able to help you.
Hello @othiele
yes I posted here because I didn’t get a replay for the issue , I close it now as you told me.
for the dataset it is about 3h, this is a sample of diacritic Arabic which is supported by the phonemizer:
I’d suggest to run phonemizer over all of your dataset, list unique phones and compare it with the phoneme set under symbols.py. If there are extras, you can add them there and send a PR for us to update it.
@erogol
We tested the whole dataset with phonemizer only, it gives it’s output as expected without any warnings.
Actually, the problem was related to phonemizer_cleaner , and exactly when the ASCII conversion happens: text = convert_to_ascii(text)
Phonemizer accepts the Arabic letters as it is , and when you make that conversion it does change the letters completely to an other format which is not Arabic.
So, for now we are using basic_cleaners in place of phoneme_cleaners and the warnings gone (until this moment)
Is there any method to disable cleaners ?
Thank you all for your support and your time.
As @erogol said, we just need a function that returns the input as it is. The basic_cleaners function seems to do that properly, and we check that it gives the input as it is without issues.
So thanks again, I really appreciate your work in Mozilla TTS
I install espeak-ng to get “ar” support, but get “xcb_connection_has_error() returned true” so fix by make “unset DISPLAY” now train. Will report result.