Margarita
(Margarita)
October 5, 2019, 12:34am
1
Hi,
I’m trying to train a new Chinese tts dataset using Tacotron2 on Dev branch
If I set “use_phonme”=true, and “phoneme_language”=zh, then following errors occur:
The version is eSpeak text-to-speech: 1.48.03 and phonemizer-1.1
available backends: festival-2.5.0, espeak-ng-1.49.2, segments-2.1.0
It’s confusing again that espeak with language “zh” worked well on master branch training tacotron
Anyone been through this before? Thanks!
nmstoker
(Neil Stoker)
October 5, 2019, 1:33am
2
Do you manage to make Espeak output speech on the command line using zh? If you can’t get that working then phonemiser (and in turn TTS) is also likely to struggle.
I’ve no experience with training for Chinese but I wonder whether you might be better off with espeak-ng, and if you use that, then I believe the language code is slightly different: cmn
# Languages
The languages in espeak-ng are grouped by their
[ISO 639-5](https://en.wikipedia.org/wiki/List_of_ISO_639-5_codes) language
family code. They are identified by their
[BCP 47](https://en.wikipedia.org/wiki/BCP47) identifier. For several accents
and dialects,
[private-use extensions](https://raw.githubusercontent.com/espeak-ng/bcp47-data/master/bcp47-extensions)
have been used.
The 107 supported languages and accents are:
| Family Code | Identifier | Language Family | Language | Accent/Dialect |
|-------------|-------------------|-----------------------|-----------------------------|------------------------|
| `gmw` | `af` | West Germanic | Afrikaans | |
| `ine` | `sq` | Indo-European | Albanian | |
| `sem` | `am` | Semitic | Amharic | |
| `sem` | `ar` | Semitic | Arabic<sup>\[1\]</sup> | |
| `roa` | `an` | Romance | Aragonese | |
| `ine` | `hy` | Indo-European | Armenian | East Armenian |
This file has been truncated. show original
1 Like