Adding to @sanjaesc, for our German training you can already hear what the voice says with about 20k. It lacks endings or has wrong pronunciations, but you can clearly hear words. Quality is not great of course. That gets better with 300k. And maybe train Taco first, then go on with a vocoder.
1 Like