Speaker ID split between train/test/dev

Hi,

there’s a beta release for the new dataset in this thread:

The related metadata file can be split up with the linked CorporaCreator tool to make sure that no speaker overlap exists in the train/dev/test sets.

If you don’t want to do that manual work, we’ll also do a full release in less than a month.

1 Like