What if people are using text-to-speech to record?

DaDiRa · March 11, 2019, 9:06pm

I’ve come across recorded sentences with text to speech. Should I vote them positively or not?

nukeador · March 11, 2019, 9:05pm

Welcome to the community discourse!

This is interesting, I haven’t found this situation, did you have the chance to document which sentences were using this?

I would say that this is not ideal, since this is the same voice over an over again, so having more than 15 minutes of this voice is not super helpful. We really need at least 1000 different and diverse voices for each language, and definitely this is not very diverse.

DaDiRa · March 11, 2019, 9:04pm

No I didn’t document it but I’ll do it from now on. This happened about 3 times so far in the 55 sentences I voted for.

lissyx · March 11, 2019, 9:05pm

I’m pretty sure this is something we already discussed about with @kdavis and the answer was a clear no as much as I can recall. Not only it’s going to not be very good for the dataset, but chances are that this is against the terms of use of the Text-to-Speech service.

kdavis · March 12, 2019, 8:44am

If the voice is indeed synthetic the clip should be marked as invalid, and I agree with @nukeador that…

Michael_Maggs · March 13, 2019, 3:54pm

I’ve added this to the draft reviewing guidelines, here:

Codigo_Logo_Programacao_e_Inteligencia_Artificial · May 10, 2019, 11:00am

Hi, so people are recording TTS clips, I’m rejecting them since it doesn’t make sense to have them in the dataset. I’m worried this will slow down the validation process of actual clips.

nukeador · May 10, 2019, 11:00am

@Codigo_Logo_Programacao_e_Inteligencia_Artificial how many of these have you found?

Codigo_Logo_Programacao_e_Inteligencia_Artificial · May 10, 2019, 6:13pm

@nukeador About 10 in a set of 150 clips.

nukeador · May 14, 2019, 11:28pm

@gregor is there a way we can help people identify and flag these ones so we can identify who is sending these?

gregor · May 15, 2019, 10:42am

Unfortunately we don’t have flagging functionality yet, though it’s been requested a couple of times already.