Sentence & Clip Reporting Strategy


We’ve had the sentence & clip reporting feature up for about a month now. So far the reports just go into our database, with no automatic action being taken. Attached to this post you’ll find all the reports we’ve received so far (Content Warning for the reported clips with the “offensive-speech” tag):

Some ideas that were already floated with regards to what we should do based on the reports:

  • Auto-downvote clips that people report (only for particular reasons given?)
  • Disable recorded sentences and mark their respective clips as invalid

There might be more or different things we can do here:

  • How do you think we should deal with sentences and clips with reports?
  • Should we deal differently for each category? How?

Thank you!

1 Like

I looked at a few of the clips and it seems like people are flagging them up for things like bad quality which you’re supposed to use the No button for. So clearly there is confusion over when to use No and when to use Report.

Another thing is that choosing the “Other” option seems to assign it to the clip when the complaint could be about the sentence. There’s no way to specify.

In terms of what actions should be taken:

  1. Flagging up a clip should count as a No vote for that user.

  2. We’ve had users record 50 or 100 clips in a row of racist abuse. There needs to be some way of getting those other clips out of rotation quickly once a certain number have been flagged. Maybe if 3 different users flag 3 different clips by that user in a certain space of time it puts that user’s remaining clips in quarantine? It could be limited to apply only to users with new accounts or with no approved recordings.

  3. Five votes for the same reason on a sentence removes it from rotation. Eventually there should be some way of viewing this in Sentence Collector and submitting a corrected one if necessary.

1 Like

I’d like suggest to add admin/admins to manage the feedback. Sometimes people record well misspelled sentences. So, we can correct wrong sentences rather that losing recordings.

1 Like

@gregor1, the tsv file is it utf8 encoded?

that’s correcto! :slightly_smiling_face:

How would you scale your suggestion for languages where we can potentially have a lot of reports?

Is there a way to crowdsource what you are proposing so we don’t have to rely on just a few individuals (potential bottleneck)?

@nukeador ,
I would suggest an admin (admins) for that locales. That’s why I’m asking for. I don’t know how other locales are organised. For Kabyle, We have an important number of graduates from language departments that can help to correct. I’m planning a traning session for some (they are not techs) to help them with GitHub since there is no a suitable UI to correct.
NB: Some reported errors, actually aren’t.

@belkacem77 I think that’s a great Idea, I’m seeing some errors in the Portuguese corpus, I report the sentences, but nothing happens. If I had access to the repo, I could fix them and do a commit.

@Codigo_Logo_Programacao_e_Inteligencia_Artificial, You can correct them on Github, but it’s not a suitable UI for people who are confortbale with such tools.

@belkacem77 I’m glad to hear this, I’ll take some time this week to improve our repo, certainly will be a great step towards our goal.

My current thinking is how we can implement something we can scale and that can be managed by the community crowd.

The same way we are doing to validate clips, so we have more eyes and hands to help vs a few individuals.

1 Like