We’ve had the sentence & clip reporting feature up for about a month now. So far the reports just go into our database, with no automatic action being taken. Attached to this post you’ll find all the reports we’ve received so far (Content Warning for the reported clips with the “offensive-speech” tag):
Some ideas that were already floated with regards to what we should do based on the reports:
Auto-downvote clips that people report (only for particular reasons given?)
Disable recorded sentences and mark their respective clips as invalid
There might be more or different things we can do here:
How do you think we should deal with sentences and clips with reports?
Should we deal differently for each category? How?
I looked at a few of the clips and it seems like people are flagging them up for things like bad quality which you’re supposed to use the No button for. So clearly there is confusion over when to use No and when to use Report.
Another thing is that choosing the “Other” option seems to assign it to the clip when the complaint could be about the sentence. There’s no way to specify.
In terms of what actions should be taken:
Flagging up a clip should count as a No vote for that user.
We’ve had users record 50 or 100 clips in a row of racist abuse. There needs to be some way of getting those other clips out of rotation quickly once a certain number have been flagged. Maybe if 3 different users flag 3 different clips by that user in a certain space of time it puts that user’s remaining clips in quarantine? It could be limited to apply only to users with new accounts or with no approved recordings.
Five votes for the same reason on a sentence removes it from rotation. Eventually there should be some way of viewing this in Sentence Collector and submitting a corrected one if necessary.
I’d like suggest to add admin/admins to manage the feedback. Sometimes people record well misspelled sentences. So, we can correct wrong sentences rather that losing recordings.
@nukeador ,
I would suggest an admin (admins) for that locales. That’s why I’m asking for. I don’t know how other locales are organised. For Kabyle, We have an important number of graduates from language departments that can help to correct. I’m planning a traning session for some (they are not techs) to help them with GitHub since there is no a suitable UI to correct.
NB: Some reported errors, actually aren’t.
@belkacem77 I think that’s a great Idea, I’m seeing some errors in the Portuguese corpus, I report the sentences, but nothing happens. If I had access to the repo, I could fix them and do a commit.
Has a policy for this been determined yet? Because now that a lot of the harder foreign words have been filtered out, I’m finding the most common reason I flag a sentence is because of poor grammar. Some examples:
In those days the councillors was called commissioners.
With these schools they creates the Easton Valley Community School District.
The storm caused severe flooding states such as New Jersey, New York and Pennsylvania.
The is the only line.
I can’t really think of a good way of automatically filtering these out, so I think the reporting function is really the only way. Of course, this requires someone to actually see it (which is a bad user experience) and flag it, by which time it’s probably already been recorded, but I really can’t think of a way to do it accurately without human review.
nukeador
(Rubén Martín [❌ taking a break from Mozilla])
13
Thanks for pinging back on this.
Since we have been moving our main dev around we haven’t been able to properly prioritize this feedback.
@mbranson is this something we can make sure we have in the backlog so it is properly triaged when we have time?
Just realized I didn’t respond here, apologies! Yes, this is captured in our backlog and we’re working to prioritize this among various other needs / requests as part of 2020 planning. In the immediate future the Common Voice team focus is on infrastructure improvements, database optimization and releasing the latest dataset. In the meantime please keep providing feedback and proposals here, our goal is to incorporate these learnings into the reporting feature improvements.