Suggestions for the sentence collector

sentence-collection
#1

Here’s some suggestions I have after having reviewed all sentences in Danish.

  • A better UI, more like the clip recorder/validator where you get one sentence and then click yes/no. Gets tiring to click up - up - up - next page - up - up - down - repeat etc.

  • Visibility of the requirements directly on the review page, and upload page. I had to go through a lot of bad and broken sentences.

  • (!!) Randomization of the order of sentences to review (or at least some kind of algorithmic mix-up of the order). So instead of having to go through 50-100 sentences from the same boring text and then stopping, it will be more fun/encouraging to do more if they are mixed up. + there will be more diversity in the sentences being validated.
    As an example for Danish right now: 641 total sentences. Only 80 validated sentences. If I understand the tool correctly those 80 sentences are most likely just among the ones to get added early on. So if I add 100 new quality sentences they will just get added to the end of the queue, and people will still have to validate all the sentences before it. And if one user uploads 200 really bad sentences early on, every user will have to go through them until they are validated or discarded.
    I hope it makes sense!

  • More visibility/links to find this tool from the main site would be great. Like directly on this page https://voice.mozilla.org/da (or any other language without enough sentences)

1 Like
Common Voice Sentence Collection Tool launch
(Michael Maggs) #2

Have a look at GitHub, where several of your suggestions have already been noted: https://github.com/Common-Voice/sentence-collector/issues/174. It’s been suggested that further discussion should continue here on Discourse in a new thread, but no-one has opened that yet. Feel free to do it - more discussion would indeed be useful.

1 Like
(Michael Kohler) #3

Yes, let’s create a new thread. Unfortunately I don’t have the permissions here to split out topics. @Michael_Maggs or @sixten would be great if one of your could create a new topic here to further discuss these suggestions :slight_smile:

(Rubén Martín) #4

Topic split into a new topic.

(Michael Maggs) #5

Copying here so it doesn’t get lost, a suggestion previously made in the GitHub thread:

The sentences appear for review in the same order that they were submitted (grouped by the number of previous votes, I understand). Where they come from a public domain source, such as a book, that tends to result in similar words and phrases appearing close together as the sentences sequentially track though the story. That makes them rather boring to review, and it would be give reviewers a better experience if all the review sentences could be randomized - not only within a single upload, but also across uploads.

(Daniele Scasciafratte) #6

As I written on https://github.com/Common-Voice/sentence-collector/issues/210 IF it is possible to send the review when all the sentences are marked will simplify a lot.
Or I will do an extension because I have 1020 pages of sentences to review, they are already only 5 so the task is very time consuming and boring but also add another step with this button is getting me more bored than usual.