November community campaign

Good Monday everyone,

This month we followed-up with our community campaigns, and as we did in October, we also run efforts through snippets and social media in the languages we had at least 1M sentences.

This time the efforts were run in: English, German, French, Spanish and Italian.

Again, Common Voice November community campaign followed the lead of the previous success last month. Last week we saw increases in the number of weekly recorded and validated clips from 1,5x to 36x from our new campaign participant, Italian!

Language Recorded Validated Increase from prev. week
English 47hrs 30hrs 8x
Italian 36hrs 16hrs 36x
Spanish 18hrs 11hrs 9x
French 16hrs 9hrs 3x
German 15hrs 8hrs 1,5x

We also got 57K visitors, that’s 6 times more than the previous week.

Thanks everyone who participated and supported this effort!



That’s great. Any idea why German has only an increase of 1,5 this time?

Also congratulations for 1k hours of validated voices and 50k donators in the English dataset! I made this screenshot on Saturday:

1 Like

Apparently Firefox in German had a lot of marketing snippets also last week and that’s why we reached way less people.

As a follow-up, we plan to run again a snippets campaign next week (Dec 9th - Dec 15th) in the same languages.

1 Like

@nukeador How far is Dutch from having 1M sentences?

Dutch currently has around 5K sentences. My recommendation on the easiest way to get 1M (or close) sentences is to get technical people to help with the wikipedia extractor in Dutch

This is how we managed to get Spanish, German, French, Italian or Catalan with that amount of sentences. Please, let me know if you have any questions.


A post was split to a new topic: Generate sentences using ML