The Sentence Collector is going to change!

We’ve been working on some exciting changes to the Common Voice Sentence Collector tool you know and love!.

First, many thanks to Michael Kohler for his work building our existing Sentence Collector tool. And thanks to Justin Grant for his 3 months of user research into the sentence collector, as well as all of you who participated in that research. Based on this research, we’ll soon be integrating the Sentence Collector more closely into the Common Voice platform for a more connected experience. We will be rolling this out in phases over several months, and expect the first iteration of these changes will be available as part of a March 2023 release.

In our first release, you can expect a similar feature set as the current Sentence Collector, but with a new look and location.

Over time, we will be adding more features - for example we expect to bring bulk sentence upload into the sentence collection UI shortly afterwards

New features will include -

  • A UI for bulk sentence upload (github will still be available for those who prefer that)
  • In-line editing for bulk ingestion reviewers
  • A more streamlined process to speed up legal reviews on our side
  • The ability to flag sentences for harmful content

As a result of the initial migration of the Sentence Collector experience into the core MCV platform, there will be some downtime. This is likely to be (early March) but we will confirm once we have a more specific time.

Once the migration / transition is complete, the Sentence Collector in its current form will be deprecated and will no longer be available for use.

Many thanks again to everyone for the feedback and sentence contributions that steered us at making these improvements!


HI Jesslyn, thanks for sharing this with the community! I am super excited to finally see this happening, as this was something I talked about more than 2 years ago, and it just never happened due to resources on both my side and the CV team side. I think there are many synergies here and we can simplify quite many processes. Also thanks to the CV team for involving me in this project.

To be correct here, I usually go for “maintaining” as I was not involved in the very first iteration of the Sentence Collector. I then took it over and created the current version of it and extended it, but the very first version I can’t take credit for. :slight_smile: @mhenretty take some credits if you still read here :wink:

I’m sure everyone would love to see those :wink:

Keep up the good work!


I will lengthen my future citations to include @mhenretty and appreciate the correction.

:sweat_smile: on the screenshots bit, it’s been a long week

1 Like

Long time awaited, congrats on making that shipping.


Thanks for the shoutout @mkohler! To be clear, I built a very rough prototype of the sentence collector years ago. It was @mkohlerwho turned it into a full production application and process.

I am continually in awe of all the incredible work being done for Common Voice. I too had hoped the sentence collection would one day be part of the main site, so I think these changes are great!

Keep up the great work @jesslynnrose, @mkohler and entire Common Voice community!!!


I for one am really enjoying the Michael <> Michael passing back and forth of polite credit!

I’m also glad to hear you’re excited about the changes to come, thank you (both!) for all your work on the project that got us here!