Hello everyone, I would like to open this topic to collect feedback from all our communities and partners. This topic contains a proposal crafted by the Common Voice and Deep Speech staff teams based on conversations with different volunteers and linguistics experts. We will keep this topic open f…

I’m assuming if I selected a city it would set the country and region fields automatically so that it was still searchable with broader criteria?

That’s correct, this is one of the lists we have been looking into.

(Trying to interpret what Jef meant:) An example is the word “aanrijden”. In most parts of the Netherlands, this would mean to hit somebody with e.g. your car, but in the South, it means that you just got into your car. It’s not slang, but a perfectly regular word that simply has an additional meani…

I think it’s generally a good idea to collect as much data as possible (and as much as people agree to), because based on that raw data it might be possible to optimize the model in the future and raw and accurate date is always a good base for any work of this kind.

I pretty agree to use birth places data for accent data, which we currently used in accent menu zh-cn and zh-tw locale. Many language researchers told me that it’s more easy and simple for people to choose and for them to process the data. [image] nukeador: That’s correct, this is one of the li…

[image] irvin: I pretty agree to use birth places data for accent data Please note that the proposal is not about “birth places” but “where is your accent coming from” and allow people to self-evaluate this. You can have born in a specific place, but now you are living in another and you rec…

[image] nukeador: Please note that the proposal is not about “birth places” but “where is your accent coming from” and allow people to self-evaluate this. Sure, I’m giving the feedback on how to come up with places list, whether it’s self-identified or not works. Perhaps we should say “selec…

Please consider also accents / a different sound by a cultural background but not on a different location. For example immigrants, that use a different toung e.g. turkish-german, russian-german, … Same for non-native speakers e.g. german-english, french-english, … which might result in a very diffe…

[image] Matthias84: Please consider also accents / a different sound by a cultural background but not on a different location. For example immigrants, that use a different toung e.g. turkish-german, russian-german, … Same for non-native speakers e.g. german-english, french-english, … which migh…

Hi Rubén, thanks for soliciting this feedback! Would you be open to including an accent category (or categories) for people with impaired enunciation from Dysarthria (see e.g. https://www.asha.org/Practice-Portal/Clinical-Topics/Dysarthria-in-Adults/ ) ? I’m developing a speech-to-text model for som…

🗣 Feedback needed: Languages and accents strategy

Common Voice

Topic		Replies	Views
Common Voice languages and accent strategy v5 Common Voice announcements	13	5659	August 4, 2021
Help preserving dialects from vanishing by allowing to add a dialect flag to spoken language Common Voice	16	1987	February 10, 2020
Bias against accented speech from voting instead of transcribing Common Voice	9	918	February 3, 2023
List of languages with variants launched on common voice Common Voice	5	841	October 16, 2024
Ask Me Anything (AMA) session on Common Voice Variants for Languages Common Voice participation	5	2327	January 24, 2022

🗣 Feedback needed: Languages and accents strategy

Related topics