Weekly Update Thread 2023

Hello! It’s a new(ish) year, weekly updates are back and we’re so excited to be bringing you all the Common Voice news:

From the community:

From the team:

MozFest Panel Discussion: Breaking the Anglocentricism of the Internet - Perspectives from the Common Voice Community

Date and Time: 03-21, 15:00–16:00 (Europe/Amsterdam),

Join a panel of Common Voice community members as they tackle a call for greater diversity, inclusivity, and representation on the Internet to create a more equitable global digital community. Be part of the conversation as they share their lived experiences of the Anglocentric digital world. In addition to their ongoing efforts to dismantle Anglocentrism, we hope that the panel discussion will provide feasible solutions that will contribute to the creation of an inclusive digital community that accurately reflects the diverse linguistic and cultural communities across the globe.

Governance & Algorithmic Design in Speech AI - with Mozilla Common Voice and NVIDIA

03-21, 16:00–17:00 (Europe/Amsterdam),

Technologists and curious practitioners are welcomed to hear a talk about how governance considerations can be baked into every stage of building a speech recognition algorith - from data collection through to model testing. There will be several interactive scenarios for workshopping together.

:wave: That’s it for this week.

Did this update miss something important? Are you doing something cool we can help you show off? Reply to this thread, message Gina or myself, chat to us on Matrix or email commonvoice@mozilla.com to let us know about it, or just say hello!

7 Likes

Thanks Jess, great to have weekly updates back

1 Like

It’s time for your weekly update from the Common Voice team! As always, if you’re doing something cool that we missed (or have something coming up we can show off for you) just reply and let us know!

Tomorrow, online: Our East African language communities have been doing incredible work, with a focus on Kinyarwanda, Kiswahili and Luganda. You can come learn about their successes at 4pm EAT by signing up for Creating community-driven datasets: Insights from Mozilla Common Voice activities in East Africa.

Call for presenters: Are you an academic working in or with under-resourced languages? RESOURCEFUL 2023 is seeking submissions for speakers working in this space for their conference May 22-24th in Tórshavn, Faroe Islands. They’re looking for submissions of either 4-8 pages and you can learn more about the CFP and the conference itself here.

IRL in South Africa: Launch of “Voices of Mzansi” project workshop on behalf of GIZ’s Fair Forward project and Stellenbosch University. This project aims to localise the South African languages on Mozilla’s Common Voice platform and collect voice data that is open and accessible for everyone. The workshop will be at the CSIR International Convention Centre on 16 March 2023 @ 08.30. We’ll add more details as we find them!

Over at the Mozilla Corporation: The Mozilla community call on March 9th will be looking at how Firefox handles machine translations with privacy in mind. This might be of interest to those of you (most of you?) excited about languages and translation technologies. And the presenter, André Natal was involved in the early days of the Common Voice Project. Catch it when it goes live: https://www.youtube.com/watch?v=J06koBcfm5w

:wave: That’s it for this week.

Did this update miss something important? Are you doing something cool we can help you show off? Reply to this thread, message Gina or myself, chat to us on Matrix or email commonvoice@mozilla.com to let us know about it, or just say hello!

It’s time for another weekly update!

Common Voice at Mozfest

Want a refresher on the basics of what Common Voice is? Here’s an overview of the project focused on our Tamil language community!

@gina is running a really exciting session on Breaking the anglocentricism of the internet - perspectives from the Common Voice community

And I’ll be putting together a holistic look ast the governance and design principles behind each stage of the Speech AI lifecycle: Governance & Algorithmic Design in Speech AI - with Mozilla Common Voice and NVIDIA

The Funder’s Track for Mozfest is also doing some exciting things, I would encourage you to check it out and share with your contacts if it’s of interest!

Speaking of presentations, @stergro recently spoke at an Open Data Day in Karlsruhe and wanted to share his thoughts (and slides!)

The Sentence Collector is changing
We’re bringing the Sentence Collector into the Common Voice platform more neatly in this upcoming release, more details can be found in this post

Dataset 13.0 is coming
We’re working on the release of the freshest Common Voice data in the new dataset. We’ll be updating you shortly (later today!) with more details

Did we miss something?
Reply to this thread or message Jess or Gina if we can be bragging about the cool things you’re doing with Common Voice!

1 Like

Thanks for the update and your mention of my slides! Maybe its worth pinning this thread on top of the forum like the old one?

Quite ironic how the talk about anglocentrism is presented haha:

You stole part of my speech :slight_smile:

1 Like

It’s time for your weekly update from the Common Voice team! :tada:

MozFest Recap: Last week marked the occurrence of MozFest 2023, an event that stresses the importance of creating a fair and inclusive internet accessible to all. This year, the festival saw a participation of over 6,000 volunteers, facilitators, wranglers and active participants from various corners of the globe. The festival featured insightful discussions, interactive sessions on art, music and food, and meaningful exchanges of ideas among the attendees. We extend our gratitude to the members of the community who played vital roles as attendees and panelists at the MozFest. “The festival serves as a unique platform that brings together two seemingly different groups - tech experts and human rights activists” one of the speakers highlighted. At its core, MozFest is about people and community. We hope that you had the opportunity to be part of this incredible experience and enjoyed it. In case you were unable to attend the event and wish to revisit some of the sessions, you can visit the MozFest website to access recorded sessions.

Data Science for Health in Africa Virtual Networking Exchange on 3 May 2020: An opportunity to attend Virtual Networking Exchange on 3 May 2023 to meet and interact with organizations working on data science and health in Africa! During this completely online event, you will learn about exciting work happening across the continent, share information about your work, and identify potential collaborators. The Networking Exchange is completely free and open to any organization working on data science and/or health in Africa.

How to Participate: To participate, you may register as a participant or as a presenter.

  • Participants may join the event and move around between different Zoom rooms to learn about different data science and health organizations.
  • Presenters may represent their organization on the agenda for the event. They will have a Zoom breakout room for an allotted time during which they can share information about their organization and interact with participants.

More information on the Virtual Networking Exchange and registration is available on the DS-I Africa website

Note: The deadline to register as a presenter is 14 April 2023. There are a limited number of presenter slots available so please register as soon as possible.

Uzbek AI: The Uzbekvoice.ai team has collected about 1400 hours of high-quality audio recordings with accompanying texts to create a valuable resource for the Uzbek language. Common Voice recognizes, commends, and appreciates the team’s dedication to building open and accessible resources for language technology. Read more about it and get in contact with the team here: Uzbekvoice.ai Project.

Did this update miss something important? Are you doing something cool that we can help you show off? Reply to this thread, message Jess or myself or chat to us on Matrix:)

2 Likes

Hello and welcome back to another weekly(ish) update from the Common Voice team!

Our Catalan community is going to be running a contribution campaign April 14th-16th. They’ve been doing just a stellar job and I bet they’re going to meet their ambitious goal of 3000 contributed hours! Come help them out or cheer them on.

@chenaichair joined BBC recently to talk about global access in AI about Common Voice. While (I think) the whole segment is interesting, you can skip to 7:18 to listen to her shine. You can listen here (Requires a BBC Sounds login)

Common Voice has been nominated for 2 Webby awards, for Accessible Technology and Responsible Innovation While votes for us do help get more visibility on the project, we’re just excited to be nominated!

We’ve also updated the sentence corpus for Catalan, Abkhaz, and Esperanto! Release notes here.

Did we miss something? Do you have something cool you want us to talk (or brag!) about? Reply here, message @gina or myself or chat to use on Matrix!

3 Likes

Voted… But it requires registration…

Hi everyone :smiley:

Welcome back to another weekly update from the Common Voice team :tada:!

Our Catalan community contribution campaign is still on-going until April 16th. Read more about the campaign here.

The voting period for the Webby Awards remains open until April 20th, and Common Voice has been nominated in the categories of Accessible Technology and Responsible Innovation. If you could spare some time to cast your vote for Common Voice, it would be greatly appreciated.

Lacuna Fund announced two new calls for proposals to develop open and accessible machine learning (ML) datasets that will improve Sexual, Reproductive & Maternal Health and Rights (SRMHR) and illuminate the relationship between Climate & Forests to help identify interventions that could mitigate or adapt to climate impacts. Read more and access the full requirements and submission portal here. Round 1 closing date is April 19th, Round 2 June 20th.

Did this update miss something important? Are you doing something cool that we can help you show off? Reply to this thread, message @jesslynnrose or myself @gina or chat to us on Matrix:)

1 Like

Hi everyone :smiley:

Welcome back to another weekly update from the Common Voice team!

Exciting News :tada:
We have new bulk sentences in last week’s update for Japanese and Swahili. Here’s the release.

Making the Latvian Language AI-Compatible :tada:
Latvian Open Technology Association (LOTA) is taking the lead in a joint initiative to make the Latvian language work seamlessly with AI tools worldwide. LOTA is collaborating with partners to achieve this goal, and several activities are planned for the coming months.

The first two activities will happen around 4th of May and on 11th of May. The 4th of May is the day Latvia regained its independence and for this occasion, LOTA is planning a social media influencer campaign asking Latvians to record their voices on the Common Voice platform. On the 11th of May, LOTA will host an annual conference focusing on Open data where voice donations on Common Voice will also have a dedicated spot.

At a later stage, activities planned for this summer and autumn will be part of a research project by the Artificial Intelligence Laboratory at IMCS, University of Latvia. The project aims to gather Latvian voice donations on Common Voice.

Lota is open to collaborating with any other organizations or individuals from Latvia working or interested in contributing to Common Voice. For further information or enquiries, contact Raivis Dejus on raivis.dejus@gmail.com.

Did this update miss something important? Are you doing something cool that we can help you show off? Reply to this thread, message jesslynnrose or myself Gina_Moape or chat to us on Matrix:)

2 Likes

Hi everyone :smiley:

Welcome back to another weekly update from the Common Voice team!

The New Common Voice Sentence Collector :tada:

New look, New features for the new Common Voice Sentence Collector. Read more about it Here.

Latvian Open Technology Association (LOTA) Initiative Success
Latvian Open Technology Association (LOTA) has taken the lead in a joint initiative to make the Latvian language work seamlessly with AI tools worldwide. LOTA collaborated with partners to achieve this goal. The initial activity held on the 04th of May achieved great success. Their campaigns were featured in the top 2 news programs:

During the campaign, the team managed to attract almost 7 000 people and had about 400 people donating their voices on the Common Voice platform. They managed to increase the recordings from ~18 hours to ~88 hours. On the 11th of May, LOTA will host an annual conference focusing on Open Data where voice donations on Common Voice will also have a dedicated spot.

The French Geek Festival 2023
Geet Faeries 2023 will be held on the 03-04 June 2023 at Cahteau de Selles sur cher, 1 Le Château, 41130 Selles-sur-Cher, Selles sur cher, France. The event aims to bring together Tech enthusiasts. RSVP for event here.

Did this update miss something important? Are you doing something cool that we can help you show off? Reply to this thread, message @jesslynnrose or myself @Gina_Moape or chat to us on Matrix:)

5 Likes

Great news! When will the new sentence collector be online?

Going live right now!

2 Likes

Live now at https://commonvoice.mozilla.org/ under the contribute tab!

1 Like

Welcome back to the weekly update from the Common Voice Community team. We’ve had a busy week with the new Sentence Collector update. Now you can write your own original sentences right in the main Common Voice UI. Anyone can write or submit new sentences but you’ll need to be logged in with an account to review new sentences. If you spot a bug or problem with the new Sentence Collector, could you let us know by raising an issue?

Right now you can only submit one qualifying sentence at a time using the Sentence Collector. So we’ve also created some new documentation showing you how to create bulk sentence submission to support the corpus of your favorite languages.

If you work with Voice and/or Speech data and have the time to take a short survey to help academics better understand your dataset documentation practices, @kath at ANU has a short survey open now that you can take.

If you’re a French speaker (or ever wanted to learn a bit more French!) Common Voice will be represented at at upcoming Geek Faëries(https://www.geekfaeries.fr/) festival June 2-4th in Loir-et-Cher. A rare chance to contribute to Common Voice in person, in cosplay!

That’s it for this week. Did we miss something? Got something cool coming up we can share with the community? Or you want us to brag about something amazing you’ve done? Just reply here, say hello on Matrix or email commonvoice@mozilla.com and we’ll joyously include you in the next update.

Best,

Jess

3 Likes

Hi Everyone :smiley:

New Week New Update :tada:

New Launched Locale :dancer:
We are excited to announce the newly launched locale for Tamazight ‘zgh’. Common Voice continues to grow and we are grateful.

The French Geek Festival 2023
Geet Faeries 2023 will be held on the 03-04 June 2023 at Cahteau de Selles sur cher, 1 Le Château, 41130 Selles-sur-Cher, Selles sur cher, France. The event aims to bring together Tech enthusiasts. RSVP for event here

Voice Data Collection
We are currently working on a “How to” guide for contributors who wish to gather voice data in their specific languages, please email us or comment on this update if you have any recommendations, suggestions or input you would like us to include in the guideline.

Did this update miss something important? Are you doing something cool that we can help you show off? Reply to this thread, message @Jess or myself @ginamoape or say hello on Matrix or email commonvoice@mozilla.com and we’ll joyously include you in the next update.

1 Like

Hello and welcome to your weekly-ish update from the Common Voice team!

We’re still excited about the changes to the Common Voice sentence collector, if you see any weird bugs or unexpected behavior, could you let us know by raising an issue on Github?

The new Sentence Collector got put to work at a stunning event in DRC run by our Kiswahili community, focusing on writing, speaking and review contributions and enabling female contributors. So many excited thanks to fellow Rebecca Ryakitimbo Mwimbi for organizing this event in Lubumbashi.

Le Voice Lab hosted the MCV project to talk about innovation and inclusion in open source. Webinar video available in French, for speakers of French (or learners!)

Our wonderful @gina is going to be speaking at AfricaAI in Kigali next week, looking at the Common Voice project.

Are you doing something exciting and want us to help shout about it? Reply here, DM me (or @gina!) or email us and we’ll include you in next week’s update.

2 Likes

Thank you for the update and sharing the exciting news! It’s fantastic to hear about the newly launched locale for Tamazight ‘zgh’.

I’m definitely interested in attending the French Geek Festival 2023, Geet Faeries. The event sounds like a great opportunity to connect with fellow tech enthusiasts.

Lastly, I don’t have anything specific to share at the moment, but I’ll definitely keep it in mind if there’s something cool I’d like to showcase.

1 Like