Happy International Women’s Day !
This year’s theme is called breaking the bias.
Gender bias is present in speech recognition and Common Voice can play a role in tackling bias by reflecting and working towards co-liberation in how create the dataset from community engagements to reflecting on how power enbales or disables people.
I wanted to highlight opportunities and reading that might be of interest to people:
Apply for the Doria Feminist Fund on Knowledge Production if you are based in MENA region
Read the Investigating the Impact of Gender Representation in ASR Training Data: a Case Study on Librispeech
Join a local Women in Voice Community Group (Open to all genders)
Join the Gender, Tech and Intersectionality Sessions at Mozfest
Why have I added these resources ?
I believe that our bulk sentence collection process can reinforce gender bias as we rely on public domain texts from author’s who have passed 70 + years ago. In the 1950s only a quarter of New York Time bestsellers were Women..
Also the use of Wikipedia brings their own bias’ ; word embeddings watercolour stereotypes - she’s a homemaker and he’s a doctor. Women in Red are hardly ever seen. Our corpora production lacks space for Feminist knowledge production. How can we break the bias, when online mobility is punctuated by patriarchal gaze ?
Call to action
If you have any texts or opportunities you would like to share please feel free to add in the comments section. I would love for the resources to be in multiple languages - I will be adding these to the comments.
If you have made any interventions to support the inclusion of Women and diversifying text in gendered languages, please share with the community.
Thanks everyone !