Adding Cantonese/ zh-hk


#1

Hi all,

Interested in helping out adding the Cantonese/ Chinese (Hong Kong)/ Yue Chinese language.

  • Wondering if mozilla will follow convention on firefox and give it the zh-hk language code?

  • If I would like to submit sentences to the Global Sprint form (https://voice-sprint.mozilla.community/upload/), how should I name this language?

  • Would it be possible to start a pontoon project to translate the UI to Cantonese?


(terry123) #2

I am Interested in helping also.


(Michael Henretty) #3

Great to hear! Let’s start with the localization of the website. Can you send an email to me (mikey (at) mozilla (dot) com) and Peiying (pmo (at) mozilla (dot) com) to get the process started?


#4

Hi Terry! The UI translation project is enabled here now: https://pontoon.mozilla.org/zh-HK/
:star_struck::raised_hands:


(terry123) #5

@dtylam, 我哋開始吧!!

然後傾下點樣可以拎到多啲語音,可以加過朋友一起傾下點做


(Liwangyau) #6

Hi, I am Kenneth. I just found that the localization part is almost done. May I know is there anything I can helping with the Common Voice project (Cantonese/ zh-hk)? Thanks!


(Luc Salommez) #7

Hi Kenneth, in order to start gathering voice samples from people, we first need to gather written sentences for people to read.

For every language, we need to get 5 000 sentences before it is launched.
Cantonese is actually at 636 / 5000 gathered sentences.

If you want to contribute to help the Cantonese language to be launched, you can write sentences for people to read.

If you have programming skills, you can also try to find open datasets of Cantonese sentences that are not attached to a copyright (for exemple books in the public domain) and parse them.

The sentences should be less than 15 words and we should be able to read them between 3 and 5 seconds ideally.

The Common Voice team is currently working on a platform to submit sentences and until this platform is available you will have to make a pull request there : https://github.com/mozilla/voice-web
in order to submit your sentences.

Your file should contain at least 50 sentences, with one sentence per line.

Bonus : The more diversity there is in your sentences the better ! (For exemple diversity in vocabulary) :slight_smile: