Adding the Occitan language: ideas and strategies

Hello,

I am writing this message to get information about the process and to inform about what I plane to do to add the Occitan language.
I saw a conference in Tolosa, France about Common Voice last November. I think I understood the different steps but to be really sure I would like to list them and tell what nee to be done.

1)Translating the website

I started some times ago but alone it’s a bit too much. I would like to teach people how to participate. I already wrote and article about this (here: http://sapiencia.eu/traduire-mozilla-firefox-en-occitan/).
In January I’m organising a translating session with, hopefully, 4 other persons. 2 proofreaders and 3 translators. At the moment there are 360 strings left to handle. We can make it :slight_smile:

2)Creating a corpus

A collection of at least 5 000 sentences is needed to achieve this step.
Not to brag about or anything but the Occitan language is written for more than 1000 years and has been an administrative language so more than 33 French departments are full of archives in this language. Lots of books are no longer under copyright, have they ever been.
I saw in the GitHub repository people gave some sayings and proverbs for their language, I’m already on it to gather some of them.
I will contact the Occitan multimedia library (Lo Cirdòc) and another public structures that developed online dictionaries, spellchecker, voice recording for Wikipedia (Lo Congrès).
In addition I will try to contact book editors and other associations to know if them can give as a gift some sentences.
Knowing that all these texts would be in a formal register I have an idea. Tell me what you think about it.

I envisage to build a website where people could drop 10 sentences of their own. I would like to give the address to people and ask them to write 10 sentences, spontaneous ones. Such as:
I went shopping for Christmas – I couldn’t go by train because of the strike - Etc. Daily and useful sentences. Maybe ask people to ask their friends to do so, and if they are not native in Occitan offer them to translate their friends’ sentences into Occitan.
I’m listing different teachers, singers, writers, and friends to solicit.

The last 2 steps, well we have time until then.

Thanks for your feedback!

2 Likes

Thanks for sharing and contributing to Common Voice.

As a reminder, this is the place to know everything about how to get a language launched:

Cheers.

1 Like

Hello you all!
The translation of the web site is now complete :fireworks:
So the next step is creating a corpus from public sources, am I right?

3 Likes

That’s about right :slight_smile:

1 Like

@Quenti

Bienvenu sur Common Voice. Heureux de retrouver l’Occitan.

Puisque vous venez de terminer la traduction du site, vous pouvez rejoindre le collecteur de phrases sur cette adresse: https://common-voice.github.io/sentence-collector/#/add

  • Il faut d’abord créer un compte
  • Lire le contenu sur cette page pour comprendre la nature des phrases (licence CC0) et certaines règles https://common-voice.github.io/sentence-collector/#/how-to
  • Vous pouvez héberger vos phrases sur Github par exemple, puis coller les phrases sur Sentence Collector et coller le lien en bas en guise de source.

En cas de besoin, n’hésitez pas à me solliciter.

Pour information, je suis impliqué sur le corpus de langue kabyle, une langue berbère nord africaine.

2 Likes