Sentence Collector Localization Update

I meant Option 2. Thanks for calling out. Would it be a significant dev work to streamline the process now? It might be lots of work upfront, but downstream, it is much easier to manage. I don’t understand the complexity of syncing issue you raised here.

Thanks for the input. I quickly talked to Jenny and she would prefer Option 2. She also reminded me that we don’t need to keep the translation files in sync in this repo, and we can fetch them when building and pushing the deployment. This means that there is less work in terms of synchronization, now I agree with Option 2 as well. So we’re gonna implement this in the same Pontoon project as the existing Common Voice strings. This will require some work on the Sentence Collector side, for which I will file separate issues. In terms of Pontoon itself there is nothing to be done.

1 Like

Update

By now most of the underlying tasks have been completed. In a few hours you will see the Sentence Collector strings pop up to be localized in Pontoon.

We are not exposing a language switch dropdown yet, but we will do so soon depending on the progress of the translations.

Remaining tasks: https://github.com/common-voice/sentence-collector/labels/localization

Michael

3 Likes

That’s awesome!

Eagerly waiting :grinning:

@mkohler, is “Sentence Collector” a brand name, or can it be translated?

That’s a very good question, hadn’t thought about that yet. While many resources will eventually be localized I think some will not be and therefore will be referencing to Sentence Collector in English. This for example includes Discourse posts. Treating it as a brand name might be the least confusing option here. I don’t have a strong opinion here though. What do you think?

Well, “Cümle Toplayıcı” has a good ring in it and means the same. If people write it with capital initials as I did, it will be a localized brand name…

1 Like

One important thing for translators thou:

There are examples in the original English text which refer to English. These must be replaced with localized counterparts, not with direct translations. Otherwise they will not mean anything.

Examples:

  • For example, the acronym “ICE” could be pronounced “I-C-E” or as a single word.
  • For example, an apostrophe is included in English words like “don’t” and “we’re” and should be included in the source text, but it’s unlikely you’ll ever need a special symbol like “@” or “#.”
1 Like

I’ve just started on the new Common Voice Strings related, I guess to the Sentence Collecor and I’ve come across this:
Home
COMMENT Don’t rename the following section, its contents are auto-inserted based on the name. These strings are automatically exported from Sentence Collector. [SentenceCollector]
Can you clarify this? What does rename and section mean? If they are not to be translated why are they included in Pontoon? Thanks.

1 Like

@rprys, I think you are looking at the repo. You should do the translation through Pontoon. That part in the repo will be filled programmatically…

No, I’m in Pontoon…

https://pontoon.mozilla.org/cy/common-voice/all-resources/?status=missing&string=235053

1 Like

@rprys thanks for reporting this. I will have a look. This was not meant as a comment on that specific string, you can safely ignore it.

1 Like

One more thing to be aware of - a minor issue… If the original sentence has a variable AND it is a correct English word ( e.g. {$sentences} ) AND if you use Google translation to start with, you get that variable also translated. Pontoon will give an error, stating there is no closing “}”. To pass that, you should correct the variable name of course…

1 Like

I’m not sure what ‘total sentences’ mean. The number of sentences or if they are complete?

0 No total sentences.
one 1 total sentence.
other { $totalSentences } total sentences.

GROUP COMMENT Validation criteria

CONTEXT sc-lang-info-total

RESOURCE Common Voiceweb/locales/en/messages.ftl

https://pontoon.mozilla.org/cy/common-voice/all-resources/?status=missing&string=235049

Should the keyboard shortcuts be localized?
They would be C, G and H in Welsh for Cymeradwyo, Gwrthod a Hepgor?

You can also use Keyboard Shortcuts: Y to Approve, N to Reject, S to Skip

GROUP COMMENT Validation criteria

CONTEXT sc-review-form-keyboard-usage

RESOURCE Common Voiceweb/locales/en/messages.ftl
https://pontoon.mozilla.org/cy/common-voice/all-resources/?status=missing&string=235041

Contributions Agreement template
Will these be available for localization?

Will there be a separate staging site for the Sentence Collector?
These’s a lot to check :slight_smile:

“x total sentences” is from the first bullet on the Statistics page, showing total posted sentences in a language, validated or not.
https://commonvoice.mozilla.org/sentence-collector/#/en/stats

About keyboard shortcuts:

No, these are checked by the SW as keypresses, as of now these cannot be localized.