Trying to create a 'Common Voice overview' for newbies, SVG drawin' style

bozden · September 26, 2022, 4:23pm

No need to apologize, this is a discussion area

I was merely saying these:

Each medium has its limits, and different set of audiences. In each medium, you try to give some information you (the developer) finds necessary. This is usually more than the end-user requires. Therefore from the very start of SW Engineering practice, and even for any appliance you buy to home, there are multiple levels of documentation: A “Quick Start Guide”, a “User Manual”, a more technical one, sometimes service manual etc.

So it is required to have a multi level approach for multiple user requirements and/or knowledge levels. Some info can be replicated/rephrased, but you can but “for xxx please refer to yyy for more info” etc, which is done here.

The mediums are not designed to be documentation specific. E.g. if you design a mobile app and put a “?” button for help, you cannot put the whole manual, it should fit the screen/popup/whatever. And if it is multi-lingual, it becomes harder.

That is in addition to the fact that “people don’t read anymore”. The whole UX field is born from that, trying to provide the users intuitive layouts, icons, actions, etc so that that they do not read and just start using. It is very good for Common Voice, but it has it’s own consequences, such as people not reading, thus not understanding how important their demographics data for Voice AI development are, so not creating a profile and keep themselves logged-in, they don’t know that their mic cable have problems, they don’t know what should not be validated etc.

Same is happening here. CV frontend is React, so it finally uses bare HTML with div tags and text. It uses Pontoon for translations, which is based on sentence by sentence translations, which in turn is not appropriate for the work we are doing, which should be deeply localized and adapted if needed.

For example, the “contribution criteria” in English gives dinosaur examples and how English shortening works (“they’ve been” like stuff), which is not valid for any other locale and cannot be translated but should be converted/localized. But there are (say) 3 lines which can be converted, but you might need 5 examples for your locale. So the medium puts limits.

To overcome this, we tried these:

Be a moderator, open a locale specific forum and wrote my own documentation for end users:
Süreç, doğrular, yanlışlar ve veri kümesinin iyileştirilmesi
Open a youtube channel for how-to videos:
https://www.youtube.com/channel/UC1Om7zlNV36QIJK69o8ZlRA
Create guides in out Facebook group:
Facebook

etc… These all could be incorporated in a multi-lingual, locale team managed documentation tool, such as the one I mentioned. They are made for these purposes.

For the levels, just check any good documentation for main headings: Introduction, How-To guides, FAQ’s, Basic Usage, Advanced Usage, Technical Details etc…

It refers to “Regular Expressions” in many languages. The cleanup procedures we’ve been talking about are mainly based on them, and sometimes they are hard to decode for humans and very error prone when coding.

They don’t, you can keep a documentation repo on github and link to it. E.g. each tool can have a doc repo and that would go into appropriate sub-topic in the documentation.

As an example check the Coqui STT repo, there are no DOCs in the repo, just a link:

And here it is:
https://stt.readthedocs.io/en/latest/