Bulk Requests

Hi everyone :slightly_smiling_face:

This is a notice to inform all those who have submitted bulk import requests that we are currently reviewing our policy and procedures for bulk requests sentences validation. At this time, we will not be ingesting any bulk sentences. We will provide the necessary information once this review process is complete. We apologize for any inconvenience this may cause.

Thank yiu



Thank you for the update on the bulk import requests review. I understand the need for a thorough review process to ensure quality and compliance.

I submitted approximately 2000 sentences a couple days ago, all from public domain articles on SNL.no, adhering to Mozilla Common Voice requirements. Extracted by a script accessing the SNL.no api

I’m eager to contribute these to the project. In light of the current pause on bulk imports, would it be acceptable for me to submit these sentences individually, or would you recommend that I wait until the review of the bulk import policy is completed?

Best regards

Hey @Gina_Moape, regardless of bulk or single, please also consider a change in the public sentence additions.

IMO, it is necessary to keep track of who added the sentence, so that you can reach them in case of problems (license, junk or low Q additions etc). The source field is not sufficient in bad additions, in some cases you need to remove all additions from a spammer for example.

I urge you to consider that only registered users can add sentences. Text corpora and its quality is the most important aspect of the dataset.