[Legal] [Sentence extraction] Can I use Wikisource(CC0) for sentence collection

mkohler · May 18, 2021, 4:28pm

Thanks for the follow-up Jenny!

For clarification: does this also include articles which are explicitly marked as being CC0 in the US? In the end I haven’t looked at the dump yet, this might be hard to extract in general, but just wondering. In general 3 per article is definitely a safe thing due to the possible technical challenge to correctly identify CC0 content.

Topic		Replies	Views
Bulk sentences submission from Wikipedia Common Voice sentence-collection	4	611	August 12, 2024
Extending our sentence collection capabilities Common Voice sentence-collection , announcements	19	3707	September 11, 2019
Use of Wikipedia Sentences Common Voice sentence-collection	1	384	August 5, 2024
Scraping news sites/subtitles -- license question Common Voice sentence-collection	14	1491	September 26, 2021
Remove my Swedish sentence submissions from parliament proceedings Common Voice sentence-collection	2	788	June 30, 2020

[Legal] [Sentence extraction] Can I use Wikisource(CC0) for sentence collection

Related topics