Sentence collector copyright issues

bozden · September 11, 2021, 10:21am

There are some ~590 sentences waiting in Turkish. I scanned the first 10-20 or so, they are from:
https://tr.wikisource.org/wiki/En_Alttakiler

It is indicated as “public domain in Turkey” at the top, but at the bottom CC BY-SA…

Also many of them are incomplete sentences, only sentence parts divided at any punctuation, including commas. So many of them are grammatically incorrect anyway…

PS: I did not review all of them… Accepted a couple then reviewed the CC0 status.

Edit: Scanned other sources in the set, they are mostly poetry with similar copyright status. Such as
https://tr.wikisource.org/wiki/Takatım_Tak_Oldu_Bican_Olmuşum
https://tr.wikisource.org/wiki/Çocuklara
https://tr.wikisource.org/wiki/Bir_Roman_Kahramanı
https://tr.wikisource.org/wiki/Sayfa:Halk_Edebiyatı_Antolojisi.pdf/238

Topic		Replies	Views
Polish sentences concerns Common Voice sentence-collection , issue , dataset	20	3286	May 4, 2020
Extending our sentence collection capabilities Common Voice sentence-collection , announcements	19	3697	September 11, 2019
Sentence collection for Belarusian – request for advice Common Voice sentence-collection	16	1152	July 9, 2021
We want your feedback: Improving the sentence collection Common Voice sentence-collection , feedback	39	8886	January 9, 2019
Problems finding public domain sentences Common Voice sentence-collection	26	2986	June 10, 2019

Sentence collector copyright issues

Related topics