Followed the “3 sentences” practice that CV currently use with Wikipedia, does it legally acceptable to extract a very small amount of the content from each news report and subtitle file?
If yes, what is the suggested amount?
Also, “daily news” (report about facts) is considered a work without copyright in Thailand. Although there’s an uncertain area about what is just mere report about facts and what is already crossed the line and being considered an analysis (which is copyrightable), if we theoretically able to extract only the report about facts, can we use those sentences as public domain?
I’m asking this because there’s recently a discussion in Thai community about adding more sources and there’s a proposal of using scraped content from news sites and other sources like subtiles (from Open Parallel Corpus project) https://m.facebook.com/groups/527601721545161/permalink/548317636140236/
So, the summarizes this, I think we have two questions:
Does is ok to use the “daily news” that scraped from news sites? (some of us think it can be consistent as public domain, while the copyright notice on the websites may say otherwise)
For copyrighted work, like articles and subtitles, does it ok to extract few % of them? What is the suggested %?