Copyright violation issues on Korean sentences

With the Korean sentences, someone added many quotes from the Korean holy bible - the Korean Revised Version NKRV, and it was accepted.

The problem is the Holy bible itself is in the public domain, but the Korean translation is not. Although it is not well known, it is a copyrighted document by KOREAN BIBLE SOCIETY.

How they could be cleared from the sentence database?

//in Korean

성경은 기본적으로 퍼블릭 도메인이 맞지만, 한국어 번역은 저작권이 있습니다.
문장DB에서 제거되어야 하는 상황입니다.

Hi Gofeel

Thank you for bringing this to our attention, please note that we will look into this issue immediately and resolve it.


Dear @gofeel, thank you for reporting this. Can you let us know what the sentences are (or at least some of them) so we can identify the contributor and remove them and any related sentences.

These are the examples :

내가 여호와로 말미암아 득남하였다 하니라
사람이 교만하면 낮아지게 되겠고 마음이 겸손하면 영예를 얻으리라
독주는 죽게된 자에게, 포도주는 마음에 근심하는 자에게 줄지어다

  • I captured and rewrite so the sentence could be not exact same with in the DB.
  • “여호와” means Yahweh, so I think you can find more sentences by the searching that word.

Thanks, I’ll pass this on to the team.

There are many different versions of Korean bible translations, and AFAIK CommonVoice got KRV(from 1952/1961) not NKRV(from 1998).

For example. common_voice_ko_36880175.mp3’s source sentence is “여호와 하나님이 에덴동산에서 그 사람을 내어 보내어”.(Genesis 3:23) which is identical with KRV version. NKRV version is “여호와 하나님이 에덴 동산에서 그를 내보내어”. Very similar, but slightly different.

CommonVoice team could check old SentenceCollector database which stated about its source. About a year ago I had similar concerns with gofeel. But then another participant let me know that it was not from NKRV but from KRV which is clearly known as copyright expired.

Useful links from KOREAN BIBLE SOCIETY(Solely writtein in Korean though):

(Just in case, I re-write my opinion in Korean.)

성경의 한국어 번역본이 여럿이 있습니다만, 모질라 커먼보이스에 사용된 것은 개역한글(1952년 / 1961년)이지 개역개정(1998년)은 아닌 것으로 압니다.

예를 들어, common_voice_ko_36880175.mp3 파일의 원문은 창세기 3장 23절의 “여호와 하나님이 에덴동산에서 그 사람을 내어 보내어”인데, 이것은 개역한글 때의 번역입니다. 개역개정에서는 "여호와 하나님이 에덴 동산에서 그를 내보내어"로 번역되어 있습니다. 굉장히 유사하지만, 살짝 다릅니다.

CommonVoice 팀에서는 과거 SentenceCollector 의 DB에서 이 문장의 출처에 대해 뭐라고 명시했는지 찾아보실 수 있을 겁니다. 1년쯤 전에 저도 gofeel 님과 비슷한 문제의식을 가졌었는데, CommonVoice 프로젝트 참여자 중 다른 분께서 이것이 저작권이 만료된 KRV 번역에서 가져온 것임을 알려 주셨던 적이 있습니다. NKRV는 저작권이 만료되지 않았지만요.

대한성서공회 홈페이지의 관련 링크:

처음 sentense collector에 성경 문구들이 올라왔을때 출처가 명확하지 않아서 여러 문장들을 reject했었습니다. 만약에 모든 문장이 KRV라고 하면 다행입니다.

I remember I rejected several sentences when it was on the sentence collector because the copyright information was not clearly written. If the sentences are all from KRV, it’s okay.

업로더가 한명이 아닐수도 있겠네요.

I am not sure the sentence uploader was one.



Some things I also wrote on the Telegram chatroom for Korean:

As far as I thought, corporate works (team projects) had special terms from the date of creation and that “author’s life + # years” was for solo projects

I am not a lawyer, and so I do not know where the line lies between solo and corporate.

A Disney movie with 200 animators?

Edgar Allan Poe’s “The Raven”?

Something translated by a group of 4 people? … I don’t know

A source of confusion is tied to the re-printing and re-issuing of various editions.

Unlike the case with source code, no one makes comments next to every line in a book.

Therefore, there is no easy way to know if a given page in the 2nd Edition of a book is wholly new content for the 2nd Edition or is content from the 1st edition.

Best thing I can recommend is searching randomly chosen sentences to see if those randomly chosen sentences appeared that way in a much older edition of the book. Statistically speaking, if one were to pull 30 random Korean Bible sentences from Common Voice and all of those sentences had appeared already in a very old edition of the Korean Bible, then we are in the clear. Otherwise, we would have to identify individual sentences as being a problem or not.