Does United Nations Parallel Corpus clear of license issue?

I notice the United Nations corpus mentioned in this post is public domain as stated in the first page of the web site. I am not a copyright expert. May I know whether this corpus is clear of license issue?

There are millions of sentences in the corpus. My community would be crazily happy to have it import to the sentence repository.

1 Like


I can check with our legal team in our next meeting, but on their terms of use I read:

When using the United Nations Corpus, the user must acknowledge the United Nations as the source of the information

From past inquiries my understanding is that attribution is not compatible with CC-0 Public Domain license.

I’ll let you know, cheers.

1 Like

Thank you very much.

The EU Parliament corpus has a similar statement at the front page: “Please cite the paper, if you use this corpus in your work”. As this corpus is fine for CV, hope the UN one is also fine.

The corpus is not public domain obviously. The result of cleaning and sorting out the sentences from PD documents into a corpus can absolutely be licensed. You can do whatever you want with PD materials, including license the work out of it.

We should look for those PD documents which the corpus originally scratched from.

A post was split to a new topic: New Zealand parliament corpus