Hello @aiteam,
a lot of contribution You have made, can You please add a better source description of sentences origin? “own choosed and edited” may not indicate that sentences are copyright free in my opinion.
Some sentences seem broken and it would be a bit hard to read and process them. Like this one: “Wyskoczy do klubu tury…iej wielkości gwiazd.”
Also I am not sure if multiple sentences in one sentence string is ok for the dataset (like this one “Wytrzeźwiej. Odpocznij. I przyjdź do mnie.”)
It is very nice that You have made a lot of contribution but keep in mind we also aim for some quality of the dataset ;). So if You can somehow pre-process the sentences it would reduce review time and percentage of rejected sentences.
If it is possible to refine those sentences which You have uploaded by a script, maybe it would be good to actually remove them from the sentence collector for now and add after refinement?