Sentence Collector need help to remove

Thanks for doing this! 87% correct sentences is in my opinion way too high to remove the full source. We’d be removing a lot of sentences that would be correct.

Instead let’s think about improvements. I think removing the sentences with emojis and also in the future not allow sentences with emojis to be uploaded would be good for all languages.

Apart from that, did you notice any patterns in the wrong sentences that we might be able to reject automatically? I’d be happy to help out, but not knowing the language at all I’d need help defining the rules.

1 Like