Sentence Collector - Cleanup before export vs. cleanup on upload

@HelloTheWorld Yes, I think you got most parts right. There are some missing steps though. I have created this Pull Request to update the README in the repository if you could have a look at it :slight_smile:

This is contrary to my initial suggestion in the post, however I think this does indeed bring up a good point. With my suggestion the cleanup could not be used to correct anything that would not pass the automatic validation. That’s why I commented on your PR that parts of your cleanup proposals are not needed, as they would not even pass validation (and therefore never get to the cleanup step).

I have to say, I kinda like the approach of running the cleanup before the validation, so that for example numbers could be converted from “2” to “two” etc. Of course, that might not be the best example (or easy to do) for each language. Overall we wouldn’t lose anything by switching this around. @bozden @ftyers do you have any preferences on which way around this is done?

Thanks for bringing up this point. I was not aware of this. I definitely would agree that we should support these use cases. I just don’t know exactly if saving both variants is the way to go here and what implications this would have. Let me think about this for a bit.