Hi. I’m new here. My name is Georgi. I want to check Hebrew sentences in “Sentence Collector”, but I can’t do it, because I don’t know, should we use Niqqud in Hebrew corpuse or not. We don’t use them in usual life, but if you think, it can useful, I want to know why. If no, I want to ask someone to delete all niqquds in sentences of SC with a bot or a program. Thank you!
Thanks for starting this discussion.
@ftyers is this something you can help with? In terms of Sentence Collector, I’m happy to do whatever is needed. However I would need to know specifics of what to delete.
This is a very good question, thanks @zedva.
I would say if your aim is to only collect recordings from native speakers, then you can remove the niqqud. If your aim is to also collect recordings from non-native speakers then you should keep them.
You could also potentially leave both in the database, the version with niqqud and the version without niqqud.
If you have any further questions, please feel free to ask!
Hi, can you say more causes please? I think it’s not so big problem. Non-native speakers can understand a pronunciation of a word from context too (beginners no, but other levels can do it)
I also don’t think it’s a big problem. What do you mean by more causes? I think whatever you do it will be fine.