Hi, I think I’m exploring the possibility of utilizing Mozilla’s Deep Speech for general phonetic transcription (ideally for connected speech transcription). In terms of audio collection, it would have a slightly different format than Common Voice: audio (of any language) would be transcribed into the international phonetic alphabet (IPA), diacritics, etc. This would potentially require more review given than most people are not versed in phonetic transcription in addition to the fact that there are some variations in transcription by region and language. It’s just a thought and I’d love to hear what people think.
Hello, that would be very interesting. Did you continue to explore that topic ?
I’m still looking into it! Not in a rush since I think currently what’s needed is reliable input/training data.
Ooh I’ll take a look when I get a chance! Thank you for sharing!