I’m creating a mobile game in Unity and am hoping to process speech recognition on the device. Due to some other technical limitations (voice chat required for multiplayer) I am unable to use the built-in iOS and Android APIs for this.
I was hoping someone would be able to point me in the right direction. But my requirements are:
- Support both iOS and Android
- Pass raw audio data snippets up to ~5s long (e.g. array of floats) to DeepSpeech running on the device and convert to text.
- Recognize a predetermined list of phrases (around 50) in various languages (user tells us which language they are speaking ahead of time)
- Use the bare minimum amount of resources. The game is already quite demanding so I would need this to function in a performant manner.
I’d also be interested in hiring someone to help me with this if they want to reach out directly to me (do not use this thread, I didn’t want this to be a job posting).