I’ve released the first version of Bumblebee:
This is my attempt at creating a DeepSpeech electronjs app, and a websocket API for writing voice-controlled JavaScript apps.
The idea here is Bumblebee handles the troublesome stuff of installing and setting up DeepSpeech, handles the microphone, adds an “alexa” style hotword system (porcupine), and a text-to-speech system (mespeak). And together these systems can be treated as an always-available service that applications can use. No need to start and stop DeepSpeech as you’re writing an application - Bumblebee stays running in the background and your application communicates to it through a websocket API.
The API is very simple, as you can see in the hello world example:
I think the really nice thing about having a shared base system like this is I can write a small voice app in one single JavaScript file, and just share that file, and be able to run it on another computer without having to install a new instance of DeepSpeech for each app.
i don’t have any sophisticated NLP/intent parsing going on here and I’m only using the pretrained English models. This release was mainly just to get the base system working and running/installing correctly on Mac, Linux, and Windows.
I don’t have an exact roadmap of where this project goes from here, but I have a lot of ideas for applications that I want to write. I’m primarily interested in voice-controlling pretty much everything – my computer, things around my home, and to build voice-controlled web applications.
I’d appreciate feedback – do other people find use in an API like this that handles all the stt/tts/hotword stuff leaving you to focus on what you want your “voice app” to do?