I’m using vim editor for text editing and was browsing internet to find a plugin for Vim that allows me to do voice typing. The only thing I found is: https://github.com/w0rp/vim-speech but it does work with Google STT cloud service which means that It works online, require some registration etc.
So I thought why not modify this plugin to use DeepSpeech project. working (a bit rough) prototype is here: https://github.com/yetanotherdeveloper/vim-speech .
Shortened installation instruction:
- clone into your plugin directory (I tested with vim8 pack dir)
- You need to create virtual python environemnt for python3 and for that
virtual environment install pyaudio and deepspeech. Procedure for google STT is in
install.sh . So by analogy create venv for python3 with needed python3 modules installed.
- Download Deepspeech model and optionally scorer . I have fine-tuned 0.9.3 model using my own voice samples (abit over 600 samples) finetuned in a similar way
as described in release notes of Deepspeech 0.9.3
- export DEEPSPEECH_MODEL=
5)optionally export DEEPSPEECH_SCORER=
- Inside Vim , check with :scriptnames if vim-speech is loaded . And then you should
be two commands available :SpeechRecord , :SpeechStop and :SpeechToggle . :SpeechRecord starts recording while :SpeechStop finish recording and runs inference.
- If all is fine the it maybe useful to add some mappings like:
Some observations from using vim STT:
- Biggest problem is that inference time is long , which is because Deepspeech streaming is not used . Reason is that it was the easiest to implement using original plugin as a base.
- Accuracy of inference is good (I’m using Deepspeech model fine-tuned with my own voice).
If you know other vim plugin with STT (working offline not using google STT service) then let me know.