Deepspeech based STT plugin for vim

Jacek_Czaja1 · January 6, 2021, 12:05pm

Hi,

I’m using vim editor for text editing and was browsing internet to find a plugin for Vim that allows me to do voice typing. The only thing I found is: https://github.com/w0rp/vim-speech but it does work with Google STT cloud service which means that It works online, require some registration etc.

So I thought why not modify this plugin to use DeepSpeech project. working (a bit rough) prototype is here: https://github.com/yetanotherdeveloper/vim-speech .

Shortened installation instruction:

clone into your plugin directory (I tested with vim8 pack dir)
You need to create virtual python environemnt for python3 and for that
virtual environment install pyaudio and deepspeech. Procedure for google STT is in
install.sh . So by analogy create venv for python3 with needed python3 modules installed.
Download Deepspeech model and optionally scorer . I have fine-tuned 0.9.3 model using my own voice samples (abit over 600 samples) finetuned in a similar way
as described in release notes of Deepspeech 0.9.3
export DEEPSPEECH_MODEL=
5)optionally export DEEPSPEECH_SCORER=
Inside Vim , check with :scriptnames if vim-speech is loaded . And then you should
be two commands available :SpeechRecord , :SpeechStop and :SpeechToggle . :SpeechRecord starts recording while :SpeechStop finish recording and runs inference.
If all is fine the it maybe useful to add some mappings like:
map (vim_speech_toggle)
imap (vim_speech_toggle)a

Some observations from using vim STT:

Biggest problem is that inference time is long , which is because Deepspeech streaming is not used . Reason is that it was the easiest to implement using original plugin as a base.
Accuracy of inference is good (I’m using Deepspeech model fine-tuned with my own voice).

If you know other vim plugin with STT (working offline not using google STT service) then let me know.