Deepspeech Live RESTful API - using React and Node

im_alex · October 30, 2020, 10:29am

Hey guys, check my newly released app, open API (mind the speed for a free service) using the deepspeech pretrained model. Voice Activity Detection implemented as well as client side audio resampling from mic to wav 16kHZ mono 16 bit.

live demo

Check and star the Github Repo

Author:
Alex Lizarraga
Portfolio

othiele · October 30, 2020, 11:02am

Looks cool, how did you implement the VAD in JavaScript?

lissyx · October 30, 2020, 1:29pm

@im_alex That does look nice, and behaves as expected in my case with my english accent. There’s some latency, I see you are running cpu implem with pbmm model, maybe switching to tflite could help there?

im_alex · October 30, 2020, 7:36pm

for the VAD I used the web audio API, an audio context and the analyzer method, something like this:

//new audiocontext
let audioCtx = new (window.AudioContext || window.webkitAudioContext)();
//analyzer for the audio context
let analyser =audioCtx.createAnalyser();

//request microphone access
navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(async (stream) => {
    //analize sound waves
    //stream source
    let source = audioCtx.createMediaStreamSource(stream);
    //analyser now can read data from the source
    source.connect(analyser);

    //set the fftSize
    analyser.fftSize = 2048;

    //get the buffer length from the analyser
    let bufferLength = analyser.frequencyBinCount;

    //create a uint8 array
    let dataArray = new Uint8Array(bufferLength)
    //call this to get the current frequency  and put it     into dataArray
    analyser.getByteFrequencyData(dataArray)

    //now handle the the dataArray, which has frequencies from 0-255 (0 ===total silence)
    //if all elements === 0 then no voice
    // call analyser.getByteFrequencyData(dataArray) as often as you want to analyze voice frequency
    // I used 5ms in my app
    handleMicSilence(dataArray);
});

im_alex · October 30, 2020, 7:39pm

Latency is definitely a problem, will test with tflite and GPU implementations as well, thanks for the feedback

othiele · October 30, 2020, 7:41pm

Interesting implementation, thanks for sharing.