for the VAD I used the web audio API, an audio context and the analyzer method, something like this:
//new audiocontext
let audioCtx = new (window.AudioContext || window.webkitAudioContext)();
//analyzer for the audio context
let analyser =audioCtx.createAnalyser();
//request microphone access
navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(async (stream) => {
//analize sound waves
//stream source
let source = audioCtx.createMediaStreamSource(stream);
//analyser now can read data from the source
source.connect(analyser);
//set the fftSize
analyser.fftSize = 2048;
//get the buffer length from the analyser
let bufferLength = analyser.frequencyBinCount;
//create a uint8 array
let dataArray = new Uint8Array(bufferLength)
//call this to get the current frequency and put it into dataArray
analyser.getByteFrequencyData(dataArray)
//now handle the the dataArray, which has frequencies from 0-255 (0 ===total silence)
//if all elements === 0 then no voice
// call analyser.getByteFrequencyData(dataArray) as often as you want to analyze voice frequency
// I used 5ms in my app
handleMicSilence(dataArray);
});