VAD-splitting only supported for frame durations 10, 20, or 30 ms

DSAlign,

VAD-splitting only supported for frame durations 10, 20, or 30 ms

Can this hard coded min length be changed ? Can it be reduced further? Please help and guide.

The VAD-splitting relies upon this repo to do the splitting and if I remember correctly you can use smaller values. But I when I played with it, 10 or 20 ms proved to be good for regular language.

Clone the repo and play around with the example given to get a feel of how it works. It does the job, but could be better :slight_smile:

Dear othiele, Thank you for always being helpful. Yes, I know and tried to narrow down to 2ms, but was unsuccessful. Can you help me where to trigger ? and how?

WebRTC VAD does not support different frame durations. Do not use values different than 10, 20 or 30ms.

I guess that you are not satisfied with what VAD is currently splitting. But in my experience it is just not good enough for some situations. It works great for slow conversations with pauses in between but it doesn’t work for fast spoken audio like radio shows.

If you find something more suitable, let us know

Hello, @ reuben, you are right. but still I will struggle my best.
@ othiele. You have understood absolutely perfect. Yes, I am working @ my end and will definitely share with you. Will be in touch. Let me know as well if you find anything suitable.