Shouldn’t adding a buffer time before and after the webrtcvad
output solve the problem?
For example,
If VAD says the voice lies between 4.20 sec(start) and 6.80 sec(end)
we can cut the chunk from
4.18 sec to 6.82 sec
i.e. a 20 ms buffer time, before and after the start and end time
The only problem here would be to choose the exact buffer time to use.
Am i correct in following this approach to deal with this error?
Thanks in advance