I am trying to create a voice corpus for DeepSpeech training. I have downloaded the youtube videos and subtitles. I noticed that for many videos, video frames and video subtitles are not 100% correctly aligned.
Is there any tool available that will align the frames with subtitles. Pls advise.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
2
Is this content you are allowed to use ? How were the subtitles made ?
Can you elaborate on that ?
There are several projects that allows to perform forced alignment, but they require some existing, not-perfect-but-good-enough model for your language.
send me the link of those projects that do forced alignments.
I noticed that for many videos, video frames and video subtitles are not 100% correctly aligned.
i mean there is start and end time of each frame subtitles. i split up the entire audio file into multiple audio files based on start and end times of each frame subtitles. I noticed in many frames the start time and end time is incorrect. in each frame of subtitiles, the start time and end time are wrongly annotated, i do not know how the subtitles and times are generated.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
4