I am training a model with a significant amount of data ( about 260 hours - 215k files) . Even-though it may not be enough for the end result when i start my training after a few minutes i get the error of “Not enough time for target transition sequence” … “You can turn this error into a warning by using the flag ignore_longer_outputs_than_inputs”
@lissyx@reuben Is there a way to use this flag ?
Searching manually for that file (or files) is almost impossible .
Thank you.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
2
Why ? On each and every dataset where this occurred, doing some basic filtering checks to ensure to have a rough match between audio and text helped and fixed the issue.
Well you have to apply that locally, so technically that makes you having to maintain a fork, which is never fun.
It shouldn’t be that hard to identify the problematic files. Plot a histogram of WAV duration (easily computable from the file size) divided by transcript length for your training set and then look at the outliers.
@lissyx, @othiele and @reuben Thank you for the fast replies.
I really appreciate it. @othiele Maybe i should have mentioned it that this is for a test run.
You saved me quite some time with the solution provided
1 Like
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
7
Yeah, for a test run, just hack it like suggested. Once you get more serious, you should really just fix your dataset or act at importer level to eradicate those broken components.