FailedPreconditionError

yihong · June 28, 2019, 5:07am

Hello,

I got this MFCC error when I running Deep Speech
Missing 5 bands starting at 0 in mel-frequency design. Perhaps too many channels or not enough frequency resolution in spectrum. (input_length: 257 input_sample_rate: 44100 output_channel_count: 40 lower_frequency_limit: 20 upper_frequency_limit: 4000
How I should do to improve frequency?
I have 2 seconds in each wav file.

Thank you

lissyx · June 28, 2019, 5:21am

Is that at inference or training ?

yihong · June 28, 2019, 5:45am

It is at training from Epoch 0. I am using transfer learning 2 branch. My utterances has English and Chinese mixed, but English occupied more than Chinese words. @lissyx

lissyx · June 28, 2019, 6:04am

Is your whole dataset using 44.1kHz stereo ? have you made any change to the code ?

yihong · June 28, 2019, 6:12am

No, I change it to 16 bits and 44.1khz(but some of data might smaller or bigger than that) mono for my whole dataset.
example of my wav file property:

I did not make any changes to the code.
Should I make all same size? @lissyx

lissyx · June 28, 2019, 6:14am

Please avoid screenshots.

This shows 48kHz, your error reports about 44.1kHz and you says it has been converted to 16kHz. Could you make sure your whole dataset is mono 16kHz ?

yihong · June 28, 2019, 6:18am

So the whole dataset has to be 16khz instead of 44.1kHz? @lissyx

lissyx · June 28, 2019, 6:22am

No, the whole dataset has to be in the same format. Our default setting and what we train on is 16kHz, so it’s easier for you. Just don’t mix different sample rates.