Resample training input samples to align with inference restraints

alchemi5t · July 23, 2019, 11:54am

Continuing the discussion from:

There could be value in ensuring that training can be done at other sample rates than 16kHz, but I’m unsure that resampling is the proper solution, to be honest.

I have added a variable(can be changed to a flag) which can be set to desired target SR. But won’t having a single sample rate(same as training and inference) improve convergence?

I’ve tested training models on different SR and inferencing on 16khz, and as expected, the model produces unacceptable results(WER, CER, LOSS and output put together). But the same model infers much better when using the same SR test files(Test results after training).

It does retain the original audio characteristics after resampling.

Do you think this would not be useful for training?

alchemi5t · July 23, 2019, 11:57am

Maybe resample it to a bracket near 16khz (E.g., 14-18 khz), if you think other SRs might be worth preserving(If only for having different SRs in the dataset for helping generalize the model).

lissyx · July 23, 2019, 11:57am

I don’t get your point: this is exactly the current situation, training at 16kHz, inference at 16kHz.

alchemi5t · July 23, 2019, 12:00pm

The point being i can train at 44khz and not infer on it.

Like this guy.

Instead of him having to do whatever he did, if the code resampled his 44khz data to 16k, then he would have not had that issue.

lissyx · July 23, 2019, 12:01pm

I don’t understand a single word of what you say. Each sentence contradicts the next one.

alchemi5t · July 23, 2019, 12:02pm

I meant, I can train a model on data with 44khz but my inference will require 16khz data which would make it hard for the model to predict accurately, if that makes sense.

lissyx · July 23, 2019, 12:04pm

I still don’t get what you are trying to achieve. Can you describe precisely your problem ?

alchemi5t · July 23, 2019, 12:11pm

let me simplify the issue. I want to know if i can run inference on any other SR other than 16k,ATM without code modification.

reuben · July 23, 2019, 12:14pm

Yes, just specify the sample rate when passing in the audio. Every function that takes audio samples in the API also takes a sample rate.

alchemi5t · July 23, 2019, 12:25pm

Got it. I thought only 16k was possible(misinterpreted the readme.). Thanks for clearing things up.