source = audiofile_to_input_vector(wav_file, self._model_feeder.numcep, self._model_feeder.numcontext)
source_len = len(source)
target = text_to_char_array(transcript, self._alphabet)
target_len = len(target)
if source_len < target_len:
raise ValueError('Error: Audio file {} is too short for transcription.'.format(wav_file))
This tells me that, whenever duration of audio is less than duration of transcript text spoken, it will raise the error.
I tried to put this condition on my audio files to filter out such audio files but i am not able to recreate text_to_char_array as its coming from another code. What are your suggestions at this point?
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
22
yeah, i checked that its coming from text.py code, but since that code requires some ‘config_file’, i don’t know how to recreate this function ‘text_to_char_array’ independently for my purpose. Is there any other method to filter out the smaller duration audio files?
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
24
Sorry to insist, but read the source. Your config_file is the … alphabet file. So I guess that it is something you have ?
def audiofile_to_input_vector(audio_filename, numcep, numcontext):
r"""
Given a WAV audio file at ``audio_filename``, calculates ``numcep`` MFCC features
at every 0.01s time step with a window length of 0.025s. Appends ``numcontext``
context frames to the left and right of each time step, and returns this data
in a numpy array.
"""
# Load wav files
fs, audio = wav.read(audio_filename)
return audioToInputVector(audio, fs, numcep, numcontext)
What do i feed in place of ‘numcep’ and ‘numcontext’? How is it getting calculated or where is it coming from?
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
26
Can you read the source calling that ? It’s clearly trivial. Hint: git grep audiofile_to_input_vector
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61
And when i do ‘pip list | grep tensorflow’, i get: