Hello,
I am using DeepSpeech to transcribe audio files in bulk. I generally use below script. But it takes a long time to transcribe the audio, because the model is being loaded every time.
Please note: I don’t have the exact source transcripts for these audio clips.
Is there any quick way to transcribe the audio in bulk?
file = 'test.csv'
with open(file, 'r', encoding='utf-8') as my_file:
for line in my_file:
columns = line.strip().split(',')
file = columns[1]
if file != 'wav_filename':
proc = subprocess.Popen("deepspeech --model model/output_graph.pb --lm ../dependencies_swiss/lm.binary --trie ../dependencies_swiss/trie --audio file.wav", shell=True, stdout=subprocess.PIPE,)
output = proc.communicate()[0]
output = output.decode('utf-8', 'ignore')
files.append(('res:'+output))
df = pd.DataFrame(data=files, columns=["path", "sentence"])
df.to_csv("model/submission-test.csv", index=False)