Is microphone the only available input for STT

JohnSmith1 · July 12, 2020, 10:14am

From what I know Deep Speech-based transcription is only available via microphone input.

Is it possible to transcribe some audio on the web?

Well obviously transcribing it even before using the codec would be something absolutely different, but after it has been decompressed

so after it has used the codec

Can that audio be transcribed internally?

So audio not being transcribed after it has been played on the speaker and the microphone becomes the input.

othiele · July 12, 2020, 12:46pm

I am not quite sure I understand what you want to do.

Usually DeepSpeech takes 16 KHz 16 bit WAV audio as input, this can be from a microphone, file, stream, … How you get the audio to DeepSpeech is sth you have to solve.

If you are looking for an end user product, sorry, this is not it. Please read the docs for more info

https://discourse.mozilla.org/t/what-and-how-to-report-if-you-need-support/62071/2

JohnSmith1 · July 13, 2020, 1:50am

By file does it have to be prerecorded?

Exactly what types of file and stream are available?

So can Deep Speech be used for transcribing Zoom Video Call? For example.

othiele · July 13, 2020, 7:43am

Please ask a developer to help you. This is somewhat possible with DeepSpeech, but you’ll need somebody to programmatically set it up and there are some drawbacks.

lissyx · July 13, 2020, 9:14am

DeepSpeech library takes wav as input, you can feed it from whereever you like. Please check the various API docs on https://deepspeech.readthedocs.io