This is a package I made that makes transcription easier. Deepspeech is VERY particular in the formats it receives. So I built a front end that takes care of fetch and then transcoding the resource before passing it on to Deepspeech to create the translations.
This toolset has automated tests for windows/mac/linux that run on github actions. As far as I know, it’s the easiest toolset out there. It will download the models from 0.9.3 and cache them. No model installation required!
Example
- Example (cmd):
transcribe_anything <YOUTUBE_URL> > out_subtitles.txt
transcribe_anything <LOCAL.MP4/WAV> > out_subtitles.txt
- Example (api):
from transcribe_anything.api import bulk_transcribe urls = ['https://www.youtube.com/watch?v=Erk4_jFDjzQ'] def onresolve(url, sub): print(url, sub) def onfail(url): print(f'Failed: {url}') bulk_transcribe(urls, onresolve=onresolve, onfail=onfail)
Quick start
Optional: Create a virtual python package
- Works for Ubuntu/MacOS/Win32
mkdir transcribe_anything
cd transcribe_anything
- Download and install virtual env:
curl -X GET https://raw.githubusercontent.com/zackees/make_venv/main/make_venv.py -o make_env.py
python make_env.py
- Enter the environment:
source activate.sh
The environment is now active and the next step will only install to the local python. If the terminal
is closed then to get back into the environment cd transcribe_anything
and execute source activate.sh
Required: Install to current python environment
-
pip install transcribe-anything
- The command
transcribe_anything
will magically become available.
- The command
transcribe_anything <YOUTUBE_URL> > out_subtitles.txt
- -or-
transcribe_anything <MY_LOCAL.MP4/WAV> > out_subtitles.txt
Testing
- All tests are run by
tox
, simply go to the project directory root and run it.