Translate-anything: a python package to make translating youtube/local files easy

ZachVorhies · May 19, 2021, 7:40pm

This is a package I made that makes transcription easier. Deepspeech is VERY particular in the formats it receives. So I built a front end that takes care of fetch and then transcoding the resource before passing it on to Deepspeech to create the translations.

This toolset has automated tests for windows/mac/linux that run on github actions. As far as I know, it’s the easiest toolset out there. It will download the models from 0.9.3 and cache them. No model installation required!

Example

Example (cmd):
- transcribe_anything <YOUTUBE_URL> > out_subtitles.txt
- transcribe_anything <LOCAL.MP4/WAV> > out_subtitles.txt

Example (api):

from transcribe_anything.api import bulk_transcribe

urls = ['https://www.youtube.com/watch?v=Erk4_jFDjzQ']
def onresolve(url, sub): print(url, sub)
def onfail(url): print(f'Failed: {url}')
bulk_transcribe(urls, onresolve=onresolve, onfail=onfail)

Quick start

Optional: Create a virtual python package

Works for Ubuntu/MacOS/Win32
mkdir transcribe_anything
cd transcribe_anything
Download and install virtual env:
- curl -X GET https://raw.githubusercontent.com/zackees/make_venv/main/make_venv.py -o make_env.py
- python make_env.py
Enter the environment:
- source activate.sh

The environment is now active and the next step will only install to the local python. If the terminal
is closed then to get back into the environment cd transcribe_anything and execute source activate.sh

Required: Install to current python environment

pip install transcribe-anything
- The command transcribe_anything will magically become available.
transcribe_anything <YOUTUBE_URL> > out_subtitles.txt
-or- transcribe_anything <MY_LOCAL.MP4/WAV> > out_subtitles.txt

Testing

All tests are run by tox, simply go to the project directory root and run it.