Transcription Results very bad in english

Hi i have installed newest version deep speech and want to transcription a long text. I upload and convert in wave 16khz. All words are wrong. i upload any test file. I am a german speaker and say. my name is falko. The software say there arthur. What is my problem? What i need to tune?
Thank you.

The English model is not good with accents. Listen to your converted audio whether it sounds fine and use DS without the scorer to see what letters it recognizes.

1 Like

Thank you. it doesnot help. I generate a simple audio. My name is Falko.
wihout scorer: ur verer eartho curv
with scrorer: there arthur

I have very good results with the English model, but my accent is pretty average US guy. As Olaf mentioned, it’s not great with other accents.

Any option to get better results or use other modells?

Fine tune it with some data you record.

1 Like

Try Daniel’s German model. The English model is not great, but it usually understands general sentences. Have you checked your audios?

Help ignite contributions of non native english speakers to Common Voice english.

2 Likes

Unfortunately I have the same problem.

The results of Mozilla Voice STT are pretty bad for my accent which you can see here : https://www.youtube.com/watch?v=7_ECqKnpg3s&ab_channel=SoftwareEngineeringCourses-SECourses

I have used deepspeech-0.8.2-models.pbmm and deepspeech-0.8.2-models.scorer

Youtube’s auto caption is also not dependable. For my some videos it generates and results are great but for this one it didn’t generate.

For example this is my another 3.45 hours long video and YouTube automatically generated captions pretty good : https://www.youtube.com/watch?v=43LB4ZAUqgs

define your accent? if you are speaking english with a türkish accent, I’m not surprised at all the model yields poor results, this is documented on the release page ; myself being french, I have the same issue.

FWIW, the audio in that video has a lot of echo which our English model can have a hard time with.

How do you understand it has echo? It sounds fine to me. Also if YouTube decides to generate it generates pretty good for my videos

example : https://www.youtube.com/watch?v=43LB4ZAUqgs

I tested google speech by free credits they give

and results are simply amazing

I have converted audio to mono channel, 16 bits, 16000 hz with below command

ffmpeg -i "introduction to programming lecture 1 week 1.mkv" -af aformat=s16:16000:mono introduction_program_lecture_1.flac

Here the audio file : 1 minute 22 seconds (i did small duration for testing)

4m.zip (1.4 MB)

And here the generated transcripts

Hello, dear students.

Welcome to the lecture 1 of introduction to programming course.

In this course, you will learn how to program you will learn the fundamentals of programming. You will learn how to be a software engineer. This course is the primary the most important cause of your Carriage. Why is that because in this course you will you will learn how to do

Programming haftar called how to compose a software. So this is your most important lesson among all of the courses you are going to take because this lesson will teach you how to program.

okay, so if you want to be a good programmer a good software engineer you have to

Perfect.

This course you have to give your most attention to this.

The audio quality is not good on that video as mentioned above. You can hear there’s a tinny after effect with the speech which is the echo, plus background noises. It’s also heavily accented. Both of those were mentioned above as not ideal for the English model.

IF you’re trying to transcribe videos, google may be your best bet for now.

With Mozilla i found not so good solution. I test a few other opensource systems:
I use debian system with pocketsphinx pocketsphinx-en-us. The results was not so bad.
I get best results with https://github.com/alphacep/vosk-api with modell. https://alphacephei.com/vosk/models/vosk-model-en-us-aspire-0.2.zip
its free of charge.
https://alphacephei.com/vosk/models
Yes Google Cloud & AMAZON API for paid service offering best results.

1 Like

Thanks for testing. Its not all time free of charge.

True. It gave me 300$ credit valid for 3 months.

I will test with 44100 hz to see if any difference

edit :

I have tested with 48000 hz and it has become even better wow

Hello, dear students.

Welcome to the lecture 1 of introduction to programming course.

In this course, you will learn how to program you will learn the fundamentals of programming. You will learn how to be a software engineer. This course is the primary the most important course of your Carriage. Why is that because in this course you will you will learn how to do

Programming how to code how to compose a software. So this is your most important lesson.

Among all of the courses you are going to take because these lesson will teach you how to program.

okay, so if you want to be a good programmer a good software engineer you have to

Perfect.

This course you have to give your most attention to this.