How to send the deepspeech model output text without lm(Language Model) and send that text to lm(Language Model) model separately

narasimman.saravana · November 8, 2019, 11:02am

Can anyone give an idea…

lissyx · November 8, 2019, 1:21pm

@narasimman.saravana Write code that runs with and without LM ?

narasimman.saravana · November 9, 2019, 6:25am

I think you not understood my question

How can i run lm model separately. for example, I have a text which i want to send it to the lm model. so is there any cmd line for lm model to run it.

lissyx · November 9, 2019, 7:49am

Well, when you just throw a question in a title, it’s not surprising people don’t understand what you want. So you want to do what exactly, because I don’t get you.

narasimman.saravana · November 9, 2019, 7:54am

i’m really sorry for bad question .
for example I got text from deepspeech model without lm ‘neaar branc’ but correct sentence is nearest branch. Now i want to give the ‘neaar branc’ text to lm model and it will give a nearest branch. I know if i use deepspeech with lm model it will give correct prediction. but I want it to do separately lm model. because i want to process the raw text from the deepspeech(without lm) and then pass to the lm model to correct the sentence.

lissyx · November 9, 2019, 7:56am

That’s an unusual usage. Problem is, the LM runs on the output of the networks, logits, during the decode phase. So you cannot achieve that through the API. Can you explain why you want to proceed this way?

narasimman.saravana · November 9, 2019, 8:10am

for noise audio i’m planning to include * .for example hello audio (0.6sec) + noise audio(1sec) transcription: hello ** (for 0.5 sec i added one *). if audio contains only one noise then depends on the second i put * in transcript.Why i am doing this is way .when the voice + noise is predicted through the deepspeech the output is somewhat ok ,but if audio contains only noise means its predicting randomly word.So I want to train the noise only audio file also,but i does not want to include * in my lm model.

lissyx · November 9, 2019, 8:16am

You want your model to learn to output * when there is noise detected?

narasimman.saravana · November 9, 2019, 8:17am

yes, if i send a noise audio it should return multiple * based on seconds

lissyx · November 9, 2019, 8:18am

Do you have a use case for the *, or it’s just a way to get rid of noise?

narasimman.saravana · November 9, 2019, 8:28am

To detect noise.
we have trained nearly 10,000 real time noise augmented with our own clean audio file 2,00,000.It predicts the mixed noise with voice correctly. but in real time stream telephone line we can also get only noise.The problem is the model predicting the only noise audio some randomly word.
so we are planning to train the noise only audio file.

lissyx · November 9, 2019, 8:49am

Have you tried VAD? From my experiments it was already good enough to avoid recognition when there is noise. Maybe you should add to your training some noise-only audio and have multiple *, as matching transcript?

Or even another tool in front of DeepSpeech that would detect voice / pure noise. I remember finding something like that on github made by French INA.

narasimman.saravana · November 9, 2019, 9:00am

@lissyx we tried VAD, but it detecting silent’s only.

Could you please share the github link… if possible.
Thanks

lissyx · November 9, 2019, 9:06am

I don’t have it here, I’ll try and find it later today

narasimman.saravana · November 9, 2019, 9:24am

@lissyx Can i use GitHub - kpu/kenlm: KenLM: Faster and Smaller Language Model Queries to get the text for separtely

But i tried this one it give given the correct word, OOV throwing for me

Above mentioned image will help you to see that.

Actual word is “nearest branch”
GIven word is “ne branc”, but it returning the not matching like that.

lissyx · November 9, 2019, 10:32am

Please avoid sharing such content as screenshot, it’s hard to read on mobile and not searchable.

What’s in your test.py?

narasimman.saravana · November 9, 2019, 10:34am

Sorry for the attachment,

Please refer the link for that test.py
https://github.com/kpu/kenlm/blob/master/python/example.py

lissyx · November 9, 2019, 10:38am

No I don’t think it is enough. Look at DeepSpeech.py how logits gets passed to the ctc decoder

lissyx · November 9, 2019, 1:56pm

here @narasimman.saravana https://github.com/ina-foss/inaSpeechSegmenter/

Topic		Replies	Views
Issue with Language Model DeepSpeech	11	1054	January 3, 2019
Is language model affect the training accuracy or prediction of the text ,and is it worth to spend much time on it DeepSpeech	4	434	August 1, 2019
DeepSpeech model training DeepSpeech	65	7980	November 12, 2019
Using Deep Speech DeepSpeech	34	12843	August 20, 2019
Generate LM DeepSpeech	3	563	November 12, 2019

How to send the deepspeech model output text without lm(Language Model) and send that text to lm(Language Model) model separately

Related topics