Using CTC decoder with custom logits on Android

Hi! In short I would like to run the CTC decoder on android with custom logits. I have model of my own which produces logits which should be then processed with CTC decoder. I would like to use deepspeech ctc decoder with scorer.

I went over documentation for android support and Java API. I also tested libdeepspeech from JCenter. It seems that the current API doesn’t provide such option. It only runs the whole process recognition + decoding. I hope that it won’t be that hard to add some binding for the CTC decoder since the API already provides functions as providing own scorer and setting beam width.

Could somebody please guide me to place where I should make appropriate changes to be able to call CTC decoder separately in Java (Android)?

Thank You in Advance

Sorry, but providing bindings for the CTC decoder is not really something we want to support.

Yes, because that’s the purpose of our API. The ctc decoding in itself is a technical detail of implementation.

The ctc decoder is well separated in native_client/ctcdecoder, so you should be able to directly call that from java through JNI.

From there you can take a look at the scorer’s API https://github.com/mozilla/DeepSpeech/blob/ffcec7f9aa8112390bcf6b990f126f94674664a6/native_client/ctcdecode/scorer.h and follow our deepspeech code for scorer call sites.

Thank you for pointing me in right direction. Make sens you don’t providing this in API. I don’t have much experience with JNI and SWIG. I will try to figure that out. I understand that basically all I need to do is to create Scorer and run beam search (https://github.com/mozilla/DeepSpeech/blob/fcd9563fcd8b47ee5719b24a9d7f0d9a4eaf372f/native_client/ctcdecode/ctc_beam_search_decoder.h#L95)

I guess you can get inspiration from our use of SWIG, but that’ll require some reading of makefiles :slight_smile:

more or less, yeah, that should be the big picture