I am writing a native UI wrapper for DS. It accepts as input files (MP3, WAV) and input devices. Inside my voiceToText loop I have something like:
DS_FeedAudioContent(stream_st_ctx, aBuffer, nsamples);
const char* words = DS_IntermediateDecode(stream_st_ctx);
CResult * res = new CResult(PARTIAL_TEXT, QString(words) );
DS_FreeString((char *) words);
emit resultReady(res);
However, “words” keeps getting the entire text. Is there any planning for providing a similar function where the already processed words are omitted (removed from DS internal buffers)?
It might take some time before @reuben gets aroung to change things like that in the core. If you have some spare time, prepare a PR on github. This is the quickest way.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
3
I don’t think it’s quite a good idea, given how the decoder works.
You should just close / create new streams when you need to. Can you describe your usecase?
Thanks. The use case is a desktop App that convert files and live audio input streams to documents (UI EditText controls).
note: project is still in its very beginning… still prototyping the main operations… no UI elements yet.
The files I have to convert are 20~30 minutes long… thus the idea to “stream” and process in chunks. However, it looks like DS does not have the same concept for streaming. It seem that DS keeps the entire audio and text in the core… it is not practical.
Unless DS have new real stream API functions I will need to detect myself silences and close/open new streams as you have suggested.
BTW… anyone who had hard time adding SOX be free to use my re-sampling code. SOX sucks. So my libmad easier building as “configure” sucks for MinGW and MSys.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
7
Understood, thanks. I do not see a good point in having the “streaming” API though.
I will implement a silence slitter in my app and use the full converter.
Have a great 2021,
Varanda