Hey Kuba,
i've build semi automatic solution to extract sentences (within
nltk) from old movie subtitles. Process is not fully automated and
i’m trying to review sentences before i’ll add it to collector
(sometimes i can miss low quality senetce). I’ll try to modify
process - it will take only those sentences, in which words are in
polish dictionary only (btw my blacklist is still growing, so
quality of batch should be better each iteration).
Do I need to write movie title within description of sentences
batch?
BR