I’m looking for an OCR implementation in Gluon. My specific problem is recognizing a single line of well-formed text of no more than twenty characters at a known location in an image with wildly varying backgrounds. Conventional CV solutions, such as Tesseract, don’t work since I cannot make the image clean enough.
I have come across many OCR implementations in Gluon, such as the AWS Lab handwriting recognition repository, but they all solve much more difficult problems and include code I don’t need, such as page and line segmentation and lexicon search. Adapting any of these seems to be about as much work as starting afresh.
The CTC example in the MXNet repository appears to be the closest to what I need since it’s a CAPTCHA recognizer, much like my problem, but it uses the Symbol API. It has a pretty good read-me but says nothing about modifying it. It also has several hard-coded values throughout the implementation making such modifications more difficult.
- My Web searches haven’t provided results focused on my problem. Am I missing any?
- Although it uses the Symbol API, is it worth modifying the CTC example? It recognizes three or four glyphs using a ten-digit alphabet but I need to recognize up to twenty glyphs using a twenty-six-letter (all English capital letters) alphabet.
- Does it make sense to use something out of gluonnlp, such as gluonnlp.model.train.StandardRNN, for my situation? Is OCR considered NLP?
- Am I better off starting afresh?