Commit Graph

12 Commits

Author SHA1 Message Date
Roberto Rosario
317d07a355 Refactor OCR app. Removes document parsing. Moves OCR processing to
model manager. Add submit and finish events.

Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2017-08-23 02:04:57 -04:00
Roberto Rosario
43d2539c95 Update OCR backend to work with the new document image caching system. 2016-11-02 05:05:25 -04:00
Roberto Rosario
d04117d345 PEP8 and code style cleanups. Replace lists with tuples. 2015-08-12 04:41:59 -04:00
Roberto Rosario
bec85f38f4 Text parsers and OCR backends are now used in tandem for each document. 2015-08-08 04:49:08 -04:00
Roberto Rosario
8382df91a6 Update PDF text parser classes. Remove SlateParser and substitute with a PDFMiner based parser. 2015-07-31 02:09:48 -04:00
Roberto Rosario
4527563d89 PEP8 cleanups, specially E501 line too long. 2015-07-22 18:21:37 -04:00
Roberto Rosario
47a74360dd Remove double execution of backend. Store the language in the instance. 2015-07-08 04:15:58 -04:00
Roberto Rosario
48df3dcafa PEP8 cleanups 2015-06-24 17:11:24 -04:00
Roberto Rosario
e4623fadcd PEP8 cleanups 2015-06-23 02:23:23 -04:00
Roberto Rosario
78198f3398 Smart settings refactor 2015-06-22 21:04:06 -04:00
Roberto Rosario
08a8ae2554 Move document page content code to the OCR app. Prep work for issue #186. 2015-06-17 00:21:35 -04:00
Roberto Rosario
5275061f9f Refactor OCR backend class to be file object based and use images from document page not the actual file. Use pytesseract instead of calling the CLI directly. 2015-06-09 03:28:38 -04:00