Commit Graph

11 Commits

Author SHA1 Message Date
Roberto Rosario
aa0f48b1a0 Update OCR app to use organizations. 2016-06-08 19:29:20 -04:00
Roberto Rosario
d04117d345 PEP8 and code style cleanups. Replace lists with tuples. 2015-08-12 04:41:59 -04:00
Roberto Rosario
bec85f38f4 Text parsers and OCR backends are now used in tandem for each document. 2015-08-08 04:49:08 -04:00
Roberto Rosario
8382df91a6 Update PDF text parser classes. Remove SlateParser and substitute with a PDFMiner based parser. 2015-07-31 02:09:48 -04:00
Roberto Rosario
4527563d89 PEP8 cleanups, specially E501 line too long. 2015-07-22 18:21:37 -04:00
Roberto Rosario
47a74360dd Remove double execution of backend. Store the language in the instance. 2015-07-08 04:15:58 -04:00
Roberto Rosario
48df3dcafa PEP8 cleanups 2015-06-24 17:11:24 -04:00
Roberto Rosario
e4623fadcd PEP8 cleanups 2015-06-23 02:23:23 -04:00
Roberto Rosario
78198f3398 Smart settings refactor 2015-06-22 21:04:06 -04:00
Roberto Rosario
08a8ae2554 Move document page content code to the OCR app. Prep work for issue #186. 2015-06-17 00:21:35 -04:00
Roberto Rosario
5275061f9f Refactor OCR backend class to be file object based and use images from document page not the actual file. Use pytesseract instead of calling the CLI directly. 2015-06-09 03:28:38 -04:00