Roberto Rosario
|
bec85f38f4
|
Text parsers and OCR backends are now used in tandem for each document.
|
2015-08-08 04:49:08 -04:00 |
|
Roberto Rosario
|
8382df91a6
|
Update PDF text parser classes. Remove SlateParser and substitute with a PDFMiner based parser.
|
2015-07-31 02:09:48 -04:00 |
|
Roberto Rosario
|
4527563d89
|
PEP8 cleanups, specially E501 line too long.
|
2015-07-22 18:21:37 -04:00 |
|
Roberto Rosario
|
47a74360dd
|
Remove double execution of backend. Store the language in the instance.
|
2015-07-08 04:15:58 -04:00 |
|
Roberto Rosario
|
48df3dcafa
|
PEP8 cleanups
|
2015-06-24 17:11:24 -04:00 |
|
Roberto Rosario
|
e4623fadcd
|
PEP8 cleanups
|
2015-06-23 02:23:23 -04:00 |
|
Roberto Rosario
|
78198f3398
|
Smart settings refactor
|
2015-06-22 21:04:06 -04:00 |
|
Roberto Rosario
|
08a8ae2554
|
Move document page content code to the OCR app. Prep work for issue #186.
|
2015-06-17 00:21:35 -04:00 |
|
Roberto Rosario
|
5275061f9f
|
Refactor OCR backend class to be file object based and use images from document page not the actual file. Use pytesseract instead of calling the CLI directly.
|
2015-06-09 03:28:38 -04:00 |
|