mayan-edms

Author	SHA1	Message	Date
Roberto Rosario	6f585a2836	Update and re-enable ocr app	2012-09-16 03:30:32 -04:00
Roberto Rosario	babdc4e93a	Initial changes to update the OCR app	2012-09-10 23:30:13 -04:00
Roberto Rosario	58019de21b	Don't pass mimetype to render_to_viewport method	2012-08-15 01:34:29 -04:00
Roberto Rosario	576a2cc643	Support passing MIMETypes and actual document filenames to TextParser for better lexer guessing	2012-08-06 03:00:09 -04:00
Roberto Rosario	f77c886e51	Register TextParser for OCR based on the list of MIME type it supports	2012-07-28 04:45:45 -04:00
Roberto Rosario	0ec1cc3823	Add text parser and render using Pygments	2012-07-28 02:22:45 -04:00
Roberto Rosario	58f027db60	Clean up (unused imports, PEP8, etc)	2012-06-08 16:43:54 -04:00
Roberto Rosario	2849fd6e79	Detect blank pages with the PopplerParser, raise ParserError to fallback to OCR if all parsers fail	2012-06-03 21:08:22 -04:00
Roberto Rosario	d1ccca4d2e	Final updates for the PopplerParser	2012-05-30 16:15:57 -04:00
Roberto Rosario	babd3ec2f3	Refacto parser system to be class based, add poppler based PDF parser, allow multiple parsers for each mimetype with fallback	2012-05-30 12:57:25 -04:00
Roberto Rosario	f9a3c4611b	PEP8 cleanups, remove OCR_CACHE_URI	2012-01-18 13:53:02 -04:00
Roberto Rosario	1e38369919	Update parser to use the latest version of a document when extracting text	2011-12-02 05:56:34 -04:00
Roberto Rosario	922971274f	Add office document text extractor	2011-12-01 04:54:14 -04:00
Roberto Rosario	90e876ca93	Code cleanup	2011-07-21 11:46:15 -04:00
Roberto Rosario	d566dfbb1d	Added the first text parser backend (PDF) and updated the requirements files and README	2011-07-18 04:06:59 -04:00

15 Commits