Commit Graph

19 Commits

Author SHA1 Message Date
Roberto Rosario
31ea558b60 Update the REPLICATION_DELAY default to be 0 seconds 2011-12-01 04:46:00 -04:00
Roberto Rosario
648be556a6 Finished adapting the OCR app to the new transformations refactor 2011-07-19 04:21:36 -04:00
Roberto Rosario
5bfd607b31 Removed pdftotext from the requirements, move unpaper calling to the OCR app 2011-07-18 04:06:19 -04:00
Roberto Rosario
9e61213241 Created new smart_settings app and move everything related to app settings to it 2011-05-07 01:32:02 -04:00
Roberto Rosario
7f2c563192 Converted whole project to a smarter method of defining app settings 2011-05-07 01:15:40 -04:00
Roberto Rosario
7469fe991f Made the OCR cache backend used for locking configurable, move ocr locking to queued document from periodic task, added again a random delay fallback in case no cache backend is used 2011-05-06 15:31:49 -04:00
Roberto Rosario
ebdcede59f Made the queue processing interval configurable by means of a new setting: OCR_QUEUE_PROCESSING_INTERVAL 2011-04-23 05:38:59 -04:00
Roberto Rosario
eaaaa5b645 Added support for the command line program pdftotext from the poppler-utils packages to extract text from PDF documents without doing OCR 2011-04-15 23:59:52 -04:00
Roberto Rosario
6b5a17af39 Made English the default language for Tesseract if none is specified 2011-04-13 03:25:45 -04:00
Roberto Rosario
71a3c218f4 PEP8, pylint and django-lint cleanups 2011-04-08 02:09:39 -04:00
Roberto Rosario
283df926d1 Made automatic OCR a function of the OCR app and not of Documents app (via signals)
Renamed setup option DOCUMENT_AUTOMATIC_OCR to OCR_AUTOMATIC_OCR
2011-04-04 15:36:00 -04:00
Roberto Rosario
3cb0f37b5b Made the concurrent ocr code more granular, per node, every node can handle different amounts of concurrent ocr tasks 2011-03-22 04:17:48 -04:00
Roberto Rosario
f9ab61647e Reduced default delay time 2011-03-22 03:43:18 -04:00
Roberto Rosario
bbcc0ead65 * Added a new option OCR_REPLICATION_DELAY to allow the storage some time for replication before attempting to do OCR to a document 2011-03-21 12:24:42 -04:00
Roberto Rosario
6a9e114acb Set all *.py files permissions to 644 2011-03-07 12:15:25 -04:00
Roberto Rosario
595d7227a2 Added navigation link from document page view and document page transformation back to document view 2011-02-17 23:27:25 -04:00
Roberto Rosario
478fb3502e Changed from python's multiprocessing to celery to handle concurrency 2011-02-17 03:45:30 -04:00
Roberto Rosario
d6afcc64bb Changed file permissions 2011-02-09 13:55:01 -04:00
Roberto Rosario
6569faad11 Added OCR capabilites 2011-02-09 02:12:14 -04:00