Commit Graph

94 Commits

Author SHA1 Message Date
Roberto Rosario
20dadafd61 Message tweaks and translation updates 2011-04-27 14:44:46 -04:00
Roberto Rosario
df60924ebb Implement local task locking using Django locmem cache backend 2011-04-25 16:41:42 -04:00
Roberto Rosario
cd77a8ffdf Fixed OCR's active node list view when there are no active nodes 2011-04-25 16:40:33 -04:00
Roberto Rosario
f155c0725b Prevented undefined variable use 2011-04-25 12:39:09 -04:00
Roberto Rosario
a7be566f50 Added OCR view displaying all active OCR tasks from all cluster nodes 2011-04-25 02:17:13 -04:00
Roberto Rosario
908183fad9 Merged two for loops, and make it so a processing document in a multi select doesn't abort the re-queueing 2011-04-24 04:48:42 -04:00
Roberto Rosario
d33ced5e34 Make sure processing documents cannot be deleted 2011-04-24 04:42:49 -04:00
Roberto Rosario
c505dad667 Fixed active task checking logic 2011-04-24 04:41:53 -04:00
Roberto Rosario
202bde555b Removed unused imports 2011-04-23 22:49:06 -04:00
Roberto Rosario
c18316d3bf Added detection and reset of orphaned ocr documents being left as 'processing' when celery dies 2011-04-23 05:40:24 -04:00
Roberto Rosario
ebdcede59f Made the queue processing interval configurable by means of a new setting: OCR_QUEUE_PROCESSING_INTERVAL 2011-04-23 05:38:59 -04:00
Roberto Rosario
2d24651189 Changed the ocr queue processing task from a Class to a function 2011-04-23 04:53:50 -04:00
Roberto Rosario
2a744cefea PEP8, pylint cleanups and removal of relative imports 2011-04-23 02:49:07 -04:00
Roberto Rosario
dc08f96414 Spanish translation updates 2011-04-21 04:36:40 -04:00
Roberto Rosario
680c33227c Reorganized and trimmed document actions links 2011-04-21 01:04:48 -04:00
Roberto Rosario
eaaaa5b645 Added support for the command line program pdftotext from the poppler-utils packages to extract text from PDF documents without doing OCR 2011-04-15 23:59:52 -04:00
Roberto Rosario
f87beff00e Fixed Non-ASCII character error in the English OCR cleanup backend 2011-04-13 03:26:55 -04:00
Roberto Rosario
6b5a17af39 Made English the default language for Tesseract if none is specified 2011-04-13 03:25:45 -04:00
Roberto Rosario
6b67cff5d7 Changed the way document page count is parsed from the graphics backend, fixing issue #7 2011-04-08 03:29:48 -04:00
Roberto Rosario
71a3c218f4 PEP8, pylint and django-lint cleanups 2011-04-08 02:09:39 -04:00
Roberto Rosario
d54fd98ec5 Finished adding language specific ocr cleanup code 2011-04-07 12:23:26 -04:00
Roberto Rosario
d1ff305a3f Initial commit for the ocr_cleanup branch 2011-04-07 04:07:59 -04:00
Roberto Rosario
f66c8ec6e2 Fixed error and some warning returned by pylint 2011-04-05 00:04:11 -04:00
Roberto Rosario
283df926d1 Made automatic OCR a function of the OCR app and not of Documents app (via signals)
Renamed setup option DOCUMENT_AUTOMATIC_OCR to OCR_AUTOMATIC_OCR
2011-04-04 15:36:00 -04:00
Roberto Rosario
1d48325a92 Clear node name when requeueing a document for OCR 2011-04-04 09:24:25 -04:00
Roberto Rosario
c2ba7eaf1d Spanish translation updates 2011-04-01 02:45:27 -04:00
Roberto Rosario
604cd60255 Clear last ocr results when requeueing a document 2011-03-25 16:37:30 -04:00
Roberto Rosario
f417344758 Introduce a random delay to each node to further reduce the chance of a race condition, until row locking can be implemented or is implemented by Django 2011-03-23 17:03:00 -04:00
Roberto Rosario
9765a7f607 Added an additional check to lower the chance of OCR race conditions between nodes 2011-03-23 16:45:49 -04:00
Roberto Rosario
a3fbe7f896 Allow OCR requeue of pending documents 2011-03-23 15:45:50 -04:00
Roberto Rosario
0f1526f3d8 Allow deletion of non existing documents from OCR queue 2011-03-23 09:51:54 -04:00
Roberto Rosario
3cb0f37b5b Made the concurrent ocr code more granular, per node, every node can handle different amounts of concurrent ocr tasks 2011-03-22 04:17:48 -04:00
Roberto Rosario
d0942a203b Reimplemented OCR delay code, only delay new document 2011-03-22 03:46:34 -04:00
Roberto Rosario
f9ab61647e Reduced default delay time 2011-03-22 03:43:18 -04:00
Roberto Rosario
70e5e4c470 Moved navigation code to its own app 2011-03-22 00:54:43 -04:00
Roberto Rosario
75dc4c84b3 Removed old code 2011-03-21 18:49:34 -04:00
Roberto Rosario
75324ce581 Disabled single OCR document action as multple actions are now enabled by default 2011-03-21 16:32:01 -04:00
Roberto Rosario
5d9302e583 Added multi ocr queued document delete support 2011-03-21 16:29:04 -04:00
Roberto Rosario
bef40d958e Added OCR multi document re-queue support 2011-03-21 16:19:19 -04:00
Roberto Rosario
bbcc0ead65 * Added a new option OCR_REPLICATION_DELAY to allow the storage some time for replication before attempting to do OCR to a document 2011-03-21 12:24:42 -04:00
Roberto Rosario
31d1641fa4 Added simple statistics page (total used storage, total docs, etc) 2011-03-20 04:35:21 -04:00
Roberto Rosario
fe2c031dfb Added missing alt attribute 2011-03-16 17:04:18 -04:00
Roberto Rosario
33089ccd08 Don't display an error for the thumbnail of non existant documents 2011-03-16 16:37:30 -04:00
Roberto Rosario
c9d82da28a Added indexing flags to ocr model 2011-03-16 04:57:59 -04:00
Roberto Rosario
9569992caf Removed debug code 2011-03-12 04:03:11 -04:00
Roberto Rosario
242c39690f Spanish translation updates 2011-03-11 14:36:14 -04:00
Roberto Rosario
0a91b7ff7d Don't allow duplicate documents in queues 2011-03-11 01:01:56 -04:00
Roberto Rosario
67c8f26d7f Renamed document queue state links 2011-03-10 00:02:04 -04:00
Roberto Rosario
cc6e8220c0 Changed ocr status display sidebar from from based to text based 2011-03-10 00:01:30 -04:00
Roberto Rosario
9bd22f65d1 Do not reinitialize document queue and/or queued document on reentry 2011-03-09 22:50:20 -04:00