Commit Graph

103 Commits

Author SHA1 Message Date
Roberto Rosario
7469fe991f Made the OCR cache backend used for locking configurable, move ocr locking to queued document from periodic task, added again a random delay fallback in case no cache backend is used 2011-05-06 15:31:49 -04:00
Roberto Rosario
472a55f5f3 Updated Spanish translation 2011-05-06 13:42:09 -04:00
Roberto Rosario
07e9b12e78 flake8 cleanups, ununsed imports and variables cleanup, changed register_diagnostics to use reverse_lazy instead of reverse 2011-05-06 10:39:54 -04:00
Roberto Rosario
438e8c89a9 Refactored the tools menu and added method for apps to register tools themselves 2011-05-03 21:58:55 -04:00
Roberto Rosario
ae35e89549 Unicode updates 2011-05-03 21:11:35 -04:00
Roberto Rosario
1e0d8d1f25 Added doctring description 2011-05-03 20:58:58 -04:00
Roberto Rosario
766567ec49 Added alt attribute to several img html tags 2011-05-01 23:10:51 -04:00
Roberto Rosario
71049f33bd Display loading spinner while document thumbnail is ready 2011-05-01 22:57:35 -04:00
Roberto Rosario
28f87690b2 Added document tagging support 2011-04-28 01:14:27 -04:00
Roberto Rosario
20dadafd61 Message tweaks and translation updates 2011-04-27 14:44:46 -04:00
Roberto Rosario
df60924ebb Implement local task locking using Django locmem cache backend 2011-04-25 16:41:42 -04:00
Roberto Rosario
cd77a8ffdf Fixed OCR's active node list view when there are no active nodes 2011-04-25 16:40:33 -04:00
Roberto Rosario
f155c0725b Prevented undefined variable use 2011-04-25 12:39:09 -04:00
Roberto Rosario
a7be566f50 Added OCR view displaying all active OCR tasks from all cluster nodes 2011-04-25 02:17:13 -04:00
Roberto Rosario
908183fad9 Merged two for loops, and make it so a processing document in a multi select doesn't abort the re-queueing 2011-04-24 04:48:42 -04:00
Roberto Rosario
d33ced5e34 Make sure processing documents cannot be deleted 2011-04-24 04:42:49 -04:00
Roberto Rosario
c505dad667 Fixed active task checking logic 2011-04-24 04:41:53 -04:00
Roberto Rosario
202bde555b Removed unused imports 2011-04-23 22:49:06 -04:00
Roberto Rosario
c18316d3bf Added detection and reset of orphaned ocr documents being left as 'processing' when celery dies 2011-04-23 05:40:24 -04:00
Roberto Rosario
ebdcede59f Made the queue processing interval configurable by means of a new setting: OCR_QUEUE_PROCESSING_INTERVAL 2011-04-23 05:38:59 -04:00
Roberto Rosario
2d24651189 Changed the ocr queue processing task from a Class to a function 2011-04-23 04:53:50 -04:00
Roberto Rosario
2a744cefea PEP8, pylint cleanups and removal of relative imports 2011-04-23 02:49:07 -04:00
Roberto Rosario
dc08f96414 Spanish translation updates 2011-04-21 04:36:40 -04:00
Roberto Rosario
680c33227c Reorganized and trimmed document actions links 2011-04-21 01:04:48 -04:00
Roberto Rosario
eaaaa5b645 Added support for the command line program pdftotext from the poppler-utils packages to extract text from PDF documents without doing OCR 2011-04-15 23:59:52 -04:00
Roberto Rosario
f87beff00e Fixed Non-ASCII character error in the English OCR cleanup backend 2011-04-13 03:26:55 -04:00
Roberto Rosario
6b5a17af39 Made English the default language for Tesseract if none is specified 2011-04-13 03:25:45 -04:00
Roberto Rosario
6b67cff5d7 Changed the way document page count is parsed from the graphics backend, fixing issue #7 2011-04-08 03:29:48 -04:00
Roberto Rosario
71a3c218f4 PEP8, pylint and django-lint cleanups 2011-04-08 02:09:39 -04:00
Roberto Rosario
d54fd98ec5 Finished adding language specific ocr cleanup code 2011-04-07 12:23:26 -04:00
Roberto Rosario
d1ff305a3f Initial commit for the ocr_cleanup branch 2011-04-07 04:07:59 -04:00
Roberto Rosario
f66c8ec6e2 Fixed error and some warning returned by pylint 2011-04-05 00:04:11 -04:00
Roberto Rosario
283df926d1 Made automatic OCR a function of the OCR app and not of Documents app (via signals)
Renamed setup option DOCUMENT_AUTOMATIC_OCR to OCR_AUTOMATIC_OCR
2011-04-04 15:36:00 -04:00
Roberto Rosario
1d48325a92 Clear node name when requeueing a document for OCR 2011-04-04 09:24:25 -04:00
Roberto Rosario
c2ba7eaf1d Spanish translation updates 2011-04-01 02:45:27 -04:00
Roberto Rosario
604cd60255 Clear last ocr results when requeueing a document 2011-03-25 16:37:30 -04:00
Roberto Rosario
f417344758 Introduce a random delay to each node to further reduce the chance of a race condition, until row locking can be implemented or is implemented by Django 2011-03-23 17:03:00 -04:00
Roberto Rosario
9765a7f607 Added an additional check to lower the chance of OCR race conditions between nodes 2011-03-23 16:45:49 -04:00
Roberto Rosario
a3fbe7f896 Allow OCR requeue of pending documents 2011-03-23 15:45:50 -04:00
Roberto Rosario
0f1526f3d8 Allow deletion of non existing documents from OCR queue 2011-03-23 09:51:54 -04:00
Roberto Rosario
3cb0f37b5b Made the concurrent ocr code more granular, per node, every node can handle different amounts of concurrent ocr tasks 2011-03-22 04:17:48 -04:00
Roberto Rosario
d0942a203b Reimplemented OCR delay code, only delay new document 2011-03-22 03:46:34 -04:00
Roberto Rosario
f9ab61647e Reduced default delay time 2011-03-22 03:43:18 -04:00
Roberto Rosario
70e5e4c470 Moved navigation code to its own app 2011-03-22 00:54:43 -04:00
Roberto Rosario
75dc4c84b3 Removed old code 2011-03-21 18:49:34 -04:00
Roberto Rosario
75324ce581 Disabled single OCR document action as multple actions are now enabled by default 2011-03-21 16:32:01 -04:00
Roberto Rosario
5d9302e583 Added multi ocr queued document delete support 2011-03-21 16:29:04 -04:00
Roberto Rosario
bef40d958e Added OCR multi document re-queue support 2011-03-21 16:19:19 -04:00
Roberto Rosario
bbcc0ead65 * Added a new option OCR_REPLICATION_DELAY to allow the storage some time for replication before attempting to do OCR to a document 2011-03-21 12:24:42 -04:00
Roberto Rosario
31d1641fa4 Added simple statistics page (total used storage, total docs, etc) 2011-03-20 04:35:21 -04:00