Roberto Rosario
|
eaaaa5b645
|
Added support for the command line program pdftotext from the poppler-utils packages to extract text from PDF documents without doing OCR
|
2011-04-15 23:59:52 -04:00 |
|
Roberto Rosario
|
f87beff00e
|
Fixed Non-ASCII character error in the English OCR cleanup backend
|
2011-04-13 03:26:55 -04:00 |
|
Roberto Rosario
|
6b5a17af39
|
Made English the default language for Tesseract if none is specified
|
2011-04-13 03:25:45 -04:00 |
|
Roberto Rosario
|
6b67cff5d7
|
Changed the way document page count is parsed from the graphics backend, fixing issue #7
|
2011-04-08 03:29:48 -04:00 |
|
Roberto Rosario
|
71a3c218f4
|
PEP8, pylint and django-lint cleanups
|
2011-04-08 02:09:39 -04:00 |
|
Roberto Rosario
|
d54fd98ec5
|
Finished adding language specific ocr cleanup code
|
2011-04-07 12:23:26 -04:00 |
|
Roberto Rosario
|
d1ff305a3f
|
Initial commit for the ocr_cleanup branch
|
2011-04-07 04:07:59 -04:00 |
|
Roberto Rosario
|
f66c8ec6e2
|
Fixed error and some warning returned by pylint
|
2011-04-05 00:04:11 -04:00 |
|
Roberto Rosario
|
283df926d1
|
Made automatic OCR a function of the OCR app and not of Documents app (via signals)
Renamed setup option DOCUMENT_AUTOMATIC_OCR to OCR_AUTOMATIC_OCR
|
2011-04-04 15:36:00 -04:00 |
|
Roberto Rosario
|
1d48325a92
|
Clear node name when requeueing a document for OCR
|
2011-04-04 09:24:25 -04:00 |
|
Roberto Rosario
|
c2ba7eaf1d
|
Spanish translation updates
|
2011-04-01 02:45:27 -04:00 |
|
Roberto Rosario
|
604cd60255
|
Clear last ocr results when requeueing a document
|
2011-03-25 16:37:30 -04:00 |
|
Roberto Rosario
|
f417344758
|
Introduce a random delay to each node to further reduce the chance of a race condition, until row locking can be implemented or is implemented by Django
|
2011-03-23 17:03:00 -04:00 |
|
Roberto Rosario
|
9765a7f607
|
Added an additional check to lower the chance of OCR race conditions between nodes
|
2011-03-23 16:45:49 -04:00 |
|
Roberto Rosario
|
a3fbe7f896
|
Allow OCR requeue of pending documents
|
2011-03-23 15:45:50 -04:00 |
|
Roberto Rosario
|
0f1526f3d8
|
Allow deletion of non existing documents from OCR queue
|
2011-03-23 09:51:54 -04:00 |
|
Roberto Rosario
|
3cb0f37b5b
|
Made the concurrent ocr code more granular, per node, every node can handle different amounts of concurrent ocr tasks
|
2011-03-22 04:17:48 -04:00 |
|
Roberto Rosario
|
d0942a203b
|
Reimplemented OCR delay code, only delay new document
|
2011-03-22 03:46:34 -04:00 |
|
Roberto Rosario
|
f9ab61647e
|
Reduced default delay time
|
2011-03-22 03:43:18 -04:00 |
|
Roberto Rosario
|
70e5e4c470
|
Moved navigation code to its own app
|
2011-03-22 00:54:43 -04:00 |
|
Roberto Rosario
|
75dc4c84b3
|
Removed old code
|
2011-03-21 18:49:34 -04:00 |
|
Roberto Rosario
|
75324ce581
|
Disabled single OCR document action as multple actions are now enabled by default
|
2011-03-21 16:32:01 -04:00 |
|
Roberto Rosario
|
5d9302e583
|
Added multi ocr queued document delete support
|
2011-03-21 16:29:04 -04:00 |
|
Roberto Rosario
|
bef40d958e
|
Added OCR multi document re-queue support
|
2011-03-21 16:19:19 -04:00 |
|
Roberto Rosario
|
bbcc0ead65
|
* Added a new option OCR_REPLICATION_DELAY to allow the storage some time for replication before attempting to do OCR to a document
|
2011-03-21 12:24:42 -04:00 |
|
Roberto Rosario
|
31d1641fa4
|
Added simple statistics page (total used storage, total docs, etc)
|
2011-03-20 04:35:21 -04:00 |
|
Roberto Rosario
|
fe2c031dfb
|
Added missing alt attribute
|
2011-03-16 17:04:18 -04:00 |
|
Roberto Rosario
|
33089ccd08
|
Don't display an error for the thumbnail of non existant documents
|
2011-03-16 16:37:30 -04:00 |
|
Roberto Rosario
|
c9d82da28a
|
Added indexing flags to ocr model
|
2011-03-16 04:57:59 -04:00 |
|
Roberto Rosario
|
9569992caf
|
Removed debug code
|
2011-03-12 04:03:11 -04:00 |
|
Roberto Rosario
|
242c39690f
|
Spanish translation updates
|
2011-03-11 14:36:14 -04:00 |
|
Roberto Rosario
|
0a91b7ff7d
|
Don't allow duplicate documents in queues
|
2011-03-11 01:01:56 -04:00 |
|
Roberto Rosario
|
67c8f26d7f
|
Renamed document queue state links
|
2011-03-10 00:02:04 -04:00 |
|
Roberto Rosario
|
cc6e8220c0
|
Changed ocr status display sidebar from from based to text based
|
2011-03-10 00:01:30 -04:00 |
|
Roberto Rosario
|
9bd22f65d1
|
Do not reinitialize document queue and/or queued document on reentry
|
2011-03-09 22:50:20 -04:00 |
|
Roberto Rosario
|
9bcd2d33ed
|
Added debuging loging
|
2011-03-09 22:50:03 -04:00 |
|
Roberto Rosario
|
f1771158d6
|
Fixed OCR queue list showing wrong thumbnail
|
2011-03-09 12:59:16 -04:00 |
|
Roberto Rosario
|
739c2ee299
|
Converted modules to use the new simpler permission checking
|
2011-03-09 01:20:07 -04:00 |
|
Roberto Rosario
|
2eafc75d29
|
Revert ocr issue test
|
2011-03-08 01:05:32 -04:00 |
|
Roberto Rosario
|
b0700c5729
|
Try to fix issue #2
|
2011-03-07 23:40:49 -04:00 |
|
Roberto Rosario
|
bc4c3b6c75
|
Remove unused function
|
2011-03-07 23:40:35 -04:00 |
|
Roberto Rosario
|
e4912a8d4d
|
Close file descriptors to prevent memory leaks
|
2011-03-07 23:22:53 -04:00 |
|
Roberto Rosario
|
efdd180483
|
Show document thumbnail in document ocr queue list
|
2011-03-07 19:24:27 -04:00 |
|
Roberto Rosario
|
86ed128dbe
|
Make ocr document date submitted column non breakable
|
2011-03-07 19:22:00 -04:00 |
|
Roberto Rosario
|
118e3d2e4a
|
Merge remote branch 'origin/master'
|
2011-03-07 18:20:37 -04:00 |
|
Roberto Rosario
|
5563e74e77
|
Fix permissions once more, directories to 755 and files to 644
|
2011-03-07 12:27:58 -04:00 |
|
Roberto Rosario
|
6a9e114acb
|
Set all *.py files permissions to 644
|
2011-03-07 12:15:25 -04:00 |
|
Roberto Rosario
|
7eee9c44f4
|
* Added document queue property side bar window to the document queue list view
|
2011-03-06 02:35:42 -04:00 |
|
Roberto Rosario
|
d05295bf54
|
Added links, views and permissions to disable or enable an OCR queue
|
2011-03-06 00:47:16 -04:00 |
|
Roberto Rosario
|
661d38aa41
|
Spanish translation updates
|
2011-03-05 19:52:50 -04:00 |
|