Changed from python's multiprocessing to celery to handle concurrency

This commit is contained in:
Roberto Rosario
2011-02-17 03:45:30 -04:00
parent 409a52af95
commit 478fb3502e
13 changed files with 102 additions and 87 deletions

View File

@@ -75,3 +75,17 @@ Fancybox - FancyBox is a tool for displaying images, html content and
unpaper - post-processing scanned and photocopied book pages
Jens Gulden 2005-2007 - unpaper@jensgulden.de.
http://unpaper.berlios.de/
celery - Celery is an open source asynchronous task queue/job queue
based on distributed message passing. It is focused on real-time
operation, but supports scheduling as well.
Copyright 2009-2011, Ask Solem & contributors
http://ask.github.com/celery/getting-started/introduction.html
django-celery - django-celery provides Celery integration for Django;
Using the Django ORM and cache backend for storing
results, autodiscovery of task modules for applications
listed in INSTALLED_APPS, and more.
Copyright Ask Solem & contributors
http://github.com/ask/django-celery/

View File

@@ -12,3 +12,4 @@
* Added views to create, edit and grant/revoke permissions to roles
* Apply default transformations to document before OCR
* Added unpaper to the OCR convertion pipe
* Added support for concurrent, queued OCR processing using celery

View File

@@ -32,7 +32,11 @@
* DB stored transformations - DONE
* Recognize multi-page documents - DONE
* Add unpaper to pre OCR document cleanup - DONE
* Count pages in a PDF file http://pybrary.net/pyPdf/ - NOT NEEDED
* Support distributed OCR queues (RabbitMQ & Celery?) - DONE
* MuliThreading deferred OCR - DONE
* Role editing view under setup - STARTED
* Scheduled maintenance (cleanup, deferred OCR's) - DONE
* Document list filtering by metadata
* Filterform date filtering widget
* Validate GET data before saving file
@@ -46,20 +50,15 @@
from a queryset
* Allow metadata entry form to mix required and non required metadata
* Link to delete and recreate all document links
* MuliThreading deferred OCR
* Versioning support
* Generic document anotations using layer overlays
* Workflows
* Scheduled maintenance (cleanup, deferred OCR's)
* Add tags to documents
* Field for document language or autodetect
* Count pages in a PDF file http://pybrary.net/pyPdf/
* Download a document in diffent formats: (jpg, png, pdf)
* Download a document in diffent formats: (jpg, png, pdf)
* Cache.cleanup function to delete cached images when document hash changes
* Divide navigation links search by object and by view
* Add show_summary method to model to display as results of a search
* Support distributed OCR queues (RabbitMQ & Celery?)
* DXF viewer - http://code.google.com/p/dxf-reader/source/browse/#svn%2Ftrunk
* Support spreadsheets, wordprocessing docs using openoffice in server mode
* WebDAV support