TODO, WISHLIST =========== * Fix repeated search results - DONE * File renaming dropdown - DONE * Create indexing filesystem folders from document type metadata type - DONE * Document detail to view document metadata - DONE * Add file checksums (hashlib) - DONE * Delete symlinks when document is deleted - DONE * Handle NULL mimetypes during model save - DONE * Raise exception instead of returning error msg - DONE * Option to delete source staging file after upload - DONE * Jquery upload document upload form with ajax widget - NOT NEEDED (commit: b0f31f2a8f82ff0daca081005f2fcae3f5573df5) * Rename dropbox from document edit view - DONE * Ability to rename staging file during upload - DONE * Implement single sign on or LDAP for intranets - DEFERRED, provided by Django AuthBackends * Database storage backend (sql, nosql: [mongodb]) - DEFERRED, provided by https://bitbucket.org/david/django-storages/wiki/Home * Staging file previews - DONE * Display file size in list and details - DONE * Document previews - DONE * Document previews on demand w/ imagemagick - DONE * Add document description - DONE * Integrate with http://code.google.com/p/pytesser/ - DEFERRED, done using Popen * Show abbreviated uuid in document list - DEFERRED, Impractical * Update symlinks when document or metadata changed - DONE * Cache thumbnails and preview by document hash not by uuid - DONE * Show document metadata in document list - DONE * Add css grids - DONE * If theres only one document type on db skip step 1 of wizard - DONE * Be able to delete staging file - DONE * Group documents by metadata - DONE * Permissions - DONE * Roles - DONE * Assign default role to new users - DONE * DB stored transformations - DONE * Recognize multi-page documents - DONE * Add unpaper to pre OCR document cleanup - DONE * Count pages in a PDF file http://pybrary.net/pyPdf/ - NOT NEEDED * Support distributed OCR queues (RabbitMQ & Celery?) - DONE * MuliThreading deferred OCR - DONE * Handle ziped or rar archives - DONE (zip only) * Scheduled maintenance (cleanup, deferred OCR's) - DONE * Tesserat default option ocr setup - DONE * Check duplicated files using checksum - DONE * Link to delete and recreate all document links - DONE * Indicate in generic list which don't exist in storage backend - DONE * Change to model signals - NOT NEEDED, found way to prevent save method recursion * Show current page in generic list template - DONE * Enable/disable ocr queue view & links - DONE * Role editing view under setup - STARTED * Document list filtering by metadata * Filterform date filtering widget * Validate GET data before saving file * Show last 5 recent metadata setups for easy switch * Allow document type to be changed in document edit view * Encrypting storage backend * Document model's delete method might not get called when deleting in bulk from a queryset * Allow metadata entry form to mix required and non required metadata * Versioning support * Generic document anotations using layer overlays * Workflows * Add tags to documents * Field for document language or autodetect * Cache.cleanup function to delete cached images when document hash changes * Divide navigation links search by object and by view * Add show_summary method to model to display as results of a search * DXF viewer - http://code.google.com/p/dxf-reader/source/browse/#svn%2Ftrunk * Support spreadsheets, wordprocessing docs using openoffice in server mode * WebDAV support * Include annotations in transformed documents downloads * Implement permissions decorators * Block Setup menu item to non staff and non superuser users * Don't append an extension separator if extension is non existant * Publish document option * Merge all generic templates into template widget object based rendering * Multiple document select in generic list template * Multiple document actions (clear transformations, delete, publish) Permissions =========== * Add permissions support to menus Documents ========= * Restrict view permission free form rename * Skip step 2 of wizard (metadata) if no document type metadata types have been defined * Tile based image server * Do separate default transformations for staging and for local uploads * Download a document in different formats: (jpg, png, pdf) * Download metadata group documents as a single zip file * Download original document or transformed document * Display preferences 'document transformations' (Rotation, default zoom) * Document view temp transformations * Gallery view for document groups Filesystem ========== * Avoid metadata indexing folders name clash Search ====== * Advanced search by metadata fields * Save advanced search by metadata setup as a virtual folder Convert ======= * Create mimetype convertion map for convert app * Migrate ocr app tesseract handling to convert app - STARTED * Add timeout support convert tasks Storage ======= * Storage backend to storage backend copy support, to move/migrate document to new storage backend GridFSStorage ============= * Implement user settings - DONE * Implement delete-open soft locking - DEFERRED * Implement master_slave_connection * if exists adding _ plus a counter - avoid file versioning OCR === * Don't do OCR on wordproccessing or spreadsheet document, strip tags and store text * Add timeout support to ocr tasks * Allow for OCR document requeue on error and requeue limit * Multiple ocr queue support - STARTED * Add per node max ocr concurrent execution ISSUES =========== * Staging file hash colition when same file with different name, newhash = content hash + filename hash * Fix field error on search action for documents while processing OCR queue