TODO, WISHLIST =========== * Fix repeated search results - DONE * File renaming dropdown - DONE * Create indexing filesystem folders from document type metadata type - DONE * Document detail to view document metadata - DONE * Add file checksums (hashlib) - DONE * Delete symlinks when document is deleted - DONE * Handle NULL mimetypes during model save - DONE * Raise exception instead of returning error msg - DONE * Option to delete source staging file after upload - DONE * Jquery upload document upload form with ajax widget - NOT NEEDED (commit: b0f31f2a8f82ff0daca081005f2fcae3f5573df5) * Rename dropbox from document edit view - DONE * Ability to rename staging file during upload - DONE * Implement single sign on or LDAP for intranets - DEFERRED, provided by Django AuthBackends * Database storage backend (sql, nosql: [mongodb]) - DEFERRED, provided by https://bitbucket.org/david/django-storages/wiki/Home * Staging file previews - DONE * Display file size in list and details - DONE * Document previews - DONE * Document previews on demand w/ imagemagick - DONE * Add document description - DONE * Integrate with http://code.google.com/p/pytesser/ - DEFERRED, done using Popen * Show abbreviated uuid in document list - DEFERRED, Impractical * Update symlinks when document or metadata changed - DONE * Cache thumbnails and preview by document hash not by uuid - DONE * Show document metadata in document list - DONE * Add css grids - DONE * If theres only one document type on db skip step 1 of wizard - DONE * Be able to delete staging file - DONE * Group documents by metadata - DONE * Permissions - DONE * Roles - DONE * Assign default role to new users - DONE * DB stored transformations - DONE * Recognize multi-page documents - DONE * Add unpaper to pre OCR document cleanup - DONE * Count pages in a PDF file http://pybrary.net/pyPdf/ - NOT NEEDED * Support distributed OCR queues (RabbitMQ & Celery?) - DONE * MuliThreading deferred OCR - DONE * Handle ziped or rar archives - DONE (zip only) * Scheduled maintenance (cleanup, deferred OCR's) - DONE * Tesserat default option ocr setup - DONE * Check duplicated files using checksum - DONE * Link to delete and recreate all document links - DONE * Indicate in generic list which don't exist in storage backend - DONE * Change to model signals - NOT NEEDED, found way to prevent save method recursion * Show current page in generic list template - DONE * Enable/disable ocr queue view & links - DONE Common ====== * Filterform date filtering widget * Divide navigation links search by object and by view * Merge all generic templates into template widget object based rendering * Multiple document select in generic list template * Keyboard navigation * Default button linking to 'Enter' and ESC key for cancel * Dismiss all messages Permissions =========== * Add permissions support to menus * Role editing view under setup - STARTED * Implement permissions decorators * Add user editing under roles menus * Workflows app Documents ========= * Restrict view permission free form rename * Skip step 2 of wizard (metadata) if no document type metadata types have been defined * Tile based image server * Do separate default transformations for staging and for local uploads * Download a document in different formats: (jpg, png, pdf) * Download metadata group documents as a single zip file * Download original document or transformed document * Display preferences 'document transformations' (Rotation, default zoom) * Document view temp transformations * Gallery view for document groups * Versioning support * Generic document anotations using layer overlays * Field for document language or autodetect * Validate GET data before saving file * Multiple document actions (clear transformations, delete, publish) * Publish document option * Document list filtering by metadata * Show last 5 recent metadata setups for easy switch * Allow document type to be changed in document edit view * Document model's delete method might not get called when deleting in bulk from a queryset * Allow metadata entry form to mix required and non required metadata * Block Setup menu item to non staff and non superuser users * Include annotations in transformed documents downloads * Toggable option to include default transformation on document upload * Add document tagging * Separate free form document rename and require new permission * Test zip file upload with multi directories zip file * Don't append an extension separator if extension is non existant Filesystem serving ================== * Avoid metadata indexing folders name clash * WebDAV support Search ====== * Advanced search by metadata fields * Save advanced search by metadata setup as a virtual folder * Add show_summary method to model to display as results of a search * Cross model inclusion search - DONE Convert ======= * Create mimetype convertion map for convert app * Migrate ocr app tesseract handling to convert app - DONE * Add timeout support convert tasks * DXF viewer - http://code.google.com/p/dxf-reader/source/browse/#svn%2Ftrunk * Support spreadsheets, wordprocessing docs using openoffice in server mode * Cache.cleanup function to delete cached images when document hash changes Storage ======= * Storage backend to storage backend copy support, to move/migrate document to new storage backend * Encrypting storage backend GridFSStorage ============= * Implement user settings - DONE * Implement delete-open soft locking - DEFERRED * Implement master_slave_connection * if exists adding _ plus a counter - avoid file versioning * GridFS FUSE to filesystem serving bridge OCR === * Don't do OCR on wordproccessing or spreadsheet document, strip tags and store text * Add timeout support to ocr tasks * Allow for OCR document requeue on error and requeue limit * Multiple ocr queue support - STARTED * Add per node max ocr concurrent execution ISSUES =========== 1) Staging file hash colition when same file with different name, newhash = content hash + filename hash 2) Fix field error on search action for documents while processing OCR queue