Commit Graph

405 Commits

Author SHA1 Message Date
Roberto Rosario
bc816ccdda OCR: Turn off parsing in OCR tests properly
The document parsing was being turned off in the OCR tests
by setting the binary to an invalid value. A proper way
to disable automatic parsing was added in a previous commit
and this commit updates the test case class to use that method.

Signed-off-by: Roberto Rosario <Roberto.Rosario@mayan-edms.com>
2018-12-12 21:06:58 -04:00
Roberto Rosario
55e9b2263c Celery: Update Celery to version 4.1.1
Upgrade Celery version used from 3.1.26 to 4.1.1. The following
settings have been renamed: CELERY_ALWAYS_EAGER to
CELERY_TASK_ALWAYS_EAGER, BROKER_URL to CELERY_BROKER_URL.

Signed-off-by: Roberto Rosario <Roberto.Rosario@mayan-edms.com>
2018-12-08 22:49:15 -04:00
Roberto Rosario
3ae991c9cd Style: Minor PEP8 code cleanups
Signed-off-by: Roberto Rosario <Roberto.Rosario@mayan-edms.com>
2018-12-07 20:24:18 -04:00
Roberto Rosario
255b1c75ea Style: Prepend "operation_" to data migrations
Prepend "operation_" to the data migration functions
for clear purpose. Add keyword arguments to the RunPython
migration opration.

Signed-off-by: Roberto Rosario <Roberto.Rosario@mayan-edms.com>
2018-12-07 17:28:22 -04:00
Roberto Rosario
d1945b6190 OCR: Update app to use document image cache
Update the OCR app to use the document image cache instead
of trying to read the image file directly from
the document storage.

Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-12-05 04:35:12 -04:00
Roberto Rosario
8e69178e07 Project: Switch to full app paths
Instead of inserting the path of the apps into the Python app,
the apps are now referenced by their full import path.

This app name claves with external or native Python libraries.
Example: Mayan statistics app vs. Python new statistics library.

Every app reference is now prepended with 'mayan.apps'.

Existing config.yml files need to be updated manually.

Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-12-05 02:04:20 -04:00
Roberto Rosario
2ca38c20b0 Tests: Fix failing tests
Fix failing tests in the OCR and parsing apps.

Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-11-27 17:17:07 -04:00
Roberto Rosario
67e79d0e19 OCR, Parsing: Revert iterator stop
Revert how the OCR and document parsing generators end
their iteration. Originally they issue an empty return,
then a blank yield was added. This commit reverts the
blank yield and restores the original 'return' behavior.

Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-11-27 17:15:38 -04:00
Roberto Rosario
e9411514c7 PEP8: Code cleanup
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-11-27 05:28:55 -04:00
Roberto Rosario
aaf9f7a8be OCR: Add 'ocr_content' attribute
Add the 'ocr_content' attribute to documents to allow access
to a document's OCR content for indexing and other purposes.

Fixes the OCR indexing failing test.

Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-11-27 05:20:31 -04:00
Roberto Rosario
51f15a3131 Settings: Update defaults formats
Update the default values of the settings which pass
arguments to backends to be valid Python values and not
YAML strings.

Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-11-26 17:45:43 -04:00
Roberto Rosario
d5224d93a7 Settings: Remove support for quoted settings
Instead of passing strings as arguments to backends, all settings must
be formatted according to YAML specifications. This is to remove the
need to add separate YAML parsing to each backend argument in each
app that needs it. Argument passing to backends is not fully
uniform.

Users need to update their config files.
  Example:

    DOCUMENTS_STORAGE_BACKEND_ARGUMENTS: '{location: /home/rosarior/development/mayan-edms/mayan/media/document_storage}'

  must be changed to:

    DOCUMENTS_STORAGE_BACKEND_ARGUMENTS:
      location: /home/rosarior/development/mayan-edms/mayan/media/document_storage

  Example 2:

    CONVERTER_GRAPHICS_BACKEND_CONFIG: '        {            libreoffice_path: /usr/bin/libreoffice,            pdftoppm_dpi:
    300,            pdftoppm_format: jpeg,            pdftoppm_path: /usr/bin/pdftoppm,            pdfinfo_path:
    /usr/bin/pdfinfo,            pillow_format: JPEG        }    '

  must be changed to:

    CONVERTER_GRAPHICS_BACKEND_CONFIG:
      libreoffice_path: /usr/bin/libreoffice
      pdftoppm_dpi: 300
      pdftoppm_format: jpeg
      pdftoppm_path: /usr/bin/pdftoppm
      pdfinfo_path: /usr/bin/pdfinfo
      pillow_format: JPEG

  Example 3:

    OCR_BACKEND_ARGUMENTS: ''

  must be changed to:

    OCR_BACKEND_ARGUMENTS: {}

  Settings that need to be updated are:

  - COMMON_SHARED_STORAGE_ARGUMENTS
  - CONVERTER_GRAPHICS_BACKEND_CONFIG
  - DOCUMENTS_CACHE_STORAGE_BACKEND_ARGUMENTS
  - DOCUMENTS_STORAGE_BACKEND_ARGUMENTS
  - OCR_BACKEND_ARGUMENTS
  - SIGNATURES_STORAGE_BACKEND_ARGUMENTS
  - SOURCES_STAGING_FILE_CACHE_STORAGE_BACKEND_ARGUMENTS

  The following error will appear in the console if a setting is not yet
  updated to this new format::

      TypeError: type object argument after ** must be a mapping, not str

Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-11-26 17:27:57 -04:00
Roberto Rosario
b04b205fb6 Add docstrings for almost all models
Also adds docstring to some managers and model methods.

Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-11-24 22:56:35 -04:00
Roberto Rosario
5a8455bfc2 Update translation files.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-10-29 13:24:07 -04:00
Roberto Rosario
bcd2427ab6 Move the noop OCR backend to the right place.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-10-18 16:21:12 -04:00
Roberto Rosario
a99b044555 Code style improvement. Test code consolidation. PEP8 cleanups.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-10-06 02:13:36 -04:00
Roberto Rosario
fb83a838fb Add support for indexing on OCR content changes.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-10-02 03:54:29 -04:00
Roberto Rosario
26ac7de70b Synchronize and compile translations
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-09-26 22:50:48 -04:00
Roberto Rosario
3c2557fb47 Update translation source files.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-09-26 22:29:54 -04:00
Roberto Rosario
eda8d18146 Database access in data migrations defaults to the 'default' database. Force it to the user selected database instead.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-09-19 20:53:04 -04:00
Roberto Rosario
a986b58338 Prepare release files.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-09-17 18:52:26 -04:00
Roberto Rosario
ac07d4a63f Add more icons to links.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-09-17 03:09:04 -04:00
Roberto Rosario
a372fc5a07 Improve model help texts. Add respective migrations.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-09-17 02:46:16 -04:00
Roberto Rosario
03c54395cc Refactor the ModelAttribute class into two separate classes: ModelAttribute for executable model attributes and ModelField for actual ORM fields. Expose more document fields for use in smart links.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-09-17 02:43:04 -04:00
Roberto Rosario
55930689bb Update language files.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-09-12 03:45:51 -04:00
Roberto Rosario
4ae7a32443 Update OCR app tests to work with Python 3.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-09-06 16:59:38 -04:00
Roberto Rosario
c312a2a304 Remove the duplicated setting pdftotext_path from the OCR path. This is now handled by the document parsing app.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-09-01 02:12:08 -04:00
Roberto Rosario
85a5bd995f Update failing OCR tests.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-08-30 13:31:08 -04:00
Roberto Rosario
5eba4f67e5 Add link to view a specific page's OCR content.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-08-28 03:39:42 -04:00
Roberto Rosario
e6db0ff098 The document type OCR setup permission can now be granted for individual document types. Instead of the document OCR permissions, the document type OCR setting permission is required to view the global OCR error list.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-08-27 03:55:45 -04:00
Roberto Rosario
3c57f7ffa7 Merge branch 'master' into merge_master
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-08-22 03:18:30 -04:00
Roberto Rosario
8e39016f12 Code cleanups.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-08-21 18:57:38 -04:00
Roberto Rosario
e400327770 Language translation synchonization.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-08-16 21:12:11 -04:00
Roberto Rosario
e18c043c1f Improve natural key handing for the Document, Metadata, DocumentMetadata, DocumentTypeOCRSetting and UserProfileLocale models.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-08-02 16:00:22 -04:00
Roberto Rosario
c665e75871 Improve serialization migration for the models: Document, DocumentVersion, DocumentMetadata and DocumentTypeOCRSettings
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-08-01 05:09:43 -04:00
Roberto Rosario
2e3ae3f78b Merge branch 'esclear/mayan-edms-patch-1' into merge_patch-1 2018-07-08 02:37:48 -04:00
Roberto Rosario
fd87e28113 French and Polish language translation updates.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-07-08 02:35:43 -04:00
Daniel Albert
8cea56aceb Fix string concatenation to fix error messages
Without using parentheses, the strings are not joined.
2018-07-02 20:57:45 +00:00
Roberto Rosario
aa38b1c0e8 PEP8 cleanups.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-06-29 03:10:17 -04:00
Roberto Rosario
f5e3470deb Update the OCR app to use the new Icon class.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-06-28 16:47:23 -04:00
Roberto Rosario
85926ae8f8 The conditional_escape call caused downloaded OCR text to contain HTML entities like &quot;
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-06-28 02:04:49 -04:00
Roberto Rosario
0f6d33140a Synchronize translation files.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-06-07 23:49:43 -04:00
Roberto Rosario
ffbac43293 Fix failing OCR test.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-06-03 14:42:55 -04:00
Roberto Rosario
15badf4ff9 Update single and multiple document OCR submit views to use MultipleObjectConfirmActionView instead of the deprecated MultipleInstanceActionMixin.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-05-30 19:12:20 -04:00
Roberto Rosario
f7ca35c9b6 Download and compile translations from Transifex.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-04-11 20:39:30 -04:00
Roberto Rosario
0641b568ee Update translation sources and compiled files.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-04-11 15:18:50 -04:00
Roberto Rosario
bce5411ea7 Fix typos.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-04-10 21:22:25 -04:00
Roberto Rosario
3484dc8f33 Update translation source and compiled files.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-04-10 04:23:16 -04:00
Roberto Rosario
99c4f2ccfb Use the document image generation task to create the images for the OCR.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-04-05 19:31:55 -04:00
Roberto Rosario
a0b7561ed7 Add support for passing arguments to the OCR backend.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
2018-04-05 17:23:32 -04:00