The document parsing was being turned off in the OCR tests
by setting the binary to an invalid value. A proper way
to disable automatic parsing was added in a previous commit
and this commit updates the test case class to use that method.
Signed-off-by: Roberto Rosario <Roberto.Rosario@mayan-edms.com>
Upgrade Celery version used from 3.1.26 to 4.1.1. The following
settings have been renamed: CELERY_ALWAYS_EAGER to
CELERY_TASK_ALWAYS_EAGER, BROKER_URL to CELERY_BROKER_URL.
Signed-off-by: Roberto Rosario <Roberto.Rosario@mayan-edms.com>
Prepend "operation_" to the data migration functions
for clear purpose. Add keyword arguments to the RunPython
migration opration.
Signed-off-by: Roberto Rosario <Roberto.Rosario@mayan-edms.com>
Update the OCR app to use the document image cache instead
of trying to read the image file directly from
the document storage.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
Instead of inserting the path of the apps into the Python app,
the apps are now referenced by their full import path.
This app name claves with external or native Python libraries.
Example: Mayan statistics app vs. Python new statistics library.
Every app reference is now prepended with 'mayan.apps'.
Existing config.yml files need to be updated manually.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
Revert how the OCR and document parsing generators end
their iteration. Originally they issue an empty return,
then a blank yield was added. This commit reverts the
blank yield and restores the original 'return' behavior.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
Add the 'ocr_content' attribute to documents to allow access
to a document's OCR content for indexing and other purposes.
Fixes the OCR indexing failing test.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
Update the default values of the settings which pass
arguments to backends to be valid Python values and not
YAML strings.
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
Instead of passing strings as arguments to backends, all settings must
be formatted according to YAML specifications. This is to remove the
need to add separate YAML parsing to each backend argument in each
app that needs it. Argument passing to backends is not fully
uniform.
Users need to update their config files.
Example:
DOCUMENTS_STORAGE_BACKEND_ARGUMENTS: '{location: /home/rosarior/development/mayan-edms/mayan/media/document_storage}'
must be changed to:
DOCUMENTS_STORAGE_BACKEND_ARGUMENTS:
location: /home/rosarior/development/mayan-edms/mayan/media/document_storage
Example 2:
CONVERTER_GRAPHICS_BACKEND_CONFIG: ' { libreoffice_path: /usr/bin/libreoffice, pdftoppm_dpi:
300, pdftoppm_format: jpeg, pdftoppm_path: /usr/bin/pdftoppm, pdfinfo_path:
/usr/bin/pdfinfo, pillow_format: JPEG } '
must be changed to:
CONVERTER_GRAPHICS_BACKEND_CONFIG:
libreoffice_path: /usr/bin/libreoffice
pdftoppm_dpi: 300
pdftoppm_format: jpeg
pdftoppm_path: /usr/bin/pdftoppm
pdfinfo_path: /usr/bin/pdfinfo
pillow_format: JPEG
Example 3:
OCR_BACKEND_ARGUMENTS: ''
must be changed to:
OCR_BACKEND_ARGUMENTS: {}
Settings that need to be updated are:
- COMMON_SHARED_STORAGE_ARGUMENTS
- CONVERTER_GRAPHICS_BACKEND_CONFIG
- DOCUMENTS_CACHE_STORAGE_BACKEND_ARGUMENTS
- DOCUMENTS_STORAGE_BACKEND_ARGUMENTS
- OCR_BACKEND_ARGUMENTS
- SIGNATURES_STORAGE_BACKEND_ARGUMENTS
- SOURCES_STAGING_FILE_CACHE_STORAGE_BACKEND_ARGUMENTS
The following error will appear in the console if a setting is not yet
updated to this new format::
TypeError: type object argument after ** must be a mapping, not str
Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>