diff --git a/docs/topics/ocr_backend.rst b/docs/topics/ocr_backend.rst index ab0de3c835..cdef0c5fff 100644 --- a/docs/topics/ocr_backend.rst +++ b/docs/topics/ocr_backend.rst @@ -2,8 +2,9 @@ OCR backend =========== -Mayan EDMS ships an OCR backend that uses the FLOSS engine Tesseract, but it can -use other engines. To support other engines a wrapper that subclasess the +Mayan EDMS ships an OCR backend that uses the FLOSS engine Tesseract +(https://github.com/tesseract-ocr/tesseract/), but it can +use other engines. To support other engines crate a wrapper that subclasess the ``OCRBackendBase`` class defined in mayan/apps/ocr/classes. This subclass should expose the ``execute`` method. For an example of how the Tesseract backend is implemented take a look at the file ``mayan/apps/ocr/backends/tesseract.py`` @@ -13,3 +14,8 @@ OCR_BACKEND and point it to your new OCR backend class path. The default value of OCR_BACKEND is ``"ocr.backends.tesseract.Tesseract"`` +To add support to OCR more languages when using Tesseract, install the +corresponding language file. If using a Debian based OS, this command will +display the available language files: + + apt-cache search tesseract-ocr