From d13c4443126c2b1af49ee213b9313ff59a85a567 Mon Sep 17 00:00:00 2001 From: Roberto Rosario Date: Mon, 24 Oct 2016 01:20:43 -0400 Subject: [PATCH] Add tesseract homepage link and note on how to add extra languages. --- docs/topics/ocr_backend.rst | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/docs/topics/ocr_backend.rst b/docs/topics/ocr_backend.rst index ab0de3c835..cdef0c5fff 100644 --- a/docs/topics/ocr_backend.rst +++ b/docs/topics/ocr_backend.rst @@ -2,8 +2,9 @@ OCR backend =========== -Mayan EDMS ships an OCR backend that uses the FLOSS engine Tesseract, but it can -use other engines. To support other engines a wrapper that subclasess the +Mayan EDMS ships an OCR backend that uses the FLOSS engine Tesseract +(https://github.com/tesseract-ocr/tesseract/), but it can +use other engines. To support other engines crate a wrapper that subclasess the ``OCRBackendBase`` class defined in mayan/apps/ocr/classes. This subclass should expose the ``execute`` method. For an example of how the Tesseract backend is implemented take a look at the file ``mayan/apps/ocr/backends/tesseract.py`` @@ -13,3 +14,8 @@ OCR_BACKEND and point it to your new OCR backend class path. The default value of OCR_BACKEND is ``"ocr.backends.tesseract.Tesseract"`` +To add support to OCR more languages when using Tesseract, install the +corresponding language file. If using a Debian based OS, this command will +display the available language files: + + apt-cache search tesseract-ocr