OCR: Add 'ocr_content' attribute

Add the 'ocr_content' attribute to documents to allow access
to a document's OCR content for indexing and other purposes.

Fixes the OCR indexing failing test.

Signed-off-by: Roberto Rosario <roberto.rosario.gonzalez@gmail.com>
This commit is contained in:
Roberto Rosario
2018-11-27 05:20:31 -04:00
parent 0f5625a356
commit aaf9f7a8be
5 changed files with 20 additions and 10 deletions

View File

@@ -13,6 +13,11 @@ def get_document_ocr_content(document):
try:
page_content = page.ocr_content.content
except DocumentPageOCRContent.DoesNotExist:
pass
yield ''
else:
yield force_text(page_content)
@property
def document_property_ocr_content(self):
return ' '.join(get_document_ocr_content(self))