Adds ability to validate and normalize metadata.

I felt that it would be very handy to be able to validate
user-supplied metadata.  It occurred to me that if a metadata
type had an explicit list of options, it would need no validation.
Therefore, the "lookup" field of a metadata type could be overloaded
to provide EITHER a list of items that could be selected by the user
OR a function to provide data validation.  The system, therefore,
would need to be able to discriminate between a lookup function
and a validation function.

    To this end, I created a global variable
('METADATA_AVAILABLE_VALIDATORS') to contain a dictionary of
available validation functions.  If the name specified in
'metadata_type.lookup' is present in METADATA_AVAILABLE_VALIDATORS,
the system treats the function as a validator.  Otherwise, the
function is treated as a generator of an iterable value providing
the choices for the user.

  Django contains a pre-existing mechanism to support field
validation.  A validator has a single argument (the data to
be validated).  If the argument to the validator is valid,
the validator simply returns.  If there is a problem with
the data, the validator raises a 'ValidationError' exception
and passes an error message which is then displayed by Django
as a mouseover tip in the browser. Validators to be used
with Mayan-EDMS may follow this convention (i.e., take a
single argument and raise an exception if the validation
fails).  The validators in Mayan-EDMS, however, may actually
do more!

  If a validator function RETURNS a value, that value is used
in place of the original data.  This allows the validator to
make data conform to a valid value or to "normalize" a value
before it is stored in the database.  This allows for more
uniform metadata and improves the ability to index on the
metadata values.  Lets take at a look at an example of this
functionality.

  Assume that a document requires a date (perhaps, an
"original posting date").  We can have a 'metadata_type" of
"original_posting_date", and we can create a validator with
the name "is_valid_posting_date".  The validator function
(which is placed in a module read by the settings routine),
contains the function:

def is_valid_posting_date(value):
   from dateutil import parser
   import datetime
   from django.core.exceptions import ValidationError

   try:
      dt = parser.parse(value)
   except ValueError:
      raise ValidationError('Invalid date')
   return dt.date().isoformat()

This is placed in a dictionary in the user's
settings file, thus:

import my_settings
METADATA_AVAILABLE_VALIDATORS = {
  'is_valid_posting_date':my_settings.is_valid_posting_date }

The user creates a metadata type called "original_posting_date"
with a label of "Original Posting Date" and a 'lookup' value
of "is_valid_posting_date".  When the metadata form is filled
in and submitted, the date value is validated by our validator.
Since the python 'parser' function accepts many kinds of input,
the user can enter (for example) '9/1/2014', '2014/10/2',
or even 'Feb 4, 2001'.  If the user enters something that
does not (as far as python is concerned) represent a valid date,
the system will raise a "ValidationError" and the form will
be re-displayed with an appropriate error message.  If, however,
the data is valid, the valid of the field (and, hence, stored
in the database) will be "normalized" to ISO format YYYY-MM-DD.
This allows consistent lookup and indexing regardless of the
users particular idiosyncracies.
This commit is contained in:
Gary Walborn
2014-09-22 12:30:15 -04:00
parent 1050fc71af
commit 9cd3753746
2 changed files with 28 additions and 4 deletions

View File

@@ -16,6 +16,9 @@ default_available_models = {
'User': User
}
default_available_validators = {
}
register_settings(
namespace=u'metadata',
module=u'metadata.conf.settings',
@@ -23,5 +26,6 @@ register_settings(
# Definition
{'name': u'AVAILABLE_FUNCTIONS', 'global_name': u'METADATA_AVAILABLE_FUNCTIONS', 'default': default_available_functions},
{'name': u'AVAILABLE_MODELS', 'global_name': u'METADATA_AVAILABLE_MODELS', 'default': default_available_models},
{'name': u'AVAILABLE_VALIDATORS', 'global_name': u'METADATA_AVAILABLE_VALIDATORS', 'default': default_available_validators},
]
)

View File

@@ -1,27 +1,41 @@
from __future__ import absolute_import
from django import forms
from django.forms.formsets import formset_factory
from django.forms.formsets import formset_factory, BaseFormSet
from django.utils.translation import ugettext_lazy as _
from common.widgets import ScrollableCheckboxSelectMultiple
from .conf.settings import AVAILABLE_MODELS, AVAILABLE_FUNCTIONS
from .conf.settings import AVAILABLE_MODELS, AVAILABLE_FUNCTIONS, AVAILABLE_VALIDATORS
from .models import MetadataSet, MetadataType, DocumentTypeDefaults
class MetadataForm(forms.Form):
def clean_value(self):
value = self.cleaned_data['value']
metadata_id = self.cleaned_data['id']
metadata_type = MetadataType.objects.get(pk=metadata_id)
if ( metadata_type.lookup
and AVAILABLE_VALIDATORS.has_key(metadata_type.lookup) ):
val_func=AVAILABLE_VALIDATORS[metadata_type.lookup]
new_value = val_func(value)
if new_value:
value = new_value
return value
def __init__(self, *args, **kwargs):
super(MetadataForm, self).__init__(*args, **kwargs)
# Set form fields initial values
if 'initial' in kwargs:
self.metadata_type = kwargs['initial'].pop('metadata_type', None)
# FIXME:
# required = self.document_type.documenttypemetadatatype_set.get(metadata_type=self.metadata_type).required
required = False
required_string = u''
if required:
self.fields['value'].required = True
required_string = ' (%s)' % _(u'required')
@@ -31,8 +45,10 @@ class MetadataForm(forms.Form):
self.fields['name'].initial = '%s%s' % ((self.metadata_type.title if self.metadata_type.title else self.metadata_type.name), required_string)
self.fields['id'].initial = self.metadata_type.pk
if self.metadata_type.lookup:
if ( self.metadata_type.lookup
and not AVAILABLE_VALIDATORS.has_key(self.metadata_type.lookup)):
try:
choices = eval(self.metadata_type.lookup, AVAILABLE_MODELS)
self.fields['value'] = forms.ChoiceField(label=self.fields['value'].label)
@@ -51,12 +67,16 @@ class MetadataForm(forms.Form):
except Exception as exception:
self.fields['value'].initial = exception
id = forms.CharField(label=_(u'id'), widget=forms.HiddenInput)
name = forms.CharField(label=_(u'Name'),
required=False, widget=forms.TextInput(attrs={'readonly': 'readonly'}))
value = forms.CharField(label=_(u'Value'), required=False)
update = forms.BooleanField(initial=True, label=_(u'Update'), required=False)
MetadataFormSet = formset_factory(MetadataForm, extra=0)