This plug-in performs the functionality of validating the documents with respect to the given regex pattern. The regex pattern described in the Regular Expression Listing is used to validate the documents. The given regex pattern is compared with respect to all the values in each document for all the Index fields present, if all are matched then that document is marked as valid i.e. their valid tag is set to true and if out of all, any index field doesn’t match then that document is set as invalid i.e. their valid tag is set to false.
A new validation for index fields has been added. This validation is based on OCR confidence generated by Recostar. All the Index fields will have an OCR confidence threshold field whose default value will be 90. If the OCR confidence of the index field is less than the OCR confidence threshold, then the index field will be displayed in “Red” on the validation UI with an error message on the top.
Unless there is some user intervention with the index field, the index field will not be validated.
Steps for configuring the plugin
- User can select the batch class module and create the regex pattern by navigating to Regular Expression Configuration page as shown below:
- User can create multiple validation rules for each Index field. This is shown below in the screenshot:
- There is a validation Operator field for every Index field. It can be assigned one of the two values ‘OR’ or ‘AND’. The default value for validation Operator is OR.
- An Index field is considered as valid if it satisfies at least one validation rule for OR as validation operator. If the validation operator is AND then the Index field must satisfy all the validation rules to be considered as valid.
- For configuring the OCR confidence a switch has been provided in <Ephesoft Installation path>\Application\WEB-INF\classes\META-INF\application.properties as follows:
#OCR confidence switch value
OCR confidence validation can be enabled/disabled using this switch.
By default, the switch has been enabled.
Steps of execution
- This plug-in uses the regex pattern defined for each document type in Index fields and the OCR confidence threshold if enabled for validating the Index fields.
- For any index field, the validation takes place in the following priority order:
- Regex Pattern Validation
- OCR confidence validation
- It matches all the regex defined with each Index field from batch.xml. If all the values of Index field are matched with regex defined then that document’s “Valid” tag is set to true, otherwise it is set to false.
- In OCR confidence validation, OCR confidence of the index field is compared with the OCR confidence threshold and the index field is marked as invalid in batch.xml if the OCR confidence of the index field is less than the OCR confidence threshold. If all the values of the index field are valid then only the document’s “Valid” tag is set to true, otherwise it is set to false.
- The documents that are valid do not need validation but those which are set as false for valid tag are to be validated during Validation.
Following are few common error messages seen due to mal-functioning of the plugin:
|S no.||Error message||Possible root cause|
|1||Invalid initialization of field service.||No field type initialized in a document.|
|2||Invalid input pattern sequence.||Regex pattern is not supplied for required field.|