Test extraction enables the user to test the extraction results without running the entire batch. By running the test extraction the user can evaluate the result for the particular input at the same time verify which kind of extraction is best suited for the particular document type.
Steps to run Test Extraction
To begin with test Extraction we need to upload an image from the batch class screen either by dropping a file or clicking on the link Upload Test Extraction File(s) located at the bottom of the screen. Uploaded image can be a single page or multipage tif/tiff or a pdf file.
Test extraction can be run for different extraction plugins by altering only two properties Classification Types and Extraction Plugins. Test extraction requires the input document to undergo operations configured in Document Assembly and Page Processing plugin for the selected classification type. It supports three kinds of classifications
For instance on selecting ImageClassification in the Classification drop down, the Document Assembler and Page Process Plugin would perform only those operations that are necessary for Image Classification only, if the scripts are configured for both the modules it would be executed at the same time.
So to run the test extraction one needs to configure all the relevant plugins in both the modules for the batch class.
Extraction Plugins can be selected from among the following to carry out the extraction on the input document.
- Recostar Extraction
- Barcode Extraction
- Regular Regex Extraction
- Key Value Extraction
- Table Extraction
- Extraction Scripting Plugin
Extraction plugins appearing in the text box are the ones configured for the particular batch class based on the order of execution. To run test extraction we need to select the Plugins from the multiple selection box. One or many plugins can be selected to perform operation on the input document and their order of execution will be based on the order defined in the batch class.
For instance on selecting Recostar Extraction the extraction would be performed on the input document and the result would be obtained accordingly as per the extracted values. In case Recostar Extraction and Barcode Extraction are selected simultaneously, the extraction is done on the input document based on the extraction rules defined in the mentioned plugins and the order of execution would be same as defined in the batch class.
NOTE: The extraction will run on the input document irrespective of extraction switch value is ON or OFF. However the plugins need to be configured in the batch class to perform the extraction and so does the Extraction Script plugin.
On the test extraction screen one comes across only three major buttons
By clicking the Extract button, extraction will be performed on the input document and the result is generated. If no extraction result is returned no values extracted message is displayed on the screen.
By clicking the download button, the extracted results XML file will be available for download. The schema of the downloaded XML file will be similar to batch.xml produced while running the batch. However the batch has empty batch instance identifier fields.
By clicking the clear button, the extraction results and the XML files will be cleared and enables the user to perform other extraction on the input document
Apart from these the user may click the close button to close the testExtraction screen.
After extraction plugin returns a result it would be displayed as shown below.
ExtractedDLF , with pageID and page level name, document level fields, extracted values, type of extraction and the confidence with which the value is extracted are displayed .
In case of table extraction, the data is populated under the DataTable section under the classified document
If no value is extracted from the input document no ExtractDLF will appear.