- batchClassIdentifier (Batch Class Id)
- Input files. (Single page pdf files)
Output result: An xml file (Ephesoft batch.xml)
Working: When documents are captured using SnapDOC mobile application, a pdf file is generated for each page that is scanned or captured using mobile camera. All those generated pdfs are submitted to mobile web service for further processing. Mobile web service performs following operation on the input files.
- Convert input pdfs into single page TIF files so that those TIF files can be processed further.
- After conversion of input pdf into TIF, OCRing is performed on those TIF files using configured OCR engine in batch class (Tesseract, Recostar and Nuance OCR engine).
- Once OCRing results are available, input files are clubbed and classified under different document type configured in the Batch Class.
- After input files are classified into different document type successfully, KV extraction is performed on OCR result for the classified document type.
- Result of KV extraction is sent back to the mobile application client in form of xml response.
- Extraction Result obtained from xml response is presented to user for verification.
As it is mentioned above, Mobile web service uses only KV extraction for extracting document level fields. The batch class that is to be used for Mobile upload processing must have KV extraction rules for document level fields.