Overview:

This feature lets you define how the document assembler plugin can manipulate the separation results in different ways based on the File boundaries. A property DA File Name Boundary Classification determines the strategy for file boundary classification which operates over the separation results generated by the DA algorithm.

 

File Boundary Classification Strategies

The three types of File Boundary Classification strategies are-

  1. UseDAGeneratedDocument– The classification results generated by the DA algorithm remain so without any changes.
  2. MergeDocumentsBelongingToSameFile– All pages belonging to single multipage input file must be a part of the same document. If the results generated by the DA algorithm contain a document that contains pages belonging to two different input files, a new document is generated at the file boundary. If pages belonging to a single input file are classified into two or more documents, the documents are merged such that all the pages from the source file be a part of one document only.

Example:

Original file name Broken file name Page ID Doc ID as generated by DA Result
File1.tiff File1-0001.tiff PG0 DOC1 DOC1
File1-0002.tiff PG1 DOC1 DOC1
File1-0003.tiff PG2 DOC2 DOC1
File1-0004.tiff PG3 DOC2 DOC1
File2.tiff File2.tiff PG4 DOC2 DOC2
File3.tiff File3-0001.tiff PG5 DOC3 DOC3
File3-0002.tiff PG6 DOC3 DOC3
Original file name Broken file name Page ID Doc ID as generated by DA Result
File1.tiff File1-0001.tiff PG0 DOC1 DOC1
File1-0002.tiff PG1 DOC2 DOC1
File1-0003.tiff PG2 DOC3 DOC1
File1-0004.tiff PG3 DOC4 DOC1
File2.tiff File2.tiff PG4 DOC5 DOC2
File3.tiff File3-0001.tiff PG5 DOC6 DOC3
File3-0002.tiff PG6 DOC7 DOC3

3. CreateNewDocumentForDifferentFileIf a document consists of pages belonging to two different input files, a new document is created at the file boundary. Pages belonging to a source file may get separated into multiple documents but no such document should span across another input file.

Example:

Original file name Broken file name Page ID Doc ID as generated by DA Result
File1.tiff File1-0001.tiff PG0 DOC1 DOC1
File1-0002.tiff PG1 DOC1 DOC1
File1-0003.tiff PG2 DOC2 DOC2
File1-0004.tiff PG3 DOC2 DOC2
File2.tiff File2.tiff PG4 DOC2 DOC3
File3.tiff File3-0001.tiff PG5 DOC3 DOC4
File3-0002.tiff PG6 DOC3 DOC4
Original file name Broken file name Page ID Doc ID as generated by DA Result
File1.tiff File1-0001.tiff PG0 DOC1 DOC1
File1-0002.tiff PG1 DOC2 DOC2
File1-0003.tiff PG2 DOC3 DOC3
File1-0004.tiff PG3 DOC4 DOC4
File2.tiff File2.tiff PG4 DOC5 DOC5
File3.tiff File3-0001.tiff PG5 DOC6 DOC6
File3-0002.tiff PG6 DOC7 DOC7

Was this article helpful to you?

Engineering

Comments are closed.