Last Updated on

This API extracts the tables from the input HOCR page provided. Extraction is done on the basis of extraction rules defined for the document in the batch class which is mapped in the input xml file.

Request Method POST

Input Parameters

Input parameters to the Web Service API would be

  1. HOCR Page/Pages
  2. XML file, with document type, batch class identifier and HOCR pages classified according to the document type as parameters.


Web Service URL: http://{serverName}:{port}/dcma/rest/batchClass/tableExtractionHOCR


Sample for XML

<ExtractTableParam> <BatchClassId>BC4</BatchClassId> <Documents> <Document> <Name>Invoice-Table</Name> <Page>PG1_HOCR.xml</Page> <Page>PG2_HOCR.xml</Page> </Document> </Documents> </ExtractTableParam>


  1. Table extraction would be done only if Table Extraction Switch is ON.
  2. HOCR Pages mapped in the input xml must be sent as input parameters to the Web Service to initiate table extraction.

Sample client code using apache commons http client:-

private static void tableExtractionHOCR() {
		HttpClient client = new HttpClient();
		String url = "http://localhost:8080/dcma/rest/batchClass/tableExtractionHOCR";
		PostMethod mPost = new PostMethod(url);
		// adding file for sending
		File file1 = new File("C:\\sample\\sample.xml");
		File file2 = new File("C:\\sample\\US-Invoice_HOCR.xml");
		Part[] parts = new Part[2];
		try {
			parts[0] = new FilePart(file1.getName(), file1);
			parts[1] = new FilePart(file2.getName(), file2);
			MultipartRequestEntity entity = new MultipartRequestEntity(parts, mPost.getParams());
			int statusCode = client.executeMethod(mPost);
			if (statusCode == 200) {
				System.out.println("Web service executed successfully.");
				String responseBody = mPost.getResponseBodyAsString();
				// Generating result as responseBody.
				System.out.println(statusCode + " *** " + responseBody);
			} else if (statusCode == 403) {
				System.out.println("Invalid username/password.");
			} else {
		} catch (FileNotFoundException e) {
			System.err.println("File not found for processing.");
		} catch (HttpException e) {
		} catch (IOException e) {
		} finally {
			if (mPost != null) {

Was this article helpful to you?