This API will extract the document level fields for the corresponding Key Value pattern provided using input XML. This API will take the HOCR file as input. If the Key Value pattern is not found in the HOCR file then it will create the empty document level fields.

Request Method POST

Input Parameters

Input Parameter Values Descriptions
AdvancedKV Either “true”/”false” This parameter is used to specifying the KeyValue extraction is perform by advanced key value or not.
LocationType This value should be one of the following:

TOP, RIGHT, LEFT, BOTTOM, TOP_RIGHT, TOP_LEFT, BOTTOM_LEFT, BOTTOM_RIGHT

 

This parameter will fetch the Value pattern of the particular key pattern on the location provided.
NoOfWords Should be Integer This parameter is used for specify in case of AdvancedKV is false. This parameter is used for adding number word of RIGHT location in the result of the value pattern found in the HOCR.
KeyPattern This value should not be empty.

This value should be valid regex expression.

 

This is used for verify the Key pattern present in given HOCR.
ValuePattern This value should not be empty.

This value should be valid regex expression.

 

This is used for verify the Value pattern present in given HOCR for that particular Key Pattern.
KVFetchValue This value should be one of the following:

ALL, FIRST, LAST

This parameter is used to specify the whether we need fetch all, first or last value pattern found.

 

 

Multiplier This value should be float and should be in between 0 to 1 This value is used to multiply with confidence for updating the confidence of the fields extracted using advanced KV.
Length This value should be integer For getting length value use Ephesoft Admin Screen as display screen shot above
Width This value should be integer For getting width value use Ephesoft Admin Screen as display screen shot above
Xoffset This value should be integer For getting xoffset value use Ephesoft Admin Screen as display screen shot above
Yoffset This value should be integer For getting yoffset value use Ephesoft Admin Screen as display screen shot above
Weight This value should be float and should be in between 0 to 1 This value is used to set the weightage for a extraction rule for a particular document level field.
KeyFuzziness This value should be float and should be in between 0 to 1 This value is used to define the acceptable fuzziness in the key generated from HOCR
hocrFileName This value should be string This value should be having HOCR file name passing for processing in XML file format.

Along these parameters hocrFileName string parameter is also to be supplied containing the name of the HOCR file uploaded.

Web Service URL: http://{serverName}:{port}/dcma/rest/extractKV

CheckList:

  • For using Advance KV user should have admin access to fetch the accurate value of Length, Width, Xoffset and Yoffset. Before using AdvancedKV, please test the image with Ephesoft Admin Screen and note the values of Length, Width, Xoffset, Yoffset and LocationType for the particular KeyValue pattern.
  • If AdvancedKV is true than NoOfWords is not use and all other parameters is used.
  • If AdvancedKV is false than NoOfWords, KeyPattern, ValuePattern and LocationType will work.

Format for XML:

<ExtractKVParams>

<Params>

<AdvancedKV>true</AdvancedKV>

<LocationType>BOTTOM_LEFT</LocationType>

<NoOfWords>0</NoOfWords>

<KeyPattern>Invoice</KeyPattern>

<ValuePattern> [a-zA-Z]{10,15}</ValuePattern>

<KVFetchValue>ALL</KVFetchValue>

<Multiplier>1</Multiplier>

<Length>384</Length>

<Width>251</Width>

<Xoffset>284</Xoffset>

<Yoffset>105</Yoffset>

<Weight>0.1</Weight>

<KeyFuzziness>0.2</KeyFuzziness>

</Params>

</ExtractKVParams>

 

Sample client code using apache commons http client:-

private static void extractKV() {
		HttpClient client = new HttpClient();
		String url = "http://localhost:8080/dcma/rest/extractKV";
		PostMethod mPost = new PostMethod(url);
		// Adding XML for the input.
		File f1 = new File("C:\\sample\\ExtractKVParam.xml");
		// Adding HOCR for processing.
		File f2 = new File("C:\\sample\\US-Invoice_HOCR.xml");
		Part[] parts = new Part[3];
		try {
			parts[0] = new FilePart(f1.getName(), f1);
			parts[1] = new FilePart(f2.getName(), f2);
			parts[2] = new StringPart("hocrFileName", f2.getName());
			MultipartRequestEntity entity = new MultipartRequestEntity(parts, mPost.getParams());
			mPost.setRequestEntity(entity);
			int statusCode = client.executeMethod(mPost);
			if (statusCode == 200) {
				System.out.println("Web service executed successfully.");
				String responseBody = mPost.getResponseBodyAsString();
				// Generating result as responseBody.
				System.out.println(statusCode + " *** " + responseBody);
			} else if (statusCode == 403) {
				System.out.println("Invalid username/password.");
			} else {
				System.out.println(mPost.getResponseBodyAsString());
			}
		} catch (FileNotFoundException e) {
			System.err.println("File not found for processing.");
		} catch (HttpException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		} finally {
			if (mPost != null) {
				mPost.releaseConnection();
			}
		}
	}	}

Was this article helpful to you?

Engineering

Comments are closed.