Last Updated on

What’s New In Transact 4.5?


Extraction | Custom REST API Lookup Guide

 

Quick Reference

Introduction

Prerequisites

Configuration of REST API Lookup using the Web Service URL

Configuration of REST API Lookup using the custom JAR file

REST API Lookup on the Validation screen

Configuration of XPath/JSONPath

 

Introduction

Ephesoft Transact 4.5.0.0 now supports REST API Lookup. It is used to extract the document level fields of a document on basis of the Web Service response. With REST API Lookup, there is no need to provide direct access to the databases or create special scripts for REST API based extraction. Now, all configurations can be done directly from the UI.

This feature has been implemented as a separate extraction plugin at the Batch Class level, configuration section at the Document Type level and a lookup functionality on the Validation screen. It provides the same user experience as the Fuzzy DB Lookup. However, rather than extracting the Index Fields directly from the database, it is passing configured values to a REST Web Service in the customer environment and the Web Service returns XML/JSON response back to Ephesoft Transact. The application parses the response and extracts the required information according to the mapped XPath/JSONPath strings. The extracted values are then mapped to the Index Fields created for the Document Types.

REST API Lookup can be configured in two ways:

  • REST URL based lookup, where the user provides REST URL, parameter and field mappings, as well as authentication details (username and password required to access the Web Service). The returned XML/JSON response will then be used to extract values on basis of the XPath or JSONPath.
  • JAR based lookup, where the user provides their own custom implementation for authentication and API invocation. The Web Service should return API response (XML/JSON) back to Ephesoft Transact using this custom JAR. The response will then be used to extract values based on a specified XPath or JSONPath.

REST API Lookup has been implemented as a group based lookup much like the Fuzzy DB extraction. The user can create multiple groups for a Document Type with each group having different URL/JAR and a different set of mapped columns. Each group can be assigned a specific weight, which will be used to select the primary value out of results from separate groups. If no weights are specified, the first result from the first group will be selected as a primary value.

If REST API Lookup is used along with Key-Value Extraction Rules, the system will first extract the values based on the configured KV Extraction and then will use this data to map and lookup remaining values using the Web Service.

 

Prerequisites

1. In the Extraction module, drag and drop the REST_API_LOOKUP Plugin into the Selected Plugins field. Note that it should always be placed after the KEY_VALUE_EXTRACTION plugin.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\1.png

2. On the REST_API_LOOKUP Plugin Configuration screen, make sure that the Rest API Lookup Plugin Switch is ON.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\2.png
 

Configure REST API Lookup using the Web Service URL

1. Navigate to the REST API Lookup tab which is available for each Document Type.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\14.png

2. Click Add to add a new group. The left-side menu will be disabled, and the following screen will be displayed:

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\4.png

3. On the REST API Lookup Configuration screen, provide the Group Name, and select the REST URL Lookup option.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\15.png

4. Fill in the Connection Details section.

Field Description
Response Type Drop-down field to indicate whether REST API will return an XML or JSON type of response. Required for parsing the response from the Web Service.
Request Method Drop-down field to indicate GET or POST type of request.
Authentication Type Drop-down field with two authentication options:

  • NONE
  • BASIC AUTHENTICATION (requires username and password)
Username Username used to access the Web Service. Mandatory if BASIC AUTHENTICATION type is selected.
Password Password used to access the Web Service. Mandatory if BASIC AUTHENTICATION type is selected.
Connection URL URL used to connect to the Web Service. Connection URL has auto-suggestion functionality to suggest path-parameters like BatchClassID, BatchInstanceID, DocumentID and Index Field names in the following format: http://www.dummy.com/rest/${BatchClassID}/

${DocumentID}/${InvoiceNo}.

The URL can have both dynamic and static path-parameters. You can press $ to see the dropdown list of available dynamic parameters.

When you select an item from the dropdown, it will be enclosed in ${} to denote its special behavior.

 

5. Fill in the Parameters Mappings section. To add a parameter, click +, to delete a parameter, click –.

Field Description
Parameter Name Name of parameter to be used as query parameter if request type is GET method, or body parameter if request type is POST method. Parameter name must match with the parameter name configured in the Web Service. All parameter names should be unique.
Parameter Value The value of a parameter can be selected from the drop-down list which includes default values like ${BATCH_CLASS_ID}, {BATCH_CLASS_NAME}, ${BATCH_INSTANCE_ID}, ${DOCUMENT_TYPE} as well as the names of all Index Fields configured for the Document Type. You can press $ to see the dropdown list of available dynamic parameters. You can also directly type any static parameter in this field as required.

Note: The combination of static and dynamic parameters in this field is not supported.

Note: Parameters mapped against configured Index Fields and containing dynamic values will be displayed on the Validation screen with the REST API Lookup icon (), which will allow the user to lookup available values using the Web Service.

Let’s consider an example. Suppose, the configured parameters are:

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\7.png

Traditionally, if GET method is used, the Web Service URL will be:

http://www.dummy.com?BatchID=${BATCH_CLASS_ID}&DocIType=${DOCUMENT_TYPE}&InvNo=${Invoice No}

If POST method is used, the Web Service URL will be http://www.dummy.com and the above-mentioned parameters will be included in the body of the request:

BatchID=${BATCH_CLASS_ID}

DocID=${DOCUMENT_TYPE}

InvNo=${Invoice No}

However, you can include query parameters in the URL or use them in the body for both methods.

 

6. Finally, fill in the Index Field Mappings section. To add an Index Field, click +, to delete an Index Field, click –.

Field Description
Index Field Index Field to be mapped against Web Service metadata. The drop-down list includes all Index Fields configured for the Document Type. Each Index Field should be unique.

If an Index Field is renamed or updated, the corresponding mapping will be deleted and the parameters have to be mapped again using updated Index Field.

Value (XPath/JSON path) XPath mapped to the Index Field if selected response type is XML or JSON path if selected response type is JSON. It is recommended to test the configured XPath/JSONPath using online evaluation/testing tools. Use the Help icon ( ) to get detailed information about XPath/JSONPath configuration and testing, or check the information below.

Sample XPath: /List/item/invoiceNumber

Sample JSONPath: $.[0].invoiceNumber

Confidence (XPath/JSON path) Similar to the Value field, this confidence field takes XPath/JSONPath as input. XPath/JSONPath string provided in this field will be used to extract confidence value for extracted fields.

The value for Confidence in an Index Field mapping is optional. Following are some important points regarding the Confidence field.

  • In case no value is provided, 100 will be taken as the confidence of the extracted value.
  • If confidence path is incorrectly defined, then 0% confidence will be returned.
  • In case the confidence value for all fields comes out to be between 0 and 1, it will be normalized by multiplying the confidence value by 100.

The confidence range can be defined from 1 to 100 or from 0 to 1. It is assumed that the user selects 1-100 range. However, if the system detects that the user has selected 0-1 confidence range, it will multiply the confidence value by 100. This is done to ensure the overall consistency and uniform approach in both cases.

 

Note: The confidence for any field will be the product of confidence value from the XPath/JSONPath mapping and the weight of REST API lookup group:

If the REST API Lookup groups are configured as:

Group 1: Weight = 0.5

Index Field Field Value Confidence
Invoice No ABC-123 80%
Company Name XYZ Inc. 20%

 

Group 2: Weight = 0.75

Index Field Field Value Confidence
Invoice No DLF-456 50%
Company Name ABC Enterprises

 

Then, the calculation will be as follows:

Invoice No:

Field value = MAX (0.5*80, 0.75*50) = MAX (40, 37.5) = “ABC-123” with confidence of 40%

Company Name:

Field Value = MAX (0.5*20, 0.75*100) = MAX (10, 75) = “ABC Enterprises” with confidence of 75%

In Group 2, no confidence value extracted from XML/JSON response for Company Name field and that is why its confidence is assumed to be 100%.

The highest confidence field value will be set as the Document Level Field.

 

7. Now, test your configuration by clicking Test Connection.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\11-1.png

Note: If any of the fields are left empty or include invalid values, the error message will be displayed and missing/invalid field(s) will be highlighted.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\23.png

 

8. If all the fields are properly configured, a dialogue window is displayed with the list of all configured parameters. To test the connection, provide sample values for the parameters, and click Test.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\24.png

If you configure two, three or more parameters in your Web Service, all of them will be displayed in this window. You can provide the value for any or all of them to test the connection.

Note: Depending on your Web Service, you might be able to test the connection even if there are no parameters provided. In that case, the dialogue window with Parameter Values and Sample Values will be empty, and you can proceed by simply clicking Test to ensure that the application can establish connection with the Web Service.

 

9. Confirm the Web Service response that will appear in a new popup window.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\26.png

Note: In case the REST API returns an error due to authentication failure, server not responding etc., the relevant message and error code will be displayed in the response popup.

 

10. Click Apply Mappings to save your configuration.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\11-2.png

 

11. Click Apply on the REST API Lookup screen to save the group.

The REST API Lookup screen will contain a list of all configured REST API Lookup groups with the following columns:

Field Description
Group Name Name of the REST API lookup group. Group Name will be saved after you click Apply on the REST API Lookup screen. Once saved, the group name cannot be edited.
Connection URL/JAR Name URL used to connect to the Web Service or the name of the uploaded JAR file.
Field Mappings List of mapped Index Fields separated by semi-colon.
Weight Weight is used to select the primary value out of results from separate groups. If no weight is specified or all the groups are assigned the same weight, the first result from the first group will be selected as a primary value. The weight can range from 0.1 to 1.0 with 0.1 being the lowest weight and 1.0 – the highest. By default, this parameter value is 0.5.
Enabled This switch is used to enable/disable the REST API group mappings. If the group is disabled, it will not be used when performing the REST API Lookup.

 

Here, you can Add new groups as well as Edit or Delete already configured groups.

 

Limitations of the REST URL-based Lookup:

1. The use of HTTP headers and cookies in the request body is not supported.

2. The only supported authentication method is Basic Authentication.

3. Only GET/POST requests are supported.

All these limitations can be handled through the REST API Custom Jar Lookup described below.

 

Configure REST API Lookup using the custom JAR file

1. Create the custom JAR for REST API Lookup.

Implement the following interface and upload the class via JAR.


import java.util.List;
import java.util.Map;
 
import com.ephesoft.dcma.batch.schema.Batch;

The interface should be implemented for "Custom Lookup" functionality for REST API lookup based extraction. The implementation class should handle the authentication and WebService lookup in the implementation.
 
 @author Ephesoft
 @since 4.5.0.0

public interface CustomLookup {

The following method will be used for lookup for a particular "Group Name" and "Document Type". The method should handle the invocation of their web-service and provide a response in XML/JSON format to be parsed by the REST Lookup
@param groupName name of the group currently getting executed
@param documentType name of the document type for which extraction is being done
@param batch batch.xml to fetch any parameters/variables required during lookup
@return XML/JSON response

public String executeLookup(final String groupName, final String documentType, final Batch batch);

The following method will be used during "Test Connection" functionality to fetch any parameters/variables which are required during custom lookup.
@return list of parameters/variables to be used during the lookup

public List<String> getTestParameters();

The following method will be used during "Test Connection" functionality after the sample values are provided for parameters required for custom lookup. The method should test the lookup functionality based on the map of parameters and sample values from test connection grid. It should provide the XML/JSON response to be shown in the test connection response.
This map will be created based on list of parameters as:: key in the map will be the parameter name and value will be the sample value provided by user in Test Connection grid.
@param parameters map of parameter name and their sample value provided from test connection grid
@return XML/JSON response corresponding to the map of parameters in input and lookup response

public String testConnection(final Map<String, String> parameters);

}

The Web Service implementation has to include the Web Service URL, return type, GET/POST method reference and authentication details (if required).

To use the Test Connection feature:

  • in the testConnection method, you need to provide the URL, which will be used to connect to the Web Service;
  • in the getTestParameters method, you need to specify parameters, which will be used to fetch the response.

All three above-mentioned methods have to be implemented in the JAR. However, in case you do not want to use the Test Connection functionality, you can simply leave getTestParameters and testConnection methods empty.

2. In Ephesoft Transact, navigate to the REST API Lookup tab which is available for each Document Type.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\14.png

3. Click Add to add a new group. The left-side menu will be disabled, and the following screen will be displayed:

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\4.png

4. On the REST API Lookup Configuration screen, provide the Group Name, and select the Custom Lookup option.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\15-1.png

5. Fill in the Connection Details section and upload the Custom JAR file.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\39.png

Field Description
Response Type Drop-down field to indicate whether REST API will return XML or JSON type of response. Required for parsing the response from the Web Service.
Fully-Qualified Class Name Package name + class name, e.g. if package name is com.ephesoft.rest and class name is RestCustomLookup, the fully-qualified class name will be com.ephesoft.rest.RestCustomLookup.
Upload JAR File File upload browser to select the custom JAR for REST API lookup, which includes details about authentication method and parameters.

 

6. Fill in the Index Field Mappings section. To add an Index Field, click +, to delete an Index Field, click –.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\40.png

Field Description
Index Field Index Field to be mapped against Web Service metadata. The drop-down list includes all Index Fields configured for the Document Type. Each Index Field should be unique.

If an Index Field is renamed or updated, the corresponding mapping will be deleted and the parameters have to be mapped again using updated Index Field.

Value (XPath/JSON path) XPath mapped to the Index Field if selected response type is XML or JSON path if selected response type is JSON. It is recommended to test the configured XPath/JSONPath using online evaluation/testing tools. Use the Help icon ( ) to get detailed information about XPath/JSONPath configuration and testing, or check the information below.

Sample XPath: /List/item/invoiceNumber

Sample JSONPath: $.[0].invoiceNumber

Confidence (XPath/JSON path) Similar to the Value field, this confidence field takes XPath/JSONPath as input. XPath/JSONPath string provided in this field will be used to extract confidence value for extracted fields.

The value for Confidence in an Index Field Mapping is optional. Following are some important points regarding the Confidence field.

  • In case no value is provided, 100 will be taken as confidence of extracted value.
  • In case the confidence value for all fields comes out to be between 0 and 1, it will be normalized by multiplying the confidence value by 100.

The confidence range can be defined from 1 to 100 or from 0 to 1. It is assumed that the user selects 1-100 range. However, if the system detects that the user has selected 0-1 confidence range, it will multiply the confidence value by 100. This is done to ensure the overall consistency and uniform approach in both cases.

Note: The confidence for any field will be the product of confidence value from the XPath/JSONPath mapping and the weight of REST API lookup group:

If the REST API Lookup groups are configured as:

Group 1: Weight = 0.5

Index Field Field Value Confidence
Invoice No ABC-123 80%
Company Name XYZ, Inc. 20%

 

Group 2: Weight = 0.75

Index Field Field Value Confidence
Invoice No DLF-456 50%
Company Name ABC Enterprises

 

Then, the calculation will be as follows:

Invoice No:

Field value = MAX (0.5*80, 0.75*50) = MAX (40, 37.5) = “ABC-123” with confidence of 40%

Company Name:

Field Value = MAX (0.5*20, 0.75*100) = MAX (10, 75) = “ABC Enterprises” with confidence of 75%

In Group 2 no confidence value extracted from XML/JSON response for Company Name field, that’s why its confidence is assumed as 100%.

 

7. Now, test your configuration by clicking Test Connection.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\41.png

Note: If any of the UI fields are left empty or include invalid values, the error message will be displayed and missing/invalid field(s) will be highlighted.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\42.png

 

8. If everything is properly configured, the system will hit the URL provided in the testConnection method, and you will see a dialogue window with the list of all parameters included in the getTestParameters method.

Suppose, you have configured the testConnection method as follows:

And included the “PONumber” parameter in the getTestParameters method.

So, when you click Test Connection, the system connects to the specified URL, and the following pop-up window is displayed:

To test the connection, provide a sample value for the parameter that you used in your Web Service, and click Test.

Notes:

  • When defining parameters in the getTestParameters method, make sure to provide the exact parameter name as configured in your Web Service to successfully use the Test Connection functionality.
  • Depending on your Web Service, you can test the connection even if there are no parameters provided. In that case, the dialogue window with Parameter Values and Sample Values will be empty and you can proceed by simply clicking Test to ensure that the application can establish connection with the Web Service.

 

9. Confirm the Web Service response that will appear in a new popup window. Depending on the configuration, the response field will or will not contain data.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\26.png

Note: In case the REST API returns error due to authentication failure, server not responding etc., the relevant message and error code will be displayed in the response popup.

10. Click Apply Mappings to save your configuration.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\43.png

11. Click Apply on the REST API Lookup screen to save the group.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\44.png

The REST API Lookup screen will contain a list of all configured REST API Lookup groups with the following columns:

Field Description
Group Name Name of the REST API lookup group. Group Name will be saved after you click Apply on the REST API Lookup screen. Once saved, the group name cannot be edited.
Connection URL/JAR Name URL used to connect to the Web Service or the name of the uploaded JAR file.
Field Mappings List of mapped Index Fields separated by semi-colon.
Weight Weight is used to select the primary value out of results from different groups. If no weight is specified or all the groups are assigned the same weight, the first result from the first group will be selected as a primary value. The weight can range from 0.1 to 1.0 with 0.1 being the lowest weight and 1.0 – the highest. By default, this parameter value is 0.5.
Enabled This switch is used to enable/disable the REST API group mappings. If the group is disabled, it will not be used when performing the REST API Lookup.

 

Here, you can Add new groups as well as Edit or Delete already configured groups.

 

REST API Lookup on the Validation screen

Now, when your batch reaches the Validation stage, you can perform the lookup using configured Web Service. On the Validation screen, you will see a Lookup icon – against the Index Fields which have been used as parameters in your REST API Lookup configuration. On clicking the icon for the REST API Lookup, a pop-up window will be displayed with search functionality on the looked-up Index Field.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\27.png

1. Perform REST API Lookup.

You can type the required value directly into the text field in the middle pane and then click the Lookup icon to perform lookup.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\29.png
C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\30.png

 

Or click the icon to open the Lookup dialogue window, provide the value in the search field and then click the Search icon.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\31.png
C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\32.png
C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\33.png

 

Percentage in brackets indicates the level of confidence for extracted values.

The Lookup dialogue window will contain results from all the configured and enabled REST API Lookup groups.

 

2. Select the required row(s) and click OK to populate the results on the Validation screen.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\34.png

Note: You will be able to select one or more groups in the pop-up window. On clicking OK, the Index Field confidence for the selected groups will be calculated as a product of group weight and field value confidence.

  • The maximum resultant confidence from all selected groups will be used to select the ultimate Index Field value.
  • If two groups have same resultant confidence, then first group will be selected and Index Field value will be populated.

The selected REST API Lookup results are populated on the Validation screen.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\35.png

 

3. Click Validate to confirm extraction results and proceed with the batch processing flow.

Note: If your Web Service is configured to fetch all values, no parameters are specified for the REST API Lookup and no KV Extraction rules are defined, the values on the Validation screen will be automatically populated with the first extracted result.

Suppose, the REST API Lookup is configured the following way:

The Web Service is configured to return the following result:

In this case, the REST API Lookup will fetch the data with the first “id” and populate the results on the Validation screen automatically. Since no parameters are configured, the REST API Lookup icon will not be displayed.

C:\Users\Ephesoft\AppData\Local\Microsoft\Windows\INetCache\Content.Word\38.png

You can “fine-tune” the extraction results by defining the Key-Value Extraction Rules or by using operators/wildcards for the XPath/JSONPath found in Configuring XPath/JSONPath.

Configuring XPath/JSONPath

XPath: XPath (XML Path Language) is a query language for selecting nodes from an XML document. In addition, XPath may be used to compute values (e.g. strings, numbers, or Boolean values) from the content of an XML document.

JSONPath: JSONPath is a query language, similar to XPath, for selecting nodes from a JSON object. JSONPath expressions refer to a JSON structure in the same way as XPath expression are used in combination with an XML document. It is used to compute values from JSON objects.

JSONPath uses special notation to represent nodes and their connections to adjacent nodes in a JSONPath path. There are two styles of notation, namely dot and bracket.

Both of the following paths refer to the same node, which is the third element within the location field of creator node, that is a child of JSONPath object belonging to the tool under the root node.

  • With dot notation: $.tool.JSONPath.creator.location[2]
  • With bracket notation: $[‘tool’][‘JSONPath’][‘creator’][‘location’][2]

We have several helpful operators in JSONPath:

  1. Root node ($): This symbol denotes the root member of a JSON structure no matter whether it is an object or array. Its usage examples have been showed above.
  2. Current node (@): Represents the node that is being processed, mostly used as part of input expressions for predicates. Suppose we are dealing with book array in the above JSON document, so the expression book[?(@.price == 49.99)] would refer to the first book in that array.
  3. Wildcard (*): Expresses all elements within the specified scope. For instance, book[*] indicates all nodes inside a book array.

The XPath language is based on a tree representation of the XML document, and provides the ability to navigate around the tree, selecting nodes by a variety of criteria. In popular use (though not in the official specification), an XPath expression is often referred to simply as “an XPath”.

Below is the list of operators which can be used to compute values in XPath and JSONPath.

XPath JSONPath Description
/ $ the root object/element
. @ the current object/element
/ . or [] child operator
.. n/a parent operator
// .. recursive descent. JSONPath borrows this syntax from E4X.
* * wildcard. All objects/elements regardless their names.
@ n/a attribute access. JSON structures don’t have attributes.
[] [] subscript operator. XPath uses it to iterate over element collections and for predicates. In JavaScript and JSON, it is the native array operator.
| [,] union operator in XPath results in a combination of node sets. JSONPath allows alternate names or array indices as a set.
n/a [start:end:step] array slice operator borrowed from ES4.
[] ?() applies a filter (script) expression.
n/a () script expression, using the underlying script engine.
() n/a grouping in XPath