Database Deadlocks in Ephesoft Cluster Environment
Applies to Ephesoft Versions: v2.5 & v3.0
Dead Locks can happen in a Clustered environment with Ephesoft under certain conditions. Each server in the cluster uses a Pickup Service which selects a batch in the NEW or READY status and decides to process it. This is decided based on a Time setting (cron job) in the dcma-workflows.properties file.
When this happens you will see similar error in the DCMA-All.log:
[TimeStamp] ERROR pool-1-thread-1 org.hibernate.util.JDBCExceptionReporter – Transaction (Process ID 298) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
[TimeStamp] ERROR pool-1-thread-1 org.hibernate.event.def.AbstractFlushingEventListener – Could not synchronize database state with session
org.hibernate.exception.LockAcquisitionException: Could not execute JDBC batch update
This issue is caused by deadlocks in your database. It seems that your servers are trying to Pick-up batches at the same time and are conflicting. Please go ahead and follow the recommendation below:
Edit the Pickup service setting in the dcma-workflows.properties file. By default it picks the batches up every minute on the 15th second using a cron job. Please make sure pickup services on both servers are picking the batches at different times like this:
Server 1: dcma.pickup.cronjob.expression=15 0/1 * ? * * (Pickup Every Minute and 15 seconds)
Server 2: dcma.pickup.cronjob.expression=45 0/1 * ? * * (Pickup Every Minute and 45 seconds)
We recommend updating to the latest version of Ephesoft (v22.214.171.124 SP3).
A new Pickup Service feature is included latest Service Pack for Ephesoft (v126.96.36.199). This new Pickup feature allows the Ephesoft servers to select these numbers randomly, so as you install more servers, they pickup batches at different times. This is especially true for Ephesoft Installs using MSSQL. This will provide a solution for preventing deadlocks in the database in general.
Note: you must have applied v188.8.131.52 SP2 before deploying SP3.