Systematic Community

nikhilb · ‎11-10-2023

Hi Pros!

We have a challenge with the concurrency(parallel jobs) in consuming the events from SQS to Salesforce with Workato.

The use case is, we have batch load processes on our source application which can create large number of events to SQS possibly with multiple events for a single record.
When we try to consume the events from SQS and process to Salesforce using Workato, we're using concurrency of 5(to handle large volumes).

This resulting in creating duplicate records in Salesforce when different create/update events of a single record are shared across concurrently running threads and processed.

Note: We have a check to update a record in Salesforce if we have a salesforce Id associated in our source application. But as the processing of these events happening parallelly, the check gets satisfied for a new record to salesforce which is extracted at the same time and then in next steps, new records(duplicates) are created in Salesforce with concurrently running jobs.

Question: How can we avoid processing streaming events related to a single record multiple times while using Concurrency?
P.S: We may get multiple events related to a single record in the SQS queue. The queue is not FIFO and we are already trying to extract 2k messages in a single batch and deduplicating(on a single job).

Any quick suggestion or guidance is highly appreciated.

Thanks

rachelnatik · ‎11-13-2023

I had this issue with another client. Every edit to a record in System A would create or update a record in System B. Which means that if a new record in System A was created and then modified, it could create 2 new records in System B. To resolve this, we do the following:
1. Check if there is a unique Id in System A.
2. If Yes - match with that record in System B.
3. If NO -
3.1 Wait 5 minutes
3.2. Retrieve the same record details
3.3. Check if the Id is now populated (since another job might have completed in the meantime)
3.4. Yes - use that record
3.5. No - create new record.

This has eliminated 99% of our duplicates.

View solution in original post

rachelnatik · ‎11-13-2023

I had this issue with another client. Every edit to a record in System A would create or update a record in System B. Which means that if a new record in System A was created and then modified, it could create 2 new records in System B. To resolve this, we do the following:
1. Check if there is a unique Id in System A.
2. If Yes - match with that record in System B.
3. If NO -
3.1 Wait 5 minutes
3.2. Retrieve the same record details
3.3. Check if the Id is now populated (since another job might have completed in the meantime)
3.4. Yes - use that record
3.5. No - create new record.

This has eliminated 99% of our duplicates.

nikhilb · ‎12-06-2023

Thanks for the response @rachelnatik.
This looks like a good idea but in our case we are doing batches of records at a time to handle large volumes and the wait times may cause delays as much as using single thread!
The unique Id gave a new idea! We'll have the System B to enforce unique constraint on Id from System A. Thus any duplicates for same record(Id in system A) created in system B would be rejected!(We're ok with the rejections on duplicates and log them)

Systematic Community

Duplicate records created in Salesforce with Concurrent processing of SQS events