cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Duplicate records created in Salesforce with Concurrent processing of SQS events

nikhilb
Deputy Chef II
Deputy Chef II
Hi Pros!

We have a challenge with the concurrency(parallel jobs) in consuming the events from SQS to Salesforce with Workato.
 
The use case is, we have batch load processes on our source application which can create large number of events to SQS possibly with multiple events for a single record.
When we try to consume the events from SQS and process to Salesforce using Workato, we're using concurrency of 5(to handle large volumes).
This resulting in creating duplicate records in Salesforce when different create/update events of a single record are shared across concurrently running threads and processed.
 
Note: We have a check to update a record in Salesforce if we have a salesforce Id associated in our source application. But as the processing of these events happening parallelly, the check gets satisfied for a new record to salesforce which is extracted at the same time and then in next steps, new records(duplicates) are created in Salesforce with concurrently running jobs.

Question: How can we avoid processing streaming events related to a single record multiple times while using Concurrency?
P.S: We may get multiple events related to a single record in the SQS queue. The queue is not FIFO and we are already trying to extract 2k messages in a single batch and deduplicating(on a single job).
 
Any quick suggestion or guidance is highly appreciated.
 
Thanks
1 ACCEPTED SOLUTION

rachelnatik
Deputy Chef III
Deputy Chef III

I had this issue with another client. Every edit to a record in System A would create or update a record in System B. Which means that if a new record in System A was created and then modified, it could create 2 new records in System B. To resolve this, we do the following:
1. Check if there is a unique Id in System A. 
2. If Yes - match with that record in System B.
3. If NO - 
3.1 Wait 5 minutes
3.2. Retrieve the same record details
3.3. Check if the Id is now populated (since another job might have completed in the meantime)
3.4. Yes - use that record
3.5. No - create new record.

This has eliminated 99% of our duplicates. 



View solution in original post

6 REPLIES 6

rachelnatik
Deputy Chef III
Deputy Chef III

If no record is returned, have the system wait a few seconds and then check again. 

nikhilb
Deputy Chef II
Deputy Chef II

you mean to wait while we check in source system for salesforce Id?

chris-wiechmann
Workato employee
Workato employee

Hi @nikhilb, Instead of checking for an existing record in Salesforce manually, have you tried to use the Salesforce Upsert action instead? With that, you have it over to Salesforce to perform an INSERT or UPDATE

nikhilb
Deputy Chef II
Deputy Chef II

Thanks for the response @chris-wiechmann 
Actually we're not checking manually on salesforce side but from our source application, if there exists a Salesforce Id on incoming record, it'll go as update else create. And as soon as we create record in Salesforce, we send acknowledgement with Salesforce Id back to source application.(This will help in next updates)

Upsert would require an external Id/primaryKey and even this might fail in our case because, when the incoming new record comes via SQS in multiple events(there can be multiple updates on single record producing multiple events to SQS), it doesn't have any Salesforce Id associated. We have concurrency of 5 and these multiple events of same Record are processed concurrently to Salesforce so all these go as new records as there was no Salesforce Id found to update.

Hope this gives more details.

For now we are switching to single thread and planning on another staging queue/file to deduplicate events of single record within some time period.

Any other suggestions are appreciated.

Thanks