Solved! Go to Solution.
I had this issue with another client. Every edit to a record in System A would create or update a record in System B. Which means that if a new record in System A was created and then modified, it could create 2 new records in System B. To resolve this, we do the following:
1. Check if there is a unique Id in System A.
2. If Yes - match with that record in System B.
3. If NO -
3.1 Wait 5 minutes
3.2. Retrieve the same record details
3.3. Check if the Id is now populated (since another job might have completed in the meantime)
3.4. Yes - use that record
3.5. No - create new record.
This has eliminated 99% of our duplicates.
Thanks for the response @chris-wiechmann
Actually we're not checking manually on salesforce side but from our source application, if there exists a Salesforce Id on incoming record, it'll go as update else create. And as soon as we create record in Salesforce, we send acknowledgement with Salesforce Id back to source application.(This will help in next updates)
Upsert would require an external Id/primaryKey and even this might fail in our case because, when the incoming new record comes via SQS in multiple events(there can be multiple updates on single record producing multiple events to SQS), it doesn't have any Salesforce Id associated. We have concurrency of 5 and these multiple events of same Record are processed concurrently to Salesforce so all these go as new records as there was no Salesforce Id found to update.
Hope this gives more details.
For now we are switching to single thread and planning on another staging queue/file to deduplicate events of single record within some time period.
Any other suggestions are appreciated.