cancel
Showing results for 
Search instead for 
Did you mean: 

Merging / upserting items in a list

mlamotte
Deputy Chef II
Deputy Chef II

I have an API which i call on a schedule, and it returns me all status updates since the last poll.

Due to API limits etc... I do this on a 15 minute interval.

However, if the item gets updated more than once within this interval, I get the same item returned an equal amount of times, but with different values.

The return I get, after flattening the result and putting it into a list is something like this:

OrderStatusArrival TimestampDeparture Timestamp
ORD-112200PLANNEDNULLNULL
ORD-112200BOOKEDNULLNULL
ORD-112200BOOKED2024-06-05 10:00:00NULL
ORD-112200BOOKED2024-06-05 10:00:002024-06-05 11:00:00
ORD-112201PLANNEDNULLNULL

 

I would like to upsert this into a table in a SQL database, but can't do this in batch as I have multiple instances of the unique column identifier, which is "Order".

So what needs to be done is either merge them in a list, or add something like a rownumber for each order, and then only select the item with the highest rownumber per Order. 

I would preferably not want to loop over each item in the list and upsert it into the database, which would have the same result, but would consume a LOT of tasks.

Does anyone have a good solutions for this?
Maybe I could use a dictionary key for each order, or something like that, or SQL collections?

1 ACCEPTED SOLUTION

gary1
Executive Chef III
Executive Chef III

I would dump the API response into a SQL Collection, then query and group by Order ID. The query response will only include a single result per Order ID, and the rest of the fields will include the data from the first instance of the Order ID found in the collection. I think this works for you, because I assume 1) you always want the newest update for that order ID and 2) the API response is already ordered from the newest to oldest update per order ID. Either way, I would test it out to make sure.

Here's the SQL query:

SELECT * FROM [table data pill] GROUP BY [order ID data pill]

If the above doesn't work, I would dump the API response into a Ruby action and write a quick loop to logically dedupe the list.

Hope this helps!

View solution in original post

4 REPLIES 4

gary1
Executive Chef III
Executive Chef III

I would dump the API response into a SQL Collection, then query and group by Order ID. The query response will only include a single result per Order ID, and the rest of the fields will include the data from the first instance of the Order ID found in the collection. I think this works for you, because I assume 1) you always want the newest update for that order ID and 2) the API response is already ordered from the newest to oldest update per order ID. Either way, I would test it out to make sure.

Here's the SQL query:

SELECT * FROM [table data pill] GROUP BY [order ID data pill]

If the above doesn't work, I would dump the API response into a Ruby action and write a quick loop to logically dedupe the list.

Hope this helps!

mlamotte
Deputy Chef II
Deputy Chef II

Thanks, I will be trying this!

The only possible issue I'm seeing is that in my response there is a nested list, which I now flatten using .pluck('value')[0] and 1 and 2

But this is a good starting point, I'm trying it tomorrow!

gary1
Executive Chef III
Executive Chef III

It should work with your flattened list, but if not we can figure it out. Good luck!

It worked.
I tried the sql collection yesterday, but the output was weird.
After querying it was OK though.
 i did a select * from bookings AS T0 WHERE index = (select max(index) from bookings where transport_number = T0.transport_number)

i added an index in the flattened list, as they are sorted from oldest to newest. 

Thanks for the support!