Showing results forย 
Search instead forย 
Did you mean:ย 

Pagination of API endpoints with Workato

Deputy Chef I
Deputy Chef I

I'm running into a couple cases where I need to process a lot of data out of an API (like all projects in Jira, or all channels in Slack), and these endpoints tend to be paginated.

I've been playing around with the repeat action, but that takes a fixed length list to iterate, and with paginated data you often don't know how many interations you're going to have to do. I guess I could set up a helper list with 50000 iterations, and stop when I'm done, but it feels like there should be a better solution.

I haven't been able to find any good design documentation/tutorials on how to work with paginated data in Workato. Does anyone know where to find that?

Thanks, Michiel


Deputy Chef II
Deputy Chef II

One strategy we have used for large lists that may require lots of calculations (like analysis on all POs for the entire year) is to do something like this:

Recipe 1 (Controller Recipe) - determines number of pages and kicks off the individual processes (or for an unknown number of pages monitors for when the process is complete)

Recipe 2 (Grunt Work Recipe) - callable recipe that gets each page of data and stores data in a temporary location (potentially doing some calculation/transformation)

Recipe 3 (Clean-up) - once all the pages are processed the controller recipe will start the clean-up recipe to create the final deliverable

We have built recursive-style processes using workato pub/sub and we have daisy-chained multiple processes of these together to do multiple large scale extractions and transformations.

Generating the 1099 report is a good example. This process has multiple large scale extractions that are dependent on each other. We had to get:

1. all vendors that are 1099-able out of NetSuite

2. all accounts in NetSuite that are designated as 1099-able

3. all bills for the vendors

4. all payments for those bills

5. calculate the percentage of lines on the bill that were 1099-able, and what percentage of the payment was applied to each line to get the 1099 amount

6. after all the calculations were complete - do the roll up to calculate the 1099 amount per payment, per bill, and eventually per vendor.

We didn't know the number of vendors, bill, or payments and had to track all of this independently (in addition to dealing with delays in getting data out of netsuite and doing calculations). The process was 100% automated (click "Start" and then 20 minutes later you received the report of all 1099s for vendors).

That sounds like quite the contstructions. I'm just trying to pull data on a couple thousand slack channels! ๐Ÿ˜„

What's been your transient storage system of choice for all that data?

Sometimes lookup tables if the data doesn't return too many columns or multiple relationships. For more complicated use cases we use a MySQL database.

Deputy Chef I
Deputy Chef I

The design that we use is to move the logic to a callable recipe and perform one job per page of the API call. When all the records in the page are processed, call the same recipe with the next page URL (you can call callable recipe xyz). Stop the execution when the last page is reached.

- Main Recipe: MR_1: Can run on a schedule. It will async call CR_1.

- Callable Recipe 1: CR_1 - This recipe simply calls another callable recipe (CR_2 - Async)

- Callable Recipe 2: CR_2 - This recipe will have an input parameter "page_url" that can be used to make a paginated call and then process the records.

If CR_2 recieves another page, then it will call and pass the page_url to CR_1 which will call CR_2 again. This process will continue unless you reach the last page.

Additionally, if you have huge volume of data, then create another Callable Recipe CR_3 to process individual records. CR_2 can pass the data for each record to CR_3. Pros: Use concurrency to increase 5x throughput, Better error debugging as each record will have it's own job.

Executive Chef I
Executive Chef I

It would be great if we could have the ability to call a callable from itself. Deven Maru