Systematic Community

michielb · ‎12-21-2021

I'm running into a couple cases where I need to process a lot of data out of an API (like all projects in Jira, or all channels in Slack), and these endpoints tend to be paginated.

I've been playing around with the repeat action, but that takes a fixed length list to iterate, and with paginated data you often don't know how many interations you're going to have to do. I guess I could set up a helper list with 50000 iterations, and stop when I'm done, but it feels like there should be a better solution.

I haven't been able to find any good design documentation/tutorials on how to work with paginated data in Workato. Does anyone know where to find that?

Thanks, Michiel

jeremyo · ‎12-21-2021

I had to do this recently. Everyone's advice is great. I just wanted to point to some recipe examples that show this. This post has two good examples:

Recipe pagination examples : Workato Success Center

rbflood · ‎12-22-2021

We do this with a few recipes. One thing I think Workato sshold consider is building this pagination into the connectors themselves directly. With the billing model including number of actions, doing these paginations with helper lists can create some pretty huge action numbers from a recipe run making these recipes quite expensive. I can't think of any use cases outside of testing where it would make sense to not retrieve all of the results and the logic to figure these paginations out really inhibits the idea of citizen develpment as it can get pretty complex.

For example, Okta will paginate group members and you need to dig through the http header with a custom action to get the link to the next result page if it exists which is different than a lot of other APIs that simply add the next page or whatever into the body attributes. We use helper lists for this and it wrks but consumes a lot of actions.

A much more challenging use case is BigQuery. We have a recipe that runs a query that results is a relatively large file we need to SFTP to a vendor. BigQuery will paginate on number of records OR size of data. In this case we are retrieving a lot of columns so we hit the data limit first (which btw took a very long time to figure out). So if you add another column to your query the number of results will change. In this case we had to create a helper list to take the total results and divide by the number of rows returned to know how many iterations were needed to get the full results since it can be different depending on the run and the data returned. We create an entry in the helper list to represent the iteration so it's usually just a list with like 3 or 4 rows.

It's possible to do all of this inside recipes but it seems like it would be a lot more trivial to do behind the scenes in the ruby built into the connector itself and also limit the number of actions we need to consume to run these things and not make us figure it out on our own.

michielb · ‎12-22-2021

Yeah, it feels like there should be some control mechanism to support pagination. I suppose you could build a custom connector to handle the case, but that's somewhat defeating the point of a low-code system.

james-ng · ‎12-22-2021

I have a simple implementation using "flag":

(1) create variable "flag"="" (empty string)

(2) create helper list 100, assume max 100 pages

(3) iterate through helper list

(4) inside loop:

(4a) if "flag" does not equal "pagination_end", call API and process the data. For example start pagination with ([index]*50)+1 for 50 records each page.

(4b) if list size < 50, update variable "flag"="pagination_end"

(5) after pagination the recipe can continue

This will ensure that the remaining iteration will be skipped while the recipe continues.

See example: https://app.workato.com/recipes/1969723?st=7b3e58

Hope it helps!

Systematic Community

Pagination of API endpoints with Workato