cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Working w/ Bulk Data in NetSuite

jessica-lie
Workato employee
Workato employee

[Nov 25, 2020] Joe Blanchett (Product Manager, Finance Systems at Seat Geek) posted:

We have a few recipes where we need to work with > 200 rows of data in NetSuite. The search records call maxes out at 200 results on the first page, so the only way I've come up with to get all results involves multiple steps that iterate through each page, and accumulating the results into a workato variable list along the way. As you can see below, this method involves 10 steps just to get all the data that's needed. Has anyone come up with a more elegant way to get high volumes of data from NetSuite? 


10 REPLIES 10

jessica-lie
Workato employee
Workato employee

[Nov 27, 2020] Mike Flynn (Principal Software Engineer at Rapid7) replied:

Joe - I agree with Gordon, we follow a similar strategy for getting large searches back from NetSuite (like generating all 1099s at the end of the year).  We created some utility recipes to help with getting all the results back from a saved search.  There are 3 recipes - the main recipe that initiates the search, the pub/sub recipe that gets all the results from all the pages, and the recipe that processes the results.  Here's basically how it works:

1.  Main recipe calls the saved search to get the Search ID and the number of pages.2.  Main recipe then does a loop for the number of pages in the search (does not get the page from NetSuite yet)     2a.  Posts to a Pub/Sub topic with the search ID and page number     2b.  Adds an entry into a lookup table with the search ID, page number, and status of "PENDING"3.  Uses recipe ops to start the recipe that will monitor and process all the results from the search.

4.  Main recipe job ends.

1.  Pub/Sub recipe is listening for Pub/Sub posts for searches, when a new post is made to the topic it does the following2.  Executes the search and gets all the results for the page3.  Stores the results in a database or a lookup table4.  Updates the entry in the lookup table to have a status of "COMPLETED"5.  Pub/Sub recipe job ends.

1.  Processing Recipe is started by the main recipe, and checks every 5 minutes if all the pages of the search have been found (does a search of the lookup table to see if all pages have status of "COMPLETED").  When all pages are "COMPLETED", then do the following2.  Do whatever action is needed to process the rows (perform calculations, kick off new searches, update NetSuite or other system, etc)3.  Delete all the rows from the lookup table for this search


This strategy allows us to split the search into multiple threads, so that the pub/sub recipe is fetching 5 pages of results at a time.  When using this for generating 1099s, the pub/sub recipe is fetching all vendors, all vendor bills, and all vendor payments (so executing thousands of searches and multiple pages of results for each search).  There is one main recipe - the slack command that starts the 1099 process.  There is a processing recipe for vendors, vendor bills, vendor payments, and the relationships between payments and bills (since it is a many-to-many relationship).  The processing recipe will initiate new pub/sub searches (first get all the vendors, then for each vendor get all their bills, then for each bill get the payments made for that bill, and finally do the calculation for what is 1099able).  Like Gordon mentioned, it is handy for the processing recipe to know when to start and when to kick off the next level of processing.

Here are some screenshots of examples:

1.  main recipe - for each vendor start the search for vendor payments



2.  pub-sub example recipe - executes search for vendor payments, updates MySQL (where we are temporarily storing data), start process for more searches




3.  processing recipe - this is the very final processing recipe that generates the final report.  There are other processing recipes for handling results for vendors/payments/bills



jessica-lie
Workato employee
Workato employee

[Nov 27, 2020] Ryan Koh (Customer Success at Workato) replied:

Lots of good discussion points here. I would love to find out what are some of the business scenarios where you require the full data set from your systems as opposed to more real time data? And how often do you run these types of workflows?


And when you have the full data set, what do you do with that data? How do you detect what is new and what has changed or is that even necessary?

jessica-lie
Workato employee
Workato employee

[Nov 27, 2020] Joe Blanchett (Product Manager, Finance Systems at Seat Geek) replied:

Thanks everybody for the great info. We will definitely be implementing some of these strategies at SeatGeek. 


@Mike - way to utilize the entire Workato Suite! I like the idea of leveraging lookup tables and pub/sub. Have not had the opportunity to use Recipe Ops yet.

@Brian - I totally agree. It would be great if pagination was a built-in option in the connector itself. 

@Ryan - we have at least 2 use cases that come to mind (1) payment processor settlements and (2) customer syncs from NetSuite to Adaptive. 


For 1, we drop in all transactions that our payment processor settles to us (we process payments on behalf of our enterprise clients) in a custom NetSuite table, calculate our fees, then create bill/bill payment transactions to payout the remaining funds to our clients. We get up to 8K transactions per batch and need to ensure a batch is fully processed and never in a partially processed state for a few reasons, one of which is that our cash release process is based on comparing incoming and outgoing cash on the batch level. For 2, we have > 200 customers and sync the customer table nightly to Adaptive Insights.

jessica-lie
Workato employee
Workato employee

[Nov 30, 2020] Brian Flood (Vice President, IT, Business Systems, & Data at Fastly) replied:

There are several use cases we have where a large bulk upload makes more sense than real-time. One is daily aggregated customer usage statistics. Once a day we update every single customer account in Salesforce with aggregated usage data from our data warehouse. The underlying tables contain billions of events, and updating Saelsforce in real time would quickly overwhelm their APIs and hit our call limits, we also don't really care about intraday usage so more than once a day is not useful. There are also use cases where either the source or destination does not support real-time and only supports something like a daily FTP feed. We have a couple of those with our Equity management tools (Shareworks) and our LMS (Bridge). We also use our data warehouse in BigQuery to calculate all types of things we want to push to various systems, as well as use it as lightweight master data management. BigQuery just does not lend itself to real time processing so some sort of batch update is always needed whenever it is involved, which is often for us. 

jessica-lie
Workato employee
Workato employee

[Nov 30, 2020] Gordon Hu from WeGalvanize replied:

I found a โ€œwork aroundโ€ to overcome the real time API limit. 


Original: we need multiple workato real time recipient will pull every 5 minutes.


Work around - to slow down calls

1. Define a master schedule recipe that calls every 5, 15, 60, day, week callable recipes. 

2. Make callable recipes according to these schedules.

3. Then make another callable recipe to extract data between now and now-<period>

We use this approach for Okta events to monitor login failure, new hire activations, user offboardings etc. Weโ€™ve also used this to pull data for aggrgated statistics (eg Gong, Salesforce).


Pros: one recipe design that can be reused in many systems.

Cons: now I need to make sure the time master recipes are correct, and attach the callable recipes to the right schedule. However once attached, there is pretty much no debugging. Only need to tweak the last callable recipe.