Are there plans to support batching of messages with the PubSub Trigger? For example, instead of calling a recipe once per message I would like to get 10, 20, 30, etc messages from the top of the queue each recipe call. These would be batched and sent to the recipe input either everytime the batch size is achived or however many messages are left in the queue after a max ellapsed time (for example 10 seconds), whichever happens first.
It would be SO GREAT if Workato added this feature. It would solve the problem of having to figure out where to park the data for batching calls.
I guess you could argue that the desired functionality could be achieved by writing to a Lookup Table & then using a recipe to search for, process, then remove the desired number of records. Do you see any limitations there Dan Ferguson?
That approach should work fine for a concurrent job count of 1 and until you run up on the lookup table limit. 10k entries is max number of rows in a lookup table and I am not sure how you could handle multiple jobs updating/deleting messages from your "queue" table.
That might be a very nice enhancement!
In the meantime, is it possible in your use case to batch the messages in the Publisher (publish x number of messages in an array), while still operating in near real-time? Of course, you need to be mindful of the Pub-Sub message size limit, which was 64KB but I believe was recently increased to 128K.
We have an interface that runs near real-time (polls every 5 minutes) and it publishes up to 1,000 messages per "Publisher" job. Due to the Pub-Sub message size limitation, we publish only the primary key (PK) for each record. The subscriber receives the 1,000 PKs and calls a helper or utility recipe to retrieve the 1,000 records from the source application, maintaining the Pub-Sub pattern advantage of decoupling source from target. All processing is done in bulk/batch and It runs pretty fast end-to-end. We have concurrency set to 5 for all 3 solution components (the Publisher, the Subcriber and the Utility/Helper recipe). We will be adding a second Subscriber in the coming weeks.
Also, we considered a Datahub approach where the Publisher would write each new/modified event to the hub near real-time and each subscriber would retrieve however they want/need from the Hub (event-driven or batch/bulk, timing/frequency, etc.). However, we opted for the ligher weight solution that I mention above as a full blown datahub, with its requisite on-prem DB components, was not warranted for our requirements.
I should have also mentioned that in our solution, the Publisher is using the New/Updated Batch of Rows SQL Trigger to take advantage of out-of-box Workato capability for cursor management, processing of transactions in order, etc., while also allowing concurrency of Jobs.