Get Lead Activities - Design issue, performance unusable | Community
Skip to main content
New Participant
July 23, 2021
Solved

Get Lead Activities - Design issue, performance unusable

  • July 23, 2021
  • 1 reply
  • 3111 views

The main issue I am running into is with the "Get Lead Activities" API:

https://developers.marketo.com/rest-api/endpoint-reference/lead-database-endpoint-reference/#!/Activities/getLeadActivitiesUsingGET

 

For example, if I request all of the interesting moments from the past 90 days for a single lead, it takes 14 calls to the API passing new "nextPageToken" until there are no longer "moreResult" = true responses from Marketo.

 

This is extremely slow, in our testing, takes about 28 seconds.

 

In my first request, I pass a nextPageToken that is set at 90 days ago, and for my test Lead with id 835054, they have 2 interesting moments in that time period.

 

So the result set is much less than 300, which is the total max batch size returned in a single response. So a single response should be able to return all Interesting Moment activities for this lead in the last 90 days.

 

However it appears the Marketo implementation is not paginating by the batchSize parameter, but instead by some arbitrary period of time represented by each "nextPageToken".

 

This design approach causes the performance issue by requiring us to execute 14 API calls in what could be returned in 1. 12 of these API calls return no data, that's 12 wasted calls against our API quota in addition to the processing time this adds.

 

Need help to understand if there is anything we can do on our side, using the Marketo REST API, to improve this performance or if there needs to be a design change/enhancement to this API so that it paginates by returned row count, rather than arbitrary time periods.

 

Marketo seems to have a less limited/better implementation they use on their own UI as they can return this Leads activities in a single call. So not sure why customers are only given the option to this extremely limited API.

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by SanfordWhiteman

If you need to do this kind of work for time ranges of more than a day or so, you should be using Bulk Activity Extract, not the paginated API. Maintain your own offline mirror using the Extract results and query that.

 

The behavior of the nextPageToken is well-known: it's a cursor through the entire log, not through a filtered subset of the log by only a single activity.

1 reply

SanfordWhiteman
SanfordWhitemanAccepted solution
New Participant
July 23, 2021

If you need to do this kind of work for time ranges of more than a day or so, you should be using Bulk Activity Extract, not the paginated API. Maintain your own offline mirror using the Extract results and query that.

 

The behavior of the nextPageToken is well-known: it's a cursor through the entire log, not through a filtered subset of the log by only a single activity.

xceledAuthor
New Participant
July 23, 2021

Why is that an acceptable solution?

 

That is an awful solution for the API consumers and introduces significant burden on API consumers to store and maintain entire databases and sync processes for ALL activity records to cover the situation where we might need to get the activity details on demand for a single lead.

 

You mention "The behavior of the nextPageToken is well-known: it's a cursor through the entire log, not through a filtered subset of the log by only a single activity.", its not well known to me, I read the API docs and this behavior was not mentioned or warned as a concern to be aware of. I had to find out by doing direct testing with the API. Your comment says exactly what is the problem, if they know our filter criteria from our API call, why are they querying the database in such a way that they are not using it in the query itself and applying the filter AFTER getting database results, giving this poor API experience. That's a benefit of databases, is to let them do the heavy lifting with your predicates, not apply them after the fact. So this answer only seems to reinforce my concern with the design issues of the API.

 

Why even offer an API if its so inflexible that the answer is to copy all the data locally and avoid using the API?

SanfordWhiteman
New Participant
July 23, 2021
Well, it's just a longstanding reality with Marketo that reading large ranges of the database in real-time via the API isn't feasible.

That's why the Bulk Extract APIs were added (relatively recently), so people with more ambitious read workloads could maintain a high-performance local mirror.