How we use the Salesforce APIs
Choosing the right API, especially for data retrieval, can be difficult. There are REST APIs, SOAP APIs, as well as two generations of Bulk APIs.
This page provides a summary of which APIs we use in different circumstances. There are some further edge cases we left out, but for the most part this should paint an accurate picture of how we interact with Salesforce.
Inbound Syncs
Salesforce can have many hundreds of objects, and often people want to just select all of them. However, querying hundreds of objects creates a big overhead.
So we first use the Record Count API to get an estimated record count of all objects. If an object returns 0 records from this, we skip it. In most situations, this saves us from checking many hundreds of objects.
Then, if the stream is configured as Incremental and there is some stream state present, we run a SOQL query to determine the count of new changes. If that is 0, we skip it.
Then, we use the Bulk V1 query API, provided that:
The object supports the Bulk APIs (not all do)
There are more than 2000 records to retrieve
We fall back to the regular REST query API for the remainder, to avoid the overhead of bulk jobs.
In most cases, we can use the nextRecordsUrl
property to incrementally fetch the whole result set. However, there's one more scenario we have to cater for. The SOQL Query we submit contains all of the fields of the object, e.g. select Id,Name,..... from Account
.
In some situations, this query can get so long that it causes Salesforce to throw an error. So we check the length, and if it's longer than 18,000 characters then we fall back to using the FIELDS(ALL) function. The caveat here is that you have to use LIMIT 200, and only offset by up to 2000. So we fetch 200 records at a time with a moving cursor value until we get no more results.
When the Bulk query API is used, PK Chunking is enabled if the object supports it and there are more than 1 million records to fetch.
If there are more than 10 million records to fetch and the object does not support PK chunking, we do our own form of client side pagination to avoid scale-related errors in the bulk job.
Outbound Syncs
For outbound syncs, we always use the Bulk v1 API. The reason for this is that it gives us access to the "Serial Mode" parameter, which can be used to prevent contention-related errors in busy orgs.
Salesforce API quota considerations
The typical usage patterns of Omnata, scheduled SOQL queries and/or ingest jobs (to Salesforce), do not tend to make a significant contribution to API quota usage. You can check your relevant APIs quotas by Salesforce edition in the Salesforce docs for the REST APIs and Bulk APIs.
Inbound syncs
For inbound syncs, there are two stages to a sync that have different API usage profiles; the initial sync run and the ongoing sync runs.
For the initial sync, it is not possible to accurately estimate the API quota usage of the initial sync, since large objects may, use the REST or Bulk APIs, and may use automated scaling techniques that increase the number of batches.
For the ongoing sync runs, Omnata automatically fetches incremental changes (unless configured otherwise), and if you run the sync on a schedule often enough that there are less than 2000 changed records, Omnata will only query the REST API. In this case, you can approximate the number of REST API calls in 24 hours as the [ number of streams in all syncs ] * [ number of sync runs in 24 hours].
Outbound syncs
For outbound syncs, the quota usage depends on the size and shape of the data being ingested to Salesforce. You can see how this is limited in the docs Limits Specific to Ingest Jobs, jobs that hit these limits will be automatically scaled out across multiple batches.
Last updated