I stumbled across an API from a popular vendor in my sector that asks the client to dynamically determine the max records allowed in a single POST request by first performing a GET request and retrieving a property called MaxRequestSize. The idea is that the client is to use this property to split up their records and make multiple POST requests to the API based on the value of this property.

Is there a reason you would want to do this? Is it a bad practice in general?

Every request has so much overhead and so many records. The more records the less overhead per record. But only so many records are allowed before the server want's to stop and deal with something else. So the client wants to know the most it can send before getting cut off.

This is a good idea. It's such a good idea that it's one of the biggest differences between how IP v6 works vs IP v4. Unlike v4, packet fragmentation in v6 asks every network about to be crossed what it's packet size is. The smallest response sets the size. Now the traffic starts life small enough not needing to be refragmented as it runs into the next network.

edit i see in a comment you explain that it is a geocoding api. Presumably you send addresses and receive lat, long

In this geo coding case, the server is under the same load per address, so the only difference between one large request or many small ones is the response time of each request.

Given that the client can still request the same number of results at the same time either way, via many simultaneous small requests, a better design is to queue each address separately for processing and return them when complete. Ideally you could push individual results back to the client.

You could also implement a round robin processing strategy if you want to fairly distribute processing time amongst many clients.

Usually a rate limit on an API restricts you to a maximum number of requests per time period rather than the size of the request.

(returning multiple records in a single response generally incurs no extra cpu load. and a single large response uses less bandwidth than the same info transmitted in many small responses)

Say you are retrieving or uploading n records. You can do it in one request or several. Usually the single (large) request will put less load on the server.

(Because each individual request incurs some general handling cost, which you only have to pay once for the single large request vs many times for the multiple small requests)

Given that the client still needs to process all its requests, splitting them up would seem to offer no performance value.

(If there is also a rate limit and the client needs the data now, then the client will hit the limit and error. if the client doesn't need the data immediately then the server could send a response later)

Having the client dynamically change its batch size also seems problematic. Should I make the Get call every time? Can I cache the value? What if there is a delay between getting the value and sending the batch?

(are all problems you would need to solve which you don't for a fixed limit)

A better design would surely be to have the server perform the splitting to the required number, before passing it on to the next stage internally and then combining the results back into a single response.

(Assuming the cost for processing large request is not linear, say its O(n^2) the server could split the request as easily as the client)

I suppose you can imagine timeout issues with large batches being a problem. But this is better served by switching to a command query or async pattern rather than RPC

(The only restraint on server splitting over client is if each request has a timeout waiting for the response. If you change to an async pattern with a callback to or polling from the client this can be negated)

Usualy a rate limit on an api restricts you to a maximum number of requests per time period rather than the size of the request. -- OK. But returning multiple records in the same request still incurs a non-trivial expense.
– Robert Harvey♦Aug 25 '16 at 18:14

Given that the client still needs to process all its requests, splitting them up would seem to offer no performance value. -- Unless there's also a number of requests rate limit.
– Robert Harvey♦Aug 25 '16 at 18:15

Should I make the Get call every time? Can I cache the value? What if there is a delay between getting the value and sending the batch? -- The rate limit should be effective for a specified period of time, after which it is renegotiated.
– Robert Harvey♦Aug 25 '16 at 18:16

A better design would surely be to have the server perform the splitting to the required number, before passing it on to the next stage internally and then combining the results back into a single response. -- What? This doesn't even make any sense.
– Robert Harvey♦Aug 25 '16 at 18:17

this is better served by switching to a command query or async pattern rather than RPC -- Unless blocking is not the problem they're trying to solve.
– Robert Harvey♦Aug 25 '16 at 18:18