The control-flow module for GeoServer allows the administrator to control the amount of concurrent requests actually executing inside the server,
as well as giving an opportunity to slow down users making too many requests.
This kind of control is useful for a number of reasons:

Performance: tests show that, with local data sources, the maximum throughput in GetMap requests is achieved when allowing at most 2 times the number of CPU cores requests to run in parallel.

Resource control: requests such as GetMap can use a significant amount of memory. The WMS request limits allow to control the amount of memory used per request, but an OutOfMemoryError is still possible if too many requests run in parallel. By controlling also the amount of requests executing it’s possible to limit the total amount of memory used below the memory that was actually given to the Java Virtual Machine.

Fairness: a single user should not be able to overwhelm the server with a lot of requests, leaving other users with tiny slices of the overall processing power.

The control flow method does not normally reject requests, it just queues up those in excess and executes them late. However, it’s possible to configure the module to reject requests that have been waited in queue for too long.

There are two mechanisms to identify user requests. The first one is cookie based, so it will work fine for browsers but not as much for other kinds of clients. The second one is ip based, which works for any type of client but that can limit all the users sitting behind the same router

This avoids a single user (as identified by a cookie) to make too many requests in parallel:

user=<count>

Where <count> is the maximum number of requests a single user can execute in parallel.

The following avoids a single ip address from making too many requests in parallel:

ip=<count>

Where <count> is the maximum number of requests a single ip address can execute in parallel.

It is also possible to make this a bit more specific and throttle a single ip address instead by using the following:

ip.<ip_addr>=<count>

Where <count> is the maximum number of requests the ip speficied in <ip_addr> will execute in parallel.

The rate control rules allow to setup the maximum number of requests per unit of time, based either
on a cookie or IP address. These rules look as follows (see “Per user concurrency control” for the meaning of “user” and “ip”):

<service> is the OWS service in question (at the time of writing can be wms, wfs, wcs)

<request>, optional, is the request type. For example, for the wms service it can be GetMap, GetFeatureInfo, DescribeLayer, GetLegendGraphics, GetCapabilities

<outputFormat>, optional, is the output format of the request. For example, for the wmsGetMap request it could be image/png, image/gif and so on

<requests> is the number of requests in the unit of time

<unit> is the unit of time, can be “s”, “m”, “h”, “d” (second, minute, hour and day respectively).

<delay> is an optional the delay applied to the requests that exceed the maximum number of requests in the current time slot. If not specified, once the limit is exceeded a immediate failure response with HTTP code 429 (“Too many requests”) will be sent back to the caller.

The following rule will allow 1000 WPS Execute requests a day, and delay each one in excess by 30 seconds:

user.ows.wps.execute=1000/d;30s

The following rule will instead allow up to 30 GetMap requests a second, but will immediately fail any request exceeding the cap:

user.ows.wms.getmap=30/s

In both cases headers informing the user of the request rate control will be added to the HTTP response. For example:

GeoWebCache contributes three cached tiles services to GeoServer: WMS-C, TMS, and WMTS. It is also possible to use the
Control flow module to throttle them, by adding the following rule to the configuration file:

ows.gwc=<count>

Where <count> is the maximum number of concurrent tile requests that will be delivered by GeoWebCache at any given time.

Note also that tile request are sensitive to the other rules (user based, ip based, timeout, etc).

Assuming the server we want to protect has 4 cores a sample configuration could be:

# if a request waits in queue for more than 60 seconds it's not worth executing,
# the client will likely have given up by then
timeout=60
# don't allow the execution of more than 100 requests total in parallel
ows.global=100
# don't allow more than 10 GetMap in parallel
ows.wms.getmap=10
# don't allow more than 4 outputs with Excel output as it's memory bound
ows.wfs.getfeature.application/msexcel=4
# don't allow a single user to perform more than 6 requests in parallel
# (6 being the Firefox default concurrency level at the time of writing)
user=6
# don't allow the execution of more than 16 tile requests in parallel
# (assuming a server with 4 cores, GWC empirical tests show that throughput
# peaks up at 4 x number of cores. Adjust as appropriate to your system)
ows.gwc=16