The
information provided applies to the Informix Server Version 11.70.xC4
and later.

With the default configuration of the Informix Warehouse Accelerator
it is possible, that on a single machine the existing CPU resources of
that machine do not get utilized to the maximum when executing queries.
One way to increase CPU utilization is to increase the number of worker
nodes, as described in an early entry of this blog (topic Configuration
Tips (1): CPU Resources).

However, on a single machine this previously described approach has
the drawback, that more worker nodes will also increase the amount of
memory needed for the data marts, because of the duplication of the
dimension table data for every worker node (described in another earlier
blog entry on the topic of Estimating
the Size of a Data Mart). On a cluster system where each accelerator
worker node is running on its own cluster hardware node, this
duplication of dimension tables is unavoidable, because by the
architectural design of clusters the memory cannot be shared among
nodes. But with the accalerator on a single (SMP) machine, this really
hurts, because all worker nodes have their portion of shared memory on
the same machine. That way the memory could not only be shared among the
threads of a single worker node, but also among the different worker
nodes themselves. The duplication of dimension tables in memory for each
worker node is not efficient on a single machine.

Fortunately, there is a way to avoid having multiple worker nodes
with the dimension tables duplicated in memory and still achieve a
higher degree of parallelism and with that better CPU utilization on a
single machine. Each single worker node is already by itself inherently
working in parallel, using threads to execute queries in parallel.
Using more parallel threads for query execution will allow a single
worker node to use more CPU resources and thus execute the query faster.
The same can be achieved also for the work that needs to be done during
the loading of data into a data mart, although here the performance gain
is less, because the load tasks are better distributed among different
worker nodes than a single worker node can distribute the whole work
among its threads.

Two configuration parameters can be used to control the CPU utilization
for each worker node, for query execution as well as data mart load:

CORES_FOR_SCAN_THREADS_PERCENTAGE

Determines the percentage of present CPU resources that should be
used for query execution. (A major task within query execution is the
scanning of table data, hence the "SCAN" in the name of this
parameter.)

CORES_FOR_LOAD_THREADS_PERCENTAGE

Determines the percentage of present CPU resources that should be
used for data mart loading tasks (like building data histograms and
dictionaries, and data compression).

Both parameters are optional and can be specified independently in
the accelerator's configuration file dwainst.conf. With the
parameters absent, both have a default setting of 17 (percent). Once the
parameters and their values have been put into the dwainst.conf
file, they need to be propagated to the individual worker and
coordinator nodes to make them effective. For this the accelerator must
be stopped (using the command "ondwa stop"), the parameters
need to be dispersed (using the command "ondwa setup") and
finally the accelerator needs to be started again (using the command
"ondwa start").

The accelerator uses the settings of these parameters to determine
the number of parallel threads to be started for execution of the
respective task at hand. This is done by an internal calculation based
on empirical values and the given percentage of present CPU cores.
Here, present CPU cores refers to the number of CPU cores, that the
Linux OS has at its disposal, which may differ from two other terms,
number of installed CPU cores as well as number of available CPU cores.
Especially in a virtualized environment, the number of present CPU cores
often does not equal the number of physically installed CPU cores, as
the virtual machine may be entitled to utilize only a portion of
physically existing hardware. (Though it may not be very feasable to run
the accelerator in a virtual machine for serious production
applications.) On the other hand, the number of present CPU cores may
not equal the number of available CPU cores, as some CPU cores may be
quite permanently occupied by other processes (like an Informix Server
with configured processor affinity).

It should be noted, that in general parallel threads for query
execution scale quite well. However, on a big machine (e.g. 64 CPU cores
or even more), it is quite possible to configure the CPU percentage too
high, resulting in too many threads causing a noticable overhead, and as
a result not improving performance. Whereas on a smaller machine, with
say 8 or 16 CPU cores, it is quite safe to configure both parameters
e.g. close to 80 or 90 percent.

All said so far applies to the accelerator installed on a single
(SMP) machine. In a cluster installation the accelerator normally is
running each individual coordinator or worker node exclusively on its
own cluster hardware node. For such an installation it is recommended
to set both parameters to 100 percent.