The kernel extension is designed to be activated on a single source node from which it will selectively drop packets either outbound to a target node or received from it. It is not necessary to install the packet drop simulator on the target node. The target node itself does not even need to be AIX-based but just needs to support the TCP/IP protocol.

The article first presents some background to TCP/IP performance and why dropped packets can have a serious impact on application performance. You will then see how to build and use the kernel extension as provided. This enables you to get up and running quickly with it.

The supplied functionality is deliberately very simple. It allows you to set a given percentage of TCP/IP packets to be dropped randomly to and from a specified host, with all other hosts and protocols unaffected and no retained history. Because in practice, you might need to add further functionality into the kernel extension according to your own requirements, you will also be shown details as to how the extension works and how it can be customised. You can then decide to do this if you need to:

Permit packet dropping to and from multiple hosts.

Support different packet drop rates for each of those hosts.

Create historic logs of total packets received, transmitted, and dropped in both directions.

Implement the packet drop simulator on different versions of AIX or C compilers to those used when the software was written.

The kernel extension could be readily developed further to simulate other network issues that could give rise to performance problems, such as packet corruption, packets arriving out of order, and jitter.

It is recommended the kernel extension is only used on non-production systems.

Background

First let's review some basic terms that are important when considering network performance, bandwidth, latency, and congestion.

Bandwidthis the capacity of the network, that is the rate at which data can be pushed across it. This is usually measured in Megabits per second (Mbps), where one Megabit is 10^6 bits. It can be thought of as the width of the network pipeline.

Latencyis the time it takes for one piece of data to traverse the network. This is usually measured in milliseconds. It can be thought of as the length of the network pipeline.

The difference between bandwidth and latency is illustrated in Figure 1.

Figure 1. Latency and bandwidth and the network pipeline

Congestionis the effect of heavy usage of the network. It can result in delays in propagating data, loss of packets resulting in packet retransmission, and an inability to make connections to the network. Packet loss is usually measured as a percentage. TCP can normally handle losses up to 0.1% with little direct impact. If the losses increase above this, the effect can be more severe.

Network latency, bandwidth, and congestion can have a serious impact on application performance. Consider, for example, the situation where a front-end application on one system makes frequent calls across the network to another system which hosts a database used by the application. The traffic across the network to the database might consist of many bursts of small data, or might consist of fewer sets of relatively large data transfers. Either way, the ability of the network to transfer this data reliably and speedily can be a significant factor in the application's perceived end-to-end performance by a customer.

Application development teams often have access to isolated, high-speed networks that can be much more lightly used than the environment into which an application is to be deployed. This can mean the effect of poor network performance on the application may never be observed within the environment in which it is developed and tested. This can, in turn, mean that the performance observed within a customer's environment might be very different to that observed in the environment from which it is tested.

This article focuses on the effect of dropped packets on a particular application's performance. Due to the fact that there can be very high numbers of packets transferred between a front-end and a back-end system and the patterns in which these application packets are transmitted are very specific to the nature of the front-end and back-end applications, the simplest approach to understand the impact on the application is to simulate the dropping of packets.

Packets can be dropped when transferring data between systems for two key reasons:

Heavy network utilization and resulting congestion

Faulty network hardware or connectors

TCP is designed to be able to react when packets are dropped on a network. When a packet is successfully delivered to its destination, the destination system sends an acknowledgement message back to the source system. If this acknowledgement is not received within a certain interval, that may be either because the destination system never received the packet or because the packet containing the acknowledgement was itself lost. In either case, if the acknowledgement is not received by the source system in the given time, the source system assumes that the destination system never received the message and retransmits it. It is easy to see that if the performance of the network is poor, packets are lost in the first place, and the increased load from these retransmit messages is only increasing the load on the network further, meaning that more packets will be lost. This behaviour can result in very quickly creating a critical situation on the network.

There are some freely available sophisticated software products that can help to simulate different characteristics of varying network performance. These include:

WANem runs from a Knoppix-based CD and can simulate a large range of different network characteristics. It does offer the ability to drop a random number of packets with filtering options but would need to run on a separate system in between the source and target system.This may not be convenient, for example if the source and target systems are installed on dedicated high-speed networks.Building the kernel extension and utilities

Although ipfilter looked very promising, it had only been partially ported to AIX 5.3 and it became evident that it would be quicker to write a bespoke utility providing the minimal functionality required rather than porting the whole of ipfilter to both AIX 6.1 and 7.1.

dummynet is a component of the FreeBSD operating system and has also been ported to other platforms including Linux and Microsoft® Windows®. The disadvantage of using this though is that to simulate packet drops for packets originating from an AIX host, an intermediate FreeBSD/Linux host has to be configured and set up with dummynet.

ALTQ is an alternate queueing framework that helps to provide bandwidth control, mostly on BSD routers.

There are also other packages commercially available such as LANforge ICE but these were ruled out for cost reasons. LANforge also requires dedicated hardware.

How to build and use the kernel extension as supplied

Requirements

The packet drop simulator was built and tested in the following environments:

AIX V7.1.4.0 with gcc V4.7.2-1

AIX V6.1.2.0 with gcc V4.2.0

AIX V6.1.4.0 with gcc V4.2.0

The kernel extension and associated C programs were compiled using the gcc compiler. It should be straightforward to use an AIX C compiler instead if required, although some of the compilation flags will need to be changed.

Note that it will be necessary to install the AIX bos.adt.syscalls file set to build the kernel extension. This will be available on your AIX installation media. The file set will not be required on systems were the kernel extension is only to be deployed as long as the AIX level is identical to the build system. If there are any differences in AIX levels between the two systems, the kernel extension should be built on each.

The kernel extension, the kctrl utility to load and activate it and the C example control and monitor are all built with the AIX makefacility. A makefile to build all of these components is included in the downloadable zip file.

Precautions

Two important factors should be kept in mind before using this utility:

1) Care should be exercised when developing and testing any kernel extension. It is common for errors to result in system failure. A dedicated test system is ideal, but avoid using systems that are also used for other purposes or by other users. If the system is located in a remote data centre, you should also ensure you have the means for it to be rebooted when required if it does not reboot automatically.

2) Avoid the use of flood pings to test the utility on a shared network as this may severely impact other users.

Building the kernel extension and utilities

First make sure you have met the requirements on your development system as outlined above.

Next, extract the downloadable attachment into a working directory of your choice on the target AIX system.

The download does not include pre-built binaries so as to avoid the risk of running it on incompatible operating system levels. The software will therefore need to be built once it has been extracted. To this, change directory to the working directory you created above and then run the make commands as follows:

Loading and activating the kernel extension

One of the binaries that will have been build in the proceeding step is the kctrl program. This permits the kernel extension to be loaded, activated, and unloaded. To load and activate the kernel extension, proceed as follows:

In this example, shortly after we started pdrop_ccm we started sending IP traffic to the target host. The pdrop_ccm utility will keep running until it is interrupted with a ctrl+c and will print an update line every 10 seconds.

Please note that the first command line parameter, the kernel extension name, must be specified exactly as it was when loaded with kctrl.

The drop percentage argument is optional. If it is not supplied, pdrop_ccm will display the current drop rate only and will not re-set it.

The rest of this article presents more details on how the kernel extension works and how it and the control and monitor applications can be developed and customized for individual use.

How the kernel extension works

The AIX kernel extension is written in the C programming language. The kernel function exposes access to it through a set of system calls. The C and Java control and monitor applications utilize these system calls to drive the kernel extension to drop packets and to and from a particular host at a given rate. For the Java environment, a JNI wrapper class provides access to the kernel extension's system calls from Java.

The kernel extension itself consists of:

A set of system calls which control the dropping of packets and the collection of packet statistics

Two functions which hook into the kernel's networking layers. One of these is for inbound traffic and the other for outbound. These functions are called whenever a packet is received or is to be dispatched to control the dropping of packets.

The system calls are accessed by the user-level b to set the remote system to which incoming and outgoing packets should be dropped and what percent of packets should be dropped. These system calls also contain functions to retrieve counters defining how many packets have been dropped in both inbound and outbound directions and also a function to reset the counters.

For the C environment, these system calls are used directly in the example control and monitor application pdrop_ccm.c, which we used in the Initiating packet dropping and monitoring section above. For the Java environment, the system calls can be used using JNI. The JNI wrapper can then be used by a bespoke Java application to control and monitor the dropping of packets. Again, a sample Java application is provided as a downloaded resource with this article. This is called pdrop_jcm.java. Refer to the Java control and monitor section for further details on the example Java utility.

Figure 2 illustrates the different components to the kernel extension and the flow of data through it. The following two sections discusses the flow of packet flow through the extension, both outbound data to be sent to the target node and inbound data received from it.

Outbound traffic flow

First, consider the case of outbound traffic, where an application under test submits data to be transmitted across the network. Refer to point A in the Figure 2. The application submits a set of data to the kernel for sending to the remote node on the network. The TCP subsystem breaks this down into a number of packets which are normally then sent to the appropriate network device driver for dispatch across the network. When using the kernel extension, however, the hook outbound_fw is set in the IP layer which results in the kernel extension's outbound filter function pdrop_outbound_filter being called for every packet being dispatched. This retrieves the IP header from the packet and matches it against the target node for which packets are to be dropped. If the node does not match, the function, ip_output_post_fw is called which routes the packet to the appropriate interface on the host. If the node does match, then a random decision whether to pass the packet is made, which is weighted according to the number of packets to be dropped. If this decision is that the packet should be sent, the ip_output_post_fwfunction is called and the packet is passed. Otherwise, the mbuf holding the data is freed and the outbound filter returns and the packet is not sent.

Inbound traffic flow

For inbound traffic, a corresponding operation applies when data is received from the network by the device driver. Refer to point B in Figure 2. The device driver forwards the packet of data to the TCP/IP subsystem. This now calls the inbound filter which again has been set by the kernel extension by assigning the inbound hook inbound_fw to the pdrop_inbound_filterfunction. This function makes the decision whether to pass the packet or not. If the packet is not from the target node for which packets are being dropped, it is automatically passed by calling the ipintr_noqueue_post_fw function. If it is from the target node, then again a random decision is made whether to pass the packet according to the required percentage of packets that are to be dropped. If the decision is to send the packet, theipintr_noqueue_post_fw function is called and the packet is passed. Otherwise, the mbuf holding the data is freed and the inbound filter returns and the packet is not passed up to the higher layers.

The arguments to the filter function include an mbuf containing the received data. The mtod macro is used to convert the pointer to the mbuf into a pointer to the received data, which is addressed as an IP header. A counter of the total number of received packets from the target address is maintained.

The function then compares the IP address selected for dropping to the source address specified in the received packet. If these match, then the pdrop_random() function is called, which uses the modulus function to select the required percentage of packets for dropping. If the packet is to be dropped, then an internal counter of dropped packets is incremented. If it is not to be dropped, then the ipint_noqueue_post_fw function is called to pass the packet to TCP for processing.

The fetch_and_addlp system functions are called to increment the counts of the total number of received and dropped packets. These ensure that the counts are maintained atomically so as to be thread safe.

You can see from this example that it is possible to specify which protocol is to filtered. In this case, we are only dropping packets for the TCP protocol. For testing, it is a good idea to make it use another protocol, such as Internet Control Message Protocol (ICMP) so then the packet dropper can be used with commands, such as ping. To do that, the IPPROTO_TCP constant in the test above would be changed to IPPROTO_ICMP.

Note:amDropping, dropip, and dropMod in the above extract are defined as global variables.

This works very similar to the inbound filter. Counters are maintained for the total number of packets destined for the target IP address and for the total number of outbound packets dropped.

The pointer to the mbuf is freed if the packet is to be dropped.

Note: amDropping, dropip, and dropMod in the above extract are defined as global variables.

Randomizing the packets to be dropped

It is important for the kernel extension to select packets at random for dropping. If say, packets were dropped after a set number of packets had been sent, this would not necessarily be a true emulation of what happens on a heavily used network. Further, it is possible that the software under test may behave differently when it misses packets at regular and irregular intervals.

The standard C library rand() call to generate random numbers cannot be used in a kernel extension. This is because such functions are not safe to use in a the re-entrant kernel environment. If you attempt to use this function, the system might fail when it is called from the kernel extension.

A simple function to mimic a call to random was therefore used. This generates a long random number based on the following inputs:

The seconds field from the current time

The nano-seconds field from the current time

The number of calls to the random function

The number of processor ticks since system boot

This is the source of the random function that is used in the kernel extension.

This is not of course a true random function but is a good enough mechanism for assuring that packets are dropped in a sufficiently irregular manner. As the calls variable is static, it is incremented atomically with the fetch_and_addlp kernel service.

System calls provided in the packet drop kernel extension

In this section, you are shown details of the various system calls provided in the kernel extension. These will be of interest if you need to write your own control and monitor applications or customize the kernel extension.

This function returns the target address as a 32-bit unsigned integer.

Setting the rate at which packets are dropped:extern void pdrop_setDropMod(long m);

This function takes a long input value, which represents how often packets should be dropped. The way this is worked out is that the kernel extension generates a random number and then takes a modulus of that random number with the long value. If the result is one, then the packet is dropped, otherwise it is passed.

The call is designed in this way as it is not possible to perform floating point operations within the kernel extension.

Here is a simple way of callingpdop_setDropMod()from the application level based on a double percentage value:

So, for example, if it was required to drop 0.1% of packets, then d would be 0.1 and the long value passed topdrop_setDropModwould be 1000.

Retrieving the rate at which packets are dropped:extern long pdrop_getDropMod();

This function returns the long representation of how often packets should be dropped, which is expressed in the same way as a call topdrop_setDropMod();

Activate the dropping of packets: void pdrop_startDropping()

When the kernel extension is started, packets are not dropped until the pdrop_startDropping() function is called. You should call this function after setting the target address and the rate at which packets should be dropped.

Stop the dropping of packets:void pdrop_stopDropping()

This function stops the dropping of packets. The functionspdrop_startDropping()andpdrop_stopDropping()can be called as required to enable and disable packet loss.

Query whether packets are being dropped:int pdrop_amDropping()

This function returns a 1 if the kernel extension is currently dropping packets and a 0 it is not.

Details about exporting the system calls provided in the kernel extension to the application level

Details about loading and activating the kernel extension

Information about the use of system logging to record when the extension is loaded and unloaded

AIX 6.1 and AIX 7.1 build issues

Beginning with AIX 6.1, the AIX operating system simplified its kernel environment by providing only the 64-bit kernel. AIX 6.1 and AIX 7.1 maintain application binary compatibility with previous AIX versions as specified above, but device drivers and kernel extensions that are only 32-bit cannot be built on AIX 6.1 or AIX 7.1.

As this article presents the kernel extension as suitable for AIX 6.1 and AIX 7.1, it has been built in the 64-bit mode.

The build process

A Makefile is included with the compressed file provided with this article. This can be used to build the kernel extension and application layer programs associated with it. Note that this Makefile has been written for use with the AIX make utility and it will need to be modified if you wish to use gnu make.

As noted earlier, the kernel extension is built as a 64-bit binary. The -ffreestanding and -msoft-float options are used to prevent the use of floating point instructions to manipulate certain data structures. This was required on AIX 7.1, but was not necessary on AIX 6.1.

The ld command specifies the entry point into the kernel extension, which in this case is the pdrop_init() function. This function is called when the kernel extension is activated.

The ld command also refers to the files kernex.exp and netinet.exp. These files are provided with the bos.adt.syscalls file set. This file set may not be installed on your test system in which case you will need to ask the system administrator to install it.

Note that it is necessary to make one small modification to the /usr/include/sys/socketvar.h header file for gcc to compile the kernel extension satisfactorily. You should retain a safe copy of the file and locate the following line:

extern struct free_sock_hash_bucket free_sock_hash_table[];

This should be changed to:

extern struct free_sock_hash_bucket * free_sock_hash_table;

The kernel extension makes this #define in the source:

#define _MSGQSUPPORT 1

This is necessary to prevent the fd_select() system call being made which is not available to the kernel extension. Using#define changes the fd_select() call to the original select() call, which is available to the kernel extension.

Exporting the provided system calls

The pdrop_syscall.exp file defines the kernel extension system calls that are exported to the application level. The contents of this file are as follows:

If you need to customize the kernel extension to include further system calls, you will need to amend this file. The syscall3264identifier at the end of each line makes the system calls available to both 32-bit and 64-bit processes. The flag may also be set to syscall32 to support calls from 32-bit processes only or syscall64 for 64-bit processes only. If the flag is not set for the correct target process environment, the process will fail with a segmentation fault when the system call is made. For further details on how to set this identifier refer to the topic, Exporting Kernel Services and System Calls.

Use of kctrl to control loading of the kernel extension

The kernel extension is loaded and unloaded with the kctrl program provided as described earlier. The program should be invoked with the full path name of the kernel extension. It then interactively accepts the following commands:

q – checks whether the kernel extension has been loaded

l – loads the kernel extenstion

I – initializes the kernel extension

t – terminates the kernel extension

u – unloads the kernel exension

e – exits the utility

Here is an example where kctrl is used to query, load, initialize, terminate, unload, and quit the utility:

In this example, the kctrl executable works with interactive user input. It would be straightforward to modify kctrl to take the command as another argument on the command line and thereby make it easy to incorporate within system startup and shutdown scripts. However, this is not generally recommended given the nature of the utility.

Logging in the kernel extension

The kernel extension uses the syslogd daemon to record when the extension is loaded and unloaded. This can be useful for debugging or auditing purposes. To enable this logging:

Ensure that logging is enabled in /etc/syslog.conf. For example, the following line can be appended to this file:

*.debug /var/log/syslog.out rotate size 100k files 4

Ensure that the log file enabled in /etc/syslog.conf exists.

Refresh the syslogd subsystem (refresh -s syslogd)

Use with other IP protocols

Although the extension is primarily considered for TCP/IP, it would be simple to change it to work with other IP-based network protocols. This was discussed in the How the kernel extension works section.

Control and monitor applications

Two example applications that can use the system calls exposed by the kernel extension are provided. One is for the C environment and the other for the Java environment.

These applications show you how to write your own custom applications to control packet dropping. You can decide to do this if you need to write your own automated test framework to simulate different network conditions.

It should be noted that only the applications that collect statistics directly from the kernel extension can provide meaningful statistics on packets that have been dropped. Operating system provided utilities that report on dropped packets will not include details of packets that have been dropped through the kernel extension. This is because the packets are dropped by the extension before they are passed through the TCP/IP subsystem for dispatch or delivery.

C control and monitor

The test application, pdrop_ccm.c, is provided in the downloads section of this article. This is invoked with a target host name and drop percentage on the command line. The application passes details of the target host and the drop rate to the kernel extension and then monitors the inbound and outbound packets to the target system every 10 seconds.

The packet drop system calls can be invoked from both 32-bit and 64-bit applications. The makefile provided demonstrates this by building both 32-bit and 64-bit versions of the control and monitor application. These are called pdrop_ccm32 and pdrop_ccm64 respectively.

pdrop_ccm shows the total number of input and output packets, the number of input and output packets that were dropped, and the numbers of dropped packets as a percentage. It takes the name of the kernel extension, the name of the target host as arguments and an optional value specifying the target drop rate. If this third parameter is not specified on the command line, no change to the drop rate is made. Here is an example of the application in use:

Here, we can see that the host to which packets are to be dropped is fred and 1.0% of the packets are to be dropped. Status updates on total and dropped packets are then displayed at 10 second intervals.

The test application calls the pdrop_reset_counters() method to reset the kernel extension counters. After 10 seconds, you can see that network activity to the target system is started and the packet statistics are displayed.

Note that the example controls a single target system only. If you run the pdrop_ccm command again and specify a different target host name, packets to and from the original host will no longer be dropped.

Java control and monitor

A sample Java application, PDrop_jcm.java, is also provided, which demonstrates how to use the packet drop simulator from the Java environment.

PDrop_jcm.java uses JNI to access the C environment for controlling and monitoring the kernel extension. The shared librarylibpdrop_jni.so provides native methods which can be called from the Java environment. This shared library is built as part of the build process described earlier from the source, pdrop_jni.c.

The JNI wrapper provides the following native methods at the Java level which map to the functions in the shared library:

When these native methods are invoked, the corresponding function in the shared library will be called and the result will be returned to the Java environment.

To build and use this test Java application, first, make sure that you have set the PATH to the Java SDK environment correctly, for example:

export PATH=/usr/java6/bin:${PATH}

Next, set the LIBPATH environment variable to reference the directory where the kernel extension is located, so the shared librarylibpdrop_jni.so can be resolved:

export LIBPATH=/home/jerry/kernext:/usr/lib

Now compile the PDrop_jcm.java application, for example:

javac -d . PDrop_jcm.java

Optionally, if you need to regenerate the JNI header file, run the following command. It mightnot be necessary to run unless you have customized the kernel extension and changed or added to the native methods:

javah -d . PDrop_jcm

If you have regenerated the header, you will need to rebuild the shared library using the process described earlier.

The Java application is then run with the path to the kernel extension, target host name, and optional new drop rate as arguments, as in the C example above:

In this example, you can see that when monitoring started there were zero packets sent or received to or from the target host, but there was then network activity that caused packets to be dropped in both directions. In this example, the drop rate had been configured at 1%.

Measuring non-simulated packet loss

It was mentioned earlier that operating system utilities should not be used to monitor packets dropped through the use of the kernel extension. However, these will be required when you need to access packets that are actually dropped across the network. This section gives some useful hints and tips for using these utilities.

The ping utility displays the packet loss on the ICMP packets being sent across the network. You can see the packet loss statistics in the above example, where it is zero. However, this measurement is only based on these packets being transferred from and to the ping utility and do not measure any other packets traversing the network.

On some systems, ping also supports an option to write ICMP packets as fast as possible onto the network and again gives you the loss statistics at the end. This is a so called flood ping. You should use this with caution as it is likely to impact general network performance while running. It should be run only for a few seconds and only under test conditions.

The netstat utility can also be used to see how many packets have been dropped on a network. Here is an example output from netstat -D on AIX.

On AIX, the netstat -Zs -p tcp command can be used to reset the protocol statistics before running the promotion activity.

If packet drops are consistently in excess of 0.1%, then you should raise this with your network administrator.

Retransmitted packets can be seen using the following command:

$ netstat -s -p tcp | grep retrans

Statistics of interest from the output of this command are:

Packets sent

Data packets

Data packets retransmitted

Packets received

Completely duplicate packets

Retransmit timeouts

Conclusion

This article provided reference material and instructions to build, use, and customize a simple utility to simulate dropped TCP packets on AIX. Such a utility is invaluable when writing cross network software to model how it will behave under non-ideal network conditions.

The tool may be adapted as required and can easily be enhanced to support simulation of other network issues that can give rise to performance problems, such as packet corruption, packets arriving out of order, and jitter.