October 2014

October 15, 2013

Are You Being Naughty?

I have an AIX server—on the Internet—and I have been
naughty! Shame on me!

My intent is that this server is “open” just enough so "random"
activity looking for servers to breach does not take it down. I say
"random" because I doubt my ISP would be happy if I were the target of
directed or sustained attacks. So, I try not to be too inviting.

So, how have I been naughty? By being complacent about an
excessive amount of processor usage by the named9 process. What was my bad
behavior? Noticing, but choosing to ignore this “cry for help.” Naughty me.

What I should have done?

In an open environment, by definition, an eye-catching activity
should activate my attention (curiosity) sufficiently that I look for a rootcause.
Finally, after weeks of complacency a different unusual activity prodded me
into action.

At the start of a system maintenance cycle I changed the
default route from the external interface to the internal interface,
effectively removing its active or open connection to the Internet. However,
after this change I continued to see a high amount of incoming packets on my Internet-facing
interface. My expectation was that the interface would quickly go idle, but it
did not. Finally, alarm bells went off in my head.

How could this be happening?!

The interface to the “open” (NAT router) interface was still
active so incoming packets were still coming in. What was lacking from my side
was a route to respond to the incoming packets. Complacency aside—awake finally—I
started tcpdump and saw numerous
incoming UDP packets directed at my named service.

I realized I should have suspected something long ago as I
had wondered why the named process was frequently the largest consumer of CPU
cycles. Now I had a possible explanation for this unexpected load. Fortunately,
I did not have to pay dearly for my lack of attention: The result was neither
really extreme nor damaging to what was supposed to be happening on the server.
However, I realize this could have been traumatic.

What was the root cause of failure here?

I made an assumption and accepted the unusual behavior
rather than investigating it. Another way to describe this: I was being lazy.
This is my wake-up call—“Do as I preach!” (FYI: I had accepted the high CPU
usage as a result of queries being made by tests I was running on other
servers. Now, in hindsight, I realize I was being naïve.)

Taking Action

As this is a well-defined situation: protocol UDP, port
number 53, I used smit to create an ipsec filter rule that would "shun any
host" that tried to query my server using UDP. In other words, dynamically
block any host (IP address) that sends a packet to port 53 using UDP incoming
on en0 interface (with address XXX.168.2.1). For now, I’m leaving TCP port
open. I may need to add a rule to block TCP requests that start “outside” while
permitting TCP queries originating on my server to continue.

Any
Source Addresses -m 0.0.0.0 will match any source address because the match
pattern is 0 bits

-d '192.168.2.1' ‑M '255.255.255.255'

Only
this destination address because -M 255.255.255.255 is 32-bit match

-c udp

Protocol
udp

-o 'any'

Any source
port number

-O 'eq'
-P '53'

Destination
port number equals 53

-r 'L'

Packets
destined/starting from local host

-w 'I'

Direction
Incoming

-l 'N'

No
logginh

-t 0

Not a
tunnel (t0 is not a tunnel)

-i 'en0'

Interface
en0

Immediate Results

About 15 minutes after first activating the packet filter, I
had 60 dynamically generated deny hosts rules similar to below. When I checked
a day later, the count was down to about 30. When I started writing this blog
post (after about three days running) the count was down to one.

Since I had not saved any data, I was afraid I would need
some time to get some new data for my examples. I need not have feared. The
moment I turned ipsec off, activity jumped as if someone was watching. One
positive denial (a response) was enough to wake up others. Result: immediately
I had the data I needed to write this blog post. Remember, there was only one
IP address being actively blocked but within seconds there were three systems
making DNS requests (see results). And as I finish the post, the number of
systems continued to grow even though no more replies were being sent—currently
holding steady between five and nine dynamic deny rules. My guess is that these
systems share with one another, and once one reports success more start trying.

I wish I could give an accurate description of what this
output means. Unfortunately, I’m not certain. From what I could find I became concerned that
these queries are meant to poison my DNS process. What jumps out is the
repeated queries on the domain fkfkfkfk.com. What I am not showing (for brevity)
is the sudden jump in activity from several locations within seconds of
deactivating the filter!

Monitoring
Continuing Activity

The command below gives me an indication of how busy the outside is by counting the number of
deny filters the shun-host action is creating dynamically. Notice that the -a
flag needed for the lsfilt command to see the dynamic, active filters:

Although my prompt ([email protected]) may not look like it, I
have been doing all of these commands with an euid of 0 (aka superuser): tcpdump
-U -w ... genfilt ... and (not shown here)
mkfilt -v4 -u - to activate the changes to the rules in the ODM. If you
want to clear all active rules, the simplest way is to deactivate then
reactivate the ipsec_v4 device using these two commands:

rmdev -l ipsec_v4

mkdev -l ipsec_v4

If you know enough to explain the meaning of the tcpdump
example above, please share that knowledge in a comment, and also, maybe
suggest a book or a site for people like me to learn more!

Comments

Being an AIX expert for years, I have to admit that I also can still learn a lot on the topic of networking and traffic flow.

Having a 10Mb ADSL line at home with a whole family all sucking the line dry 24/7 and render it shaped, leaves me unhappy when I occasionally want some bandwidth.

QOS comes to mind yes, but something tells me that there is a deeper and better level to dig into. It always start with 'knowledge/information' and I am clueless where to start my focus.

Now the ADSL router's QOS configuration is to laugh about. This makes me think that I should put a server down with 2 NICs and build my own QOS router, so that I can choose who gets what and when. This will empower me also to look at and manage the 'baddies', whether it be an attacker, a knocker, a sniffer, a spoofer or anything else.

Being a "servant" in the industry I will never be able to afford an AIX system at home, so will look at a Linux flavor to work on.

How would one take charge of such kind of control?
...and looking at "entstat -d entX" on the AIX systems in our data centre, the extremely high amount of "Receive Interrupts" tells me that a lot of "shouting" on the network with nobody owning this, since all the systems has got 0 Transmit Interrupts.

Lots to respond to! Thanks. And, I never really worry about "on/off" topic - that is what editors are for :).

First, never be able to own an AIX box. The first AIX box I bought was actually 4 of them - for 100 gulders (45 euros), and the stats I'll show are from the last one still running.

You mention entstat output. As you say, no transmit interrupts - which just means the drivers are (no longer) interrupt driven for xmit. Receive is by definition an interrupt because it is outside of any control/timing of the host/server. Something knocks on the port - system is diverted/interrupted.

QOS and AIX - it is in there - somewhere, but I have never used it so I cannot really help you there.

re: your ADSL modem and weak QOS controls - weak is 'weak', but if it is enough to get done what you need/want - use it first. I have pretended my IP address is the "phone" or the "TV" that needs priority, and while it does not give me everything, it is noticeably more.

However, if you are looking for an excuse to build your own router - my experience of over 10 years ago - hard to beat the throughput/performance of an embedded system (aka ADSL router with NAT, IPv6 support, etc. etc.).

My secret is that I have two providers - one with a better ADSL router (that I 'own') and one that wants to 'own it' so I cannot really change anything. That second one has 'required' that I do more security at the host level because I cannot do it at the router.

Security should be like an onion - many layers.

Anyway, I hoped I responded (do not dare say answer) to most of your comments.

IBM Systems Magazine is a trademark of International Business Machines Corporation. The editorial content of IBM Systems Magazine is placed on this website by MSP TechMedia under license from International Business Machines Corporation.