Notes:
You don't actually need to include the ip.src and ip.dst fields, since they're not extracted by the grep command. I include them in case I want to do an ad-hoc grep for an IP address during the analysis process. Another way to do the same thing would be to modify the display filter to look only for certain addresses, e.g.:

Friday, November 1, 2013

I was working on a problem today in which vendor tech support was suggesting that a firewall was subtly modifying TCP data payloads. I couldn't find any suggestion of this in the firewall logs, but seeing as how I've seen that vendor's firewall logs lie egregiously in the past, I wanted to verify it independently.I took a packet capture from both hosts involved in the conversation and started thinking about how to see if the data sent by the server was the same as the data received by the client. I couldn't just compare the capture files themselves, because elements like timestamps, TTLs, and IP checksums would be different.After a bunch of fiddling around, I came up with the idea of using tshark to extract the TCP payloads for each stream in the capture file and hash the results. If the hashes matched, the TCP payloads were being transferred unmodified. Here are the shell commands to do this:

This invokes tshark to read the "server.pcap" file and output the TCP stream indexes of each packet. This is just a long series of integers:
0
0
1
2
1
etc.
The next command, sort -u, produces a logical set of the unique (hence the "-u") stream indexes. In other words, it removes duplicates from the previous list. Not all Unix-like operating systems have the "sort -u" option; if yours is missing it, you can use "| sort | uniq" instead.

Next, sed 's/\r//' removes the line break from the end of the resulting stream indexes. If you don't do this, you'll get an error from the next command.
The next one's a bit of a doozy: xargs -i takes each stream index (remember, these are just integers) and executes the tshark -r server.pcap -q -z follow,tcp,raw,{}command once for each stream index, substituting the input stream index for the {} characters.

The tshark -r server.pcap -q -z follow,tcp,raw,{} command itself reads the capture file a second time, running the familiar "Follow TCP Stream in Raw Format" command from Wireshark on the specified TCP stream index that replaces the {} characters. If you're rusty on Wireshark, "Follow TCP Stream" just dumps the TCP payload data in one of a variety of formats, such as "raw" or ASCII. If you've never used this option in Wireshark, make sure you try it today!The final command, md5sum, runs a MD5 hash on the preceding input.
To summarize, we've done this: taken a file, extracted all the raw TCP data payloads from its packets (without headers), and hashed the data with MD5. If we do this on two files and the hashes are the same, we know they contain exactly the same TCP data (barring the infinitesimally small probability of a MD5 hash collision).

In my case, both capture files produced the same hash, proving that the firewall was (for once) playing nice.

Wednesday, October 16, 2013

Some recent discussions at work have led me to the surprising realization that lots of people working in IT don't understand that Java and JavaScript are almost completely unrelated to each other. This is actually a fairly important misunderstanding to correct: it leads to wasted troubleshooting efforts, such as downgrading or upgrading Windows Java installations in response to browser JavaScript errors.

I found the title of this blog entry in a StackOverflow post: "Java is to JavaScript as Car is to Carpet". That's pretty much it, in a nutshell. For the record, the only things that Java and JavaScript have in common are:

They are both programming languages.

The word "Java".

Both came out of the web technology explosion of the early 1990s.

Both are frequently encountered in the context of web browsers.

Java is a compiled programming language that was originally developed with a major goal of allowing similar or identical codebases to run on different platforms without needing to be recompiled. It does this by compiling to "bytecode" rather than platform-specific machine code, which then typically runs inside a so-called "Java Virtual Machine". Java was originally developed and controlled by Sun Microsystems (now Oracle), but it has since been re-licensed under the GNU Public License. Numerous open-source Java implementations now exist, but the Oracle/Sun version is still the most familiar to the average user.

Java is associated with the web browser experience because of the widespread use of Java "applets" that are embedded in browser windows. Applets are not technically part of the browser; the compiled Java bytecode is downloaded by the browser and executed in a Java Virtual Machine (JVM) as a separate process. Applets are frequently transferred as a compressed "Java archive", or JAR file. Applets downloaded by a browser do not necessarily need to run in a browser window, but the fact that they are frequently embedded there leads to some confusion.

Neither is Java necessarily a client-side technology: many popular server-side applications are written in Java and execute in a server-side JVM. Google's Android platform extends things even further, using Java as the programming language but compiling the bytecode to execute on their own proprietary virtual machine.

JavaScript, on the other hand, is an interpreted (i.e., non-compiled) programming language that was originally developed to run inside web browsers. It was developed at Netscape and was later adopted by Microsoft and standardized as "ECMAScript". The use of "Java" in the name "JavaScript" was probably an attempt to piggyback on the popularity of Java; the two languages have almost nothing in common from a technical perspective.

JavaScript is most frequently used to control the web browser experience, but there are many projects that use JavaScript completely outside the browser. My first experience with this dates back to the late 1990s, when I used a JavaScript-based commercial tool to automate software deployments to Windows workstations. Today, there are many interesting non-browser-embedded JavaScript platforms, such as Node.js and PhantomJS.

Friday, August 2, 2013

The variety of terms used to describe network flow export technologies and components can be pretty confusing. Just last year I wrote a post on web usage tracking and NetFlow that is already a bit obsolete, so here's an attempt to explain some of the newer terms and capabilities in use today.

NetFlow Version 5
NetFlow v5 is sort of the least common denominator in flow technologies. Almost all vendors and devices that support a flow export technology will do NetFlow v5. Because it's only capable of exporting information about packet fields up to layer 4, however, it's not flexible enough to use for analytics that require information about the application layer. NetFlow v5 tracks only the following data:

Source interface

Source and destination IP address

Layer 4 protocol

TCP flags

Type of Service

Egress interface

Packet count

Byte count

BGP origin AS

BGP peer AS

IP next hop

Source netmask

Destination netmask

Netflow Version 9
Netflow v9 was Cisco's first attempt at defining an extensible flow export format, defined in RFC 3954 back in 2004. It provides a flexible format for building customizable flow export records that contain a wide variety of information types. Many of the goals for flexible flow export were defined in RFC 3917:

Usage-based accounting

Traffic profiling

Traffic engineering

Attack/Intrusion Detection

QoS monitoring

The RFC defines 79 field types that may be exported in NetFlow v9 packets, and directs the reader to the Cisco website for further field types. The latest document I could find there defines 104 field types, several of which are reserved for vendor proprietary use and some of which are reserved for Cisco use.

IPFIX
IPFIX is the IETF standard for extensible flow export. The basic protocol is specified in RFC 5101, but details are included in many other RFCs (Wikipedia has a partial list). IPFIX is based directly on NetFlow v9 and is generally interoperable, but since it's an open standard it is extensible without Cisco involvement. Hundreds of field types are defined in the IANA IPFIX documentation.

IPFIX is being used by various vendors (Plixer, Lancope, and nProbe/nTop come to mind) to export HTTP header data, making it capable of being used as a web usage tracker or web forensics tool with the appropriate collector/analyzer software.

Flexible NetFlow
As far as I can tell, Flexible NetFlow is a marketing term used by Cisco to encompass everything about their approach to configuring and implementing NetFlow v9 and IPFIX.

NSEL (NetFlow Security Event Logging)
NSEL is a proprietary extension of NetFlow v9 to used by Cisco's ASA firewalls to export firewall log data. It's not clear to me why Cisco didn't use IPFIX for this purpose.

Cisco AVC (Application Visibility and Control)
AVC is another Cisco marketing term that encompasses a variety of technologies surrounding the DPI and application-based routing capabilities in its routers, such as IPFIX, NetFlow v9, NBAR, PfR, ART (Application Response Time), and more.

Other Vendors
As mentioned above, most network technology vendors support NetFlow v5 and/or v9. IPFIX support is now becoming very common. Some vendors use proprietary extensions of NetFlow v9; Riverbed's CascadeFlow is one example of this.

In a followup post, I'll take a look at some tools that produce flow data without using export technologies.

I have no idea what the port numbers in columns six and seven are. Fortunately, if the IOS device has the TCL or Bash shells available, we can quickly convert them.

Method 1: Tcl Shell

Most routers have the Tcl shell available:

Router#tclsh Router(tcl)#puts [expr 0xda7]3495

Router(tcl)#puts [expr 0x6d61]28001

You could write a callable Tcl script to make this permanently available from normal EXEC mode too.

Method 2: Bash Shell

The Bash shell came out in one of the early IOS 15.0 versions, so you may or may not have it available. You need to explicitly enable it by entering "shell processing full" in global configuration mode.

Friday, April 12, 2013

I wanted a way to do some controlled tests of WAN acceleration products, using a production network. You can buy or rent commercial WAN emulators, but for my purposes it seemed like an improvised solution would suffice. I had a couple of Cisco 2800 routers, a switch, and an ESXi box in my lab that I could press into service, so I built a test network that looks like this:

R1 acts like the WAN router at a branch site. It has a QoS policy with a "shape average" statement on its "WAN" interface to change the bandwidth to whatever we want to test.

R2 simply NATs the test traffic onto an IP address in the production network, since I didn't feel like configuring a new production subnet just for the test.

The ESXi box is where the fun part lives: I created two vSwitches and connected one physical NIC to each. I then spun up a simple Ubuntu 12.04 VM with eth0 and eth1 connected to each of the two vSwitches, giving me a separate network connected to each Cisco router. I then enabled routing on the Linux VM and created the appropriate static routes to enable the test and production networks to communicate. Finally, I used the "netem" WAN emulator built into the Linux kernel to inject delay, jitter, and packet loss into the network. Voila -- a network de-optimizer!

For testing the WAN accelerators, we'll just install one in the test VLAN and one between the Linux router and R2.

Here are the basic steps required to set up the Linux de-optimizer, in case you want to try to build your own:

1) Basic Ubuntu 12.04 VM. I used 4GB RAM and 24GB disk, but you could get away with less.

2) sudo apt-get install openssh-server Install SSH server so you can still get to the VM from the test network when you break the rest of the test environment. Do this before you break something... ask me how I know. Don't forget to enable SSH on your lab routers too...

3) Turn off Network Manager so it doesn't mess with your static addressing and routing config: edit the file /etc/NetworkManager/NetworkManager.conf and change "managed=false" to "managed=true"

Send those prefixes one-by-one to the "whois" command, extract the "netname" field, and remove duplicates again:

| xargs -i whois {} | grep netname | sort -u
Note that this takes a while to run because of the time it takes the Whois server to respond.

The prefix-list that I used to get the output from the first step is as follows:

ip prefix-list GT24 permit 0.0.0.0/0 ge 25

Note that I used this as an argument to "show ip bgp", not as part of the config!

Now, this obviously isn't entirely accurate, because it only shows the providers that are leaking long prefixes that aren't being filtered by any of my providers, but it's interesting. I also searched based only on the parent /16, so there could be lower-level providers that I'm missing.

Some of them are clearly the same provider tagged with different whois records (e.g., "TBROAD" and "TBROAD-KR").

netname:
AFRINIC-NET-TRANSFERRED-200909

netname: ASI

netname: ASIANET

netname:
AquaRaySARL-2

netname: BIDCMain(about 100 more)

etc. Run it yourself to get the full list from your router's perspective!

Saturday, March 9, 2013

The Python module "SimpleHTTPServer" is traditionally a quick and dirty way to test web code without the overhead of installing a full webserver. You can also use it as a quick way to transfer files between systems, if you have Python available:

This causes Python to make all the files in the current directory available over HTTP on port 8080. From the client system, you can then use a browser, curl, or wget to transfer the file.

No, it's not secure, and yes, it may violate data exfiltration policies (and as an aside, Bro detects this by default). But I've used it fairly often to move files between Windows and Mac or Linux in situations where SCP isn't available and I don't feel like setting up a fileshare.

I've also used this in conjunction with the "time" command to test the effects of latency on various network protocols, and as an improvised way to test WAN optimization software.

Sunday, March 3, 2013

I've been largely out of touch with the IT certification scene lately, but I'm sure that people are still complaining incessantly about the fact that they need to memorize "trivia" in order to pass certification tests. Back when I was teaching Cisco classes full-time, my certification-oriented students were particularly bitter about this. Of course, this is a legitimate debate and the definition of "trivia" varies from person to person.

When I saw this article about CloudFlare's world-wide router meltdown, however, I immediately felt a bit smug about all those hours spent learning and teaching about packet-level trivia. If you don't want to read the article, here's the tl;dr:

their automated DDoS detection tool detected an attack against a customer using packets sized in the 99,000 byte range.

their ops staff pushed rules to their routers to drop those packets

their routers crashed and burned

So at this point you should be saying what some of the commenters did: huh? IP packets have a maximum size of 65,536 bytes, because the length field is 16 bits long.

In order for this meltdown to happen, they had to have a compounded series of errors:

the attack detection tool was coded to allow detection of packet sizes that can't actually occur: no bounds checking.

the ops staff didn't retain the "trivia" that they learned in Networking 101, and thus couldn't see the problem with the output generated by the detection tool.

the router OS didn't do input validation, and blew up when attempting to configure itself to do something crazy.

There's lots of blame to go around here, and my intent isn't to add to that; rather, my point is to tell everybody who dutifully memorized and retained stuff like the maximum IP packet size to feel good about yourself for a few minutes! And next time you write networking code: do input validation and bounds checking.

Sunday, January 27, 2013

Bro has four main container types, which I'm going to cover in somewhat nontraditional order:

tables

sets

vectors

records

Tables
A table is a collection of indexed key-value pairs: the same idea is referred to as a dictionary, associative array, or hash table in other languages. Here's a simple example that pairs letters with their place in the alphabet:

Because we printed only the value associated with the key, we never see the key in the output.

Sets
It's common in programming to need a data type that allows one to make a collection of distinct objects, without containing multiple instances of identical objects. This is the mathematical notion of a set. Bro has a native set type. Consider an example where you have a large list of IP addresses that you got from some other source: an intel feed, a firewall log, a web server log, etc. You want to get a unique set of addresses that have appeared, but you don't care how many times they appeared. This is the perfect use for a set:

Note that 2.2.2.2 appears only once in the set, despite having been included three times as different variables.

Vectors
A vector is Bro's version of a one dimensional array. Having spent most of my recent programming time in Python, I was surprised to find that Bro vectors work like hash tables for iteration: when you loop over a vector, you get the index into the vector rather than the object itself. Here's an example:

Bro's vectors work pretty much exactly like a table with "counts" as keys (a count is yet another native Bro type that we haven't discussed yet; it's the same as an integer but it's always unsigned; it can't be a negative). In fact, some of the earlier Bro documentation doesn't even show vectors as valid types, so I wonder if they are actually implemented internally as tables.

Records
The last Bro container type is the record, which is sort of the meat and potatoes of Bro's wonderful logging tools. Records are discussed in detail in most of the other beginner-Bro material out there, so I'm not going to cover them here.

This will probably be the last "Baby Bro" post that I do with just the raw language features demonstrated inside the bro_init() event. Any further Bro posts will probably be using Bro in its intended context. Hope this was helpful!

Thursday, January 24, 2013

Recently I had to set up a transparent proxy with the Cisco Ironport Web Security Appliance (WSA) using WCCP on a Catalyst 6500 with a Sup720, with IP spoofing and web cache ACLs enabled. Like with many technologies, this turned out to be pretty simple but I couldn't find it documented all in one place. Perfect blog fodder!

The network topology looked like this (simplified, but not by much):

Normally when you set up a transparent proxy with WCCP, the IP address of the proxy server is used as the source of the HTTP requests. The problem in this topology is that I wanted the real source address of the client to appear in the firewall logs. The IP spoofing feature on the WSA allows this to happen, but it requires configuring bidirectional WCCP redirection on the Cat6k. If this had been a Cisco ASA firewall, we could have enabled WCCP there and saved some trouble, but in this case the network was using a firewall from another vendor that didn't support WCCP.

One important thing to realize about WCCP on the Catalyst 6500 with the Sup720 is that WCCP egress redirection is done with software switching rather than in hardware, so if you find yourself wanting to use the command "ip wccp redirect out", you're virtually guaranteeing that you're going to redline the CPU on your supervisor. Thus, we want to do only ingress redirection.

! define the web cache, referencing the ACL above ! this WCCP service handles only standard HTTP trafficip wccp web-cache redirect-list ACL_WCCP! a second WCCP service is used for reverse-path redirection, which! is required for IP spoofing to workip wccp 90interface Vlan 10 description to client networks ip wccp web-cache redirect ininterface Vlan 30 description to firewall cluster ! WCCP service group 90 is applied inbound on the return path ! from the Internet so that IP spoofing will work ip wccp 90 web-cache redirect in

If you have multiple layer 3 interfaces going to client networks, you need to enable WCCP on all of them if reverse-path redirection is enabled. If we weren't using reverse redirection (i.e., if we weren't using IP spoofing), this wouldn't be the case: we could simply leave WCCP disabled on interfaces whose traffic shouldn't be proxied. With reverse redirection, though, the return traffic is always sent to the proxy server; if the proxy server sees the return traffic but not the egress traffic, it gets dropped. If you need to use IP spoofing and still have certain types of web traffic bypass the proxy, you would need to do this with ACLs applied to the WCCP service, rather than simply by leaving WCCP disabled on certain interfaces.

On the WSA, you create two WCCPv2 services under the Network-->Transparent Redirection menu, one with service group 0 (the default), and one with service group 90 (matching the the IOS configuration above). On group 90, you enable "redirect based on source port". For both groups, you enter the IP address of the SVI as the upstream router address (the IP address of VLAN 20 in this case). The WCCP service on the WSA then registers automatically with the Sup720.

That's it: now you have a transparent proxy, with IP spoofing so that your firewall logs show accurate client IP addresses. Handling HTTPS traffic or HTTP traffic on non-standard ports is beyond the scope of this post.

Saturday, January 19, 2013

Bro has native types for addresses and networks, making it much easier to work with network data. Today's Baby Bro script shows global variable definition, the use of the address and subnet types, and a simple conditional:

Bro also has several interesting built-in functions for working with network data that we'll explore in upcoming posts. For now, we'll take a look at the mask_addr function, which allows you to use Bro as an improvised subnet calculator. You can run a Bro micro-script from the CLI with with the -e option, just like the -e flag in Perl or the -c flag in Python:

Tuesday, January 15, 2013

[Note: Blogger seems to have done something nasty to my new blog template, so it's back to the old one at least temporarily]

Here's my first "Baby Bro" post. Before getting into using Bro scripting for its intended use of network traffic analysis, I wanted to figure out how to accomplish basic tasks common to most programming languages:

Functions

Common types and variable definitions

Loops

Conditionals

Iteration of container types

Basic string and arithmetic operations

This is the kind of stuff that many programmers can figure out instantly by looking at a language reference sheet, but I think it helps the rest of us to have explicit examples.

I'm not sure if I'll get through all of them in this series, but here's a start: a main dish of functions, with a side of string formatting and concatenation.

Friday, January 11, 2013

One of the reasons I don't blog that much is that I generally assume that everything worth blogging has already been done, and that everyone reading is probably smarter than me and doesn't need me to explain things. I'm going to pretend that those are non-issues and try to blog more, no matter how basic the topic.

I just got back from FloCon 2013, which was quite interesting. The highlight for me was some informal after-hours knowledge-dumps from Seth Hall (@remor) and Liam Randall (@hectaman) on the subject of Bro (@Bro_IDS).

Before these mini-lessons, I didn't really have a good idea of how to get started with Bro scripting. Now I do!

The stuff we covered was actually a lot more complex than Hello World, but in the spirit of beginning coders everywhere, here's how you do "Hello World" in Bro (and a little more):