Code Read

Friday, November 18, 2011

This was a small project I did to help me teach my brother-in-law to play the piano. View source for the code. The possibly interesting parts: indexing an anonymous associative array, generating a random integer, string concatenation with integers, and halfway-done code factoring.

Monday, August 1, 2011

Drag this link: down? to your bookmarks toolbar for a button that checks downforeveryoneorjustme.com to see if the website you just failed to load is actually down, or if you just have a problem getting to it.

There are some interesting things in these few lines of code that are worth pointing out. That $() business is jQuery, a really great JavaScript library for writing compact, powerful, cross-browser JavaScript. I can use it in this bookmarklet because Twitter already uses it, so the library is already loaded. To (perhaps dangerously) oversimplify, $() lets you "select" page elements by class ($('.last-new-tweet')), reference ($(window)), id, and others, giving you a slew of useful functions (offset(), height(), etc) to query and manipulate them.

The other fascinating thing here is the immediate calling of an anonymous function. The basic syntax of this construct is:

(function (){statement;})()

In essence, a function is created, called, and discarded, all in one statement. This is useful when you specifically do not want a return value, or when you need to fit multiple statements into one statement.

So this would run every night at 3am, performing a verbose TCP SYN scan of my network, showing only open ports, and creating output files in all 3 formats (Normal, XML, and Greppable) in the /root/nmap/ directory. Just getting this far presented some challenges, since I was unfamiliar with some aspects of the crontab file format:

Commands run by cron have their environment stripped down for security reasons. Specifically, the PATH variable is set to /usr/bin:/bin, which is pretty restrictive. Since cron just logs that it ran the command, and not the output of the command, I was very confused as to why the logs showed it being run, but no output was generated.

Percent signs (%) are interpreted as newlines by cron. Anything after the first line is passed to the command on STDIN, similar to a here-doc in shell programming. To pass the time format specifiers to Nmap, I needed to escape the percent signs with backslashes.

So this was pretty good, but it left a lot to be desired. To get an idea of what had changed, I needed to manually run an Ndiff on the last two scans. Also, I wasn't taking advantage of Nmap's advanced version detection capabilities. So I decided to automate the diffing process and do a follow-up in-depth scan of new services I detected.

To schedule a complicated job like this, I needed to move the logic out of the crontab and into a shell script. I broke the task down into 3 basic steps:

Scan the network

Perform a diff

Scan new stuff for version information

In order to make it worthwhile to scan things twice, I wanted my first scan to be fast. I decided early to ignore UDP ports, since scanning firewalled hosts for UDP can take hours. I also decided to use a more aggressive timing template. Nmap runs at T3 by default, but since all of my targets are just one hop away, I can easily bump that up to T4. I don't consider T5 to be worth the possible loss in accuracy, but for such a small network, it could have been useful. Finally, since I will only be looking at differences, I don't need all the extra output files, just the XML. Here's the command to do all that:

nmap -v --open -T4 -oX lan-%y%m%d 192.168.1.0/24

Next, I needed to do a diff. Nmap ships with a great tool called Ndiff, which is written in Python. It takes two Nmap XML files and generates a text or XML diff. This was a tricky decision: I wanted to be able to review the diff every morning, so text output would be best for that. But I also wanted to have my script scan all the new hosts and services, which meant parsing the output. Luckily, I have done some development work on Ndiff, so I knew that it would have the whole diff in a data structure before printing it. I just needed to run through it and pull out the new stuff.

Ndiff, like any well-written Python program, consists of a bunch of class and function definitions, and a conditional statement to run the main function if the program is run as a program, not imported as a module. This ensures there are no side-effects if it IS imported, which I planned on doing. I started by making a symlink to the ndiff program in my working directory

ln -s /usr/local/bin/ndiff ndiff.py

I tried using the PYTHONPATH environment variable set to /usr/local/bin, but Ndiff is not installed with a .py extension, so the interpreter complained that it couldn't find the ndiff module. The symlink ends up being the way to go here.

Not a lot of functionality yet. I wanted a similar invocation to the ndiff program itself, so I started by copying the main function from ndiff and stripping out the options I didn't need: help, text, and xml.

So at this point, the main function doesn't produce any output. It just creates a ScanDiff object from the two scans. The original ndiff.main function just prints out the text or XML representation of that object, but I wanted more. I wanted a list of new hosts and ports, so that I could generate a shell script to do the details scan. Here's what I wanted the shell script to look like:

The first two lines set up a default output filename but let me pass a different one as the first argument ($1). I debated using the -A or -O flags (which would both add Operating System fingerprinting), but since I'm only scanning ports that I know are open, OS fingerprinting wouldn't be as accurate. Nmap needs both open and closed ports to get a complete fingerprint.

Back in ndiffdetails.py, I needed to build a list of targets and ports. Targets would just be a subset of the first scan's results, which would not include duplicates, so I can use a list to hold them. Ports, on the other hand, could show up on multiple targets. I only want to specify each port once, though, so I stored them as keys to a dictionary, which guarantees no duplicates.

Here's what's happening: ScanDiff and HostDiff objects have a property called cost that tells how many changes it would take to change one object (scan or host) into another. If it's greater than zero, then there is a difference, and I want to scan it, but only if the host is still up in the latest scan, and only if the host has new open ports.

Nearly done with ndiffdetails.py! I just needed to write my two output files: the text-format diff, and the shell script for running the followup scan.

Writing the diff out is straightforward, since that's the original purpose of ndiff. The shell script was also fairly easy, once I remembered to use the absolute path to nmap. The one complexity was getting a comma-separated list of ports. My first attempt used string.join, but here's how that went:

Saturday, September 4, 2010

Pie in the sky. In a New York minute. On the other hand. Costs an arm and a leg. In the black. Mad skills. All Greek to you? To a non-native English speaker, common idioms like these are often challenging, since their meaning is only loosely tied to the words that are used. In the same way, most programming languages have idioms that can look confusing to someone first learning the language, but which are used to perform common tasks.

Perl has many idioms. Here is a common one for "slurping" a file, or reading the entire file into a single variable (rather than the default of reading one line per "readline" call)

{ local $/; $contents = <$filehandle> }

This code introduces a new scope block with curly braces, and declares the Input Record Separator, $/, local to that block, which makes its value undef instead of a newline. Then the <> operator is used to read a "line" of text from the filehandle, which turns out to be the entire contents of the file (from the current position in the file, of course.) The diamond operator itself is rather like an idiom, being a slightly magical way of saying readline $filehandle. The closing curly brace ends the scope block and returns $/ to its previous value.

Here's an idiom from Python that is often used in modules:

if __name__ == "__main__":
main()

The idea here is to define a special behavior for when the module is used as a script, rather than being imported. When the file is imported, __name__ will be set to the name of the module, and this block will not run. When the file is used like python filename.py, however, the condition will be true, and the main function will be called. This is a convenient way to make a dual-purpose program that can be included as a module or run on its own. It could also be used as a place to put module tests.

C has lots of idioms to choose from, but here's the most recent one I came across:

This code isn't portable due to compiler differences, but it definitely works with Microsoft Visual Studio 2005. Essentially, declaring a struct with a one-element array at the end lets you allocate a struct with a variable-length array element. The trick is to never declare an instance of the struct, but instead use pointers and allocate dynamic memory. Since this idiom is fairly common, the C99 standard defined a way to declare flexible array members by leaving out the array length, like so:

struct c99struct {
int num;
char array[];
};

This form is guaranteed to work in C99-compliant compilers (a set that does not include Visual Studio 2005).

Just a few idioms to get you started. I find the best way of learning new idioms in a programming language is reading other people's code and looking for parts I don't understand.