IDFAQ: What is AMap and how does it fingerprint applications?

Abtract
Gathering information about a remote host is often the first step in launching an
attack. In order to break into a system exploiting some kind of vulnerability it is
important to find as much information as possible. Port scanning, OS fingerprinting ,
banner grabbing are only some of the techniques that can be used. This paper
summarises briefly the most common intelligence gathering techniques in use today,
describing some of the tools that employ such techniques. Finally, a tool (amap) is
presented which can be used to probe remote systems in the attempt to recognise an
application listening on a non standard port.

Introduction
Gathering information about a remote system is often considered the first step an
"intelligent hacker"1 takes in launching an attack against or gain privileged access to a
target machine. Intelligence gathered in this research can provide useful information
about vulnerabilities or misconfigurations that can be successfully exploited by the
potentail intruder. The more a hacker knows about a particular system (e.g. the OS,
the hardware architecture and services that are running), the greater are his or her
chances of launching a successful attack. By knowing the operating system and
system type, a hacker can do a little research and come up with a list of known
vulnerabilities.
Ofir Arkin describes in [4] a series of steps that an "intelligent hacker" would take in
this intelligence gathering attempt:

Footprinting: this phase consists in gathering as much information as possible
on the target from authorised source of information (IP address ranges, DNS
servers, mail servers);

Scanning: this phase consists in determining which hosts in the targeted
network are alive and reachable (through ping sweeps), which services they
offer (through port scanning) and which operating systems they run (OS
fingerprinting);

The second phase has an impact particularly strong on all networks since the number
of automated scanners is constantly increasing and so is this type of traffic on the
borders of every network.
Arkin also classifies the scan types according to the protocol used, as follows:

PING SWEEPS: consists in querying multiple hosts using ICMP packets. It is an old
approach to mapping and the scan is fairly slow. Automated tools for this scan include
fping and gping on Unix, Pinger on Windows

BROADCAST ICMP: consists in sending echo requests to the network and/or
broadcast address. Some operating system (Unix machines in general) will send back
an ECHO REPLY to the attacker source IP, others will ignore these packets.

NON-ECHO ICMP: consists in sending ICMP messages different from ECHO
REQUEST. This is useful when ECHO REQUESTS (PING) are filtered. Messages
used for this purpose are ICMP type 13 (Timestamp request) and type 17 (address
mask request). Automated tools for this type of scan include icmpush and icmpquery 2 .

TCP SWEEPS: consists in sending a TCP ACK or SYN. Receiving a RST response
is an indication that there is a host. However, information provided by this type of
scan is not completely reliable if the target is behind a firewall that can reply with an
RST packet on behalf of the targeted host. Tools that can be used for this type of scan
include nmap and hping 3 .

UDP SWEEPS: consists in sending a UDP packet. This method relies on the ICMP
Port unreachable message as a reply to a UDP packet sent to a closed UDP port. This
type of scan too can be done using nmap and hping.

All the above are used to determine if a host is alive, i.e. those hosts on a targeted
network that are alive.
Port scanning, on the other hand, is used to determine which services are running on a
host.
Port scanning techniques include:TCP connect() scan:
A SYN is sent to an "interesting" port;
If a SYN/ACK is received, a service is listening and the TCP handshake phase
is concluded by sending an ACK.

TCP half-opening scan:
A SYN is sent to an "interesting" port;
If a SYN/ACK is received, a service is listening, a RST packet is sent to close
the connection.

Stealth scan:
This is a technique that is meant to pass through filtering rules, not to be
logged by system logging mechanisms. It consists in forging non-standard
combination of TCP flags and relies on the fact that some filtering devices do
not log a TCP connection if the three-way handshake is not completed.

SYN/ACK:
Packets are sent with SYN and ACK flags set. If a port is open, TCP replies
with a RST because there is no SYN corresponding to the received
SYN/ACK, otherwise the packet is discarded silently.

The techniques that are employed for port scanning are also successfully employed
for identification of the remote operating systems (OS fingerprinting).
Basically, OS fingerprinting is a process for determining the operating system a
remote host computer is running, based on characteristics of the data returned from
the remote host. This can be as simple as connecting to the host and reading a service
banner or as complex as statistical analysis of TCP initial sequence numbers and
flags. OS fingerprinting is based on the fact that there are slight differences in the
implementation of the TCP/IP stack from different vendors. In some cases, these
differences can reveal information as detailed as the version number of the operating
system and the processor architecture.
Tools are available today which that can tell with a high degree of precision which
operating system is on the other side, by examining subtle details in the way TCP/IP
was implemented in that particular system, they can be distinguished, according to the
approach they follow, in passive and active fingerprinting.
The first approach consists in sending particular combinations of TCP flags or options
(or ICMP messages) observing the responses obtained and comparing them to a
database of known "fingerprints", while the second approach consists in monitoring
(sniffing) incoming traffic and observing certain characteristics of the received
packets.
Active port scanning and OS identification techniques are extensively described in
[1], while [21] describes the basis of passive fingerprinting. More recently another
approach has been described to remote fingerprinting based on the Round Trip Time
(RTT) between a SYN and the SYN/ACK sent by the server. This approach is
described in [16] which also presents a tool (ring) that has been implemented as a
proof of concept for this approach.
An alternative method to TCP/IP stack fingerprinting is identification by using client
application. These methods rely on the behaviour of certain daemons in error
conditions or on the "greeting" information that some applications send as part of the
application level handshaking process. Quite a number of network clients send
revealing information about their host system, either directly or indirectly. Email
clients, for example, often include a lot of information on their systems in the headers,
[12] provides interesting information about the behaviour of the pine mail client in
this respect. Web browsers also send this kind of information.
The different approaches to OS fingerprinting are summarised in the diagram in the
following page (also described in [16]), some of the tools that employ the various
techniques are also indicated.

Active TCP/IP Stack fingerprinting
Several publicly available tools exist that use active fingerprinting techniques. Of
these tools nmap [1] seems to be the popular choice. Version 3.0 of nmap was
released last August. Nmap uses several techniques for attempting to determine the
host operating system from a network level, some of them primitive in their approach
and others more complex, requiring a good understanding of the TCP/IP protocol.
They include testing the response of the remote system to undefined combinations of
TCP flags, TCP Initial Sequence Number (ISN) sampling, determining the default
setting of the DF bit, TCP initial windows size, ToS setting, fragmentation handling,
types and order of TCP options.
Nmap fingerprints a system in three steps: port scanning, which provides as a result a
list of open and closed TCP and UDP ports; "ad-hoc forged" packets sending, analysis
of the responses received and comparison against a database of known OS's
behaviour (fingerprints).
In version 3, nmap has introduced the following additional features:

protocol scan, which determines which protocols (TCP, IGMP, GRE, UDP,
ICMP, etc.) are supported by a given host;

"idlescan" which performs a scan via a "zombie" machine;

ICMP timestamp and netmask requests;

detection of host uptime;

option to specify payload length

IP Identification Number and TCP Initial Sequence Number predictability
report;

Another tool that is very popular for use in active scanning is xprobe based on the
work described in [23]. Xprobe introduced the use of ICMP messages for OS
fingerprinting. Its first version was not very flexible as it did not have a signatures
database, and relied on a static decision tree hardcoded in the binary code to produce
the results. Xprobe v2.0 [9] is an evolution of xprobe. It uses a "fuzzy" approach to
analyse the results produced by its various tests on the remote system. In this
approach each fingerprinting test is implemented as a separate module. Upon
initialisation, xprobe2 builds its own vector of possible "test matches" (i.e. builds a
matrix associating a starting value for the various operating system that the software
recognises). When the test is executed, the received packet is examined, the result is
scored and put in the matrix. The "score" can be one of:

YES(3)

PROBABLY_YES(2)

PROBABLY_NO(1)

NO(0)

Once all tests are run, the scores for each test are summed. The top-score OS is
declared as the final result.
The system is modular, new tests can be implemented and added as additional
modules.
Other tools that deploy similar techniques are hping [3] and iQ [13].

Passive fingerprinting
Passive host fingerprinting is the practice of determining a remote operating system
by measuring the peculiarities of observed traffic without actively sending probes to
the host.
Five parameters are particularly useful in this technique:

The value of the "Time to Live" field (TTL) in the IP header

The Initial Window Size in the TCP header

The value of the "Don't Fragment" bit (DF) in the IP header

The value of the "Type of Service" (TOS) field in the IP header

The types of TCP options used (if any)

No single signature can reliably determine the remote operating system. However, by
looking at several signatures and combining the information, the accuracy of
identifying the remote host increases.
Passive fingerprinting was first described in [21]. Tools based on this technique
include p0f [24] and siphon [12].
Passive fingerprinting has some limitations. If used to analyse incoming traffic, it will
not help in gathering useful information about malicious users since applications that
build their own packets (such as nmap, hping, xprobe, etc.) will not use the same
signatures as the operating system. In addition, it is relatively simple for a remote host
to modify the default values for the TTL, Window Size, DF or TOS settings and,
indeed this is considered one the countermeasures system administrators could and
should take against passive fingerprinting.

Using RTT for TCP/IP Stack fingerprinting
A new approach to remote OS fingerprinting at the TCP/IP stack level is described in
[16]. The technique described here relies on the fact that timeouts and regeneration
cycles between a SYN sent by the client and successive SYN/ACK sent by the server
to complete the TCP handshake are loosely specified in the RFC, which means that
almost each OS uses its own method and set of values. Ring is a tool that has been
implemented to prove how the Round Trip Time can be effectively used to recognise
the remote OS.
A typical ring identification session has the following steps:

ring sends a SYN packet to an open port of the target

the target enters the state "SYN_RCVD" and sends back a SYN-ACK

Ring ignores the SYN-ACK

the target remains in the SYN_RCVD state while reinjecting SYN-ACK
segments from time to time. ring measures times between these segments.
Ring is extensively described in Tod Beardsley's GIAC practical 4 .

Banner grabbing
One of the oldest techniques used to identify a remote operating system is "banner
grabbing", which consists in opening a connection to a remote application daemon
and determining the operating system by examining the responses received from
applications like telnet or ftp.
Tools that use this technique span from scanners like Hackbot [10] and ScanSSH[11]
to ad-hoc scripts aimed at particular application services [18] [19]. Hackbot is a
bannergrabber that can scan for ftp, mail, ssh banner and DNS version, can perform
whois lookup and various types of web scanning including Nimda and "path revealing
NT problems" [10]. ScanSSH is a scanner that probes SSH servers and classifies them
according to their advertised version number.
Fingerprinting at the application level is also extensively described in [12].

Defeating Fingerprint
Various techniques have also been described to defeat fingerprinting. Among them,
the simplest and most immediate is the modification of the default values of a TCP/IP
stack implementation, such as the TTL, Window Size or TCP options.
Another interesting approach can be found in [8] which describes the design and
implementation of a TCP/IP stack "fingerprint scrubber". A "fingerprint scrubber" is
a tool aimed at restricting a remote user's ability to determine the operating system of
another host on the network. It is a piece of software that is transparently interposed
between the Internet and the network under protection (a typical position would be on
the firewall) and performs a set of kernel modifications to avoid recognition of the
operating system based on the characteristics of IP and TCP implementations. It
works both at the network and transport layers by converting ambiguous traffic from a
heterogeneous group of hosts into sanitized packets that do not reveal clues about the
hosts' operating systems. For example for all the packets generated by all hosts in the
protected network it normalizes the IP header flags, forces all ICMP error messages to
contain data payloads of only 8 bytes, keeps track of the open TCP connections by
following the three-way handshake, and blocking all TCP packets that do not belong
to a valid three-way handshake sequence, reorders the TCP options within the TCP
header. According to [8] the fingerprint scrubber was tested against nmap which was
completely unable to determine the operating system with the scrubber interposed.

Probing application level services: amap0.95
In the previous sections various approaches to remote information gathering were
described that allow identification of the remote Operating System or of the version of
a particular application running on a remote host. A further step ahead in gathering
information about a remote host is provided by amap [25]. Amap is a scanning tool
that probes services running on a remote server on a given port to identify the specific
application that is listening on that specific port. Its purpose is to be used to identify
services that are not running on the standard ports. This tool has been released on
March 2002 under the GNU General Public License and can be downloaded from
http://www.thehackerschoice.com/download.php?t=r&d=amap-0.95.tar.gz. . It is also
available as a package in the Debian Linux distribution. Its authors describe it as "a
next-generation scanning tool, it identifies applications and services even if they are
not listening on the default port by creating a bogus-communication. amap has a
growing database of know applications also including non-ASCII based applications
and even enterprise services.".
The purpose of the following sections is to explain how amap works and to present
the results of its use in a test environment.
Amap probes the target by sending a number of "trigger" packets at the rate of about
one per millisecond. By default it sends 16 such packets, this value can be modified
with the "-T" option, however I counted 11 such packets in my tests, probably
because there are only 11 different triggers defined in the signature files for TCP
based application protocols. These "trigger" packets are typically the initiating packet
of an application protocol handshake (see SSL example in the following section).
Amap has a list of "triggers" which include binary as well as text handshake
messages.
Triggers are defined in the file: appdefs.trig. The triggers currently defined are shown
in the following table:

The hex string in the table (indicated by a 0x before the first octet) is sent as the
payload of the "trigger" packet in the first message sent after the completion of the
TCP handshake or in the UDP datagram (depending on whether the service uses TCP
or UDP as transport). This list can be expanded very easily, provided one knows the
handshake message of the application that one wants to trigger.
Amap defines a format for describing the trigger:

Where ":" is the separator and:
PROTO_ID: is the name of the application level protocol (service) for
which a handshake trigger is provided (e.g. SSL, Telnet,
etc.). This value is looked up when the "p" command line
option is used.

"t" or "u" indicates whether TCP or UDP must be used as transport

"0|1" is a flag to mark "dangerous" protocols. These are
applications that might crash if unexpected or long data is
received". When the "H" command line option is
specified, triggers with a value of 1 in this field will not be
sent.

<optional trigger data> can be an hex string or a ascii string depending on the
application. A hex string is identified by a leading "0x".
All strings are terminated with a newline character ("\n").
A trigger string is not defined for application protocols
that provide a banners string upon successful completion
of the TCP handshake (e.g. mail servers, ftp servers, ssh
daemons, etc.). These will be simply recognised with the
same mechanism used by any banner grabbing tool.
After the trigger has been sent, amap then looks up the response in a list, contained in
the file appdefs.resp and prints out any match it finds.
The possible responses are contained in this file with the following format:

Where ":" is the separator and:
PROTO_ID: is the name of the application level protocol (service) containing
the string in its response.

<response string> can be an ASCII string or a binary string, like in the triggers and
can be prepended with either a "^", meaning that the specified
string must be found at the beginning of the response, or by a
"/" meaning that the specified string must be found somewhere
in the received string.

As for the "triggers", it is very easy to expand the list of "recognised" services by
providing the appropriate description in this file.
Amap supports both tcp and udp protocols, ASCII and binary protocols and provides
a number of options to tune the probe being sent. It can take an nmap machine-readable
output file as its input file and probe the services that are listening on ports
found open by nmap.
The options currently available are described below:

- i Reads hosts and ports from the specified file. The format of this
file is as obtained by nmap using the option "-m"

- sT Scan only TCP ports

- sU Scan only UDP ports

- d Print the hex dump of the received response. The default is to
print only the responses that are recognised

- b Print ASCII banners if any are received from the probed service

- o Log results to

- D Reads triggers and responses definitions from ,
instead of the defaults appdefs.trig and appdefs.resp

-p Indicates that only the trigger associated to must be
used

-T n Open "n" parallel connections. The default is indicated as 16 in
the manual pages, however, I counted only 11 in all tests I made.

- t n Wait "n" seconds for a response. Default is 5.

- H Skip potentially harmful triggers. This swill skip triggers that are
marked with the 1 flag in the triggers description file
(appdefs.trig)

The syntax for running amap is:

amap [-sT|-sU] [options] [target port| -I ]

Either -sT or -sU must be specified. "target" is the IP address or fully qualified name
of the probed host and "port" is the probed port number. Target and port must not be
specified if the "-i" option is used.

Testing amap
Amap was downloaded from http://www.thehackerschoice.com/ and compiled on a machine running RedHat Linux 7.2.
No changes were made to the default configurations.
The test environment included the RedHat 7.2 machine running amap at the address
10.0.0.2 and the "target" host running Debian 3.0 at the address 10.0.0.1, both hosts
on the same subnet. A number of services were activated on the debian host, for most
of them the default port was changed to verify that amap could correctly recognise the
applications listening on the ports probed.
Tcpdump was activated on the RedHat host to record the traffic exchanged between
the two hosts.
Amap was used to probe services listening on TCP ports.
Services were distributed as follows:

When amap was started, in each probe, 11 TCP connections were opened, SYN
packets being sent at a few milliseconds one after the other. Amap forks as many
child processes as the number of parallel connections specified with the -T option.
Once the TCP handshake is completed, amap sends the one trigger packet per each
trigger found in the appdefs.trig file for the chosen protocol (TCP in this case). In
addition, it sends a trigger packet containing the string "\rnHELP\r\n".
Upon reception of the response from the server, amap checks in the appdefs.resp file
for a match with the pre-defined responses. The response form the server can be either
a banner or an error or a response to the handshake initiated by the amap trigger.
Some application would also send error messages back to amap. As soon as a
message is received from the server, the corresponding TCP connection is closed.
Obviously, depending on the level of logging of the application listening on the
probed port, an error will be recorded on the log file for each "wrong" trigger
received. Finding eleven connections open from the same host all of which, except
possibly one, generating errors on the application level protocol, could be a good
indication of a probe from amap.
The next two sections describe the results of running amap against an application that
responds with an ASCII banner (FTP) and an application that requires the successful
completion of a binary handshake.

"Text banner" applications: ftp
The traces provided in this section show an extract of a probe on port 31 (running ftp).
Amap was run on 10.0.0.2 with the following options:

For brevity, only some of the connections are shown and the payload is shown only
for data transfer packets (PUSH and ACK bits set).

Amap successfully recognised ftp listening on port 31:

Recognition of the ftp service is based on the banner received from the server. In
particular, the match of the response received from the server with the string:

On the server side, the following error messages are logged in the syslog file. Error
messages are also sent back to the client.

"Binary handshake" application: SSL
The traces provided here show how amap can simulate an SSL connection and
recognise an SSL application running on port 80.
The steps involved in the SSL handshake are as follows:

The client sends the CLIENT_HELLO message containing:

Client's SSL version number

Supported ciphering schemes

Challenge

The server sends the SERVER_HELLO message containing:

Handshake type (server hello)

Server's SSL version

Cipher settings

Cipher suite

Session_ID

Random number

Timestamp

Compression method

The server then sends its certificate

Handshake type (certificate)

Server certificate

Messages 2 and 3 can be combined into a single message like in the trace below.
The trigger that is used for SSL probing is the starting message of the SSL handshake,
i.e. the CLIENT_HELLO message. The binary string contained in the appdefs.trig file
and actually sent by amap is:

The decoded equivalent of this string is (decoding has been obtained using ethereal
[22]:

The response received from the server that allows amap to recognize SSL is (the
decoded format has been obtained using ethereal [22], the content of the certificate is
not shown for brevity, but it can be seen in the trace in the following section)

Amap was run on 10.0.0.2 with the following options:

The following page shows the tcpdump log recorded during the probing on port 80.
Hex dump is shown only for data transfers for brevity. All the connections opened by
amap are shown as well as the all the triggers sent in one run of amap. Payload in red
is the triggers sent by amap (Application protocol probed is indicated beside). Payload
in blue is the response sent by the server. In this case, the probed service replies only
to the correct trigger (i.e. the SSL CLIENT_HELLO handshake message).

Amap successfully recognized SSL listening on port 80. Here is the content of the
results file:

The "banner" is, in fact, the message sent by the server in response to the handshake
initiated by amap.
The apache-ssl daemon recorded the following error messages in its error log file,
showing that the server received also messages that were not recognized as valid
"client_hello" message. There was one such entry per each of the triggers sent by
amap.

Detecting amap probes
Amap is not very stealthy is run in its default mode. 11 parallel connections each one,
with the exception of one possibly, sending an unexpected message at the application
protocol level are surely recorded in the application log file, provided that the
application maintains a good logging level. In the test that I made, the probes on the
ssh port did not leave any trace at the application level, since no logging was enabled
for this application. On the other hand extensive logging was available for the ftp, http
and http-ssl applications. Therefore the most effective means to detect this probe is to
maintain and check logs at the application level. After all if your mail server receives
a NETBIOS request, something strange must be happening.
Apart from logging at the application level, it is difficult to detect an amap probe
since it uses the OS system calls to the TCP/IP stack and therefore no signature can be
found at the level of the TCP, UDP or IP packet. Nevertheless, it is still possible to
write a snort rule that is able to detect probes from amap when it is run in its default
mode. In fact, we can observe that in all attempts, amap always sends the trigger for
the mount service, specifying a machine name that is hard-wired in the binary string it
sends for this type of trigger. The machine name is "kpmg-pt" and it can be found in
any default probe from this tool.
It is possible to write a rule that looks for this string in the payload for each service
that we possibly want to monitor against this type of probes.
For instance I wrote the following rules for my test environment:

Obviously, $HOME_NET could be substituted with the IP address of the host on
which the specific service is running.

The following alerts were produced by snort, running in NIDS mode with the -dv and
-c options:

Obviously these rules fail, if one runs amap with the -p options
specifying the triggers that should be used and not using the "mount" trigger. But then
again, being a tool that targets the application level, detection is done most
appropriately at the application level by careful monitoring of "strange" messages
sent to the server.

Conclusions
Tools like amap are an additional proof that "security through obscurity" is not the
right approach to secure a network: simply running a service on a different port is not
sufficient to go unnoticed. However, amap can be very useful for system
administrators in finding "hidden" services, in those cases where users run
unauthorised services and try to disguise them using a non-standard port. In this
function it can be usefully used in collaboration with tools like nmap. The list of
signatures (triggers and responses) is customisable and can be easily expanded with
the addition of signatures of proprietary protocols. Like its authors say: "With amap,
you will be able to identify that SSL server running on port 3445 and some oracle
listener on port 23!".