Comments 0

Document transcript

BOTNET METRICS

Basisand Outline of

This Report

The CSRIC IIIWorking Group Descriptions and Leadership

document (last updated November 15th, 2012),1

provides among other things that "[Working Group 7] shall identify performance metrics to evaluate theeffectiveness of the ISP Botnet Remediation Business Practices at curbing the spread of botnet infections."

This report is provided in fulfillment of that requirement, and has the following structure:

I.

Expected Audiences for This Report

2

II.

Thinking Precisely About What Is and Isn't A"Bot"

6

III.

What Sort of Botted "Things" Should We Be Trying to Count?

9

IV.

Substantive Questions About Bots

15

V.

SomeStatistical Questions Associated With Botnet Measurements

22

VI.

ISPs As A Potential Source of Botnet Data

23

VII.

Sinkholing, DNS-Based Methods,Direct Data Collection

and Simulations?

25

VIII.

Recommendations

27

Appendices

28

1

http://transition.fcc.gov/pshs/advisory/csric3/wg-descriptions.pdf

at PDF page 7.

2

I.Expected Audiences for This Report

Whilethe

primary audience for this report on botnet metrics is theFCCCSRIC itself, it is not theonly

FCC's efforts been compared to ours? Are there opportunities for us to collaborate on targeted

joint initiatives?

If so, where's the low-hanging fruit?"

(e)

Senators andRepresentatives

may need botnet metrics to determine if new legislation is required,

or

if existing legislation requires additional funding in order to befullyeffective.

(f)

Public interest organizations

may wantbottedusers to be protected from botnet

threats, butonly

in a

way that's appropriate and privacy-respectful.

A sense of the magnitudeof the problemis

critical to sizing up what might be necessary, and metrics also provide programmatic

transparency.

(g)

Member of the media

will be

interested in understanding and reporting on government efforts and

initiatives,

will want to see documentation about how much work is being done

on botnets, and

to what effect.

(h)

Securitysoftware and security hardware

vendors may viewbotnetmetric requirements

as

potentiallydriving new

markets for newsecuritygear

--

or defining how well their existing gear

works. In a competitive marketplace, metrics may define "winners" and "losers" and be important

drivers for keeping existing customers and gaining new ones.

(i)

Law enforcement agencies

may eagerly seek botnet metrics to

help them to target and optimize

their enforcement activities, endeavoring totarget their limited cyber enforcement dollars in a

way thatgives

taxpayers the most "bang for their buck."

(j)

Academic and commercial sector cybersecurity researchers

might want access torawempirical

data

about botnets for use in their own analyses.

(k)

Governments overseas

may look at our bot metrics to see if this

program is something that they

should be doing, too.

3

An unfocused/ad hoc

botnet metrics program is unlikely to serendipitously meet the requirements of all thosediverseaudiences. The metrics that are needednow--

and that may be needed in the

future--

must be explicitlyand carefullydefined, or we run the risk offinding ourselveswith no evidence with which to answer

criticaloperational and policy questions relating to bots.

At the same time, we must remain cognizant of the fact thatcollecting and reporting data about bots is potentiallyburdensome, intrusive, and expensive. Therefore, anydata thatis targeted for collection

should be data that'sneeded andwhichwill be used in meaningful ways that justify the cost ofits acquisition.

A Specific NON-Audience: Botmasters and Other Cybercriminals:

While there are manylegitimateaudiencesthat are welcome to industry botnet metrics, there is one explicit non-audience: botmasters (and other cybercriminals). We need to explicitly recognize that botnet metrics, done wrongly, have the ability to potentially helpour enemies and undercutanti-botnet

goals.

Afewsimple examples:

(a)

Some ISPs may worry that if publicly identified as working diligently to combat botnets, they

may be targeted for serious and ongoing Distributed Denial of Service (DDoS) attacks by

unhappy botmasters.

(b)

Giving detailed and accurate information

about where and when bot activity was observed may

be sufficient

for a botmaster to identify (and subsequently avoid!) honeypots (or other data

collection infrastructure) in the future.If that happens, valuable (sometimes literally

irreplaceable)

data collectionsources and methods may be compromised.

(c)

If our botnet metrics include "per-bot" cleaning

and removal statistics, botmastersmight be able

to

use that feedback to learn

what

bots have proven hardest to remove, information that they

can

then use to"improve" future bots, making them harder to mitigate or remove.

ASNs, or Autonomous System Numbers, are a convenient way of referring to a particular ISP, or perhaps part ofan ISP. For example, Google uses AS15169, Sprint uses AS1239, Intel uses AS4983, the University of Californiaat Berkeley uses AS25 and so on. For more on ASNs,seehttp://pages.uoregon.edu/joe/one-pager-asn.pdf

5

ISPs and other entities receive "blocks" or "ranges" of IP addresses for their use. For example, the University ofOregon has 128.223.0.0/16 (the IP addresses 128.223.0.0 through 128.223.255.255) for its use among othernetblocks. These network blocks represent another way of referring to an entity online, albeit one that is lessconvenient than ASNs because a large ISP may have accumulated hundreds of netblocks over time as theirrequirements evolved, or as a result of mergers and acquisitions with otherISPs. Ownership of IP netblocks isdocumented in a distributed online database known as whois. Unfortunately whois servers often rate limit (orotherwise control) the number of queries that a given user can make against it during any given period of time,making it frustrating to use in conjunction with large datasets.

6

Inverse addresses (also known as "in-addrs" or "PTRs"), are the domain names that get returned when youlookup an IP address. They can provide hints about who controls a given IP address (although they are often notpresent, and can be subject to spoofing).

7

There's often a tendency to treat "North America" as if it is just comprised of the United States and Canada (witheverything else in the Western Hemisphere being part of Latin/SouthAmerica and the Caribbean), but in fact

5

ISPs code only targets U.S. ISPs and their customers.

(d)

That study looked at infection rates forhouseholds, rather thancomputers. There might be a half

dozen computers in a household, but if even one is infected, the entire household will get flagged

as bad. This can skew the proportion of a population that endsup getting reported as infected.8

(e)

On the other hand, what about infected devices other than just desktops or laptops? For example,

what about smart phones and tablets? Are we also counting infections on those devices? What

about other sorts ofdevices, such as "smart TVs" orInternet-connected gaming

consoles?

(f)

Not all broadband customers (nor all infected broadband customers) are"households."

level that we may eventually reach (even if it isn't zero) that we can all agree is "good enough?"

The answers to those questions largely shape the botnet metrics space, and those choices largely determine theanswer

that one ultimately finds.

We need to address these issues if we're to be able to provide meaningful metrics about the state of bots in theUnited States, and if we're to be able to measure the potential impact of the ABCs for ISPs code.

Let's begin with the issue of what is or isn't a bot.

there are actually 29 countries that are serviced by ARIN, the "North American" Internet number allocationorganization. If a researcher determines what's a "North American" household by checking to see if the IP addressassociated with each infection came from ARIN (rather than some other entity, such as RIPE, APNIC, LACNICor AFRINIC), the anti-botnet efforts of U.S. ISPs will be potentially conflated with the botnet experiences of 28other countries or territories. That means that even if U.S. botnet numbers improved, domestic improvements (ifany) may end up marginalized or eliminated by a hypothetical worsening of Canadian/Mexican/Caribbean/other"North American" countries bot numbers.

8

To understand this distinction, imagine a hypothetical media report about the impact of the flu on 150 areabusinesses employing 30,000 people. If each business had exactly one employee sick with the flu (a total of 150sick people among all area businesses), we could either report that"100% of businesses had been hit by the flu"(since each business does in fact have exactly one employee sick with the flu), or that"just 1/2 of one percent ofall employees have the flu"

(e.g., since 150/30,000*100=0.5%). These two different metrics convey radicallydifferent stories about the hypothetical flu problem in area businesses, right?

6

II. ThinkingPrecisely

AboutWhat Is and Isn't A Bot

What ExactlyIs

a Bot?

In an earlierreport,9

Working Group 7

provided a

general

definition of "what's a bot,"

stating:

A malicious (or potentially malicious) "bot" (derived from the word "robot" [...]) refers to a program that

is installed on a system in

order to enable that system to automatically (or semi-automatically)

perform a

task or set of tasks typically underthe command and

control of a remote administrator (often referred to

as a "bot

master" or "bot herder.") Computer systems and other end-user

devices that have been “botted”

are also often known as "zombies".

Malicious bots are normally installed surreptitiously, without the

user's consent, or without the user's full

understanding of what the

user's system might do once the bot has been installed. Bots are

often used to

send unwanted electronic email ("spam"), to

reconnoiter or attack other systems, to eavesdrop upon

network

traffic, or to host illegal content such as pirated software, child

exploitation materials, etc.

While that's a fine definition as far as it goes, it may not sufficiently emphasizeone

critically important point:

Not all malware isbot

malware.

Characteristics thatcan be used todifferentiate malware in general from bot

5.The extent to which the industry often fails to identifywhat malware is or isn't bot

malware can be seen in thisgraph from the Microsoft Security Intelligence Report,10

which breaks out ten different types of malware, butmakes no mention of what is or isn't a bot:

A Botnet Malware Registry?

To help eliminate ambiguity over what is and isn't a bot, one option would be for theindustry to create a voluntary botnet malware registry.An excellent foundation for a registry of this sort might bethe site http://botnets.fr/ whichcurrentlycatalogs over 300 botnet familiesby name.11

Once

an agreed uponbot registry isavailable,whether that's botnets.fr or something else,malware that has beenfound to be "bot" malware couldthenbe listed in that registry.

While this might sound like a small step, it actually enables significant bot-related research.

For instance, anti-malware vendors, when analyzing and cataloging malware they detect, could then potentially voluntarily add an"is this malware a bot?" attribute to their malware catalog entries

(based on the registry), and potentially employthat attribute as part of their periodic malware reporting. For example, in addition to any other statistics an anti-malware vendor might share, an anti-malware vendor mightalso hypothetically report on:

(a)

The number of new bot malware families discovered that quarter,

(b)

The percent of systems seen infected witheach ofthedozenmost significant bots, and

(c)

The total number of hosts detected as infected with oneor morebots.

Having a common botnet definition would allow multiple reports of that sort to be compared: do all anti-malwarevendors see the same number of new bot malware families? Do they see approximately comparable new levels ofinfection? Until we agree on what is and isn't a bot, it is impossible to tell if apparent differences are due to botdefinitional differences, or other differences (such as a different customer base, differing detection efficacy, etc.)

"If ItActs

Like a Botnet:"

In some cases (for example, in theroutinecase of an ISP that does not havedirectadministrative access to a customer's system), if a system exhibits empirically observable

While the primary language of that site is French, content is also available in English via the selector in the lefthand column.

8

(such as checking in with a botnet command and control host,12

or spewing spam, or contributing to a DDoS13

attack), even if a particular bot cannot be identified, the system should still get tagged as being botted.

Tagging botted systems based on their externally observable behavior may be necessary when direct access to thesystem isn't possible, but alsoin cases wheresystems are infected with malware that's

so new that antiviruscompanieshaven't yet had time to identify that malware.

Therefore:

If a system acts like it is infected by a bot, even if it cannot be identified as infected by

a particular

type of bot malware, tag it as botted.

12

A "command and control host" is a system that a botmaster uses to run his botnet.

13

A DDoS attack is a Distributed Denial of Service attack, often conducted by flooding a site with so much bogustraffic that the site's network connection or servers can't keep up, thereby preventing legitimate users from beingable to use that site.

9

III. WhatSort of Botted "Things"

Should WeBe Trying toCount?

Now that we've agreed on what is and isn't a bot, we're a large part of the way to being able to ask meaningful/measurable questions about them.

However, we also need to decide one other critical issue, and that's deciding precisely what sort of botted"things"we're going to count.

What

I'mAble

to

Measure

Will Depend on My Role In the Ecosystem:

(a)

If someone were to go "boots on the ground" and actually check all the devices in a number of

households to see whether any device is infected,thoseresearchers would have the option to

While going "boots on the ground" might seem to provide the most flexibility and

most comprehensive data collection options, it is also the most potentially expensive option, and

it presumes access to a household's systems, access that might be viewed as intrusive and

routinely denied.

(b)

On the other hand, if I'm an ISP, andI detect bots based on malicious network activity associated

with a particular IP address, I'm likely going to count

botted IP addresses

or

botted subscribers.

(c)

If I'm an antivirus company or an operating system vendor and I scan/clean end-user systems, I'm

going to end up countingindividual infections,14

or perhapsbotted systems.15

(d)

If I'm a survey research outfit, and Icall people up on the phone and ask,

"Have you ever been

infected with a 'bot'?" those survey researchers are goingto end up countingbotted users.16

Different parties will contribute different views of the problem. While those views may be different, all arepotentially valuable and important.

Desktops and Laptops Only?

Are we going to measure all kinds of botted devices, or are we just going to countbotted desktops and laptops?

For example, consider smart phones in particular. The number of smart phones isnow material, and malware is increasingly attacking and infecting at least some types of those devices.17

Usersalso have a growing number of tablets, Internet-connected "smart TVs" and set-top boxes, gaming consoles, andother devices that may be targeted for compromise.

Should all those sorts of devices be counted, if botted?

Wethink so, yes.To ensure a comprehensive botnet "picture," we suggest that any program of botnet metrics

shouldinclude ALL types of Internet-connected devices,but,

asthatdata is collected, it should include the type of deviceinvolved, thereby allowing analysts the option of reporting about all devices, or justsome

particular subset of alldevices, such asjusttraditional laptops and desktops, or just smart phones.

14

A single

system might have multiple simultaneous infections.

15

One user might have two or three systems, or one system might be shared by multiple users.

16

If we talk to three different people from the same household, and they all used the same botted computer,wemight potentially get three reports that are all about that single botted system.

This scenario also runs afoul of multiple other issues, including things as basic as the fact that users may not knowwhat a bot is, or they may forget having been botted.

17

Virtually All New Mobile Malware is Aimed at Android,

http://www.androidauthority.com/mobile-malware-aimed-android-112403/

10

Online

Devices Only?

We must recognize that in most cases we can only count botted systems that are "live" or"that we can see" on the network.Other devices may be botted, butremainundetected/unknown

as a result of howwe detect botted hosts.

An easy way to understand this is to imagine that we're an enterprise that actively scans its corporate systems withNessus18

or a similar security scanning tool in an effort to identify systems that appear to be botted. Obviously, ifa system isn't connected to the network, or isn't powered up when our network scan takes place, a potentiallyinfected system won't be able to be found

with that tool.19

Similarly, if an ISP identifies botted hosts based on spam (or other network artifacts visible to the ISP"on thewire"), a botted host that's offline or in a walled garden won't be seen "making noise" or "causing problems" onthe ISP's backbone20

where instrumentation exists,and also won't/can't be noted as being botted by externalparties relying on externally visible symptoms to tag systems as being botted.21

For instance, if an ISP is detecting bots based on characteristic botnet network activity,when/how longthe ISP

collects

bot-relatednetwork flow records will strongly influence

botnet detection rates.

To understand why, notethat if we only watch for botted hosts during a brief window during the business day, we might miss homesystems that are turned off except when a family member is using them at night. On the other hand, ifwe collectbot data during evening "prime time" hours, we will likely miss any bottedwork systemsthatmay only beon andin use during the normal 8 to 5 work day.Therefore, if you try to count botted hosts during toobrief

a timeperiod, youmay

misssome bots.

If we go to the other extreme, andcontinually

watch for botted hosts, we will

virtually certainly

see the samebotted host more than once, and since we can't tell one bot apart from another, we run the risk of counting a singlebot more than once, simply because in most cases there's no unique identifier that we can use to track a particularbotted host from one sighting to the next time we see it.

Unique Identifiers:

If each botted hostdid

have

a unique identifier, we could collect data over a

protracted periodand not have to worry about countingthe same botted host

multiple times. Unique identifiers for botted hostswould also greatly simplify the process of aggregating (or "rolling up")fine grained records appropriately: forexample, we could tag a record about each individual infection with the unique identifier associated with thatbotted host, and then we could easily consolidate that data if/when we wanted to do so.

Unfortunately, if we don't or can't use unique identifiers, our measurements mayend upprofoundlyflawed.

18

http://en.wikipedia.org/wiki/Nessus_%28software%29

19

This raises an interesting methodological question: if we scan and can't reach a potentially botted host,howoftenshould weattempt to rescan it? Once? Twice? Time after time after time? Never?

20

A related potentially important methodological question (at least from the point of view of ISPs activelymitigating botted hosts): if a botted system has been detected as being botted and successfully put into a so-called"walled garden" where it can't cause problems for other Internet users, should that host still be counted as"botted"? Or do we need an additional category to capture systems in this status, "botted butnot online and able tocause problems," perhaps?

21

This may be a material problem, kin to giving cough syrup to lung cancer patients. You may stop the externallyvisible symptoms with symptomatic treatment, but you're not curing the underlying disease. Sometimes havingbad symptoms can be a good thing when it comes to forcing attention tobe paid to a serious underlying problem.

DHCP address assignments are normally record in DHCP logs, and can be correlated with customer name andcontact information when necessary, at least until DHCP logs get discarded.

12

Another Reason Why IP Addresses Are Not Unique--

Network Address Translation (NAT):

Another potentialcomplication when it comes to mapping bots to

IP addresses is the use of NAT.

NAT is a public IP address-conserving technology that allows multiple systems to share a single public IP address. Because multiple systemsshare a single public IP address,malicious traffic from multiple systems may appear to come from just one IPaddress, resulting in an underestimate of the number of truly infected systems.

IPv6 Addresses:

As the Internet runs out of traditional IPv4 network addresses, ISPs are beginning to use IPv6addresses to supplement rapidly depleting stocks of IPv4 addresses. IPv6 significantly complicates the process ofmeasuring botnets viatheirnetwork traffic. Let's just mention a few of many reasons why this is true:

(a)

Many ISPs maynot

have IPv6 network flow monitoring that's on par with their IPv4 monitoring

High end server with multiple CPUs/multiple cores, lots of RAM, and gigabit

ethernet connectivity

Shouldeach

of those systemssimplybe countedas one "botted host"? There would certainly be a huge differencewhen it comes to the amount of spam or the volume of DDoS traffic thateach ofthose two systems mightrespectively deliver...

Does this meant that we should

weighting botnet detections by some measure of capacity(such as their average spam throughput, or their average DDoS output)?

Some Apparent "Bot" "Hits" May Not Be Real:

For example, imagine researchers investigating a botnet: it isconceivable that they might attempt to "register" or "check in" fake "bots" of their own creation in an effort tofigure out how a botnet operates, sometimes in substantial volume. In other cases, imagine a antibot organizationthat is attempting to proactively interfere with a bot by "poisoning" it with intentionally bogus information aboutfake botted hosts. Thus, if you're measuring bots by counting the hosts that "check in" rather than the bots that areactually seen "doing bad stuff" you run the risk of overestimating the number of"real"bots that actually exist.

14

What

One NascentMetrics ProgramChose

to Count:

Therecently estsablished

MAAWG malware metricsprogram focuses on "subscribers" as the unit of analysis.

Because

the number of subscriberswill fluctuate over the course of a typical month, MAAWG decided to use thenumber of subscribersas of the last day of the month.

Participant ISPs

then report

the number of unique subscribers that have been found to be infected one or moretimes duringthe month. (What is/isn't an infection isn't explicitly defined, except to say that it should be an"infection" that's serious enough to motivate the ISP to contact the user about the infection)

Participant ISPs will also report the number of unique subscribers that have been notified of a problem bywhatever means (SMS, phone call, email, web redirection/browser notification etc.) Multiple notices to the samesubscriber count as one. This does not imply that the subscriber has read/received the notice.

Given those values, one can compute the percentage of subscribers that have been found to be infected one ormore times during a given month, and the percentage that have been notified of that infection.

As a metric, note that thismetric implicitly accepts

some compromises, e.g.:

(a)

Given the definition of this metric, we can't talk about how many infected customersystems

may

be present,nor how many distinct infections were seen,nor can wetalk about whether a

particular customer was repeatedly reinfected,or if the infection was on a laptop, smart phone,

gaming console,etc.

(b)

The MAAWG program doesn't focus solely on bots, since many ISPs want to protect their

customers from all types ofseriousmalware infections, not just bots.

(c)

Choice of a month-long window means that day-to-day or week-to-week infection trends won't be

able to be identified.

(d)

There are many other potential measurements related to infected customers that aren't getting

reported (for example, how much customer effort was required to disinfect and harden a

typicalinfected system?)

Theseand otherlimitations were explicitly recognized and accepted by MAAWG aspart of its pragmatic programdesign decisions,

recognizing that

ifit made the

malware metricsreporting programtoo difficult

ortootime-consuming ortoocomplex, many ISPs would simply opt out of participating.

Keeping the program simple andeasy to participate in increases the number of participating ISPs.

Another example of a pragmatic measurement choice was the decision to

focus on

the number of unique customer

detections

and

customernotifications

rather thanthe number ofcustomer systems that have beencleaned up orrebuilt.

(Because customers may usethird party services to clean up or rebuild their systems, ISPs may not knowif a customer's system has been cleaned up, rebuilt, replaced, or remains infected (but is offline/dormant)).

15

IV. Some Substantive Questions About Bots

The MAAWG example justmentioned will yield some botnet related metrics. However, what are the othersubstantive questions about bots that we might like to be able to answer?

What's the Order of Magnitude of the Bot Problem?

If botted hosts are rare, we likely don't need to worry aboutthem. On the other hand, if ISPs are being overrun with botted hosts, we ignore all those botted hosts at our peril.

If we don't (or can't!) at least roughly measure botnets, we won't know if bots are a minor issue or a huge problem,and if wedon't know roughly the size of the problem, it will be impossible for industry or others to craft anappropriate response.

Note that when we talk about "order of magnitude," we're NOT talking about a precise measurements, we're justasking, "Are 10% of all consumer hosts botted? 1% of all hosts? 1/10th of 1% of all hosts?" etc. We shouldat

least

be able to do that, right?

One example of such an estimate can be seen in Gunter Ollmann's "Household Botnet Infections,"27

Out of the aggregated 125 million subscriber IP addresses that Damballa CSP product monitors from

within our ISP customer-base from around the world, the vast majority of those subscriber IP's would be

classed as "residential"—

so it would be reasonable to say that roughly 1-in-5 households contain

botnet

infected devices. [...] Given that the average number of devices within a residential subscriber

network is going to be greater than one (let's say "two" for now—

until someone has a more accurate

number),I believe that it's reasonable to suggest that around 10% of home computers are infected with

botnet crimeware.

There are 81.6 million US households with broadband connectivity as of 10/2010.28

If 20% of 81.6 million USbroadband households were actually to be botted, that would imply that there are 16 million+ bots in the USalone...

I'm not sure that I "buy" that.

Let's consider another estimate, from the Composite Block List ("CBL"). On Sunday December 9th, 2012, theComposite Block List knew about 174,391 botted host IPs in the United States.29

There are 245,000,000

Internet users in the US as of 2009 according to the CIA World Fact Book.

174,391/245,000,000*100=0.0711% of all US Internet users are potentially botted [assuming 1 computer/user]

Worldwide, that puts the US near the bottom of all countries, in 149th place on the CBL. On a per capita-normalized basis, that means that the US is among theleast botted of all countries

Wonder which nations are the worst? Looking just at countries with 100,000 or more listings, the most-bottedcountries are Byelorussia (137,658 listings, with 5.2% of its users botted), Iraq (196,046; 2.4%), Vietnam(431,642; 1.85%), and India (1,093,289; 1.8%).

16

In fact, if there are only 175,000 bots here in the U.S., botted hosts have effectively become a "rare disease" inthat when it comes to traditional medicine, the U.S. definition for a rare disease is one which afflicts fewer than200,000 people in the United States.31

How Could Those Two Estimates Be So Vastly Different?

We think that there are two main reasons for that vastdiscrepancy in those estimates:

(a)

The two estimates countdifferent sorts of things

(households that are detected as being botted vs.

botted IPs seen sending spam)

(b)

The two estimates measuredifferent populations

(users worldwide who are connected via ISPs

with enough of a bot problem that those ISPs have been motivated to purchase a commercial

network security solution vs. users in the United States (where a real push to control bots has

been underway for years)

Another Very Basic Question: How Many Different Families of Bots Are Out There?

That is, are there three maintypes of bots actively deployed right now? Thirty? Three hundred? Three thousand? The proposed malwareregistry should allow us to answer this question...

The number of unique types of bots is important because it tells us a lot about how hard it might beto get the "botproblem" under control. If there are only a handful of major bots, concerted effort should allow governmentauthorities to shut them down, if the government makes doing so a priority. Conversely, if there are threethousand different typesof bots out there, getting all those bots under control would be far harder.1

Closely related to the question of how many types of bots are out there,how many botmasters

are out there?

We might expect that the number of botmasters would roughly track

the number of unique bots, but one bot "codebase" might be "franchised" and in use by multiple botmasters, or one botmaster might run multiple different bots,eroding a direct one-to-one relationship between the two metrics.

How Many Users Are Coveredby the ABCs for ISPs Code?

While some organizations have attempted to identifythe number of users covered by the ABCs for ISPs Code, it can be hard to dig out subscriber estimates forparticipating ISPs. Should one of our "metrics" simply be a clean report of how many subscribers are covered bythe Code?

31

http://rarediseases.info.nih.gov/RareDiseaseList.aspx

17

Are There Any Trends Relating to Bots?

That is, in general, is the bot problem getting better or worse over time?

Some anti-malware companies are already sharing data of this sort, at least for some types

of bots. See forexample the following graph from McAfee for the United States:32

Do Bots Show Any Sort of Operational Patterns?

For example, hypothetically, does most botnet spam get sent"overnight" when US anti-spam folks are asleep but Europeans have already woken up? (remember, Europe is +7or +8 relative to the US Pacific Time)

Does the number of bots increase during the weekend, and then go backdown during the week? (This might be the case if a regularly employed botmaster just ran his or her botnet as away to supplement his or her income on weekends, or if fewer anti-botnet people were paying attention/whackingbots on weekends)

Does the number of bots increase at the start of the month when people get paid and havemoney to buy spamvertised products, or does it peak in the month before Christmas (when people are most likelyto be Christmas shopping), perhaps?

If law enforcementarrests a botmaster ortakes down a botnet, can we see anoticeable drop in the amount of spam sent, or do other botnets immediately step up and fill that now-vacantniche in the bot

Are there bots that are installed but totally idle? If so, why? Excess capacity?

Understanding how bots are being used will help us to figure out we how we should try to measure bots.

For example, if bots are not longer being widely usedto send email spam, we shouldn't attempt to measure botnetpopulations based on the amount of email spam we observe, right?

34

http://www.eleven.de/botnet-timeline-en.html

19

HOW Bots Are Being Used May Change Who's Interested in Them:

Hypothetically, if bots are no longer inwidespread use for spamming, anti-spammers may lose interest in bots. On the other hand, if bots start to bewidely used to conduct distributed denial of service attacks against critical government sites,

that change mightincrease interest in bots in the homeland security and national security communities.

We really need to understand/monitor the botnet workload profile as seen "in the wild," recognizing that this canchange as quickly as the weather.

"Comparatively Speaking..." Another set of potentially interesting botnet metrics are comparative metrics:

(a)

Are American computers getting botted more (or less) than Canadian computers, or computers in

Not all countries are the same size. Should we normalize botnet infection rates by the population

of each country (or by the number of people in each country who have broadband connectivity?)

(c)

Are all ISPs within the United States equally effective at fighting bots, or are some doing better

than others? For example, if an ISP adopts the voluntary "ABCs for ISPs" code do they have

fewer bots than other ISPs that don't adopt it?

(d)

Are there other important comparative differences that we can identify? For example, are older

users (or younger users) more likely to get botted? Does it seem to matter what antivirus product

or web browser or email client or operating system people use?

ComparativeRawBot LevelsPer CountryFrom the CBL:35

China looks pretty bad inthat list, but then again, remember, that China's a big country. How do they look,comparatively, once we adjust for their population?

35

http://cbl.abuseat.org/country.html as of Sunday, December 9th, 2012.

20

Selected CBL Listings By Country,Normalized Per Capita:

Once we've normalized per capita, China is no longer leading the list (that dubious "honor" now goes toByelorussia), but China is still fully an order of magnitude more botted than the United States is, even if China isfully an order of magnitude less botted than Byelorussia is.

Pre/Post Longitudinal Studies: Given that it may be difficult to compare bot-related statistics collected by ISP Awith bot-related statistics collected by ISP B, another option might be to track botnet stats longitudinally,within

an individual ISP, over time.

For example, assume the FCC would like to know if an ISP has fewer botted customers after adopting the ABCsfor ISPs than before (this is what some might call a "pre/post" study). If so, we'd expect to see a downwardsloping curve as the number of bots drop over time.

In practice, it may be difficult to do a study of that sort since many of the most important/most interesting ISPshave already implemented important parts of the ABCs for ISPs code. Thus, we cannot get a "clean" "pre""baseline" profile for many ISPs because the ISPs have ALREADY begun doing what the ABCs for ISPs coderecommends.

Drilling Down on a Per-Bot-Family Basis:

In addition to measurements made about overall bot infectivity, wealso need the ability to "drill down" and get more precise estimates on a bot-family-by-bot-family basis, ideallyboth at any given point in time, and historically. Per-bot-family measurements might include the number ofsystems infected with each particular major bot, but also related measurements such as:

(a)

the amount of spam attributed to each particular spam botnet

(b)

the volume of DDoS traffic attributed to each DDoS botnet

(c)

the number of command and control hosts that a bot uses

(d)

the geospatial distribution of hosts infected with each bot

21

Micro as Well as Macroscopic Measurements:

Not all metrics are macroscopic measurements related to botnetinfection rates. Some

measurements of interestmight be per-system micro values:

(a)

What does it cost to rent a bot on the open market?

(b)

How long does it take/what does it cost to de-bot a single botted host? What factors make a

system take more or less time to de-bot? Can we build a standardized cost model? For example,

what's it worth to have a clean backup of a botted system? Does that make it significantly easier

to get a botted system cleaned up and hardened?

(c)

When a system is found to be botted, does it tend to be botted with just one type of bot?

If co-infections are routinely found, can we identify "clusters" of bot malware that are routinely

found together, so that an anti-malware technician can then be told, "If you find bot A on a

system, also be on the lookout for bot B, too?"

(d)

If a user's botted once, does thatmake them more (or less)likely to get botted again? That is, can

we expect that thata once-botted user will become less likely

to be rebottedas a result of that

presumably unpleasant

experience? Or are some types of usersjust inherently more prone to get

themselvesreinfected, perhaps because of a failure to apply available patches, or inherently risky

Looking at This From A Different Direction: How Long Will A Typical Bot Live?

hypothetically assume thatyou're running a blocklist, and you list the IP addresses of

botted systems when you see those systems send spamor check in with a C&C that you're monitoring.

If you don't observe any subsequent activity from a botted and blacklisted system, when could you "safely"remove it? After a day? After a week? After a month? After 90 days? Never?

Some botnet blocklists deal with this issue by simply rolling off the oldest entries after the list reaches some targetmaximum size (after all, if the system turns up being bad again, you can always freshly relist it)...

Measuring Botnet Backend Infrastructure:

While we've been talking about botted end user hosts, another potentialtarget for measurement is botnet backend infrastructure, such as botnet command and control hosts.36

Potentiallyone could also track authoritative

name servers associated with bot-related domains, and sites known to bedropping bot malware, and a host of other botnet-related things (other than just botted hosts).

A philosophical aside:

is there any risk that focusing on backend botnet infrastructure (including potentially doingC&C takedowns) will result in interference with ongoing legal investigations? If third parties don't target botnetbackend infrastructure, can the Internet community be confident that law enforcement will in fact track and takedown those botnet-critical resources? Are there ways that we can deconflict this work without compromisingoperational security?

36

See for example Zeus Tracker,https://zeustracker.abuse.ch/

22

V.SomeStatistical Questions Associated With Botnet Measurements

HowPrecise

Do Our Answers to These Questions Need To Be?

"High precision" answers cost more than "rough"answers. (Think of this as the width of a confidence interval around a point estimate)

If you want to estimate avalue within +/-

10%, that requires less workthan if you want to know that same value within +/-

5% or even +/-

1%. Exactly how precise do our measurements need to be, and why?

How MuchConfidence

Do We Need That Our Estimates Include the Real Value?

For example, if we need 99%confidence that our estimate includes the real value for a parameter of interest, we can get that level ofconfidence, however, getting 99% confidence might require accepting broader bounds around an estimate (ordrawing more observations) than we'd need if we could live withjust a 90% level of confidence.

Notice the interaction between (a) the required precision, (b) the required confidence, and (c) the cost of obtainingthose answers (typically the number of observations required).

Most people want high precision and high confidence and low cost, but you can't have all three at the same time.

Budget:

We really need to emphasize that if bespoke hard numerical answers to questions about botnets areneeded,it's going to cost money to obtain those values. How much are we willing to spend to get thoseanswers?

If the answer is "zero," then I would suggest that in fact all our substantive questions about bots are just a matterof simple curiosity, and not something that's actually valuable ("value" implies a willingness topay).

If we also don't have a budget for data collection, our ability to rationally set the required level of precision (andthe required confidence in our estimates) is also going to be impaired.

23

VI. ISPs As A Potential Source of Botnet Data

The CSRIC WG7 metrics presumption has inherently been that ISPs themselves might be a potential source of botdata about their botted customers. While this is an understandable assumption, it might be problematic in practicefor multiple reasons:

(a)

Collecting botnet metrics requires time and effort. Who will reimburse ISPs for the cost of this

work,or for the capital costs associated with passively instrumenting the parts of the ISP's

network that may not currently be set up to gather the required data?

(b)

There are many ISPsin the United States. There are even more ISPs in other countries.

Hypothetically, assume that no ISPsvoluntarily submit metrics on their botnet experiences to

the FCC.37

In that case, having no other option (short of mandating reporting, which would likely be resisted by ISPs andothers), let's assume that the FCC begins to look at publicly available third party data sources, and begins to usethat data as abasis for evaluating ISP performance when it comes to combatting bots.

Let us further assume that the 3rd party data the FCC obtains is inconsistent, or the 3rd party data they obtain isradically different from what ISPs believes to be accurate. Those data discrepancies might potentially motivate anISP to voluntarily contribute data supporting their alternative (andmore authoritative) perspective--

if

thosecompanies could be assured that having shared botnet data, they'd be safe from threats of lawsuits, involuntarypublic disclosure of shared data, or eventual compulsory reporting.

and if we believe that both partiesarecommitted tocollaborative data driven security work, it would be terrific if operational data sharingwasbidirectional.

That is, if ISPs are good about sharing botnet metric data with the FCC, how will the FCC reciprocate and sharedata back with the ISPs? Data sharing partnerships shouldnot

be just unidirectional, just industry togovernment!38

37

Has the FCC explicitlyindicated

that they'd be interested in receivingdata of this sort, and told ISPs about anaddress/department to

which such data might be sent? Are there clear terms and conditions around how such datawould be used or potentially redisclosed?

38

Yes, there are

statutorylimits to what data can be shared by the governmentwith

ISPs, and there was legislationproposed to deal with thisproblem, but that legislation hasn't passed to-date.

25

VII. Sinkholing, DNS-Based Methods,Direct Data Collection

and Simulations?

Sinkholing Specific Botnets: Sometimes a

researcher or the authorities are able to gain control of part of a botnet'sinfrastructure. When that happens, the researcher or government person may be able to direct botnet traffic to asinkhole, and use the data visible as a result of that sinkhole to measure a particular botnet.

Some might hope that sinkholing would provide a general purpose botnet estimation technique. Unfortunately,because this is a bot-by-bot approach, and requires the researcher or authorities to "inject" themselves into thebotnet's infrastructure, it will not work to get broad ISP-wide botnet measurements for all types of botnets. Manymodern botnets now also take special care to prevent or deter efforts at sinkholing.

DNS-Based Approaches:

Another approach that's sometimes mentioned is measuring botnets by their DNStraffic. That is, if you know that all botted hosts "check in" to a specific fully qualified domain name, if an ISP seea customer attempt to resolve a "magic" known-bad domain name, there's a substantial likelihood that thatcustomer is botted.39

Big botnets might tend to make more extensive use of DNS than smaller and less sophisticated botnets, butcaching and other subtleties associated with DNS can complicate DNS-based measurements, and of course, not allbots even use DNS.40

Some might also be tempted to try an RPZ-like approach (as implemented in BIND) to "take back" DNS andprevent bots from using DNS as part of their infrastructure. While this approach certainly has technical promise,any effort to instantiate policy via DNS is a potentially tricky one,41

and should only be considered if customerscan opt-out of a filtered DNS view should they want or need to do so.

Directly Checking Systems to FindBotted Hosts?

Assume that you want a direct information-gathering approachthat doesn't rely on ISPs providing data, or on third party data sources. That is, you want to go out and collectyour own data, much as survey research groups survey entities about political viewpoints or consumer spending.How many individuals might you need to survey to get sufficient data about botted users?

The required number of users will depend on the breakdown between botted and non-botted users, and thenumber of ISPs whose customersyou'd like to be able to individually track. If you don't have "hints" about whomay be a botted user a priori, and you just need to discover them at random, you may be facing a daunting task, atleast if bots are indeed a "rare disease."

Let's arbitrarily assume you want 350 botted users to study.

If 1.5% of all users are botted, on average you'd see 15 botted users per thousand. Given that you want 350 bottedusers, that would imply you'd likely need to check (350/15)*1,000=23,300 users in order to find the 350 bottedusers you needed. But now what if just 0.0711% of all users are botted (recall that this was the CBL-reported ratefor the US on December 9th, 2012)? On average you'd see just 0.711 botted users per thousand. To get 350 bottedusers to study, you'd likely need to check (350/.711)*1,000=492,264 users. That's a LOT of data to collect.

Now assume that you want 350 users PER ISP, and assume you're interested in a dozen ISPs... 12*492,264 users=5,907,168 users. That's REALLY going to be a lotof work!

39

One noteworthy exception: customers who are security researchers!

40

Hypothetically, a botnet might choose to use raw IP addresses, or to use a peer-to-peer alternative to traditionalDNS, such as distributed hash tables.

41

Remember the Internet's negative reaction to SOPA/PIPA.

26

Assume that you were charged with going out and checking 492,264 computers to see if those systems had beenbotted. To keep this simple, let's assume that we'll call a system botted if a bot is found when we run acommercial antivirus product onthat system.

If we assume that it would take an hour to run a commercial antivirus program on one machine (a very lowestimate given the increasing size of consumer hard drives today), and techs work 40 hours a week, it would take492,264/40 = ~12,307 "technician weeks" to scan 492,264 systems.

If a tech works 50 weeks a year, that would be 246.14 "technician years" worth of work.

Given these realities, many users would probably simply refuse to allow their computer to be checked, even ifthey thought that their system might actually be infected.

Volunteers?

Some users who think that their systems might be infected might welcome the opportunity to havetheir systems scanned. Unfortunately, a "convenience sample" of that sort would not result in data that wouldallow us to generalize or extrapolate from the sample to the population as a whole.

Simulating Bot Infections in a Lab/Cyber Range?

Another option, if we wanted to avoid the problems inherent insurveying/checking users (as just discussed), might be to try simulating bot infections in a lab or on a so-called"cyber range." While conceptually intriguing, this might not be easy. For example, the fidelity of the results fromsucha simulation will depend directly on researchers ability to:

(a)

Replicate the full range of systems seen on the Internet (operating systems used, antivirus systems

used, applications used, patching practices, etc.–

do we have the data we'd need to do that?)

(b)

Replicate the range of botnet malware seen on the Internet (constantly changing)

(c)

Accurately model the ISP response to the malware threat

Given these difficulties, we believe this is a fundamentally impractical approach.

27

VIII. Recommendations

The CSRIC Working Group 7 recognizes that summary metrics with which to determine the effectiveness of theU.S. Anti-Bot Code of Conduct are not yet available.

The working group recommends that the following course be undertaken in order to enable such metrics for thefuture:

•

We recommend that specific Case Studies be supported to gain metrics around particular bot efforts.Summary metrics which may address and validate the overall effectiveness of the code can only be developedbased uponcomponent metrics derived from more specific efforts in combating specific bots. Furthercollaborative efforts will be required to arrive at a foundation of metrics. These efforts will need to involve notonly the ISP community but the larger ecosystem aswell.

•

We further recommend that that specific Pilot Programs be supported to gain metrics around particularbot efforts. Such programs are required to test which metrics may be useful and which are not. For comparativepurposes, the metrics definitions

will have to be reasonably standardized between ISPs. We anticipate that somemetrics methods used by ISPs will lend themselves to comparative analysis and some will not. Participation inpilot programs will indicate which are viable.

GeorgiaTech Information Security Center) be considered as an initial case study. This effort centers on the DNSchangerbotand related customer notification methods used by ISPs.This approach, although preliminary, may well showthe cooperative, collaborative, steps required for the future development of overarching metrics involvingmultiple ISP approaches. These steps are a micro chasm of the recommendations in the code. Additionally, thetest results may provide insight into the relative efficacy of different notification methods and insight intoresulting best practice methods for notification of customers.

•

It is recommended that the FCC recommend voluntary methods for standardization of metrics for thepurpose of comparative analysis of methods and best practice development. Such voluntary methods can beapplied ultimately toward education, detection, notification, and remediation of Bots as well as to thecollaborative efforts required by the broadband ecosystem at large.

The participants in CSRIC III Working Group 7 would be happy to address any remaining questions that CSRICor the FCC might have about this proposed program of work.

While botnets are often thought of purely as a nuisance, e.g., as a source of spam and similar low grade unwantedInternet traffic, bots have also been used to attack government agencies and Internet-connected criticalinfrastructure. Viewed in that light, bots might properly be considered a threat to national security.

If bots are indeed a threat to national security, "other government agencies" may be able to directly apply"national technical means" to collect intelligence about botnets, including per-ISP estimates.

Such information, once collected, might then be able to be shared with appropriately cleared governmentofficialswith a legitimate need-to-know.

If domestic collection mechanisms aren't an option or appropriate, it may also be possible to make estimates aboutdomestic bot populations based on data collected by international partner agencies.