Michal Zalewski on the Wire

Recently the eccentric security researcher Michal Zalewski published his first book,
entitled Silence on the
Wire: A Field Guide to Passive Reconnaissance and Indirect Attacks.
Because the book is everything except a security manual, Federico Biancuzzi
chose to interview Michal and learn more about his curious approach to
information security. Among other things, they discussed the need for
randomness, how a hacker mind works, unconventional uses for search engines
such as Google, and applying AI to security tools.

Could you introduce yourself?

Well, I am just a computer geek. I am a relatively young, self-taught
enthusiast who is fairly proficient in the field of computer security, and
simply enjoys playing with this stuff. Since the mid-'90s, I managed to contribute
some probably worthwhile research to this area, as witnessed by a number of
BUGTRAQ readers. I found and helped to solve a bunch of interesting security
problems, and wrote a couple of well-received papers; I also developed several
small but cool open source infosec utilities such as p0f, memfetch, Fenris, and
fakebust.

Well, enough with blatant self-promotion. Curious readers can find more
information about my work (and what else do I do in my free time) at lcamtuf.coredump.cx.

Silence on the Wire is a fairly unusual guide to the world of computer
security. Unusual, because instead of taking the reader through the frequently
repeated fundamentals, I tell a story of this field as witnessed by me when I
first learned this stuff.

I show that security problems are inherent to the way we design systems,
bound to just about any aspect of modern computing; and that only by
understanding it can you follow and mitigate threats efficiently. Along the
way, I focus on some of the more unusual, fascinating, and often arcane topics
in a way that hopefully is both easy to follow and entertaining, even if you
have no professional interest in security.

Who should read it? Well--if you just want to get a solid grasp of the
basics, this book is not for you, at least not to accomplish this task. If you
are a seasoned computer user or a developer, and want to learn to see the
technology in a different way, I believe you should give SotW a try. If you are
an infosec professional and want to learn more about the technology, and rediscover
the fascinating world of computer mechanics, I hope you'd enjoy SotW, too.

In the first chapter, you write about the need for randomness, and how
it's difficult to get truly random data from a machine built to behave
deterministically. Could this necessity disappear with the growing resources
that common people will have access to? For example, a blind spoofing attack
could become more feasible with broadband access to the internet, and there
are some countries where you can easily and cheaply get a 10Mbps or 100Mbps
connection.

Computers need to be able to generate truly unpredictable numbers for
various purposes--implementing cryptography is a prominent example. This is
not going to change anytime soon.

When users have access to more and more bandwidth and computing power, they
can more easily carry out brute-force attacks against protocols and algorithms.
But this only means the need for strong cryptography, cryptographically secure
ISN generation, and so forth is on the rise. And to get there, we need
computers to be able to deliver high-quality, unpredictable entropy--more
than ever.

Do you think that security concerns will require the adoption of a
new version of TCP in the near future?

In my opinion, TCP has some shortcomings, and these are bound to become
more and more of an issue in the near future, but I do not think we're going
to reach a point where we must switch to something else that instant;
there is no mystery failure threshold, but performance and security features
within or kludges around the protocol are becoming less efficient as the
surrounding technology advances.

In fact, even if we had to replace TCP on short notice, it
would be next to impossible to carry out such an operation. Look at
how we're moving toward IPv6 protocol suite--ho boy!

HTTP does not use crypto, while HTTPS does. Do you think that in the
future we'll use crypto for every single connection?

Well, because of the shortcomings of TCP (and the increasing ease of blindly
tampering with the data as bandwidth increases and new attacks are discussed),
almost all communications, even nominally of little relevance, should
be either encrypted or cryptographically tamper-proofed by now.

Unfortunately, this is a complex and costly process, and implementing
advanced cryptography may introduce new flaws elsewhere. Furthermore, unless
carefully engineered, it may remain susceptible to disruptions on underlying
layers, replay attacks, etc. Last but not least, end users simply don't
understand encryption and PKI, and hence can be easily tricked to ignore or
bypass our sophisticated protections.

In other words, "perfect world" solutions may be not really that desirable
or easy to implement, and we might have to stick with simpler short-term
options and strategies for now.

Your book is full of interesting and original ideas to study a
network or a single host; however, how can we focus on those advanced topics if
most of the break-ins on the internet come from worms, spyware, and other dumb
things or users?

There are plenty of books on these topics, some of them very, very good;
there is no point in writing another summary of threats just because worms or
spyware are a prominent problem.

What I wanted to achieve is to show how to think creatively and see problems
that go beyond textbook examples; I try to show that these flaws don't come out
of nowhere, and are inherent to every single tiny design decision ever made. If
there is a software engineer, a system administrator, or a security professional
who, after reading SotW, puts a bit more thought and insight in their work,
that's good news--we may be preventing new classes of exploits and attacks of
tomorrow.

There are a lot of books and courses that teach "how to think like a
hacker". Your book should open a reader's mind showing original points of view
for different situations and problems. Do you think that it is possible to
learn this way of thinking, or is just part of some people's
personality?

I don't think that ("good") hackers have any special, hardwired mental
abilities or specific personality traits, and I do believe you can easily learn
to think like a hacker, even when you come from a different background.

The difference between hackers and people who just deal with computers for a
living, 9 to 5, is quite simple--hackers share a genuine passion for this
stuff, they learn and analyze computers just for fun, and hence can more
readily see beyond the taught problems and scenarios, invent or explore.

And so, if you have ambivalent feelings about computer science and just
want to get your paycheck, no amount of books or courses is going to turn you
into a skillful, passionate enthusiast. On the other hand, if you have the
genuine desire to explore computing as a true hobby, you're likely to succeed
and become an old-school hacker with (or without!) proper guidance.

I was thinking that often the so-called hackers have other hobbies
beyond computers, and that being open-minded and cultivating mental elasticity
could explain why they have better results than people who do things just
because it's their job. For example, you like to practice photography, and this
interest in expressing yourself with images came out when you published your famous research on ISN, where you
used a graphical format to spot algorithms weakness. Thinking of the people you
met and the hackers you know; does this theory sound good?

I think it's an oversimplification to attribute any special mental skills or
capabilities as either a result of or a reason for being a hacker. In fact, I
know of several hard-core hacker geeks who have remarkably little other
interests or any form of mental or social elasticity. (In fact, they're really
hard to get along with, and have very serious problems adopting to everyday
situations.)

Also, I don't think that hackers necessarily "have better results" than
people who do not fall into this category. It's a comforting thought for us
geeks, but I'm afraid this is not very true. Some hackers are either far too
obsessed with a particular concept or set of problems, or too disorganized, to
outperform well-trained, distanced professionals.

Hackers are generally more determined to do the things they're interested
in, for its own reward, that's all.

Sometime ago you played a joke claiming to have founded a company
called eProvisia LLC that provided a 100 percent guaranteed antispam service. The
very interesting fact was that its antispam technology used human beings who
manually analyzed email.

Yes, of course the company is not real; it was just a silly joke that got
out of hand (and was carried as a true story by ZDNet, Yahoo, Slashdot,
and others).

This idea is original, and made me think of the saying that the best
firewall would be that one made by a human being analyzing packets one by one,
manually.

I don't think so, no. Regular users lack the understanding of machines'
inner workings and the protocol subtleties needed to determine what to do in even the
most trivial scenarios. They want to watch a movie they got from a friend, but
they have no clue if that translates to clicking Yes or No in some firewall
pop-up prompt.

Of course we could hire professional security experts to somehow perform
near real-time review of someone else's traffic--but then, they have no way
of knowing what the user really wants or expects his computer to do. Clicking
on a link to some obscure .exe file may be a phishing attack or a
legitimate download of a software update for a Mongolian Tetris-alike game we
never heard of. Without feedback from the user, it's hard to tell what should
be done.

Why do you think we cannot develop a software that behaves
autonomously like a human being in such restricted environments?

We did! Most antivirus programs and personal firewalls usually attempt to
implement a certain level of "smart" adaptive protection, rather than forcing
the user to define everything from scratch. I doubt this could be much improved
without reading the user's mind and scanning it for true intentions and motivation
for each mouse click.

Four years ago, you wrote an interesting paper entitled Against the System: Rise of
the Robots for Phrack #57. The main idea was to use search engines'
spiders as an attack vector. Just writing specially formatted URL on a
web page, and wait for the spider to follow those links. After all this time,
do you see a better or worse situation?

Our ability to understand how search engines address potential security
threats and other abuse scenarios is very limited; most of them, Google
included, are very secretive and not willing to discuss their business. As
such, I can only guess--but my impression is that a good number of major
crawlers implemented basic checks to trap the most obvious exploitation attempts,
simply by rejecting URLs that either appear clearly malicious or very
strange.

That said, I don't think the problem can ever be fully addressed; what
exploits one web script is a valid and expected parameter to another.

You developed a chatting bot that uses Google to create answers.
Obviously it's far from being perfect, but do you think that AI software could
use Google as a repository of human knowledge to make decisions such as what is
spam and what is a valid email?

Certainly Google and other search engines are in possession of databases
that may come handy for various applications--AI, automated learning, and
content classification included. Unfortunately for us mortals, they do not
share their databases with others; they only provide you with a very limited
search front end or similarly limited API, which is obviously not enough to
write much more than a toy chatbot (and even that is a possible violation of
their Terms of Service, as I learned from a friendly cease-and-desist-style
letter a while ago).

I don't know if you play modern videogames, but I saw some projects
about AI on your website. I think we could learn something from the work game
developers have done with AI. We both act in a restricted context (rules of the
virtual world/rules of protocols), but for my experience their "intelligence"
is much better than most security tools.

Funny you say that--many gamers complain that mainstream game AIs are
awful and too dumb, and ruin all the realism.

As to your question--AI is a meaningless term; it can mean just about any
algorithm that either appears to have certain poorly defined "human"
characteristics, or an algorithm that mimics a certain aspect of wetware
processing, or simply a program that adopts in some way. As such, yes, some of
the techniques and tricks collectively labeled "AI" can be of some use--now,
if we had a great idea how to put this to work. ;-)

Some of them seem to work as regular expression catchers. Think of
Snort, for example; it seems a pattern-matching tool. Should we start playing
with AI in security too?

Snort could be greatly improved by simply working on signature quality,
making it more stateful, incorporating more heuristics and so on. (Some may argue that
heuristic algorithms are AI, but that's what most commercial AV and IDS
products can already do, and so this is beyond the scope of that question.)

It's easy to claim that "adding AI" would help (or to claim the opposite),
but as I mentioned, this means nothing--it's like saying "computers would be
faster if we invented some cool new computing method." Sure, but so what?
Unless you're talking about a very specific use of a specific, known algorithm,
you're not going to get anywhere.

Federico Biancuzzi
is a freelance interviewer. His interviews appeared on publications such as ONLamp.com, LinuxDevCenter.com, SecurityFocus.com, NewsForge.com, Linux.com, TheRegister.co.uk, ArsTechnica.com, the Polish print magazine BSD Magazine, and the Italian print magazine Linux&C.