Posted
by
samzenpus
on Wednesday January 25, 2012 @05:22PM
from the read-all-about-it dept.

brothke writes "In the classic poem Inferno, Dante passes through the gates of Hell, which has the inscription abandon all hope, ye who enter here above the entrance. After reading The Tangled Web: A Guide to Securing Modern Web Applications, one gets the feeling the writing secure web code is akin to Dante's experience." Read below for Ben's review.

The Tangled Web: A Guide to Securing Modern Web Applications

author

Michal Zalewski

pages

320

publisher

No Starch Press

rating

10/10

reviewer

Ben Rothke

ISBN

1593273886

summary

Incredibly good and highly technical book on browser security coding

In this incredibly good and highly technical book, author Michal Zalewski writes that modern web applications are built on a tangled mesh of technologies that have been developed over time and then haphazardly pieced together. Every piece of the web application stack, from HTTP requests to browser-side scripts, comes with important yet subtle security consequences. In the book, Zalewski dissects those subtle security consequences to show what their dangers are, and how developers can take it to heart and write secure code for browsers.

The Tangled Web: A Guide to Securing Modern Web Applications is written in the same style as Zalewski's last book - Silence on the Wire: A Field Guide to Passive Reconnaissance and Indirect Attacks, which is another highly technical and dense book on the topic. This book tackles the issues surrounding insecure web browsers. Since the browser is the portal of choice for so many users; its inherent secure flaws leaves the user at a significant risk. The book details what developers can do to mitigate those risks.

This book starts out with the observation that while the field of information security seems to be a mature and well-defined discipline, there is not even a rudimentary usable framework for understanding and assessing the security of modern software.

In chapter 1, the book provides a brief overview of the development of the web and how so many security issues have cropped in. Zalewski writes that perhaps the most striking and nontechnical property of web browsers is that most people who use them are overwhelmingly unskilled. And given the fact that most users simply do not know enough to use the web in a safe manner, which leads to the predicament we are in now.

Zalewski then spends the remainder of the book detailing specific problems, how they are exploited, and details the manner in which they can be fixed.

In chapter 2, the book details that something as elementary as how the resolution of relative URL's is done isn't a trivial exercise. The book details how misunderstandings occur between application level URL filters and the browser when handling these types of relative references can lead to security problems.

For those that want a feel for the book, chapter 3 on the topic of HTTP is available here.

Chapter 4 deals with HTML and the book notes that HTML is the subject of a fascinating conceptual struggle with a clash between the ideology and the reality of the on-line world. Tim Berners-Lee had the vision of a semantic web;namely a common framework that allows data to be shared and reused across applications, companies and the entire web. The notion though of a semantic web has not really caught on.

Chapter 4 continues with a detailed overview of how to understand HTML parser behavior. The author writes that HTML parsers will second-guess the intent of the page developer which can leads to security problems.

In chapter 12, the book deals with third-party cookies and notes that since their inception, HTTP cookies have been misunderstood as the tool that enables online advertisers to violate users privacy. Zalewski observes that the public's fixation on cookies is deeply misguided. He writes there is no doubt that some sites use cookies as a mechanism for malicious use. But that there is nothing that makes it uniquely suited for this task, as there are many other equivalent ways to sore unique identifiers on visitor's computes, such as cache-based tags.

Chapter 14 details the issue of rogue scripts and how to manage them. In the chapter, the author goes slightly off-topic and asks the question if the current model of web scripting is fundamentally incompatible with the way human beings works. Which leads to the question of it if is possible for a script to consistently outsmart victims simply due to the inherent limits of human cognition.

Part 3 of the book takes up the last 35 pages and is a glimpse of things to come. Zalewski optimistically writes that many of the battles being fought in today's browser war is around security, which is a good thing for everyone.

Chapter 16 deals with new and upcoming security features of browsers and details many compelling security features such as security model extension frameworks and security model restriction frameworks.

The chapter deals with one of the more powerful frameworks is the Content Security Policy (CSP) from Mozilla. CSP is meant to fix a large class of web application vulnerabilities, including cross site scripting, cross site request forgery and more. The book notes that as powerful as CSP is, one of its main problems is not a security one, in that it requires a webmaster to move all incline scripts on a web page to a separately requested document. Given that many web pages have hundreds of short scripts; this can be an overwhelmingly onerous task.

The chapter concludes with other developments such as in-browser HTML sanitizers, XSS filtering and more.

Each chapter also concludes with a security engineering cheat sheetthat details the core themes of the chapter.

For anyone involved in programming web pages, The Tangled Web: A Guide to Securing Modern Web Applications should be considered required reading to ensure they write secure web code. The book takes a deep look at the core problems with various web protocols, and offers effective methods in which to mitigate those vulnerabilities.

Michal Zalewski brings his extremely deep technical understanding to the book and combines it with a most readable style. The book is an invaluable resource and provides a significant amount of information needed to write secure code for browsers. There is a huge amount of really good advice in this book, and for those that are building web applications, this is a book they should read.

However, I do understand your point. To really appreciate the Paradiso, you have to know a lot of Medieval cosmology and theology (actually, that is true of the entire Divine Comedy, but especially the last part). The entirety of the work has four interpretative levels, which are the literal, figurative or allegorical (or metaphorical), moral, and anagogical (yes spellcheck that is a real word). This is true of many of the works of literature which are called "great", but to really understand these levels you usually have to have read an absolutely massive body of other works. Most people really only see the literal and figurative levels. Oh and you should read it in the original Italian too. Really, literature is a much more in-depth field than most people realize.

Just as a quick example: Dante meets 3 creatures at the beginning of the Inferno, a leopard, a lion, and a she-wolf. Those are actually representative of the 3 main levels of hell at one level (the appetative sins, like lust, the spirited sins, like anger, and the intellectual ones, like fraud and treachery), and of those tendencies in Dante (the main character's) soul at another level. He can't get around the she-wolf, which is figurative of his problems with intellectual sins (pride, most likely), and Virgil (considered a prime example of intellectual guidance) is required to show him the path around her.

For reference, despite having read a lot of great works, I don't understand most of the symbolism in the Divine Comedy, just enough to see the depth there.

It's been a few years since I read it so my memory is a bit hazy. Still, I do remember copious footnotes in my translation, pointing out lots of symbolism and historical background. Eventually I found that I just didn't care about those features. After the morbidly interesting punishments in Hell and Purgatory were over, I suppose there wasn't much left to keep my interest. The structure, while obviously laden with symbolism, is also incredibly repetitive. I'm sure it was great for its time, but the same ca

Granted the Divine Comedy was rather repetitive but the imagery conjured up is what I liked most. It might have been that I was lucky enough to have had a teacher who knew all of the necessary background info to get the most out of it (as well as the rest of the classics) but still it is a book of historical significance. Comparatively Shakespeare 's works are less deep and don't develop as much. Granted they targeted towards an entirely different medium, time, and class of people so I am not trying to knoc

I was lucky enough to have read it in high school and to have a teacher who could explain and provide the necessary background information to fully understand it. Granted that was years ago but one of the things that I remember is one day he was talking about one pope and how he was considered at the time to be awful. If you had kept up with the outside of class reading you understood when the same pope was mentioned in the Divine Comedy as being in the replica baptistry of Pisa in hell upside down as Dante

Yep, I remember only the 1st book from highschool, however I remember the explanations being mostly literal as well. At times it was hard enough interpreting the text. I think to really get into it you'd probably have to read it at least 3-4 times in the entirety of the trilogy. Alternatively you can get/have a life:)

But, it's often not the code that secures the application... rather the underlying technologies... much of the modern applications they're referring to sit behind a username / password over https that requires brute force (99.9% of the time unfeasible). The ones that aren't have much higher security research budgets than this book did:)

Cross site (XSS) & sql injection are real threats, but are they really the weakest link usually?

Coming from a guy who's dba in college left a test server w/o a root pa

What frustrates me about web security or security in general is alluded in the review; that there is really not a good idea of what security is.

More specifically, the idea of security as a binary property. For me, it seems a more realistic approach is how much information and resource access do I have to gain to perform an action, what are the paths to do so, and how likely are those paths to occur, and what is the cost of the breach. The web makes this analysis harder without question, but still possible.

For example, I have a web site that contains account ids in a post to change something for that account. If the account id is an email, then forging the request is trivial. If the id is an opaque token, but sent in the clear (HTTP), it is less trivial, but still relatively easy. If the id is sent via https POST, sniffing is harder, but replay attacks may occur.

And so on. The point is to decide when the advantage gained is overwhelmed by the cost and risk of the attack.

I remember a time when somebody was worried about GUIDs in a URL because you could guess a valid one. Of course, it is much easier (understatement) to capture a valid one than guess one. This book will have value if it helps you avoid red herrings like that and focus on the real threats.

Of course, I am not a security specialist, so this may be naive at best.