Friday, September 03, 2010

Our infrastructure -- Assessing Over 2,000 websites

Recently I was asking a colleague how desktop black box web application vulnerability scanners, from a scalability perspective, approach scanning large numbers of websites (i.e. 100 to 500+) simultaneously. I was curious about the way they address the physical infrastructure requirements to support big enterprise deployments as compared to our own. Anyone with experience knows commercial desktop black box scanners can easily eat up several gigs of memory and disk space for even a single website. Nothing high-end workstations can’t handle, but multiplied out it’s an entirely different story. The person had an unexpected answer, “Desktop black box scanners don’t have the use-case you [WhiteHat] do, their technology doesn’t need to scale.” Say Whaaa!?!

Asking for clarification he said consider how black box scanners are normally used in the field. They are “developer” or “pen-test” tools where the use-case is one person, one machine, one website, one configured scan, and then let it run for however many hour or days it takes until completion. Attempting to perform dozens or hundreds of scans at the same time would be exceedingly rare, if ever, so the capability to do so doesn’t need to exist. He said, “Who beside you guys [WhiteHat] needs to scan that many websites at a time?” To which I humbly replied, “the customer.”

As we know new Web attack techniques are being published weekly and Web application code is change is rapidly (Agile). Web applications, even those that are old and unchanged, need to be tested often for these issues. Testing once a year, or even once a month, isn’t enough in an environment like the Web where daily attacks are normal. So if an enterprise has say 10 or more websites, to say nothing of those with hundreds or thousands, mass scanning is essential to get through them regularly. Burdening enterprises by having them wire scan nodes together with command and control systems to achieve scale is patently absurd. That's as an inefficient one-to-one model. 100 simultaneous scans = 100 scan boxes. Of course I’m sure they are happy to sell the hardware.

So yes I was a bit surprised that the desktop scanner guys haven’t seen fit to tackle the technology scaling problem, even though two of them are mega corps. They above all should know that scaling must be addressed if performing routine vulnerability assessments on all the Internet’s most important websites is to become a reality. To be fair, we’ve never pulled back the curtain to show off our own infrastructure. Maybe it’s time we did so because over the years we’ve invested heavily and it’s something we’re particularly proud of. I think others would be interested and impressed as well. The physical requirements for WhiteHat Sentinel, a SaaS-based website vulnerability management platform, are in a word -- massive.

Operationally we’re assessing over 2,000 websites on a fairly routine basis (~weekly). A dedicated IT staff is monitoring the systems for over 300 points of interest (utilization of network, cpu, memory, uptime, latency, etc.) ensuring everything is running smoothly 24x7. Metrics show at any moment 450 scans are running concurrently, generating about 300 million HTTP requests per month, and processing 90,000 potential vulnerabilities per day. We preserve a copy of every request sent and response received for audit, trending, tracking, and reporting purposes. This system itself is being access by over 350 different customers with tens of thousands of individual Sentinel users.

CPU and memory wise our ESX virtualization chassis allow us to control resource allocation and scale fast between multiple scanning instances and load balanced front-end & back-end Web servers. As you can see from the pictures we have some serious storage requirements. Our clustered storage arrays have 250TB ready to go (additional capacity at a moments notice), writing about 500GB to disk per day, and connected by dual 10GB backplane ethernet connections. Sick!

Oh, did I forgot to mention the two 100MB links to the Internet? Also very important is that the infrastructure is fully redundant. Pull any network cable, push any power button, and the system keeps on humming. I left out the pictures of the backside of the cages, which is every bit as cool as the front, but there’s a lot of network cords, firewalls, routers and other stuff we’d prefer to keep to ourselves. :) If someone else claims to have a SaaS scanning platform I’m wondering if it looks anything remotely like ours.

The data center where everything is housed is SAS70 Type II certified and state-of-the-art when it comes to power, fire protection, cabling, construction, cooling, and physical security. Guards are on site 24/7/365, active patrols both inside and outside the facility, with 54 closed circuit video cameras covering the interior and exterior of building. To get access to our area requires an appointment, government issued ID, thumbprint, retina scan and only then do they hand over the key to our private space where only two people at WhiteHat have access. I’m not one of them. :) Compare this to the scanner on laptop sitting somewhere unguarded in the enterprise. Clearly we’re not a desktop scanner behind a curtain like others out there. We’re not playing around. We take this stuff extremely seriously.

Other then the first three paragraphs, a nice post. (Although, 2 100MB links seems a bit odd - I'd think you'd be better off if you spread them out a bit between telcos for better peering - but maybe you are using someone like InterNAP.)

Now, about those three paragraphs... :)

>Say Whaaa!?!

Are you suggesting that "Desktop" scanners _should_ scale?

That doesn't make any sense.

If you need to scan 100 websites, surely you might consider that a desktop tool is the wrong one to do so?

>He said, “Who beside you guys >[WhiteHat] needs to scan that many >websites at a time?” To which I >humbly replied, “the customer.”

The customer of the "Desktop" tool vendor or a Whitehat customer?

If you meant the "Desktop" customer - they should consider an appropriate "Enterprise" tool. Strapping together 100 desktop tools seems too goofy to consider.

If you mean a WhiteHat SaaS customer - well, that's what you are there for. :)

In the real world, I don't think you will find too much demand for vast scaling to the degree that a SaaS vendor needs.

In my experience, even if the threats change daily, test schedules/windows, CM policies, approvals, backups, etc slow things down to a more cautious pace.

Even when/if you do get authorization to do a BIG scan, a little bit of creative scheduling and risk assessment/prioritization will help with bottle necks and keep your server and network teams happier with you anyway.

(This discussion almost ends up as a advertisement for WAF as 0-day protection.)

Other then the first three paragraphs, a nice post. (Although, 2 100MB links seems a bit odd - I'd think you'd be better off if you spread them out a bit between telcos for better peering - but maybe you are using someone like InterNAP.)

Now, about those three paragraphs... :)

>Say Whaaa!?!

Are you suggesting that "Desktop" scanners _should_ scale?

That doesn't make any sense.

If you need to scan 100 websites within tight time constraints, surely you might consider that a desktop tool is the wrong one to do so?

>He said, “Who beside you guys >[WhiteHat] needs to scan that many >websites at a time?” To which I >humbly replied, “the customer.”

The customer of the "Desktop" tool vendor or a Whitehat customer?

If you meant the "Desktop" customer - they should consider an appropriate "Enterprise" tool. Strapping together 100 desktop tools seems too goofy to consider.

If you mean a WhiteHat SaaS customer - well, that's what you are there for. :)

In the real world, I don't think you will find too much demand for vast scaling to the degree that a SaaS vendor needs.

In my experience, even if the threats change daily, test schedules/windows, CM policies, approvals, backups, etc slow things down to a more cautious pace.

Even when/if you do get authorization to do a BIG scan, a little bit of creative scheduling and risk assessment/prioritization will help with bottle necks and keep your server and network teams happier with you anyway.

(This discussion almost ends up as a advertisement for WAF as 0-day protection.)

For large enterprises that require scanning numerous applications, in a recurring manner, we have a different product called AppScan Enterprise, which does exactly that. It sits on dedicated hardware, uses a robust database, and can handle the load mentioned.

@Anonymous: yes, enough file-system level crypto is applied in the case of physical hardware theft, the risk of is which is already extremely low. Application level encryption is a more complicated subject because we must be able to read the data to perform our duties.

@Ory: Would you mind describing what AppScan Enterprise's hardware requirements would be when an organization needs to scan 100 sites simultaneously? And if the answer is "it depends," perhaps explaining how creating such an estimate is approached and estimated.

@Dan: Im seeing your comments emailed to me, but for some reason they are not being posted to blogger. Not sure if this is a bug or you are deleting the message. Either way, hard to respond to a comments that no one else sees.

@Dan: The data center provider handles the relationships and connections through multiple telcos.

Desktop scanner vendors claim their solution scales, when it clearly does not at several multiple levels. Secondly, in network VA, a single scan box is capable scanning huge host/ip space. The perception among many is that the same can be done in webappsec. Obviously not true.

"In the real world", our experience has been that there are a great many organizations who are responsible for literally hundreds and if not thousands of websites. For those yes, I think we'd be a fine match. :)

I'll not be addressing the WAF issue here, except to say that something needs to be done with the vulns found.

@Ory: Propaganda? I'm specifically comparing against the invalid claims perpetuated by desktop scanner vendors, which include IBM. A conversation that yes, not only product my business, but also protects customers against such false and misleading scalability claims.

I asked you a fair and direct question about the hardware requirements when deploying AppScan Enterprise to scan 100 sites simultaneously. You did not answer. If that's the way you protect your business, so be it.

Other then the first three paragraphs, a nice post. (Although, 2 100MB links seems a bit odd - I'd think you'd be better off if you spread them out a bit between telcos for better peering - but maybe you are using someone like InterNAP.)

Now, about those three paragraphs... :)

>Say Whaaa!?!

Are you suggesting that "Desktop" scanners _should_ scale?

That doesn't make any sense.

If you need to scan 100 websites within tight time constraints, surely you might consider that a desktop tool is the wrong one to do so?

>He said, “Who beside you guys >[WhiteHat] needs to scan that many >websites at a time?” To which I >humbly replied, “the customer.”

The customer of the "Desktop" tool vendor or a Whitehat customer?

If you meant the "Desktop" customer - they should consider an appropriate "Enterprise" tool. Strapping together 100 desktop tools seems too goofy to consider.

If you mean a WhiteHat SaaS customer - well, that's what you are there for. :)

In the real world, I don't think you will find too much demand for vast scaling to the degree that a SaaS vendor needs.

In my experience, even if the threats change daily, test schedules/windows, CM policies, approvals, backups, etc slow things down to a more cautious pace.

Even when/if you do get authorization to do a BIG scan, a little bit of creative scheduling and risk assessment/prioritization will help with bottle necks and keep your server and network teams happier with you anyway.

(This discussion almost ends up as a advertisement for WAF as 0-day protection.)

I am not trying to stall or avoid giving an answer, I simply don't have the answer, since I do not deal with AppScan Enterprise, I'm mostly involved in the AppScan desktop product (Standard Edition), as you probably know.

Having said that, I guess our support team has a formula to help our (successful) customers with their scale-up questions.

What can I say - Whitehat does it all, it's the best company, with the best scanning solution on the planet. There you go, I admit it, we suck, you rule :-)

Aside from the initial scan for a new effort, wouldn't "normal" operations be to scan as part of the ongoing lifecycle, periodically, and as the threat changes?

So, once you get past that first "Scan Everything for Everything" scan (which I can't imagine would really be done all at once on 100 servers anyway - given the constraints I mentioned earlier), then you have:

1. Full "as needed" scan on some subset of hosts as their code & environment changes,2. full "periodic" scans,3. and delta scans for specific newly discovered vulnerabilities & techniques

So, maybe you could reasonably cover 100 sites with a desktop tool. Since, the problem doesn't necessarily require a full crawl and scan over and over daily.

Running several new tests on 100 pre-crawled sites doesn't seem too far beyond the capabilities of desktop scanners - although, they would probably be serially executed (?) and not parallel.

But I'm not sure the average company will get too excited about serial versus parallel execution. If you're that time sensitive (and not a SaaS vendor) you probably need another control anyway - since fixes are going to take awhile in an "Enterprise" environment.

Aside from SaaS vendors (little control of the schedule) and universities or hosting companies (little control of the content/configuration), what sort of users are you finding that require a high degree of parallel testing?

"Aside from SaaS vendors (little control of the schedule) and universities or hosting companies (little control of the content/configuration), what sort of users are you finding that require a high degree of parallel testing?"

If a single customer has 100 (or more) apps to look at then I think they have bigger business issues to consider.

Hence why WhiteHat is a excellent choice for service providers like www.proactiverisk.com to leverage in helping clients meet the needs for enterprise -- no silver bullet, just a machine gun to be leveraged.

Home Depot has lots of HAMMERS you have to pick the right one for the type of nail you want to hit ;)

The desktop web application scanning products definitely do not scale, and as mentioned, they are not supposed to because that is not their intent.

As far as commercial scanning products (and I have used and have licenses to all the top commercial scanners), the vendors almost always offer two products. The desktop version, and the "enterprise" version. The different between the the two tends to average about $20,000 for a desktop license, and $1,000,000 for the enterprise product.

The enterprise product has to scale, so it will definitely require more hardware. That being said, there are a couple things I have seen.

Company A will offer up the service in a SaaS manner, much like WhiteHat. This is great, because it removes a lot of the hardware burden (managing & building labs, etc) from the customer, they are already struggling as it is.

Company B, the $1,000,000 price tag includes an company wide license for unlimited scanning using the enterprise product. BUT they DO NOT provide the hardware/infrastructure/resources. The customer has to provide their own servers, virtual machines, etc. Company B sucks. : )

Then there are open-source web application vulnerability scanners. They don't scale either in an enterprise fashion, and because they are free there is little customer support. But for a single penetration tester with a single web application target, they are great!

When talking about desktop and enterprise scanning products. It is VERY important that both products use the same scanning engine, so that way there is consistency. There are certain commercial scanning products out there that use a different code base in their desktop and enterprise product. So what happens is...you scan a web application with the enterprise and desktop product, and end up with a different list of vulnerabilities. False-positive, and even worse, false-negatives.

In addition, certain vendor enterprise scanners do not have the same "nerd knobs" as the desktop product. What that means, is there is very little ability to fine-tune the enterprise scanner to a particular web application to the granularity.

@Aaron: Speaking from experience when customers, when they engage with us they quickly mature from phase of just finding vulnerabilities to actually implementing a process to fix them.

"Fixes" are typically an application code change, a configuration change, or a web application firewall rule. Whatever the case may be the vulnerability details and recommended action (policy?) must filter down from the security team to the appropriate people in the organization.

The way many of our customers have done this is by using the open XML API in Sentinel. The results are automatically pulled into a bug tracking system or in a higher level dashboard like archer. Of course the XML can also be automatically converted in WAF virtual patch rules.

About Me

Jeremiah Grossman's career spans nearly 20 years and has lived a literal lifetime in computer security to become one of the industry's biggest names. He has received a number of industry awards, been publicly thanked by Microsoft, Mozilla, Google, Facebook, and many others for his security research. Jeremiah has written hundreds of articles and white papers. As an industry veteran, he has been featured in hundreds of media outlets around the world. Jeremiah has been a guest speaker on six continents at hundreds of events including many top universities. All of this was after Jeremiah served as an information security officer at Yahoo!