This is not so much a question as a meditation. But then again maybe it is a question but one using some clever obfuscation, who knows?

Anyhoo... grab a cuppa and settle back...

How fast is fast?

Let's say you want to setup a site and your aim is to be sure that it will handle upto 1 million hits per hour. So you shop around and you get yourself a dedicated server with 24 processor cores, 12gb ram, 100mpbs pipe and 15TB bandwidth for about $259 / month. (Yeh I know a place...)

Anyway so you do a little maths and divide 3600 seconds by 1,000,000 hits and get the answer of 0.0036. So here's your target, get your code to build a request in that length of time, and you can do 1,000,000 in an hour. That is of course neglecting the multi-cores you just ordered each of which can be simultaneously producing a result, so your server is apparently capable of 24,000,000 hits per hour. Less of course processor time for O/S, DB, and other sundry odds and sods. So let's call it 20,000,000 hits per hour.

But how big is a hit?

With broadband now the defacto standard, the days of optimising sites so that they load in less than 10 seconds on a standard 56K modem are a becoming a distant memory. Lets say you have about 50Kb for the headers, cookies, and html + the high quality jpg images, maybe some flash animations, video content all of which cost bandwidth to send out.

So let's be conservative with the estimate of average KB / hit, and suggest a value of about 200Kb total.

20,000,000 * 200KB = 4,000 GB (4TB)

Oh dear... running at full tilt your super fast optimised down to the bone program has just served up so many hits so fast that you have run out of bandwidth in a mere 3hrs and 45minutes.

And that was assuming you had a pipe wide enough to fit all that data through it at that speed!

So how big was the pipe again?

The server comes with a 100Mbps pipe with a throughput capability of around or just under 12.5 MB/s.

12,500 KB / 200KB = 62.5 average hits per second

oh...

More / wider pipes!!

So having worked out that the pipe can only sustain a throughput of around 62.5 hits / second * 60 * 60 = 225,000 / hour, you realise you need a bigger pipe!

So you plump for the 1Gbps pipe, 10x the bandwidth with max theoretical speed of 2,250,000 / average hits / hour.

Nice.

But wait a minute, the code is capable of running 20,000,000 hits per hour because we optimised it so heavily. So even at maximum throughput on the pipe on a dedicated server with unlimited bandwidth, the CPU(s) is/are nearly 90% idle!

So how fast does it need to be?

So we have just determined that a server with even a 1Gbps pipe can only do around 625 hits per second before it simply hits the bandwidth limit and can go no faster.

So how many cores have you got?

In the example above the server had 24 cores, of which I suggested 4 would be busy with odds and sods leaving the other 20 cores free for rendering activity.

With a 15TB bandwidth cap your going to run out of bandwidth again in about 33 hours and 20 minutes.

But recall our spec only called for 1,000,000 hits per hour, which was around 277 per second.

277 * 200 * 60 * 60 = 199.44GB / hour

15TB / 199.44GB = 75.21 hours.

Clearly more bandwidth is needed, or the server will run out just 3 days into the month!

Therefore bandwidth availability is far more important than server speed and available cores in this day and age, a trend which is likely only to increase as web-apps get even more complex and graphically polished and include high bandwidth overheads like streaming video and audio content.

Conclusion

The processing power available today is huge! Also I recently saw a video of Larry talking about Perl 6 when he mentioned utilising CUDA graphics hardware to do extra processing. Things like regexes which traditionally
have been considered to be expensive on CPU time as each character must be tested against the matching rules, can be done in parallel on the graphics card at super fast speed.

Nvidia Tesla cards (and tesla enabled servers), can have as many as 512 processing cores available!

With that much processor power backed up inside the bottleneck of available bandwidth, questions like how fast is fast? and is my code fast enough? are soon set to exist along side questions like, will my database of 100,000
records fit onto the server's 10MB hard disc, or do I need to drop the first two characters of the "Year" column to save space?

In short, don't worry about it too much. If your using mod_perl2 (or similar) and your code is producing results in less than 0.3 seconds your pretty much good to go! Remember, your time is several orders of magnitude more valuable
and expensive than extra server power, and considering the range upto which servers can be built now (think deep-blue, Tianhe-1A, etc), the sky really is the limit. Remember the Cray? Would you believe it only ran at 80Mhz? I'm pretty sure I've got a wristwatch somewhere around here that can out perform that now.

Yesterdays super computer is tommorows laptop! And yesterday's code which absolutely must be optimised to the bone for the sake of efficiency is tommorow's day
off having fun. (unless of course you enjoy debugging and code optimisation, each to their own I guess)

Let us consider caching. The front page of Google is 300k... but 270k of that is cached. 30k a hit. Getting a search result is about 130k, 80k of which is cached. In addition, with a site that large much of your requests will be AJAX calls returning small bits of JSON and XML. We're talking an order of magnitude below the 200k estimate.

Now consider the Twitter problem. You need to efficiently pipe 140 characters to just the followers of each user in real time and you have millions of users constantly sending messages. The problem seems simple, and the payload is small, but it is an extremely expensive calculation. Social networks at the scale you're discussing are CPU intensive.

This brings us to the problem with your calculations: they're wrong. They're wrong because they are premature optimization. Or in this case, premature sub-optimization. The evil of premature optimization is not so much the optimization, it's thinking you can predict performance. It's thinking you can predict the future. It's thinking you know everything you need to know about how a system will be used and react before it actually happens. In any sufficiently complex system the best performance prediction is this: your prediction is wrong. You simply do not have the data. I don't either. Most of us do not.

A site with a million hits a day doesn't just appear out of thin air. Nobody should sit down and try to design a site that big unless they already have a slightly smaller one doing the same thing. That's the only way to get real experience and data to plug into the equations to find the real bottlenecks. Nobody should start by buying the sort of hardware you're talking about. You should start by knowing you will be wrong, plan accordingly, gather metrics and optimize on that. Be modest in your performance expectations until you have something to profile.

You can think of it like the Drake Equation. As a thought experiment and discussion piece about the probability of alien life, it's fantastically focusing. As a practical predictor it's meaningless. Most of the numbers we plug in are sheer speculation. There's so many variables with such high variability being multiplied that the end result swings by orders of magnitude based on what valid seeming estimates you plug in. Errors multiply. You can get anything from millions to just 1, and you'll tweak the results to your liking (nothing personal, just human nature). It's seductive to think that meaningless number is proof to take action.

If you are getting so many legitimate hits that you are overwhelming your giant bit cruncher, fed with the latest light speed ethernet, you might consider selling out to someone who knows how to handle it. :-)

Like the saying in the old west: There will always be someone faster in this game. Thats what IBM, and the plethora of Cloud Server providers are for.
They have air-conditioned rooms full of racks of the fastest equipment hooked directly into whatever is the internet backbone. You won't even have to buy any equipment, they will sell you so many Mips/per_month on a virtual machine, and they will determine at which giant data center the Mips get run at, assuring you of no roadblocks, and protecting you from DOS attacks and storm failures. Now thats what I call fast.

Your site content must be pretty awesome to get that many hits, unless it's just a Denial of Service Attack, which you, I presume, would somehow get, and blame on Perl. :-)

By the way Logicus, can you be faster than the Cosmos itself, as it transforms from one instance of Time to the next? Om.

I just treat my files in a similar way I've always treated them coming at the problem from a 90's graphics demo coder perspective. It's how I've evolved as a programmer, and now oh look, Perl is about to learn how to exploit graphics processing power :):):):):):):)

If I needed my code to run on a 386, I'd write it in C/asm and hand optimise the inner loops... I don't need to do that!

A chain is only as strong as its weakest link. A processing system is only as fast as its slowest component, multiplied by the queue-size that results from that differential in speed.

It is very, very often the case that “the slowest component” is precisely the same as it has always been: the disk drives. Not the networks, not the CPUs or the number thereof.

In my opinion, the best way to get definitive answers is: simulation. Except in the very rare case of processing that actually involves CPU-intensive activity, the number of CPUs or cores can almost always be omitted. What matters are things like ... cache hit ratios, data distributions, and I/O avoidance. And this has more or less been true for about sixty years now.

That's what's wrong with your position. If "aXML" really became the next hot thing, it could never scale to the level of Facebook or Google without requiring 42 nuclear power plants to sacrifice their lives on the alter of your awesomeness.

I put "aXML" in quotes because it really is all about the "a" and has very little to do with actual XML. (Else you'd just use XML::LibXML like the rest of us mere mortals.)

Where/When did I every suggest that the system is the core design of the next Facebook or Google? I never suggested such a scope, I suggested that when understood the system is easy to extend, modify and maintain, and saves programmer time. I also suggested that given the state of current hardware the extra processing overhead is largely irrelevant, even more so on the next version which I'm working on at the moment.

If I was writing a core for a google or a facebook, then the whole approach would have to be different, I know that... it's not news to me... and wasn't 4 years ago. To get down to 0.003 for your average pagehit you can't be using anything that requires much processing time on the stoneage classical hardware currently available. My point is that 99.9% of people who just want to use Perl to put together a site for their business or community have no need to worry about whether their site is going to become a global hotspot. Such people are looking for something simple they can setup easily and don't have to spend years reading and studying up how to make it fast.

This to my mind is one of the big reasons Perl is losing out to PHP, because PHP is very quick and easy to get results with, and Perl takes effort. People don't like effort, they are either too lazy or too busy to dedicate the time and resources it will take. Especially when they think they can just get an Indian PHP programmer to build what they want for peanuts. And there is no shortage of such people.

What you expect from aXML, a system you know very little of, is a reflection of your own vain ego, it's got to be the biggest/best/fastest to satisfy your craving for power. Not all cars have to be Ferrari's there is room for ford people carriers as well.

This to my mind is one of the big reasons Perl is losing out to PHP, because PHP is very quick and easy to get results with, and Perl takes effort.

In my mind, that's more about deployment than it is about learning syntax. People don't learn syntax. They copy and paste and tweak example code until it appears to behave appropriately.

Plack helps with Perl's deployment. Plack helps a lot.

I suggested that when understood the system is easy to extend, modify and maintain, and saves programmer time.

That'd be easier to judge if you were to provide more examples.

I also suggested that given the state of current hardware the extra processing overhead is largely irrelevant...

The hardware specs you quoted are pretty heavyweight though, especially for a site measuring traffic in the hundreds of thousands of hits per month. When a persistent pure-Perl Plack-based server like Twiggy or Starman can serve thousands of hits per second, the bottleneck gets back to inefficiencies in processing, IO latency, and database traffic.

Even so, moving to a persistent process model is probably the best thing you can do. (I prefer Plack to mod_perl myself these days for many reasons, and not only performance.)

When putting a smiley right before a closing parenthesis, do you:

Use two parentheses: (Like this: :) )
Use one parenthesis: (Like this: :)
Reverse direction of the smiley: (Like this: (: )
Use angle/square brackets instead of parentheses
Use C-style commenting to set the smiley off from the closing parenthesis
Make the smiley a dunce: (:>
I disapprove of emoticons
Other