[Good "rant" about why networking is lagging behind processors in terms of speed, and that the gap is NOT closing, but rather expanding]

> Let me make a bottom-line statement: networking will *never* be as> fast as internal computer speeds. Technology trends point to computers> increasing at a greater rate. And then there's physics. Unless you're> planning on breaking the speed of light, it's always going to take a> lot longer to send information across a room than across a chip.

On the other hand, if the software can make good use of certainfeatures, the hardware-manufacturers will in turn see that it benefitsthem provide more of those features.

In the case at hand, if Linux allows us to build a cluster of machinesthat transparently to the user handle a workload much larger than onemachine could handle, faster and lower-latency networking productswill become more and more profitable to produce and sell.

Suppose you want to make a distributed shared memory machine. What"far-memory-latency" would be acceptable? Inside the machine wecurrently achieve:

Ok, Now lets assume that the "further away" latency penalty can beallowed a factor of 10. So we have a 1us timeframe. On "current"technology (SCSI, 80MB/sec) we can transfer 80 bytes in that time.

Enough to tell the other side "please get me (sub-)page xyz". Currentinterrupt latencies and software limitations don't allow us to do itthat fast. But those can and will be improved if they become an issue.

One microsecond is also enough for light to travel 300 meters. So atthose latencies, "across the room" is not yet a problem. Not evenclose. Electrical signals can travel at about a third of thelightspeed, so that doesn't pose any trouble either.

However, having a bit too little memory already pushes you to "swap",and that costs MUCH more than a single microsecond. Suppose a taskwith 64Mb of VM needs to considered for migration. On 80Mb per secondnetworks, you want to save at least the transfer time. That would cometo about one second of CPU. If the application is already running afew minutes, and you could assign say a 20% faster CPU, that is likelyto be worth the trouble... If at first latencies are on the order ofone millisecond, it is still likely to be better to "page" to othermachines RAM, than to DISK. It is already useful. (consider paging toanother machine over 100Mb ethernet. Throughput is the same as areasonable disk. Seek times are much better though...)

In my masters thesis, I realized that there are two types ofapplications that can be paralized. One has a data-dependency graphthat has loops. Others don't.

Consider FFT. No loops. Here latency is not an issue. Bandwidthis. Also the "100 users are doing independent things" case falls underthis category.

Most applications however DO have loops in their data-dependencies.In that case the latencies are the bottle neck. Consider for examplean Finite Elements program. You need the results of the previoustimestep of the adjacent elements for the current timestep.

If Linux moves towards allowing clusters of machines to function asone if they are networked together, the push for faster andlower-latency network-devices will get more and more significant.

The software layer should "know" about the latencies and bandwidthrestrictions in the form of tuneable parameters. A task that has runfor an hour, and does on average one kbyte of IO per second of CPUtime(e.g. a raytracing program), is worth migrating to a different(faster) host. On any network.

An application doing 1Mb of IO per second might be worth migratingtowards the data. On the other hand, for gigabit ethernet or 80Mb/secSCSI networks that is still insignificant enought to allow thisapplication to migrate towards a faster host, if the running timeseems to warrant that.

Most of this is long-term future. I don't know wether we can getLinux to the point where the clustering features become importantenough for the industry to start thinking about our "needs".

Current state-of-the-art proves Richard and Larry right: for bestperformance the API should match the hardware. (Don't provide an DSMinterface on message passing hardware or the other way around!)

But looking towards the future, I don't think we should dismissthinking about clusters of machines and how we can make them perform.Even if the interfaces are not optimal right now.

Regards,

Roger.

-- ** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2137555 ***-- BitWizard writes Linux device drivers for any device you may have! --** Never blow in a cat's ear because if you do, usually after three or ** four times, they will bite your lips! And they don't let go for at ** least a minute. -- Lisa Coburn, age 9

-To unsubscribe from this list: send the line "unsubscribe linux-kernel" inthe body of a message to majordomo@vger.rutgers.eduPlease read the FAQ at http://www.tux.org/lkml/