Posted
by
Hemos
on Monday November 29, 2004 @10:13AM
from the busting-outta-school dept.

rdwald writes "An international team of scientists led by Caltech have set a new Internet2 speed record of 101 gigabits per second. They even helpfully converted this into one LoC/15 minutes. Lots of technical details in this press release; in addition to the obviously better network infrastructure, new TCP protocols were used."

Important Point:
When CERN comes online in about 5 years, it's expected to churn out petabytes of data. Yeah. I meant that. Petabytes, as in 1024 terabytes. Fermilab is already turning out terabytes but it will be surpassed greatly by CERN.

A particle accelerator is basically taking very high resolution images in 3 dimensions hundreds of times

I mean that's a full terabyte almost every minute and a half. What has so much data?

Internet 2 as well as the Engineering Science Network [es.net] are being setup to move massive amounts of data from science labs to computational labs within the US Department of Energy.
Labs like Fermilab are expecting to produce hundreds of terabytes per day of data from research when they come online.

Not counting the time to load, burn and read the discs, a non-stop flight from Pittsburgh to LA takes around 5 hours = 18000 secondsThis amounts to 920000000/18000 =~ 51000 Gbit/sec

Considering a very approximate cost of $1/kg for the transport, and $2 for each disc it amounts to around$4653000 total.Which is about 0.04 $/Gbyte, or around the same price per GB as a cheap 160GB Hard drive.

Unfortunately the LHC needs a nuclear power station (and a hydroelectric one for the computers 8)). I'm not joking, it really does. Running for more than 100 days a year + 30 for some other stuff is not practical unless you want Geneva to freeze durin

The Swedish LHC (CERN) guys are going to need to be sending *petabytes* of collision data around the world for analysis over the course of their experiments. It's crazy to think it, but some people really do have a need for this. I suppose this is why the Internet2 is largely restricted to research and education purposes.

I know the difference. I just couldn't remember where Geneva was. After a nap I now remember it lies across the borders of both France and Switzerland. I'd be ashamed if it weren't for the weekend I just had.

Canadian researchers at CaNet3 did an interesting experiment around this very question.

What do you do when your network is faster than your drives?You turn the network itself into a drive - a giant drive made of light and 1,000 miles in diameter.

Basically, the idea is that instead of accessing data relatively slowly from a server's drive, you instead keep the data spinning around the fibre network at the speed of light. If anyone wants something - a DVD quality movie for example - they peel it off as it c

One of the major reasons why this is important (and higher bandwidth is good) is for scientific data. Things like data from CERN and other particle accelerator. These produce -huge- amounts of data every time they run and this will allow researchers to be able to access this data without actually having to go to places like CERN.

When people jumped from 56k to 1Mbps, the only thing that really changed for the *average joe* was grabbing mp3s and checking out more trailers.

Contrary to popular belief, most people are not out there downloading a 9GB collection of Friends, season 1 or grabbing a 20GB MAME set with flyers and cabinets. Most people will just go buy the DVD or grab Midway Arcade Treasures and be happy.

When people jumped from 1Mbps to 5Mbps, I've seen them take advantage of it by shopping on amazon 2ms faster than before.

I think the real "danger" with higher speeds would lie in the realm of more annoying/higher def advertising. When the day comes that it becomes trivial and technically possible on a large amount of computers to download and display a 1920x1080 30 sec interstitial ad before you can view a webpage, it *will* be done.

You can already see this transition happening with lower res video as people try to pack a highly-compressed 30 second FMV ad into a flash box.

Just food for thought, this isn't so much about speed as it is about size (and you all thought that didn't matter).

Think about it this way. If you have a 1" pipe, and you send a little bit of water down it, the water reaches from one end to the other in a certain amount of time. Now take that up to a 4" pipe. Does the water travel any faster simply because it's a larger pipe? No. But the difference is that you can send MORE water in the same amount of time, not that you can send it from one end to the

I remember watching a lecture on the reseach channel, where a comparison was made of growth rates of different technologies: cpu, storage and network bandwidth. The bottom line was that cpu performance growth follows Moores Law (i.e. the perf increase is dominated by manufacturing issues) while network performance is increasing at 10X the cpu rate (disk is somewhere between). The talk discussed the implications of this.

The summary was that we'd need to revisit system tradeoffs. We currently compress data

Does your surgeon really need these speeds? Or does he need a connection with near-zero latency, and very consistent near-zero latency?

Real-Time Video can be acheived with today's bandwidth (albeit expensive solutions). What we can't deal with is a five second hiccup (sorry about cutting all the way through you ma'am... lag is horrible today).

Cue the gags about "Finally, I shall be able to download my pr0n collection".

Cue questions about whether is gigabytes or gigabits.

Cue questions about "How can I get a such gaping-a$$ bandwidth.

One of these days I will write the ultimate FAQ to/. posts including all the possible combinations of arguments started by SCO stories, how politics is treated here and a whole chapter on non-funny memes.

The way to write 11GBps is to use a distributed array of disks. A parllel filesystem can easily handle it. Over a 100 networked computers with a parallel filesystem like Lustre, GPFs or PVFS( 1,2 and 3.... is there a 3 ?) can do it. I mean there are disk arrays that have sustained throughput of over 55GBps. Also the 11GBps that we see now may one day used for having all sorts of communication going through it so in a way it is a way of the future.

For the first time, a comment that starts "Imagine a Beowulf cluster..." might actually be on topic.

More seriously, the Internet2 is designed for transferring massive scientific data sets between research institutions. The folks at CERN are planning to run experiments that generate terabytes of data per second. They're no doubt going to be using buckets of RAM and monster arrays of drives operating in parallel to keep on top of that. They wouldn't be develo

"How did they sustain a transfer like that? Unless my math is wrong, that's 11GBps... what has that kind of read/write speed?"

Good point, but that's the aggregate throughput of the data pipe and not necessarily generated or used by any two single end-point devices. They may test it this way as a proof of concept, but it's more likely that 1000 computers in a lab on one coast would send that total data through such a link to a lab on the other coast.

As this was an experiment, it is likely that they merely sent pseudo-random data. Probably even just the same blocks of data repeatedly. You don't need a large data set to generate traffic for testing purposes, just something that is easily verifiable.

People comment about the bandwidth of a card full of DVDs or Tape Drives and the like, but do they ever stop to think about exactly how LONG it takes to write information to the medium? Driving from one place to another with the data is trivial, but converting the raw data into the transportable message takes absolutely forever.

This might be bit degratory, but I've heard that in England they toss midgets (some sort of bar game) and surely the information content of a midget is much more than 200MB. So the Brits have transfered information in their drunken stupor for centuries faster than these dudes.

If anybody shorter than me (5'11") is offended by midget tossing, blame the Brits not me.

Blame the Brits? Unfortunately, this bar game is around in the States, too. The problem is not just that most dwarfs (we call ourselves "dwarfs", or "short-statured") see this as degrading, but that it is dangerous. Not that we're particularly worried about the dwarfs that subject themselves to this - they are probably aware of the risks, even if they are ignoring them - but the fact that this is seen as "acceptable" creates a danger for Joe Dwarf walking down the street, in that some day, some drunk ass

They could probably get much better speeds if they compressed it first. The Library of Congress is quite compressible, as there is a lot of redundant data. Text in general is known to be quite compressible.

Here's a question. Sure, you can send 101 Gigabits per second. But what kind of power do you need on either end to send or interpret that much data? I know my hard drive doesn't go that fast. I don't even think my RAM is that fast.

That was kind of my point. It would be nice to have this kind of connection running underneath the ocean to link continents. However. We first have to have a line under the ocean that can handle that kind of bitrate. We also need a computer (or a cluster of computers) on either end that is capable of dealing with all this data, parsing the packets, routing it, and doing anything else that may be necessary.

SCTP was specifically devised as a replacement for TCP as it can emulate the 1 -> 1 connection of TCP but can do connection based 1 -> N too. I thought it has been designed with high speed in mind too. Does anyone know whether this protocol is being used more and more or has it just become another good-idea-at-the-time that got run over by the backwards compatability steamroller?

SCTP is also connection-oriented and provides all the transport
services that TCP provides. Many Internet applications therefore
should find that either TCP or SCTP will meet their transport
requirements. Note, for applications conscious about processing
cost, there might be a difference in processing cost associated with
running SCTP with only a single ordered stream and one address pair
in comparison to run

This is great and all, but has anyone stopped to ask why we need such fast networks? The stock-frenzy driven surplus of unneeded bandwidth was a major contributing factor to the dot-com bust. I remember when I was working on a multi-gigabit, next-generation optical switch, and the project manager was assuring us that in just a few years, people would be downloading their movies from Blockbuster instead of actually traveling there to pick up a DVD. We were all supposed to be videoconferencing left and right by now, with holographic communications just around the corner. A massive growth in online gaming was supposed to cripple the existing legacy networks, forcing providers to upgrade or perish. All of this was supposed to generate a huge demand for bandwidth, which were were poised to deliver.

Well, as we all know, that demand never materialized. We had way more bandwidth than the market needed, and when the bandwidth finally became stressed, providers opted to cap bandwidth and push less-intensive services rather than pay for expensive upgrades to their infrastructures.

I think we should instead be focusing on technologies that can a) generate real new revenue to the providers that we're trying to sell these ultra-fast networks to, b) have obvious and legitimate research or quality of life improvements, and c) are sure-fire hits to attract consumer attention (and $$$).

Don't get me wrong, this is very cool and all, but until Netflix actually lives up to its moniker and sends me my rented movies through my phone/cable line rather than UPS, then it doesn't really matter to me if the network is capable of 5 Gbps or 500 Gbps. Slashdot will still load in a second or 2 either way. We need real products to take advantage of this massive bandwidth, and that revenue will drive research even further, faster. I fear we're going to stall out unless we find a way to embrace these faster networks and make money off of them.

Just because "the future" isn't happening in N. America yet, doesn't mean that it isn't happening elsewhere. N. America is constrained by its last mile problem, but Asian nations like S. Korea and Japan don't suffer this, which is why they already have multi-megabit fiber drops to homes and businesses. Sure, we on our miniscule ADSL and Cable hookups may not see the need for such massive bandwidth since we can't use it, but when you have a 1000 unit apartment complex with 100Mb fiber drops, this kind of int

Instead of looking at the possibility of beefing up your catalog of Futurama episodes, think about the new uses for it.

Medical imaging produces very large files, and the need to transfer them over distances quickly to save lives is real.

The possibility for video is great as well. Imagine getting multiple feeds of the next WTO event from different sources on the ground. Or quality alternative broadcasting that isn't just some postage-stamp-sized, pixelated blobs. Torrents are nice, but there is something to be said for being jacked in live.

...how fast this could transfer the sum of all data (DNA, memory, etc.) contained in a human.

Yes, I'm kidding. But only half kidding. In some crazy future where we can reconstitute energy into matter, how much bandwidth would be needed to do this practically? Do we even have any ideas or estimates on how much storage would be needed to accurately represent the nature of the human body in terms of data? And no, I'm not talking about the "memory" of the brain - I'm talking about the physical manifestation of the body itself, of which the memory of the brain is a part.

...how fast this could transfer the sum of all data (DNA, memory, etc.) contained in a human.

Another poster has already provided an excellent summary of how long it would take to transfer a whole 'human', assuming 100 bytes per atom.

I will note that DNA is actually easy. Since it's massively redundant--just about every cell has a copy of the same stuff--you only need to send it once. The entire human genome is three billion (3E9) base pairs. Each base is one of only four possibilities, so that's just

Such a blazingly fast connection is amazing, but how the hell do they get the data onto and off the pipe? Are the disk read and write speeds up to that? What about the ram? how the hell do they do that!!!

Some Perceived Problems with the Introduction of Terabit Network
Technologies.

This short paper attempts to highlight some potential problems
associated with the introduction of high speed networking -
specifically at the Terabit per second level. These problems are still
in the theoretical arena as practical Terabit networks are probably
still several weeks away from fabrication.

Introduction.

The primary problem when considering Terabit networks must be the
enormous speed that the packets on such networks will be traveling.
Naturally there are problems at the protocol level with very large
window sizes necessary for useful throughput, and enormous quantities
of data "in flight" at any one point. However, these problems
are encountered at the Gigabit level and are solvable in principle (by
appropriate window and packet size negotiation for instance).

The major problem that is perceived at such high speeds is that data
is now flowing at a significant fraction of the speed of light. This
brings into play a number of relativistic effects that must be taken
into account when designing such high speed networks.

Physical Considerations.

There are firstly a number of physical considerations that must be
taken into account. These are problems associated with any body
traveling close to the speed of light (c).

A large amount of energy is required to accelerate the
packets to the required velocity. However, the closer one approaches
c - the more of that energy is transformed into mass. Thus
packets will become heavier. A related problem is the slowing down of
packets, when they enter conventional lower speed Megabit networks. The
large amounts of energy that have gone into accelerating the packets
and giving them extra mass will be lost. This will require large heat
sinks. Cable fractures may also be explosive in these cases (which is
in keeping with the abbreviation TNT Terabit Network Technologies).
Alternatively, a special large coil of cable could be used to allow
the packets to naturally slow down.

Networks often need to be laid to fit the physical shapes
of buildings and other infrastructure. When any body traveling close
to c undergoes acceleration it tends to emit "breaking
radiation" or bremsstrahlung. This is particularly noticeable when
bodies have to undergo angular acceleration when turning corners.
Thus any bends in the cabling will have to be heavily shielded with
lead plates to stop the intense burst of high energy particles.
At high enough speeds, the curvature of the Earth may also prove a
significant loss of energy.

Whilst traveling at high speeds, the packets will undergo
time-dilation effects. Thus whilst two ends of a connection may agree
on a round-trip time for a packet, this may well be different to the
packets perceived RTT. The packets estimate of the RTT will be shorter
than the measured delay. Therefore if times are kept in the packet
this will lead to confusion.

When a body is traveling at high speeds, it tends to shrink in the
direction of the travel. This means that a packet taking 1400 bytes,
might actually take up 1300 bytes space on the network. This leads to
more capacity being available than might first be perceived. However
all routers must be able to handle packets at speed to stop them
suddenly growing. This leads to circuit switching being possibly a
better base technology.

A perhaps more serious problem is the case of collisions on a network
technology such as ethernet. The collision of two very high speed
packets could give rise to many spectacular effects, equivalent to
those seen in current particle accelerators. In par

I read a lot of : is this needed?, let's be clever and ask oneself what we are doing...

Frankly, it is hilarious from folks who probably jumped on GMail, IPods, stupid phone which does all but work when needed, and other devices which are arguably the most un-needed space on the planet. (No you won't get me to believe your 200MB emails are worth keeping...)

Ciao

As a reminder, the ALICE experiment at CERN will produce per year 1 PB ( Peta Byte ) of _raw_ data. This is only _one_ experiment out of _four_. Add DB overhead and you start getting the picture. And no: there won't be backups: too big.
The nature of particle physics is to be statistics. The search is for slight deviations from what is predicted. So the amount of raw data is huge. It is also that the amount of (raw) data per second produced will be in some case magnitude of order bigger.

It is thought that some data will not be stored at all at CERN, but sent straight to remote storage farm. Too much data to be stored localy.

The people analysing those data will be scattered over the planet, involving indeed the need of big transfers.

Ha ha ha: is this needed ? Hi hi let's think about it... Please dump all the crap data you pretend to need and ask again the question.