Posted
by
timothy
on Tuesday October 21, 2008 @10:26AM
from the build-me-a-beowulf-cluster dept.

alphadogg writes with a story that's about the possibilities for the next generation(s) of Ethernet, stuff far beyond 10base-T: "Ethernet has conquered much of the network world and is now headed deep into the data center to handle everything from storage to LAN to high-performance computing applications. Cisco, IBM and other big names are behind standards efforts, and while there is some dispute over exactly what to call this technology, vendors seem to be moving ahead with it, and it's already showing up in pre-standard products. 'I don't see any show-stoppers here — it's just time,' says one network equipment-maker rep. 'This is just another evolutionary step. Ethernet worked great for mundane or typical applications — now we're getting to time-sensitive applications and we need to have a little bit more congestion control in there.'"

You probably don't have a storage area network, running over some proprietary fiber protocol, or some hight performance proprietary cluster, or a supercomputer around, do you? All those things are fading out as Ethernet evolves to do those kinds of jobs, but they didn't disapear yet.

FTA: "But in its current state, Ethernet is not optimized to provide the service required for storage and high-performance computing traffic -- speed alone won't cut it, vendors say. Ethernet, which drops packets when traffic congestion occurs, needs to evolve into a low latency, "lossless" transport technology with congestion management and flow control features, CEE and DCE backers say."

If I understand right, they're trying to change Ethernet because of TCP/IP? Isn't that kinda, backwards as a concept?

Something new won't sell. People wont adopt revolutionary products as easily as they will adopt incremental upgrades with a known and trusted brand. So calling it "Uber-fiber hyper gylde" won't sell as well as "Ethernet v10".

People will deal with confusion. They deal with it all the time. Its the only way they know to deal with the walrus.

Since nobody buys anything not labeled ethernet it's going to be called ethernet anyway. Maybe ethernet+ or ethernet ring or some BS marketing term.

The technology changes have been substantial since Xerox was pumping 3Mbit/s through a coaxial cable but we continue to call it ethernet because PHB's don't want to bother implementing a new networking standard.

No.Ethernet uses collision detection and resending to to manage packets.Well it used to anyway. I am not sure about Giga-EThe way Ethernet used to work is that a sender would listen to see if the line was clear and then send a packet and listen at the same time. If the packet was damaged by a collision the sender would wait a random amount of time and then try to resend.This system really bugged a lot of people but it was cheap and it worked.This is the actually physical layer and not TCP/IP.

Ahh thank you. I don't know that much with regards to networking, but I would imagine well before this standards issue haven't there been many forms of this already implemented, say as routing ethernet connections to switches? I mean only one true "Sender" is on each ethernet line, right?

Ahh thank you. I don't know that much with regards to networking, but I would imagine well before this standards issue haven't there been many forms of this already implemented, say as routing ethernet connections to switches? I mean only one true "Sender" is on each ethernet line, right?

Wrong.

Collisions still occur when multiple computers try to talk to a single computer at once. Of course this is an extremely common scenario in a typical client-server network. However, with a switch at least one packet al

No. Switches don't notify the source that the packet was dropped. TCP's retransmit works without explicit notification.

Some switches report port buffer overflow as a collision.

Consider dropping a packet destined for a remote system with high latency. The RTT time used to tune TCP retransmits might be several thousand milliseconds. Where if the switch reports that it dropped the packet the CSMA/CD retransmit time would be MUCH shorter.

Clearly if the congestion issues are transient and temporary and the laten

Wrong. There is no "collision detection", the only way to tell that you had a collision is if the packet didn't get there. If two devices transmit at the same time, you get a mangled packet that won't pass CRC and gets dropped.

What they're really looking for is token ring - which was (and still is) a superior protocol - for Ethernet, as you increase the bandwidth utilization beyond 10%, you get so many collisions that your throughput goes through the floor, while for token ring, the throughput degradation is much more gradual. For bandwidth utilization above 10%, token ring is far superior to Ethernet.

Why Ethernet was adopted over token ring has more to do with religious issues and who can scream the loudest and bully their way through technical issues with emotion than it has to do with technical superiority.

I think it had a lot more to do with cost.Ethernet was available first and had more hardware suppliers so the cost went down.Token ring was really popular with IBM. It was almost a standard for IBM systems. I have a few microchanel Token Ring adapters if you need them:)FDDI is Token Ring on fiber.

Huh, in Ethernet which is CSMA/CD you listen to the wire before starting to transmit, this doesn't avoid all possible garbled packets but it does avoid most if things are working to spec. Also because VTT goes higher than normal transmit levels during a collision there IS detection. The reason that ethernet won over tokenring is that IBM charged a hefty fee per port for tokenring. There was also real world reliability problems with tokenring as early designs used a physical ring or string layout instead of

What they're really looking for is token ring - which was (and still is) a superior protocol - for Ethernet, as you increase the bandwidth utilization beyond 10%, you get so many collisions that your throughput goes through the floor, while for token ring, the throughput degradation is much more gradual. For bandwidth utilization above 10%, token ring is far superior to Ethernet.

That was true in hub based networks, but hasn't been true for a long time in switch based networks.

I think GP was referring to two packets coming in on two switch ports, both destined for a third. Even if the switch doesn't buffer one of the packets (or frames, whatever the appropriate jargon is for layer 2), it can still send the other one out the third port.

In the case the second incoming packet is dropped and not buffered, I don't see why the switch can't jam the collision line on that port to notify the sender, so long as it detects

Collisions still occur when multiple computers try to talk to a single computer at once.

Collisions occur when there are more than one sender on a collision domain, they don't have to be sending to the same host. Imagine you have four computers on a hub. Computer A sends a message to B while C simultaneously sends a message to D -- this is a collision.

When a packet is sent to a hub the hub immediately sends it out all ports -- it's like a set of spliced together wires. A switch switches, it tries to fi

Collisions occur when there are more than one sender on a collision domain, they don't have to be sending to the same host. Imagine you have four computers on a hub. Computer A sends a message to B while C simultaneously sends a message to D -- this is a collision.

We are really just talking about how collisions occur on a switch. Technically, they CAN'T occur on a full duplex switched network. The collision domain is the switch port and the PC port, and both can talk at once (full duplex).

Hypothetically though, if you set aside buffering, a 'collision like' conflict occurs when multiple PCs try to talk to a single port, except that one gets through and the rest are 'blocked' which is what I was trying to say. Of course, due to buffering, this is 'handled' and the conflict is actually pushed back to when the buffer overflows instead.

And yes, switches do have outbound buffers for each port so that if two sources try to send to the same host they can be done in sequence rather than causing an outbound collision on the destination port's collision domain. I am not sure what happens if this buffer becomes full, I had always assumed the switch would just begin dropping the packets (as indicated by this Cisco document).

Dropping packets is one option. The other is to use 'back pressure' to signal to the PC to back off a bit. This can be done by sending 'fake collisions' or via 802.3x Flow Control 'pause' signals. Many switches support these modes including those from intel and cisco.

Its often better to just dropping the packets and let tcp deal with it, but in some cases you can get better performance by using flow control/back pressure features.

Yes, that is the case strictly at layer 1 of the OSI model. However at layer 2 we have the switch. By segmenting the collision domain up and creating one for each port rather than the entire unit we no longer have collisions and CSMA/CD is no longer needed. Unfortunately wireless still uses CSMA/CA which operates similar to what you described, except it requests silence of the 'wire' first before trying to send rather than retransmitting when a collision occurs. Switches are still part of ethernet since

Doesn't it also send a big blast down the line to make damn sure that everyone knows the packet has been mangled? Not particularly important, but I always thought it was kind of like screaming profanities when something goes wrong.

Actually I think it still does. I got this from the Wikipedia."Despite the physical star topology, hubbed Ethernet networks still use half-duplex and CSMA/CD, with only minimal activity by the hub, primarily the Collision Enforcement signal, in dealing with packet collisions. Every packet is sent to every port on the hub, so bandwidth and security problems aren't addressed. The total throughput of the hub is limited to that of a single link and all links must operate at the same speed."Switches are differen

You can still buy HUBs for 10BaseT. Clearly nobody trying to use ethernet for a SAN is going to be using HUBbed 10BaseT.

Nobody in their right mind uses 100BaseT hubs unless they just want to sniff the port (since most dumb switches do not have monitoring features). I've never even seen 100BaseT hubs, do they even exist? Well, there's no reason to buy one anyway even if they do.

There has not been a cost basis for using a HUB for at least a decade. There's no point even arguing about it any more.

"they may drop packages," Do you mean frames?pretty much the same thing a collision. The problem with Ethernet is that it is none deterministic. For really high performance uses it is less than ideal.For most things it is just fine and dandy.

No, ethernet itself is the reason that those packets are dropped. It is possible to have IP on some other network, like token ring or FDDI, bother of which actually achieves higher throughput than ethernet for a given bandwidth. IP is known to be "unreliable" because there is nothing in IP that corrects for dropped packets, but those packets are dropped because of the network type that is used (or because of physical considerations on that networking, like a disconnected cable or radio interference).

t is possible to have IP on some other network, like token ring or FDDI, bother of which actually achieves higher throughput than ethernet for a given bandwidth.

Nope, both of which have higher overhead than full-duplex ethernet. They have better throughput than half-duplex ethernet, which is about as useful as being better than avian carriers. Half-duplex ethernet should just be banned entirely. Maybe that would make Linksys wake up.

Nope, both of which have higher overhead than full-duplex ethernet. They have better throughput than half-duplex ethernet, which is about as useful as being better than avian carriers. Half-duplex ethernet should just be banned entirely. Maybe that would make Linksys wake up.

Half-duplex ethernet corresponds to the way things work on a shared peer-to-peer radio channel. Like WiFi. (Which uses the Ethernet MAC and collision/backoff algorithms - though I think the collision detection is inferred rather than

No, they want Ethernet as a transport to contain a lot of the features of TCP so that they can lay their own protocols on top of it while being able to assume it's a reliable transport. That will increase the cost of ethernet by including that intelligence down the stack. Basically the cost of ethernet ports is plummeting compared to things like fiberchannel due to economies of scale and so cash strapped datacenters are trying to use it for everything, but it's not ideally designed to handle those other protocols so the other technology areas are trying to mold ethernet to meet their needs. Basically the way I see it if the industry does what is right there will be no 100Gbit Fiberchannel, there will be 100Gbit FCOE adapters.

Since 10Gbit Ethernet has the collision domain defined as the two endpoints there IS no longer a collision domain on the wire, just a virtual one in an oversubscribed switch. This isn't about guaranteeing transmission over the internet, it's about guaranteeing reliability in a LAN/MAN/WAN Ethernet network. The idea is you will have one set of wires, one physical protocol with several personalities sitting on top. The biggies are TCP/IP and FCOE but there are other things like remote DMA that can greatly ben

Oh I agree completely, a converged network is desirable and some would say inevitable. What I was at Storage Networking World this spring there were multiple presenters who basically said the same thing, FC as a physical transport is dying. As an end user I think this is a GOOD thing, give me a single duel fabric to manage for data, voice, storage, video, etc and I'll be very happy because it means fewer things to break, fewer technologies to learn and fewer products to master. I guess if I was a SAN specia

It sounds like they just want bandwidth reservation and isochronous transfers on Ethernet. Something that would establish a virtual circuit and then not drop frames. Something with an Asynchronous Transfer Mode, perhaps?

So they want to make networks more expensive for EVERYONE so that they THEY can sell their products for less.

You don't have to buy switches which support the new features. Ethernet is just a low-overhead way to serialize frames onto 4 pairs of wire (or one pair of fiber).

The concept is problematic for a different reason: They believe that advanced features are needed because ethernet doesn't have enough bandwidth. That will only be true for a few years, then everyone will be doing multiple 10Gbps links to switches with 500+ Gbps bandwidth on the backplane -- you can even buy that today, it's just a bit pricey.

No they want it because a LOT of money gets sunk into the development of new standards and PHY chips for other protocols that could be more easily implemented as a personality on top of a reliable ethernet transport. A good example is Fiberchannel, most of the physical equipment and low level signal encoding is the same between say 8Gbit FC and 10Gbit ethernet so if you had a reliable ethernet layer with reliable switches you would only need one type of switch, one type of wire, and one type of admin. This

For a high-performance system with a large number of nodes, the cost of the actual network to connect everything together can cost more than the CPU's and servers themselves. To get high performance from this network, everything has to be tied so tightly together, that is is considered a component in itself, the network fabric. Also, the actual communication through the network cables is the slowest part of the system. So this price/performance ratio is what customers will be considering when buying a syste

I got like 10 responses, about half were a flamewar with some people. I thought switches and such already make up for this, and the comment above yours makes me concerned as well, the whole decommoditize aspect (if it's accurate)

This seems like a total kludge being put together by networking equipment vendors to find a way to differentiate their products and return to the days where a 10 Base-T hub was $1000.

Network gear is now mostly a commodity, except at the super high end.

The vendors hate that - so they are trying to push the host's functionality into the LAN gear instead. They don't want to provide "dumb pipes" any longer, they want to monkey around with the traffic and protocols, and provide virtual servers and such in their boxes.

Really, it's just an attempt to make the servers the commodity and their gear the expensive part.

Except... you already can implement this yourself with existing equipment and software, with much better control and no vendor lock-in. For low-end solutions, a Linux cluster works great behind an UltraMonkey front end. For higher-end ones, well, that's what IBM z-series mainframes are for.

Well, most of the good traffic control algorithms are already provided as standard by most GOOD server OS' (Linux, OpenBSD, NetBSD, and the like), most routers and router/switches have also provided those same algorithms, leaving the fast-n-basic switches as the only "dumb" devices (well, other than the CEOs, who are the dumbest devices in any system).

There is a drawback in making Ethernet too expensive by adding too much smarts to the system that I think people are missing. Infiniband isn't THAT much more

What? This isn't between Infiniband and ETHERNET! This is the convergence of Fibre Channel with Ethernet. This LOWERS the price of Fibre Channel, which is already king of storage networking in the enterprise. There is no way in hell this will HELP Infiniband in the corporate world. Infiniband will continue to live in HPC land.

Amen. We looked at TOE cards for iSCSI and found that they were absolutely uncompetitive with adding CPU power and running the stack in software. There's a reason for all the commoditization going on. And collisions? Dropped packets? That's only a problem if you designed a bottleneck in. As long as you're aware that storage interconnects are different from your ordinary bursty LAN there is no problem.That said, I'm looking forward to an open source FCoE implementation.

From the article: "Ethernet is not optimized to provide the service required for storage and high-performance computing traffic -- speed alone won't cut it, vendors say. Ethernet, which drops packets when traffic congestion occurs, needs to evolve into a low latency, "lossless" transport technology with congestion management and flow control"

Q: Packet loss and traffic congestion are to Ethernet as:
A) blue screens are to Windows
B) registers are to assembly
C) mustard is to sausages

There's a draft of Firewire that hasn't hit yet that uses standard Ethernet cables. The port is supposed to be able to use either Firewire OR Ethernet and be able to switch between them depending on what it's plugged into. This sounds ideal to me.

Would have been nice if Apple put that into the new MacBook, since they were so tight on space for ports. Though it wouldn't surprise me if it did have that capability in hardware but it hasn't been worked out in software yet and will appear later as an update. Believe me, I'd happily cough up the $2 for that enabler.

Ethernet is more of a generic name than a specific thing. It's more like "soup" than it is like "VHS".

Ethernet started as a daisy-chained garden-hose-size coax cable with vampire taps. Then RG-58 with BNC connectors, then twisted pairs to a hub, then switching hubs, then wireless... Not much stayed the same, not speed, media, topology,... except maybe carrier-sense. It's basically a comforting name, with the Ethernet-of-the-day varying at the chef's whim.

Keeping the name while tossing out the last remaining bit of commonality is a bit bizarre.

Ethernet started as a daisy-chained garden-hose-size coax cable with vampire taps. Then RG-58 with BNC connectors, then twisted pairs to a hub, then switching hubs, then wireless... Not much stayed the same, not speed, media, topology,... except maybe carrier-sense. It's basically a comforting name, with the Ethernet-of-the-day varying at the chef's whim.

Even better, all of those are merely the physical layer. Ethernet itself is the formatting of the frame (destination, source, protocol, data (payload) CRC). However, that has also evolved with the addition of vlan tags, MPLS, etc such that ethernet is more a colloquialism for a whole alphabet soup of standards that mostly work together.

As long as everyone keeps that in mind and has the good taste to add any heavy extra features like reliable transport as layer 3, things will be fine. Otherwise, it will prob

Why do we still use ethernet? Ethernet was designed to work with multiple access cables in 10B2 and 10B5 layouts with backoff algorithms and all the other stuff that goes with detecting and avoiding collisions. With 10BT all that stuff is irrelevant now - what 10 base T is effectively is a highly complex serial cable carrying just one machines data to a router or switch. All the overhead of frame encoding and decoding and the collision system should be ditched and something more appropriate to a 1 -> 1 c

Ethernet encoding is simple and cheap. CSMA/CD is gone with 10G and I haven't seen a 1Gbps half duplex connection.

Yes, half duplex should just be banned entirely, but if you can implement it in a $0.10 10Mbps ethernet chip, you can probably survive the added $0.01 in your 1Gbps adapter, even if it never gets used.

Fibre Channel over Ethernet has real promise, but these new requirements are a real stumbling block.

What many of the posters here may not realize is that storage traffic (and the "standard" SCSI it uses) is extremely intolerant of dropped frames. A link that in the TCP/IP world would be specatacular is wholly unsuited for block-level storage, which is too latency sensitive to have time to recover from dropped data.

Since the most common cause of dropped frames within a data center is congestion, FCoE requires your Ethernet to implement frame-based flow control, which prevents the congestion from occuring, along with the resulting frame loss.

Fibre Channel over Ethernet has real promise, but these new requirements are a real stumbling block.

Something to note is that the Ethernet in FCoE is really not the same Ethernet we use today. The acronym really confuses things. The article offers some better names for the new Ethernet standard, "Converged Enhanced Ethernet (CEE)", "Data Center Ethernet (DCE)." It really is the convergence of Fibre Channel and Ethernet, NOT Fibre Channel glued to the back of Ether. Think of it more like a gigantic leap for Ethernet (and IP/TCP eventually, as functionality is pushed down a few layers), not so much a do

Ethernet already has flow control at the link-level - they're called stop frames (and since all modern switches give you dedicated links to end workstations and have some amount of hardware buffering, collisions/overrun aren't an issue). Now, since the world really runs on IP (doing raw ethernet would only ever work in the most local of LAN applications which is rather pointless in most deployments), and IP has TOS bits (which every real modern router can classify, queue, and throttle per-queue all in the hardware fast-path with no additional latency), I'm failing to see what they're proposing to solve since the problem is already solved. 1G/10G switches are used all over data centers and in HPC situations today (and have been for years)...

doing raw ethernet would only ever work in the most local of LAN applications which is rather pointless in most deployments

Which is exactly the deployment FCoE and several upcoming ethernet uses are aimed at.

A handful of SAN boxes serving FCoE on the same segment as the servers they're serving. Basically the same way you provision FC today. The storage and servers are extremely local, and there's no reason to stuff IP in the middle when it will never be routed.

If you want less performance but the ability to route it over an IP network, use iSCSI.

they're doing it so as to get rid of specs that are cumbersome to translate to and from current Ethernet. the rationale is that in order to gain speeds into the 40gbps and beyond they will have to pull out bottlenecks in the frame forwarding components in their switches.

translating one frame protocol to another is probably a big overhead for buffering and processing (especially if there are dropped frames) so they want to take advantage of a spec with a huge install base of network equipment and nodes but

Given the direction SATA & USB is going, the rate at which its bandwidth has increased relative to traditional CATx ethernet, and the relatively lower cost of interconnection devices, is Ethernet really the best? If we're going to making significant wiring changes in server rooms I'd prefer to just do it once and standardize on the cheapest, fastest "2-wire" solution that makes sense.

I do get the diff between the layers. What I'm trying to suggest is that if there's going to be some rewiring in the fishbowl, why not use the fastest, cheapest solution that has the potential to support all the scenarios? I don't have a 10 Gb/s server room, I have a little 1 Gb/s server room; the SATA & USB specs are either already > than MY twisted pair bandwidth and for a lower cost or will be there in the near future and by trends I've observed will continue to out pace twisted pair bandwidth/$.

My point wasn't based on Infiniband or Myrinet, or any other interconnect. It just seems to me that all of this is a solution desperately searching for a problem.

Different technologies exist for a reason: they do what they are designed for as optimally as possible. Trying to make one network technology that does "Everything" is doomed to epic failure. It will become a compromise that does lots of things, and few of them well.

It is high speed at a very low cost, and the ubiquitousness of the technology.

It doesn't really matter that the technology is inferior to the numerous very fast low latency solutions available today because all of those solutions are also very high-cost, low-volume solutions relative to ethernet, and have no ability to be incrementally upgraded. It doesn't matter that there are congestion issues when pushing a packet-switched technology to

Collisions became a non-issue the moment low-cost 10BaseT switches came into existance, which was, what, over 15 years ago? Don't bother arguing about collisions, not even my home network uses hubs any more. There are no collisions.

Congestion is a lot easier to handle then people seem to realize. All modern switches have packet buffers. Even the CHEAP switches have a few megabytes of buffer. GiGE switches have flow control on top of that.

No more USB cables with a million different connector types. No more PATA or SATA cables.
No more serial or parallel cables. No more trying to figure out where to plug a given device
in on a motherboard or looking for spare PCI/whatever slots - Just one type of cable and
they all plug into a switch-like section of the motherboard.

Now, some devices (video cards as the most obvious) will still require extra power, but
most devices could probably manage with a variant on PoE, meaning the inside of your case
goes from rats-nets of assorted cable types, to a half-dozen or so tidy round cables.

* Yes, you can already get network enabled versions of these, but they count as a real
full-fledged network endpoint, not as a slave device "local" to a particular computer.

And sadly, you'd see the same issues it with this standard too, because an ethernet RJ-45 plug isn't appropriate to plug into a cell phone, digital camera or mp3 player, but a 5-pin mini-connector isn't appropriate to run 25 feet to a switch/router either.

I think the biggest reason that there are hundreds of different USB connectors is that standardized plugs don't help sell 30 dollar Apple or Sony branded AC/DC adapters.
I really loved my old motorola phone with the mini-usb connector, now my LG phone doesn't share it's connector with any other device I own.

I disagree completely. All you ever need is power and signal. How did PC-Card makers make the connection? A slide out connector. Digital camera and mp3 player, they can use it too.

The 5 pin-mini connector was designed to make it easier to get the orientation correct without bending the pins, ( its called a DIN connector, and its a german design/standard ). I think its appropiate to run 100 feet to a router, but the RJ Physical standard is so much more reliable.

* Yes, you can already get network enabled versions of these, but they count as a real full-fledged network endpoint, not as a slave device "local" to a particular computer.

You can with protocols like iSCSI or ATAoE. A lot of enterprise gear uses iSCSI, which makes a remote device appear like a local SCSI device, and some consumer-grade NAS devices running Linux can act as ATAoE devices, which does the same thing but with ATA instead of SCSI and over raw Ethernet frames rather than over IP.

I actually called MaBell, and talked to some design engineers for the RJ connectors, when I was thinking about wiring my first Network. Phone jacks and Ethernet cables are ULTRA reliable when implemented properly. ( Umm.. Did I tell you my first network, is still in use and the cables are still working? )

The idea that every component uses them is extrodinary. If all my devices except my Video card used Ethernet/RJ11, and was self configuring? It sure would make everything a WHOLE LOT EA

Ethernet has nothing to do with the connector type. It is a layer 2 protocol that sits on top of the physical transport medium. There is a little bit of overlap with things like wiring specs for distances and attenuation, but it ethernet itself doesn't really care what plugs or wires you use. even if connectors were in the spect, it would still likely be extended to allow for new connector types to fit the appropriate devices (mobile phones, mp3 players, etc).

thus, for the consumer world you probably wouldn't see much difference on the user end. developers, on the other hand, would have to start pushing their device drivers into the network stack in order to get them working. say hello to firewalls and IDS/IPS on your HDD and video card.

First of all SAN's by inherent design have the ability to aggregate data across multiple ISL's (trunks) in real-time. If you have 2 pipes between switches your I/O's will be evenly distributed across the links adjusting in real time as needed to fully utilize both links. Need more bandwith? Simply plug in another ISL, done.