"EVE Online's complicated inter-corporate politics are often held together by fragile diplomatic treaties and economic agreements. So fragile, in fact, that a single misclick can lead to a fracas that quickly snowballs into all-out warfare. That's what happened to two of the spacefaring sandbox MMO's largest player alliances in the Battle of Asakai, a massive fleet vs. fleet onslaught involving 3,000 players piloting ships ranging from small interceptors to gargantuan capital ships. Straight from the wreckage-strewn outcome of the battle, we're breaking down the basics of what happened for everyone to truly fathom one of the biggest engagements in the game's history." The costs of this battle in in-game currency is, so far, 700 billion. While MMO's don't float my boat, I have to say that this is still pretty awesome. Penny Arcade looks at the technical details server-side, and what a battle like this does to the game's backend infrastructure.

If you want to stretch the definition of "distributed system" to include the "least distributed possible" cases; then you could pretend almost anything is a distributed system (all the way back to the old "telnet into a server" MUDs MUSHes and MOOs) and it becomes meaningless joke.

Traditional MUDs... no. But there was some effort to make distributed system of MUDs

Note there is no mention of fault tolerence, load balancing, distributed processing, or anything of the sort.

I'm fine with that if that's what you want - let's call it a "barely distributed system" (no fault tolerance, no load balancing, etc; where the entire pile of crud has to be taken down for a few hours every week because they're too stupid to figure out live migration or even handle the complexities of symbolic links).

I agree with your gripe - only pointing out that fault tolerance and load balancing are only tangentially related to distributed systems architecture. Many distributed systems have none of those attributes, and many systems having those attributes are not distributed systems.

So; are they lying about it being a "more distributed than everything else" system?

If they run a large number of primarily independent servers that do most of their work on local independent datasets, only communicating to each other over narrow channels, then they are textbook distributed systems. The more state they share the less "distributed" they are...

"If you want to stretch the definition of "distributed system" to include the "least distributed possible" cases; then you could pretend almost anything is a distributed system (all the way back to the old "telnet into a server" MUDs MUSHes and MOOs) and it becomes meaningless joke.

Traditional MUDs... no. But there was some effort to make distributed system of MUDs. "

I was looking at wikipedia's list of "distributed computing architectures":

You'll see that boring old "client-server" (potentially including one client on one computer talking to one server on a different computer) is the first architecture on their list.

In my opinion, boring old "client-server" (including multiple clients talking to one server, and multiple clients talking to multiple separate servers) is just client-server and doesn't really qualify as a true distributed system.

Now; EVE Online (as I imagine it) is a slightly more complex case of client-server. I'd imagine that each individual client is talking to at least 3 different servers (one for chat, one for the economy/trade, and another for "objects in space"); but despite this it's still all just client-server, and still doesn't really qualify as a true distributed system in my opinion.

If they run a large number of primarily independent servers that do most of their work on local independent datasets, only communicating to each other over narrow channels, then they are textbook distributed systems. The more state they share the less "distributed" they are...

These are all practical considerations only (e.g. shared state and/or heavy communication tends to kill scalability/performance; and local independent datasets is the result of minimising shared state and communication).

What does qualify as "true distributed" is when multiple computers work together, rather then independently. Google (many computers working in parallel for each query), Wikipedia (front-ends, caches, databases, media servers, etc for each page request), BitTorrent, SETI@home, supercomputers.

I want to avoid getting lost in the weeds when it comes to terminology is all...

A distributed system is really just a collection of network connected machines running software to facilitate a common goal.

BitTorrent is a good example of a distributed system. So is Seti@Home and the other examples you gave.

But one of the things you mentioned is live migration. One way to do live between two nodes in simple layman's terms:

1. On the source node, snapshot the current servers complete state, and start doing incremental differential snapshots at set intervals.
2. Send the full snapshot to the destination node. Once it has it, start sending the incrementals until you get "caught up"
3. Freeze the state on the source node, send the last incremental to the destination and have it wake up, and then direct all clients to reconnect to the destination node.

My point is this process has absolutely nothing to do with distributed computing - it is the same process you would go through if you only had one server and wanted to migrate to another one. It also has an achilles heel, namely the larger the amount of state and the faster it changes, the longer it takes to complete. There is almost always a significant amount of "lag" involved with a fall over unless the amount of state is trivial.

All I was getting at is that EVE Online may be a poorly designed systems in some respects (I really don't know much about it), but the deficiencies you mentioned don't have anything to do with it being more or less distributed...

Take seti@home... How does it deal with a non-responsive node (i.e. a user's computer that goes offline)? It simply moves on - give the work to someone else. The point is the system doesn't care about a node going away, because all a node is to it is a compute resource working on a small dataset. For seti@home, compute cycles are the gold - they don't care about latency because it doesn't matter to them.

EVE is the opposite - the problem they have to solve is reducing the latency to a large number of clients in a world represented by a shared state - latency is their gold. Performing compute in separate "islands" is the way they reduce latency.

They are both distribute systems, but they are almost opposite in purpose and design. The bigger an EVE backend node is, the more clients it can handle with low latency, but the more state it has to manage. Live migration of a large amount of state, without negatively impacting the latency of the users, is simply not an easy problem to solve. Not saying it can't be done, just that it isn't trivial and it would almost certainly not be transparent to the users...