Posted
by
Unknown Lamer
on Wednesday April 11, 2012 @10:29AM
from the udp-reunion-tour dept.

An anonymous reader writes "Launched in 1995, SSH quickly became the king of network login tools, supplanting the old insecure mainstays TELNET and RLOGIN. But 17 years later, a group of MIT hackers have come out with "mosh", which claims to modernize the most annoying parts of SSH. Mosh keeps its connection alive when clients roam among WiFi networks or switch to 3G, and gives instant feedback on typing (and deleting). No more annoying network lag on typing, the MIT boffins say, citing Bufferbloat, which has been increasing latencies."
The folks involved have a pre-press research paper with the gritty details (to be presented at USENIX later this year). Mosh itself is not particularly exciting; the new State Synchronization Protocol it is based upon might be: "This is accomplished using a new protocol called the State Synchronization Protocol, for which Mosh is the first application. SSP runs over UDP, synchronizing the state of any object from one host to another. Datagrams are encrypted and authenticated using AES-128 in OCB mode. While SSP takes care of the networking protocol, it is the implementation of the object being synchronized that defines the ultimate semantics of the protocol."

Then they discover there was usually a good reason for something being done the way it was in the past. Eg local echo was very useful for line buffered programs such as MUDs and chat servers or even talking to SENDMAIL or an FTP server directly. It was easier to write the server to cope with just line by line rather than character by character and it used up less network resources in the process.

.. a negotiable LOCAL_ECHO mode. Then they invented ssh, and left away that LOCAL_ECHO and linebuffered flags, considered to be archaic.
And 15 years later, LOCAL_ECHO is back in mosh!

Right. Breaking local echo in Telnet was a Berkeley misfeature. It was in 3COM's UNET, which predated Berkley networking in UNIX. (Berkeley did not introduce networking in UNIX. Theirs was the third or fourth implementation, after ones from BBN, 3COM, and Phil Karn.) With UNET, circa 1983, Telnet had local echo until you used something like VI or the RAND full screen editor, at which point the server noticed the stty call which switched to "raw mode" and switched to remote echo.

Seamless transition from local echo to remote echo is even older. It was in Tymnet [rogerdmoore.ca], which used markers called a "red ball" and a "green ball" to do the switch seamlessly.

Oh God! The flashbacks are killing me! Back in the mid-70's I worked for Tymshare (sister company/parent/?? of Tymnet) doing load testing on a project called OnTyme (commercial email). I was hip-deep in the Tymnet protocol trying to record and then re-create realistic pseudo-user-loads from different points in the country. Massive PITA.

That sounds like a step backwards to me. Any utility in that is lost when something doesn’t sync up properly. When I hit a key, I want to know it has been sent and received and see the result.. not see the result as my shell predicts it. Maybe I’m just having local echo flashbacks from past telnet experiences.

Everything else sounds really neat though. I don’t jump wifi often enough for re-connecting and re-attaching to screen to be a big deal.. but I can see the utility for those who do.

I've been using it for a few days now, and I find the local echo to be quite a useful feature. Many of the machines I remotely use are on different continents, and waiting for my keypress to make a round trip can be frustrating at times.

Mosh also makes it clear which characters have been successfully transmitted by underlining those that are still finding their way through the tubes... i've never been unsure what has or has not been received.

After a few days of using mosh, I don't see myself going back to plain old ssh anytime soon.

Sounds like a great solution. Responsiveness is critical for user interaction, therefore, local echo is vital for high latency links. Knowing that the remote end has received the same thing you see locally (and if it's performing character-by-character filtering, seeing those results) is also important. Local echo, with remote echo verification.

It seems to me that for my typical usage it is going to have limited utility - I'm either in a shell where I'm leaning heavily on the tab for completion, or in vi where it would need to secondguess what vi is going to display.

IIRC in the original telnet protocol the list of keys that prompt synchronization with the server is negotiable. Normally that would be carriage return, and going to a full-screen editor would disable local echo, but adding tab to the list when in a shell should be trivial.

That sounds like a step backwards to me. Any utility in that is lost when something doesnâ(TM)t sync up properly. When I hit a key, I want to know it has been sent and received and see the result.. not see the result as my shell predicts it.

Easy to say until you are (attempting to) type in a longish shell command into a server across an overloaded Internet connection. In that situation, you have two choices: either wait 1 second after each character, to make sure you typed it correctly, or blindly type a bunch of characters at once and hope you didn't make any typos. And if you did make a typo, then you're really in the sh*t, because now you have to figure out how many characters you actually since then, and press backspace EXACTLY that num

So mosh has brought back the ages-old idea of local echo on the terminal. It disappeared as soon as terminal connections became faster than the old teletype links. I have often wished for such a feature in ssh, some kind of 'cooked mode'. However I usually run a 'screen' session on the other end of ssh, with emacs inside that, and finally a shell-mode under Emacs! Mosh would need to do something quite clever to enable local editing in that.

I think the point this has is that it will automatically do the reconnect for you. What I'm not sold on is that this now requires an arbitrary port to be open on the server side in order to connect to the mosh server, and how irritated are the security guys who control the edge firewalls on your corporate network going to get?

And many (most?) SSH clients support auto-reconnect on short network drops. And many even support reconnect on IP change (like when switching wireless networks or to 3G). And you can even configure your tmux (way better than screen) session to connect on login.

Thus, many SSH clients already do everything that MOSH does, but without having to install any new software anywhere.

And many (most?) SSH clients support auto-reconnect on short network drops.

It's not even a case of reconnect. A TCP connection lasts forever or until one side says "stop"; all the client
has to do is *not* explicitly time out after N seconds.
Misconfigured NAT devices tend to fuck this up though; one of many reasons NAT is evil.

This. My guess is fixups for automatically opening pinholes for Cisco gear will be a long time coming for this new protocol. Really this needs to be integrated into the SSH server and run on the SSH port to be useful in a modern enterprise environment.

So that means it's just like GNU Screen? ctrl+a d on one connection, hop wifi, ssh and screen -x. wow. Really?

Except the feature will be there even if you don't want it, I guess.
Who cleans up the dead sessions, I wonder?
It's already a problem with screen in some setups I've seen; people create sessions and then forget about them.

(Screen or similar software. I don't care what's the latest and greatest is; we're discussing the general feature of attached/detached
terminal sessions.)

Yeah, always had issues with that. I always set to CTRL-G, since the only thing I ever use that key for is bailing out of thinkos in emacs, and double-pumping that in non-screen mode is pretty harmless.

"To bootstrap an SSP connection, the user rst logs in to the remote host using conventional means, such as SSH or Kerberos."

I stopped reading where it said they use UDP. People who say "I can outperform TCP" are almost always wrong, and I'm quite fed up with badly behaved UDP-based protocols.
Citing the "bufferbloat" theory is also a bad sign, but that may be just a misleading summary.

While neat for those who are currently in areas with spotty wireless coverage it is a neat idea but for most users I don't think it's that much of an issue, even at the moment.

Fast forward five years and I just don't see this software being all that useful. Sure, there's always gonna be that handful of people who will scream that this is extremely useful because they're always hopping between wifi hotspots but most users are using 3G/4G when they're on the move and coverage for those is already "good enough" in most civilized places and steadily improving. I've taken 5+ hour train trips several times and only had ssh connections drop once or twice on those trips (due to spotty coverage in what would quality as the middle of nowhere in northern Sweden).

This is like "solving" the IPv4 address exhaustion problem with NAT, it's a neat workaround but doesn't actually solve the problem.

Satellite links, network congestion/delays, and other sources of high latency aren't going to magically disappear in 5 years, nor 10, 15, or 20 years. Until you can bypass the speed of light (in x transmission medium) as the limiting factor, this is useful.

Fast forward five years and I just don't see this software being all that useful. Sure, there's always gonna be that handful of people who will scream that this is extremely useful because they're always hopping between wifi hotspots but most users are using 3G/4G when they're on the move

Dunno about 4G, but 3G has enough latency to make ssh annoying, so Mosh would definitely be an improvement.

I just started using it (after seeing this article) to connect from my laptop which I suspend and carry in my backpack from work to home. Opened the lid, and the session is still seemingly intact after the few seconds it takes to find my home wifi.No 4G connection in the world is gonna help a device that's effectively turned off.

The "problem" with screen is that it requires manual intervention to resume. FWIW I've been using screen for this for the past ten years or so (well, tmux the last one or two...), but it's still annoying to have to reconnect and attach. mosh handles the reconnection transparently.

I see the need for this all the time. It's a commonplace in large enterprises like hospitals, factories, and financial services corporations.

Example: I'm working on my hospital laptop. I get called urgently to do something elsewhere in the hospital so somebody won't die right now. I grab the lappie and run, then when I get to the theatre I plug into the malfing imager and fix it. Meanwhile all my SSH connections died because I crossed three wireless boundaries at high speed.

If we only ever built technology that appeared immediately useful for at least 95% of the population we would still be trying to figure out how to transport messages across long distances without using a horse.

I work with tele-operated robots, and I must say this is an amazingly useful feature. The robots can establish a connection with either WiFi or 3G, and are meant to navigate in indoor environments. With WiFi, you can go a short distance before losing signal. With 3G, you can go a short distance before losing signal.

People with laptops or mobiles seldom notice the dead zones - they don't suddenly stop walking whenever they hit one.

This is like "solving" the IPv4 address exhaustion problem with NAT, it's a neat workaround but doesn't actually solve the problem.

I think you're not focusing on what the actual "problem" is. NAT really is a bandaid solution for a lack of IP addresses. NAT does solve the issue of multiple devices sharing an common WAN address.

Mosh addresses the issues of connections essentially being treated as static routes no more no less. The problem is the proliferation of different protocols and networks and devices which attempt to seemlessly hop between them to remain in a coverage area. While the hopping bit works quite well (as soon as I am ne

We tried to put OCB mode in 802.11i. So IBM sent a guy to explain the 'licensing terms' for their patents on OCB mode. The next vote in 802.11i after that presentation was to replace OCB mode with CCM.

Until the patents expire or are freely licensed, OCB mode should be considered off limits for free and open projects.

Two U.S. patents have been issued for OCB mode. [1] However, a special exemption has been granted so that OCB mode can be used in software licensed under the GNU General Public License without cost, as well as for any non-commercial, non-governmental application. Since the authors have only applied for patent protection in the U.S., the algorithm is free to use in software not developed and not sold inside the U.S. [2].

I've looked into MOSH recently, and it is GPL. The battle would be, does mosh live under Rogaway's OCB patents which makes it free, or IBMs patents, which makes it unclear... From a "money is justice" perspective, I donno if ucdavis would win against IBM, but they'd have better odds

IP roaming looks nice & ought to be secure with the right steps (no reply from old IP:port, correct cryto negotiation with new IP:port).

But LOCAL ECHO is a big problem -- applications have to be aware of it. On CLI, many keystrokes are commands, not text to be entered. On vi in command-mode, G goes to the last line.

Personally, a bigger thing is traffic reduction, particularly keystoke combining. Nagel's algorithm is a start, but I've modded ssh to delay and buffer likely-text keystrokes for a short time (400ms) while letting likely commands through immediately to retain responsiveness. The delays aren't irksome, and I reduce outbound traffic by ~80%.

It looks like local echo can be turned off with a runtime flag. Additionally, my few experiements with it indicate that it somehow is intelligent enough to properly interpret command keystrokes as such.

We use the existing infrastructure for authenticating hosts and users. To bootstrap an SSP connection, the user ïrst logs in to the remote host using conventional means, such as SSH or Kerberos. From there, the user or her script runs the server: an unprivileged process that chooses a random shared encryption key and begins listening on a UDP port. The server conveys the port number and key over the initial connection back to the client, which uses the information to start talking to the server over UDP.

You open a SSH connection (client->server:22). This port is allowed on the firewall, it lets you through. But then the server decides to listen on UDP:(random port) and tells the client, back through the (encrypted) initial connection, which UDP port to contact. So you initiate a SSP UDP session on that port. How does the firewall knows it should let you through? Since the port number is communicated on an encrypted session, it doesn't have access to that information.
So how does this work in a secure environment? The paper doesn't mention any mean for the server to communicate with the network which port its listening on.

no it is not. security is made of layers.you let UDP out (and actually for mosh you need UDP in, because unlike you, I tried), and anyone can use this to get a remote shell among other things.udp in makes this easier than that of course

You open a SSH connection (client->server:22). This port is allowed on the firewall, it lets you through. But then the server decides to listen on UDP:(random port) and tells the client, back through the (encrypted) initial connection, which UDP port to contact. So you initiate a SSP UDP session on that port. How does the firewall knows it should let you through? Since the port number is communicated on an encrypted session, it doesn't have access to that information.
So how does this work in a secure environment? The paper doesn't mention any mean for the server to communicate with the network which port its listening on.

My guess is as good as anyone else's, but I surmise it does a bit of packet trickery. Once device A (behind firewall) is connected to device B (may/may not be behind firewall, but at least one port is open, 22 by default in this case), device A can create an SSH tunnel...they really are rather neat and VERY useful as a means of security. For example, I have webmin running on a server, but its port (10 000) is blocked by the firewall. Once I connect to SSH I can redirect packets to a certain IP:Port comb

Does not make sense tho.They use UDP to bypass the buffering delays of TCPIf you tunnel UDP in TCP, well.. while you get local echo and state saving.. you might as well type ssh blah.com screen -rd. autossh also auto reconnect 'n stuff. kitty.exe on windows.

I would say for the most part it work just like every other protocol that requires inspection by the gateway/firewall device. It will look inside the data stream and fish out the port / address numbers, then store them for later use; it might even change them suit its needs going out. "It can't do that if its encrypted!" you say.
Get with the times if your gateway device does not intercept and MTIM tls/ssl/ssh traffic that you otherwise allow out you are not in a "secure environment".

> Since the port number is communicated on an encrypted session, it doesn't have access to that information. So how does this work in a secure environment? The paper doesn't mention any mean for the server to communicate with the network which port its listening on.

I assume that mosh relies on stateful firewalls allowing outbound UDP packets. So, given that the ssh channel allows handshaking, here's what I guess happens:

That was the point of my OP. The paper doesn't describe what you said. It says the client initiates the Mosh UDP session once it got the destination UDP port (the port the server listens on) from the SSH session. A firewall would never allow that. Also, opening a wide range of ports is a... let's say... challenging idea. This goes against all rules of network security known to network administrators.

As we're talking about things related to terminals, the one thing I'm still anxiously missing is a terminal emulator which implements smooth scrolling of new text, a feature that was also present in some hardware terminals a million years ago. I guess some smart guy could modify an existing terminal to support this. Heck, if I had a bit more skills, I'd roll up my sleeves and do it myself. It would be sweet.

> It's kind of hard to do things like roaming using TCP because endpoint IPs can change.

Bullshit. With UDP you have to abstract the connection so that the source IP can change. With TCP you can do the exact same fucking thing. Close the old socket when you get a connection attempt from a new client with the right handshake.

Bullshit. With UDP you have to abstract the connection so that the source IP can change. With TCP you can do the exact same fucking thing. Close the old socket when you get a connection attempt from a new client with the right handshake.

I'm a little out of my depth here, but I'd imagine it'd be much easier with UDP because UDP is connectionless. With this sort of roaming, the server isn't expected to change addresses, but the client is. So, have the client sign everything with a negotiated public key, and the server doesn't even have to care where each packet is coming from, or even open any new connections when the client moves across IPs.

Since this is an SSH replacement, I'd expect the key signing to be done already, so once you build an

I'm a little out of my depth here, but I'd imagine it'd be much easier with UDP because UDP is connectionless. With this sort of roaming, the server isn't expected to change addresses, but the client is. So, have the client sign everything with a negotiated public key, and the server doesn't even have to care where each packet is coming from, or even open any new connections when the client moves across IPs.

In the context of persistent logical connections, you have to consider that the TCP connection will

On the other hand that's also an argument for why HTML5 video is pointless: We already have Flash and it has a near-100% install base. HTML5 video doesn't add killer features; Flash already does pretty much everything we want in embedded video.

Oh, wait. Flash doesn't play well with mobile browsers. Just like ssh, really, since everytime you switch from one wireless cell to another you may get a new IP address, which would kick you out of your ssh session with no indication that it happened. Which is exact

I get reconnectability (which I already have, either by using a VPN or by using screen on the server), but now it's built-in.

But now it's automatic.

I get local echo so I have no clue whether my connection has been dropped -- but OTOH, this is great if you have the brain of a goldfish and so can't remember what you just typed for a couple seconds till it gets echoed back. I presume this is optional, so non-goldfish-brains can tell it to 'degrade' to be as useful as ssh.

It also is automatic and shows what hasn't been echoed. Further, typing while lagging by a character or two is incredibly frustrating to almost all brains in existence. It's like listening to headphones which have a half second delay in what you said. Your brain simply freezes.

I get better unicode support -- well, that one's cool, anyway.

And it needs ssh for login, but also needs a mosh server -- so I can ssh into every server, but only mosh into a few.

Am I missing some really great thing about it? It seems like a major hassle for a minor improvement.

For the most part if you are connecting via a Command Line you are using "An Older" (I am using this lightly, as a lot of new system still have command line and even new version of windows is expanding it command line) System designed for ASCII or VT100 transmission of data. Adding Unicode is a minor improvement, because for the bulk of us whatever language we speak, ls -l is still ls -l.

Unicode will be handy for newer systems that may have a more human language interface to it.

I'm sorry, but that is simply not true. At least not outside the English only speaking world.

ls -l may still be ls -l, but its man page, and the filenames it spits out on stdout are localized, with non US-ASCII characters. The files we view with cat and less are filled with non US-ASCII characters.

unicode isn't insecure, it is how it is used that could cause security problems. There are letters that are different then from the ASCII but look almost the same, if not the same. Which means too wide support of unicode could allow people to trick people to go to trusted names to an untrustworthy file/location.

Once something becomes widely used and stable, making drastic changes becomes next to impossible.

That's why we went CVS -> SVN -> git. Too many people were using CVS to make the changes made in SVN. Too many people are using SVN now to fix the (very old and oft complained about) problems with SVN.

See also NFS. There are issues with NFS that people have complained about for years.. and they will never be fixed for the same reason.

Yes and no. SVN is a lot younger and thus can more easily adapt. For instance, in 1.7, they changed from having those.svn folders in every single directory in your working copy and moved it all into a central.svn folder for the entire working copy. Which opens up some possibilities (now that the code is cleaner) that it might get some "git-like" features.

SVN isn't as fossilized as CVS, but it's still a server-centric architecture, not a distributed version control system. Which has its benefits and d

Moreover, I regard Mosh as solving a higher-level problem, and it is simply good Unix style to use another application on top of ssh. There are also many use cases where ssh does not need these improvements, so it is better to keep the core protocol simple.

As with TLS, I'd like to see any future revisions of these secure protocols trim more fat.

Dude, SSH is half a meg. Calm down.

The problem with "Arcane ciphers, modes, etc" is not executable size at all, but security.

For example, MD2 finally go the axe from openssl back in '09, not because md2 took up too much executable space, but because it was obsolete small / psuedo-broken. I call it psuedo-broken because they crypto guys would call it broken, cracked wide open, but its not totally broken like DRM or copy-protection schemes. Its still has got about 50 bits or so of security, and for many apps that's more than enough, as long

As with TLS, I'd like to see any future revisions of these secure protocols trim more fat.

Dude, SSH is half a meg. Calm down.

I think buddy's point is that SSH should deprecate support for old crypto libs because no one uses them anymore, and they are sort of an Achilles Heal...look at how easily GSM can be subverted because it supports old cypher protocols (and even one that is "No Encryption" encryption!)...anyway, point is: get rid of the stuff no one uses anymore, use only strong crypto with no option for in-the-clear, to reduce the potential for security issues. Our good friend AC just isn't so verbose about his idea...

The thing is, there's no way to 'untrust' those protocols. With my GSM example, A5/0 is classed as an 'encryption level', but in fact is in-the-clear. With TLS, and a man-in-the-middle attack, there's no way to know if the MITM has renegotiated the encryption to be in-the-clear. There's no way to turn off support for in-the-clear. The worry about SSH is that with older crypto protocols, a weakness may be found, and that protocol will still exist in all versions off SSH. If the attacker/MITM can force a

note that ssh does that too and you can configure the amount of time, and/or turn it off entirely (which has its advantages, if connection breaks and comes back you don't lose the session at all. if you change ip, it takes however forever to figure that out)