Monday, October 21, 2013

Just a quickie to share a tweak I had to make to my xmonad.hs. Not sure if there's a better way to do this, but hey.

The goal was to finally, actually get working hibernation on my laptop. I usually use it in short bursts, so I just got used to shutting it down between sessions. However, I recently started using a work laptop running Windows 7 and hibernation has been useful there[1], and I'll be damned if the non-free shitbox is going to have a mildly useful feature that my machine doesn't.

The way you get a Debian machine to hibernate or suspend is with the appropriately named pm-hibernate and pm-suspend commands[2], so I figured this would be a fairly easy key binding

Unfortunately, the pm-* are root user commands. And Xmonad doesn't automatically prompt for a password when you do something like su -c pm-suspend. And, unlike with sudo, you can't pass a password into su. So that approach is right out.

I googled around for alternatives for a little while, but What I ended up doing was finally adding myself to the sudo group, and defining this function for my own nefarious purposes

Footnotes

1 - [back] - Granted, because the boot time on that machine is something like 5 minutes instead of the 12 seconds I'm used to waiting, Hibernate is a goddamn necessity, but I digress.

2 - [back] - Ideally, I'd just be using hibernate, but there are some issues. I've upgraded my ram since installing the OS, which means that my swap partition isn't big enough to store a memory dump, and I can't seem to resize it with gparted, with or without swapoff/swapon magic. Luckily, I've had a larger hard drive waiting for me to crack open the box and configure it to my liking, so I'll just do that this week rather than procrastinating. In the meantime though, I'm suspending instead.

Tuesday, October 15, 2013

I'm going to get to the reflections piece eventually, I swear. Or maybe I won't. Fuck I don't know.

Anyhow, sessions are things you'll need to deal with if you want to build any kind of stateful application on top of HTTP. Because an HTTP conversation is stateless by default. When you send an HTTP request out, as a general rule there's nothing in it that could let the server positively identify you. Which means that if you make two serial requests to the same site, they usually can't be absolutely sure that both of the requests you just sent came from you. They'll get data on your user agent[1], operating system, and your IP[2]. And that's it. Now, granted, if you're me, it's fairly easy for the server to point out the Debian Jessie/Conkeror user originating at IP foo, but that's not something a server operator can normally rely on.

What they have to do is hand you some piece of data, and ask you to hand it back to them every time you visit. Usually this takes the form of a cookie, and if they've done their job sufficiently well, they can now take any bunch of requests they got with the same cookie and reasonably assume that it came from the same user.

How Well is "Sufficiently"?

Something should be obvious there. First, unless you're using SSL, that piece of state you've been handed is trivially sniffable. Which means that if you have a habit of logging into a server that doesn't make you use https, well, I hope you're not keeping anything really secret there. Second, unless your session state is pretty hard to guess, someone who wants to impersonate you probably can.

From a server operators' perspective, the https thing is easy. Just use SSL[3]. As for guessability, we want the following properties:

each active user should have a unique session token, unless they choose to share it

knowing any number of previous tokens shouldn't give you any edge in guessing others[4].

knowing how the tokens are generated shouldn't give you any edge in guessing others[5]

It's probably not necessary to generate a new key for each session, but it doesn't seem to be too expensive, so I'll spring for it.

sha256 is a thin wrapper around a particular digest-sequence call, and it produces a 32-element vector of octets representing the digested number. We feed that to an aes cipher as a key, along with a (gensym), random number and the current time in milliseconds. aes is itself just a call to a set of ironclad functions that return a vector of octets representing the AES-encrypted message described above. That result is itself then fed through cl-base64:usb8-array-to-base64-string, which gives us a string we can use as a reasonably secure session token, provided we're using SSL. Here's a sample

the profiler says session generation probably isn't going to be my bottleneck. Though I could probably tune it if I liked, not that I could see the gains offsetting the readability hit we'd take. If I had to start cutting somewhere, I'd make sure to only generate one key per server session, and figure out a more efficient way than format to put the key content together.

Actually, that gensym+rand-call+get-universal-time method strikes me as programming by superstition. Even more-so than the Hunchentoot session mechanism, which also includes the target IP/user-agent and validates these against the incoming request[8]. If we were implementing the real requirements as I understand them, we'd just need

No encryption, no fiddling with random, no assigning results of make-random-state calls. Just initialize a :fortuna instnce, and collect random output in batches of 32.

Wed, 16 Oct, 2013

Other than that, what's left is putting together a session table with its own lock to store session information indexed by these IDs. Oh, and also sending them out to the client. I guess that's kind of important. Both are waiting for next time though, or this will quickly cease being "brief".

Footnotes

1 - [back] - Unless you've spoofed it, as I often do to access the many "IE only" pages built by the legion of typing monkeys in my current companies' HR department.

3 - [back] -I'm not implementing this myself, obviously. The current plan is still to hide behind nginx for static file serving, so we can have it handle SSL certificates for us to. It's not even terribly difficult.

4 - [back] - Except in the trivial sense that each active user should have a unique one, so if as an attacker you write a script to grab a few thousand keys, you can be sure that other people aren't using those specific ones.

5 - [back] - Except in the trivial sense that you can avoid guessing short dictionary words, or dates or something.

6 - [back] - That key, incidentally, should have similar properties to session tokens. It should be difficult to guess no matter how many of them you've seen, and running your own copy of Deal to extract a bunch of keys should give you no advantage when guessing another servers' secret key.

7 - [back] - And that's not a certainty. I'm not exactly a math guy, so it's entirely possible that I'm misunderstanding the requirement at some step of this process. I'll certainly keep you up to date on any revelations.

9 - [back] - Also, the runtime of ironclad:make-prng is extremely inconsistent. It takes between 8 and 76 hippopotomi to complete, and I'm not entirely sure what plays into that. Possibly entropy shortages in the underlying OS? Which also reminds me; this version isn't Windows friendly. So if you were planning to run Deal on Windows, I'm deeply sorry for you.

On the Mechanisms of Stopping A Server...

See, because the server I'm putting together is single-threaded, you need to C-c C-c out of it to get back to the REPL. Except, that still leaves the socket-server listening on the specified TCP port. The half-assed solution I'd come up with involved setting a handle into which I'd put the listener so that I could close the process and kill the listener externally later.

That'll automatically clean up on any kind of error, including an Emacs interrupt, and it completely removes the need for stop and *socket-handle*. The above also uses usocket:*wildcard-host* instead of "127.0.0.1", but that's a tiny change.

On The Mechanism for Listening to Sockets

There's a less obvious place that I wanted to figure something out for. Here's the above server with elided chunklets, just so we can focus in on the relevant details

The point I've been thinking about in particular is that bit that says (wait-for-input conns ...), and the associated places where I either remove things from, or add things to conns. As written up there, it's a list. Which is to say, a singly linked list. And that means that adding a thing to it is O1, but removing a thing from it is On, and since we're doing that (setf conns (remove ready conns)) inside of a loop, this version of startstart is effectively an On^2 procedure in the worst case. Not horrible, I guess, but I think I can do better.

The challenge here is that no matter what data structure we use to store connections, wait-for-input needs either a socket, or a list of sockets. Here's one attempt to do somewhat better

If we represent conns as a hash table, we can effectively pay some memory and some best-case time to mitigate worst-case time. Seems worth it, I'd say, but I'm not at all sure. New connection insertion now takes the form of (setf (gethash (socket-accept ready) conns) :in), and connection removal is written as (remhash ready conns), both of which are O1. The thing that gets markedly worse, ironically is the wait-for-input call itself. Unlike the original, which just passed the raw conns, we now have to pass (cons server (hash-keys conns)), which requires not only the consing of an entirely new list each time through, but also a full traversal of conns. Since the interface of wait-for-input demands an actual list, and not a generator or similar, the best you can do on the implementation of hash-keys is something like (loop for k being the hash-keys of conns collect k). Which works, but isn't exactly stellar.

To its credit though, it does save us time in the worst case. As a result of this representation change, start now has On performance in both the worst and the best case. I get the feeling we could save some constants by opening up usocket and twiddling with wait-for-input, but a glance at the relevant files tells me that it's implemented something like four times, sometimes in expected configurations I can't easily test.

Ah well. The naive hash is a good enough improvement for now, and I am in the middle of reading through notes about Berkeley Sockets and their uses. Hopefully, when I get to work I can convince one of my co-workers to take me through the nuts and bolts of the implementation in C. Maybe that will give me enough insight to write something that solves this problem in a satisfactory fashion.

Ruby and Erlang each come with their own modes, and recent Emacs versions ship with a built-in Python mode and shell. Smalltalk uses its own environment (though GNU Smalltalk does have its own mode), and I'd really rather not talk about PHP. If you're writing in it, chances are you're using Eclipse or an IDE anyway.