Thursday, November 29, 2012

Except that github still thinks this project is 20% Perl. If I were mean, I'd make a joke about binary files being improperly recognized.

Granted, the re-write encompassed player.py and not just main.py, but that's because I never stopped to sit down and think through the threading model. Because web.py does a thread-per-request, it technically worked anyway, but that module was due for a proper tear-down and re-build whether I moved servers or not.

The relevant parts are actually just those two functions in the center. I'll assume you know what all the imports mean, and that we can just gloss over the MASSIVE CONFIG. The utility functions are self-explanatory-ish. __getPlayerCommand takes a file name, and figures out which player that file is going to be using by looking its extension up in the command table. By default, that's mplayer, but as you can see by that try block in the config section, if omxplayer is available, we use it for mp4s and ogvs[1]. __clearQueue takes a Queue and pulls from it while it's not empty, resulting in an empty queue.

listen pulls stuff out of the playQ[2], checks that the thing it got is a file it should be able to play and if so, pulls the relevant metadata and passes it on to playFile.

playFile is probably the oddest function I've ever had to write. It has to be blocking, because we don't want its caller to think it can play another file before the last one is done[3], but it also has to launch its player in an asynchronous subprocess, because it needs to be able to receive input from the user, but it can't wait for input because that means that it would have to receive some before it returned[4]. The result is what you see there. The first thing we do is clear the commandQueue[5] and launch the player and retain a handle to it. Then, until playback finishes, we poll commandQueue for user input. We have to leave a timeout for that input check, because we'd otherwise wait here even after the file finished playing, and that's not fun. ServerStatus.write_message_to_all write out an SSE notifying the front-end of

playing a file

receiving a user command

finishing the file

respectively. Hmm. I should probably notify the front end that I've finished playback even when a stop command is received. Just for completeness. I'll make a note of it.

Those changes essentially make the player into an actor, except that it reaches into surrounding state in order to send notifications. If I really felt strongly about it, I could instead give it an output queue that other processes could pull from in order to communicate, rather than have it send messages into ServerStatus directly. I don't today, but you knever no.

Now that we've got that out of the way, here's what the new main.py looks like using tornado

As you can tell, it's not significantly different. The handler classes now need lowercase POST/GET methods, we use self.write and self.redirect instead of return and raise, handlers now subclass tornado.web.RequestHandler, the routing table has slightly different syntax, and that's basically it. We also communicate with the player slightly differently, but that's due to the rewrite in the player. The only really significant difference (which I actually prefer the tornado approach for) is

You can specify your own static directory.

This bugged the ever-loving crap out of me in web.py, where doing the same required non-trivially subclassing StaticMiddleware. It's also not obvious based on the documentation, but the default was a static folder relative to cwd, rather than relative to __file__, which meant that running python a/long/path/to/my-app.py 4141 was needlessly tricky. tornado just takes a path, and you get to decide how complete/relative it is.

Oh, I should mention, ServerStatus is not actually a default tornado class. I didn't have to write it myself, but the sse.py file is derived from this. The diff is minimal; I added the capability to specify event fields, and made the id auto-increment by default. The class itself implements Server Sent Events; an asynchronous handler which assumes it isn't getting closed by the other end; a message written to it is going to be sent over to the client without a new request coming your way.

That's an essentially working, non-blocking media server.

Now, it's not done yet. I still have to re-write the Play handler, because I'm currently doing something fishy with time.sleep and the stop command instead of formalizing new-queue as a separate directive, and I still have to make mild edits to the front-end to actually use all this data that's being SSEd over, and setting up a play queue makes it almost trivial to implement skip forward/backward functionality so you bet I'll fucking do it, and it would be really nice to be able to make config changes through the front-end somehow.

Footnotes

1 - [back] - That's the RasPi video player; it's more primitive than mplayer, works specifically on the RasPi hardware, and can only really play a few different kinds of video, but the upside is that it can do surprisingly smooth HD output. So we really want to use it if at all possible.

Sunday, November 25, 2012

My wife and mother-in-law went out for lunch while I was at work. They dined[1] at KFC, and brought the leftovers of their feast home, ostensibly to serve as dinner for me. Three pieces later, my body is loudly and clearly telling me to fuck off. So having torn myself away from the porcelain god, I'm going to write something in an effort to expunge the taste of grease and fail.

Fatherhood

Children are a lot of effort. I'm still constantly being told that it gets easier, and that the second and third ones are cakewalks, but I'm not seeing it yet. All I can definitively say so far is that if you're planning on generating larva with your significant other, expect to sleep significantly less than you're used to.

Two more things actually. First, if at all possible, have a female of your family unit breast-feed. Milk-fed babies' initial output doesn't smell like, well, shit, for about a month. Second, babies have a ridiculously poor API. They just open a stream and send "Waaaaah" at various volumes and modulations. That might mean "feed me", "change me", "burp me", "pay attention to me" or "fuck off"; and they don't close the stream until you do the right ones? and/or get very lucky. So ... be prepared to iterate through that pattern.

A lot.

Finally, a word advice for the programmer daddies specifically, hands down the best investment you can make is a sling carrier of some kind. It'll let you cradle your baby while keeping both hands free. I'd have basically no hope of finishing this article without one.

Haskell

I've finally gotten past Absolute N00b stage with Haskell. Like I said last time, it only took three years. There's a few patterns I'm detecting in the libraries and community that I thought I'd point out.

The strong typing stereotype turns out to be right on the money. There are precious few libraries on hackage that have any kind of usage example in addition to type signatures, and fewer still that have actual documentation. I guess I'll have to get good at reading type signatures, but guys, these are for the fucking compiler. Some human readable media beyond basic explanations would be nice. The upshot is that, if you're hopping around ghci, you can use :browse Library.Name.Here, and the hoogle docs are available for local use, so I guess it might kind-of-almost-sort-of even out once you get to the point where you're comfortable with language basics. The other upshot is that the community is very responsive. I wasn't expecting to have as easy a time getting stupid questions answered as I've actually had, whether that's been in-person, via IRC or on the appropriate SO tag.

Clojure

swank-clojure seems to have been deprecated! I went to install my usual Clojure environment to do a bit of hacking for this article, and noticed that giant note in its git repo. The good news is that there's apparently a thing called nrepl that provides more or less the same functionality, except through the Clojure Networked REPL rather than through SWANK. nrepl.el is available here, for those of us who still install Emacs packages manually. The model is a bit different; where SWANK is a thing that gets started as part of SLIME, and then loads projects, nREPL is theoretically a thing that your project needs as a dependency, then gets started once you start editing that project.

I say "theoretically" because, just like swank-clojure, nrepl fails pretty spectacularly on my machine. I guess I'm sticking to inferior-lisp for Clojure code.

On the language in general by the way, it feels surprisingly comfortable after a couple months playing around in Haskell-land. To a first approximation, it's Common Lisp with more emphasis on the stuff I like and less emphasis on the stuff I don't. Because of the ecosystem its embedded in, you frequently find yourself having to call Java code for one task or another, but doing so is easier than you'd think. The only part I really don't like about it is its weight. Every time I lein run a project, or start up the repl/run-lisp, there's a visible few-second delay during which all my cores start spinning into the 98% range. That doesn't happen with any other language I use regularly, even while running what seem to be more compute-intensive operations.

Web Mote

On that note, a very small chunklet of my time has been going towards the tweaking of a Python project I started for my RasPi a little while ago.

Web-Mote is in a usable state at the moment[2]; I've got the Pi hooked up to a separate wireless router and running a subset[3] of my media library through the livingroom TV's HDMI port. I still haven't figured out how to control the TV itself from the device. Hopefully, I'll fix that soon soon[4].

I ended up not following my thought process out to its ultimate conclusion. Ok, I did experiment a bit with a completely client-centric approach, but the downsides I mentioned turned out to be more severe than anticipated. Specifically, it ended up causing all sorts of headaches relating to what to do when my remote lost the signal, or when it ran out of battery power. Those kinds of problems seem to be inherent to keeping most program state on the client rather than the server, and being that they directly got in the way of my enjoying the use of the system, I will not be going down that path.

What I do have to do is put together a handler that deals with sending out a log in the form of an SSE response. That'll be critical for the future when I actually want clients to start interacting with one another[5]; they'll each need to know what the state of play is on the server, so a coordination handler is in order. A cursory googling tells me that web.py is built more or less like hunchentoot in terms of the threading model, so I may actually need to move to a different server if I intend to make this puppy support more than a handful of clients at a time. Which I may as well, just for the fun of it. The Python situation is a bit better than the CL one here, since there's actually a production-ready non-blocking web-server waiting to be used, where Lispers warned me away from the comparable CL application for fear of its prime-time readiness. The good news is that, thanks to this front-end separation experiment I'm running, porting the backend away from web.py will involve changing exactly one file in a not-very-extensive way.

Footnotes

2 - [back] - Though oddly listed as 21.2% Perl, even though the only non-JS/HTML/CSS code I've got in there is Python.

3 - [back] - It would be the complete set, but I'm still waiting on a drive enclosure that will finally let me store more than 32GB of data there without using up a second wall outlet.

4 - [back] - Realistically, I don't need anything like complete control. I need to be able to tell it power on/power off, go to the channel [this] is connected through and volume up/volume down.

5 - [back] - For instance, one use I've already dreamed up but haven't come close to implementing yet, is something I'm calling democracy mode. The idea is that the server tallies votes for the next thing to play, and plays the highest voted rather than next-in-queue when a media change occurs. When I think about how to implement something like that... Well, it seems like it would be both simple and in keeping with the general design principles of the semi-client approach. You keep a running total and a list of IPs that have already voted, and you give each client a handler by which to register a vote. Done. Now, thinking about how I would do it without central state being kept on the server. It seems like the best I could do is let the user register their current vote. Keep the clients synchronized with the server somehow, and send a message every once in a while that says VOTE NOW, BITCHES, at which point each client would report its current vote and clear it. There'd need to be a momentary stateful operation, but it would literally be getting the len of the collected [vote-ip]. That seems like it would be a bit more complicated, if theoretically elegant, to actually implement.

Friday, November 16, 2012

Man, I'd better wrap this shit up before my Authentication series becomes Zeno's Article. This particular column won't be contributing to the cause, unfortunately; this is more errata than another installment.

Its been pointed out to me that SHA-2[1] is actually a pretty poor choice of hash function for password storage. The why of it is explained over in this article, which conveniently starts off by recommending bcrypt and linking you to implementations for a variety of popular languages. bcrypt looks good for password storage for a number of reasons, including pre-resolved salt, slow hashing speed and an adjustable work factor. Still, read through the entire article, and then look through this one, aggressively titled "Don't Use Bcrypt", which introduces another couple of algorithms which you might want to pick over it for various reasons. As it turns out, scrypt is also implemented for a variety of languages and provides much poorer performance[2], while PBKDF2 has been around longer and has therefore seen more battlefield sorties.

I'm not going to recommend one.

They're all better than the SHA family for this particular purpose, and they all implement salting for you, so any of them will be an improvement if you ended up blindly copying out the code I had previously posted. Thing is, like I mentioned last time, you really should understand the possible attacks in a given situation, and pick a hash that counters them appropriately. I kind of agree with the second guy; yes, bcrypt is much better than some options, but don't take that to mean "Just use bcrypt from now on". You need to evaluate your situation and pick a hash function that fits it.

All that having been said, bcrypt is going to be beat out a SHA-2 in a known-cyphertext attack. That is, in the situation where your attacker has a copy of your user database, including all the salt and password hashes. In this situation, they can probably brute-force passwords hashed with SHA-2. The problem is that SHA-2 is fast, so it's possible to try several hundred thousand per second even with a relatively modest machine, whereas hashing a string with bcrypt or similar actually takes a second or so. You can't brute force faster than the algorithm produces output, so the slower ones are going have a security advantage there. It's definitely better than the SHAs, but I wanted to point out the kind of endgame we're into here.

All that having been said. You know what would completely sidestep the entire fucking question? Using RSA keys to identify and authenticate your users. You wouldn't need to hash dick because all you'd store is their public key. Because you wouldn't need to hash anything, you wouldn't need to salt anything. Because you wouldn't be using a password, your users wouldn't have to remember any additional information.

Yes, it's currently more work to put together a working RSA-based authentication system, and yes you have to offer it as an option because the general public hasn't caught on to it yet, but it's the Right Way to do auth[3]. Just putting it out there.

Footnotes

1 - [back] - Whichever SHA-2 you like, it doesn't matter for the purposes of this exercise.

2 - [back] - Slow, as both of those articles note, is actually what we want in a password hashing function. So while that may sound like a dig, it's not.

3 - [back] - At least until cheap-ish quantum computers become available. Hopefully someone works up a better trap-door function before then. Anyway, I get the feeling that it's far enough off that RSA auth would still be worth implementing in the interim.

Firstly

I swore not to make a multiprocessing joke about this, so I won't.

As of about a week ago, I am one of two custodians to a freshly-hatched human being. He's apparently a bit big, which will come as absolutely no surprise to anyone that's met me in meatspace. It's pretty taxing in the sleep department, but I'm told that's temporary, and that the experience is worth it in the long term. Granted, I am told this by people who have gone through the process, so it may just be them rationalizing a fundamentally damaging experience, but lets give them the benefit of the doubt for now. I'll keep you posted, I guess. My wife is still in the recovery stages and has become, shall we say, slightly less certain that she wants to repeat the process. Otherwise, it's going ok. We've got the pretty good Ontario Health system at our backs, and various online/literary resources all of which has helped prepare us. There's also the entirely unexpected benefit of having a sufficiently well-behaved infant, to the point that we've managed to sleep semi-properly and actually go out in the week since.

Raising him is going to be another can of worms altogether, especially if Stross is anywhere near the mark. I'm about to raise a child to whom I will have to explain that we used to have this thing called "being lost", and that it used to be impossible to keep in touch with your friends every hour of every day, even if you wanted to. I don't even want to begin thinking about that right now though, it'll just get me riled up.

I've got a son, my wife lived, and they both seem happy and healthy. So... that went well.

Secondly

Earlier this week, I bit the bullet and upgraded my laptop to the latest wheezy release. That's actually something I've been meaning to do for a while for various reasons, and my pack-rat data storage habits finally got me to the point where most of my admittedly meager 64GB hard drive was full. The installation routine is down pat by this point

I'm not quite finished yet. a couple other small items still need to be put together[1], but that's a comfortably functioning if minimal development machine. There are a few changes from last time, but before we get to those,

I need To Brag for a Moment

I thought that would end up being true for the very short term, seeing as the wireless drivers this machine uses are all blobs. Turns out that not installing those has done nothing but kept me from trawling Reddit for porn. My 3G kindle performs admirably when I need to take a look at some new piece of documentation, or just pull up a previously downloaded reference manual, my desk at work has three CAT5 jacks so I'm always on the wired network anyway, and Toronto libraries have wifi hot-spots consistently shitty enough that I've yet to ping www.google.ca successfully through one[2]. I'll stay disconnected for the short term, though I have no idea how long that'll remain the case. In the meanwhile, rms would be proud.

Different Languages

Ruby and Smalltalk got left out again. I'm not particularly happy about either of those. I definitely wish that Matz had become more popular than van Rossum, but it seems that he hasn't. A little while ago I realized that I was reaching for Ruby and Python in roughly the same situations and, despite the fact that I like Ruby better, Python was coming out more often. That's because everyone in IT at the office has at least a cursory knowledge of Python[3] and because Python comes with Debian. It would be nice if python-setuptools was also included by default, and if the language/community didn't have this anti-functional-programming stick up its' ass, but whatever I guess. I've got nothing against Smalltalk either, but of the languages I don't currently have installed, both forth and prolog are ahead on the "things to learn" list. On the other hand, I have been fooling around with RasPi recently, and that comes with a Squeak image, so I dunno. Maybe it jumps the queue at some point. It just probably won't be on my main machine.

Erlang and Node.js are both absent, but I'm not complaining. I haven't gone back to kick at Erlang since my last rebar failure. Honestly, I haven't been missing it. It has some excellent ideas, but the language itself is fairly ugly, and a few strokes of bad luck on deployment have soured me on it. Maybe that'll change in the future, but it's out for the moment. Node, is a bit odd. On the one hand, there's nothing overtly offensive about JavaScript from my perspective[4]. On the other, it just isn't interesting enough that I want to use it anywhere I don't have to. That wouldn't usually prevent me from at least installing the runtime, but

It still isn't in the testing repos, and I'll be damned if I start apt-pinning from sid for a language that I have an at best passing interest in

Once you've taken the leap of faith and installed npm itself that way, you apparently need to run npm installas root too. Which sounds like the sort of bullshit that made windows the marvelously insecure block of Swiss cheese it's been for the last couple decades. It's not the only one, so I guess I can't kick its ass too hard over this, but I long for the day when the wisdom of quicklisp/lein install/cabal is picked up by all language designers.

I can see myself installing it somewhere other than my main machine, just to give me handlebars.js pre-compilation as a service, if no one's done that yet, but that's about it. In fact, here. Now you don't need to install it either.

Clojure and Haskell are now part of the standard lineup, neither of which should surprise you if you've been following the blog at all. Both place emphasis on functional programming, laziness and composeability, but that's about where the similarities end. Clojure is one of Lisp's bastards; a dynamic, fully parenthesized, prefix-notated language running on a virtual machine with a heavy focus on Java interoperability. Haskell is a member of the ML family, which means a fanatic devotion to strong, static typing, a heavy emphasis on compile-time rather than run-time optimization, a complete lack of VM, plus a strong aversion to parenthesizing anything and the ability to vary fixedness based on context. I'm making an attempt to learn both over the next few months, and that will hopefully convince you that I take cognitive diversity seriously.

Switching WMs. Again.

Last time I hopped back into StumpWM from XMonad. This time, I'm hopping back. It turns out that, just like there are a couple of small annoyances in XMonad that make Stump preferable, there are a couple of small annoyances in StumpWM that do the same for XMonad.

StumpWM really really doesn't like floating windows. Far as I know, there isn't a way to detach one randomly, or do anything with one once its detached. The WM also occasionally throws errors when a program tries to put up an alert, like a file-save notification or print dialog. XMonad has yet to yell at me about that, and it elegantly deals with floating windows using the mouse[5].

Stump still crashes with GIMP. I vaguely hoped that the single-window mode would outright resolve that issue, but it hasn't. Sure you can now run the program, but attempting to open a file with it results in the WM becoming unresponsive to keyboard input[6]. XMonad has no such problems, and being that I occasionally like to draw things, I'd prefer my window manager to not explode while loading drawing tools. Even apart from the specific GIMP problem, I've found StumpWM to crash more in general than XMonad does[7].

Taking screenshots using import caused some odd errors. It would very occasionally crash the WM, and very frequently create a black rectangle rather than a screenshot of the appropriate screen area. I normally wouldn't put this down to the window manager, except that I haven't observed the effect in XMonad, XFCE or Gnome.

I'm prepared to make peace with the fact that C-t has to be an exclusively window-manager keystroke, and I've changed my keymap a bit to mitigate the second-class status of chorded mod keys. Specifically, I've bound any repetitive keystrokes to C-t C-[key] rather than C-t [key]. It doesn't entirely solve the problem, but using hold C + t h t h t h t h t h to resize windows is still preferable to C+t h C+t h C+t h C+t h C+t h. Speaking of configs

Finally

The main thing I've been kicking around is actually Haskell. I finally buckled down and went through most of the Happstack Crash Course[8], and it just about feels like I have a less tenuous grip on the language than I used to. After reading through the references available, hitting my head rather hard against the concept of monads, going through several tutorials, and attempting a few small programs of my own, it is possible for me to write a medium sized program in Haskell without pulling a reference text out every two minutes. That only took about three years. I'm not entirely sure whether the effort has been worth it in the direct sense, but I still stand by my prior assessment of the situation. Understanding a new mode of thinking about a problem can not be a waste of time. Even if it turns out to be less effective than another mode, or even outright incorrect, understanding the process will give you some insight. Either about the problem or about the current practitioners of its solutions or about your own cognitive assumptions.

All of those are powerful things, and you get very surprisingly few of them if you only know one language.

Ruby and Erlang each come with their own modes, and recent Emacs versions ship with a built-in Python mode and shell. Smalltalk uses its own environment (though GNU Smalltalk does have its own mode), and I'd really rather not talk about PHP. If you're writing in it, chances are you're using Eclipse or an IDE anyway.