Thursday, June 17, 2010

My giant epistemology paper, that is. Forty-six pages. I'm fairly pleased with it. It certainly lacks nothing for boldness. Skipper said that writing a good paper takes a month, and he wasn't exaggerating. I just hope he agrees that its good.

I had four separate people copy-edit it, and they all found a ton of completely non-overlapping egregious typos that I overlooked in my many readings of it. Matt and Rachael both found 20 to 30, each! Different ones, too! And I was pretty sure I had caught everything! Its amazing how much tiny errors damage credibility, and how unable I appear to be to see them.

Good to be done with it. Its by far the longest paper I've ever written, and also the most difficult in terms of content. It pulls in lots of the reading I've done over the last couple years, its almost like a year 1 paper (if I was actually in a PhD program).

Tuesday night I went to bed all freaked out. I had 8500 words written, but I had painted myself into a corner and I figured I needed to backtrack and delete a 1000 word section. I was laying in bed thinking "maybe I'm not cut out for this, maybe I should just get a banking job or somehting; maybe cognitive science research is too much for me and I should just be an observer...". I shook off the funk in the morning, deleted the offending section, and wrote the last 2500 words or so straight through.

Friday, June 11, 2010

My goodness, it seems that finishing theses has a strongly negative effect on blog-motivation! Never fear though, soon I'll have a 9,000 word paper on Simplicity (I know, right?) to share with you! I was afraid for a while that it was straying dangerously far from the theme of the class (Occam's Razor, and the tortured modern interpretations of it), but the professor reassured me that he was equally non-plussed by the several hundred pages of reading he had us do for the class, and that he'd be much more interested in reading something about the cognitive science of simplicity, which is what I'm writing about. Probably wise to at least mention the assigned readings, though; so far I've done lots of exposition on stuff I've read over the last year outside the class and and plenty of integration thereof with nary a mention of 'grue' or 'minimal reversals.'

I had a moment of programming-induced rage last night (involving lots of swearing at hung terminals and throwing empty cranberry-juice bottles), triggered by realizing just how awesomely specialized R apparently is for some tasks, as compared to Python. I'm working on this "Think Python" book, and one of the tasks it had me do really stretched the capabilities of the language. It asked me to read in the official crossword list and do some list operations on it; such an apparently easy task (from my R-soaked perspective) that I almost skipped it. As it turns out, making a list by traversing the word file and appending each of the 1ook or so words to the previous ones takes up a LOT of memory, and it was a "great learning experience". Apparently Python creates a new list at each iteration, so instead of one list with 100k-some elements you get 100k-some lists, each with as many elements as precedes it in the list. I know I could do the math to say how much excess memory that uses, but meh. I tried it several different ways, looking up efficient idoms in python to do the task, and every one of them caused the shell to hang for at least a half hour. None of them actually "finished" in a way that gave me a usable list; if I had the patience it would eventually give me back my command prompt, but would remain hung.

Maybe if my first real programming experience had been with a general purpose language like Python, I would have been expecting this. But I've been working in with super-specialized-for-statistics-"R" for the last year, which does tasks like this efficiently in the bleary-eyed milliseconds of post-waking-up pre-first-cup-of-coffee time frame that it takes for the enter key to spring back up after being depressed. Want to read a file and do list operations on it? Great! Type "someListthat. Happily I found that its quite comfortable with that sort of operation, but my travails weren't quite over yet. The "range(x)" function will create a vector of length x with pleasing rapidity and is the closest equivalent to R's "vector(mode, length)" function I could find. Unhappily, if you try to print this vector with x=100k-some, it hangs the shell. What the EverlovingEff? This is another thing that R does without blinking. I suppressed my boiling anger long enough to find that if I just assign it to a variable without printing it, like this: "someList=range(x)," everything is fine. It makes a list of the requested length, but fills it with consecutive integers. Presumably this isn't the most memory-efficient way to go, since all I really wanted was zeros, but it does the trick.

Now, at long last, I could read the file and assign words to slots in this damn vector. Here's the code that I finally came up with, at 1am:

i=0fin=open('/Users/Tj/Downloads/swampy.1.4/words.txt')for line in fin: i=i+1 result=range(i)fin.close()fin=open('/Users/Tj/Downloads/swampy.1.4/words.txt')j=0for line in fin: word=line.strip() result[j]=word j=j+1fin.close()

It works, just don't type "result" into the shell, for the love of god! Why was I doing all this, you might ask? Merely so that I could eventually implement the "bisect" algorithm (which incidentally Python already has a version of) to reduce the search space in sorted lists. Its pretty neat: If you've got an alphabetical list of words and you want to know whether some other word is in it, you could just read through the list, testing each word to see if it matches the new word, and after 100k-some operations you'd be done. OR, the non-naive way to do it: test whether the word is in the first or second half of the file based on alphebetization, then test the half that its in, then test that half... and continue till you've got a list of just one or two words you can test. The virtue is that the bisect algorithm completes the search in twenty-or-so operations, rather than 100k-some. Neat eh? Now that I've finally got a list with indexes, I can actually write the function!

I'd be pleased if some expert Python programmer would offer a comment explaining that I'm doing it all wrong and give me a nice simple solution to the problem. Pleased in an ironic way.

A few more complaints, though: IDLE doesn't seem exceptionally stable on OS X. I can't use the keyboard interupt to stop processes that are taking for ever, and I end up force-quitting it much more often than seems ideal. Also, I can't seem to run more than one shell at a time. Both of these things aren't a problem on my sparkly-new Ubuntu-running salvaged laptop from 2004; though that's got its share of headaches.

Speaking of which: Cory Doctorow has a glowing review of the latest Ubuntu release, co-opting the "It just works!" slogan erstwhile applied to Macs. As much as I'm enjoying Ubuntu, I can't say that I've had the same experience. Version 9.10 worked pretty well, aside from issues with the wireless driver (5 hrs of (clueless) shell scripting), not being able to hibernate when the lid is closed (no solution, just have to do it manually), and the wireless being disabled when it comes back from manual hibernation (no solution but reboot). I enthusiastically upgraded to version 10.4 when it prompted me to do so... only to find that I had to do some shell scripting to get the video driver to work, plus the wi-fi issues were back and harder to solve. I ended up going back to 9.10 in frustration. Complaints issued, I must say that I rather like it. Its sleek and fast, and there's free software available for everything I want to do. My advice to would-be Ubuntu explorers: be prepared for deep-googleing problem-solving sessions, at least if you're using a Dell x300.