The papers are out for WWW2009 (and have been for a bit), but I’ve only just gotten a chance to start looking at them. First of all, kudos to the ePrints people for improving the presentation of conference proceedings. This is a lot easier than having to do a Google Scholar search for each paper and hoping I find something, like I have to do with some conferences.

WWW2009 Madrid

There are a lot of very interesting ones, and here are a few that bubbled to the top of my reading list:

The books on string and tree algorithms and collective intelligence should be self-explanatory. The book on data visualization I wanted because it was an overlooked skill in my education. I appreciate great data visualizations and taking some steps to improve my understanding and increase my skills in that area is worth doing. Finally the book on evolutionary computing is for personal enrichment. I’ve been playing around with genetic algorithms since 1994, even before I got out of high school. It’s always been playing, though, and I wanted a bit of a more rigorous introduction to them.

With any luck, I’ll be posting some thoughts on these books in the coming months.

Twitrratr is a new service that attempts to do sentiment analysis on Twitter (follow me while you’re at it). According to their about page, they started off by tracking opinions on Obama but have since expanded to any term. Enter a keyword and it searches twitter for occurrences. It then assigns a sentiment to each post and returns percentages of positive, neutral, and negative tweets for that word. You can also track your own sentiment by searching for @your-username. I come up neutral, but there’s not a lot of data to go on there.

Their method appears to be fairly simple. They have a collection of adjectives with sentiment values (negative, positive) and based on what appears in a given tweet, they can classify a sentence. Of course, this is probably low recall (meaning it misses a lot of tweets that do express sentiment) since sentiment can be expressed without using adjectives. I’m not sure if it tries to do anything with negation, but so far my scans of results look like it ignores it.

So even though it’s pretty ghetto, it’s a nice toy. If they care to extend the algorithm, they have some pretty cool data to work with. I think it would be cool to get some (possibly donated, probably not paid) human effort together to tag some of their data to release as a research dataset.

After hearing about it for weeks, I caved and decided to check out friendfeed last night [and again, ht @dpn]. In previous posts I mentioned something I like to call the information diaspora. This is the phenomenon created by posting all sorts of personal information about your likes, dislikes, thoughts, opinions, etc all over the web and your subsequent loss of that information because it can’t be managed. I can see friendfeed coming in handy for removing some of this problem. You can attach a number of different social networking sites, flickr, youtube, etc all to your friendfeed account. Whenever you post something new in one of these sites, that information will be updated on friendfeed for all of your friends (and yourself) to be able to view. It’s not the perfect solution, but it is a very big step in the right direction.

Check it out. As usual, my username there is ealdent and feel free to friend me.

My friend Israel clued me in on Dapper a few weeks ago. I have played around with them a very small bit, but that was all it took to recognize their potential. The idea is simple, the implementation not so much. When you browse videos on YouTube, the layout of search results are all the same. So why can’t something recognize this and treat any search result as an rss feed, checking it periodically for changes? Enter Dapper. One thing that has bothered me for the past couple years is the fact that the ACM Technews does not have an RSS feed. WTF, ACM? Thanks to Dapper, now it does.

Unfortunately, Dapper is not perfect. It took me a few tries to get my first dapp working (what they call a single instance of the service). Granted, it was on fairly complicated output (not ACM Technews). If the service you are trying to create a dapp of uses sessions, your attempt will probably fail (and if it doesn’t, let me know how you did it). They are still improving the service, though, so perhaps that will change.

If you are into information trapping, though, Dapper is a must have in your arsenal of traps.

Is it Hallowe’en already? A fellow nlp blogger (and twitterer) pointed me to Plurk just a few minutes ago. I have been messing with Twitter’s api over the past couple days, which hasn’t been as easy as you’d think since they are suffering from massive growing pains. Fetching the public timeline takes between 5-30 seconds. However, they just got like $15 million in funding, so maybe they’ll be able to address the issue. The even bigger question is can they turn this free advertising service (which is what it is partially becoming) into a revenue stream?

Plurk is basically Twitter with a makeover and some extra social features thrown in. It still has the 140 character status update style interface, but includes a function selection for each plurk (what they call qualifiers): you can say, think, ask, wish, etc. You can also add smileys. Rather than appearing as a series of boxes scrolling down the screen, your plurks appear as floating boxes on a side-scrolling timeline. Plurks of friends also appear on this timeline and the result is a more graphical and pleasing (to me) interface. You can reply directly to other plurks in the boxes and conversations are tracked very nicely. This is far superior to twitter, which requires you to visit the other person’s timeline and wade through their tweets to find previous tweets in a thread. With Twitter being slower than a drunken monkey with three broken legs, that’s even harder.

As my esteemed colleague pointed out, however, scaling is an issue for any service like this. Ultimately, you are bound by how fast you can access the database. If Plurk becomes as popular as Twitter (and I have every reason to believe it won’t), it will also become bogged down. Also, Plurk is just getting started and has no discernible API (unless I’m just missing it). Twitter already has quite a few third party apps.

I must say, though, I am sorely tempted to abandon Twitter in favor of Plurk just for the fact that Plurk is accessible. The massive lag of Twitter is getting to me. Of course, if no one is there to listen to my ramblings, what’s the point?

Science fiction author Arthur C. Clarke died yesterday. He touched many lives through his writing and his ideas had an impact on me at an early age with short stories like “The Nine Billion Names of God” and movies based on his books like 2010 (which I saw in the theater) and later 2001 (which I saw as a young man). His novel Rendezvous with Rama is being made into a movie and IMDB is quoting 2009 as the release date. I thought it was interesting to find out he had been living in Sri Lanka for some time.

I visited my family in Ohio this past weekend and my uncle made a few interesting points. He’s an old-school spring engineer, meaning he learned coming up through the trade rather than by going to school, and he supervises a number of employees at a relatively small spring company. My grandfather used to own a spring company called, shockingly enough, Adams & Sons Spring Co. That was later bought out and a number of the employees were moved to a different plant, including my dad and uncle. So anyhow, my uncle was telling me a story, which I won’t go into, but the heart of it is that you should not wait for people to hand you “what you deserve.” If you are a leader, regardless of your job title, then lead. If you see someone who needs help, don’t wait for them to ask you. Help. Show that you have the initiative. That’s probably fairly obvious, I mean we’ve all heard it before, but it came at a particularly important time for me.

I’ve been on twitter for a while now, though I don’t update it super-regularly like some people. It’s fun and I hope more of my friends start using it, but I’ve noticed an interesting trend. Just about anything is open to potential spam. Friendster is sick with it. MySpace is abominable. LinkedIn seems fairly immune and I’ve gotten very few spam friend requests from Facebook. Twitter has so far been very good about it, but there is a new trend that I’ve found interesting. You can follow people and people can follow you on twitter. So your status updates are public and potentially seen by thousands of people. How do you increase the number of people who follow you? Follow them, of course! I’m having random people follow me left and right. It only helps me, since I don’t follow them back, but it’s interesting to note.