Random Crap

Posts Tagged ‘blagoblag’

Thanks to Odiogo.com (via @johndcook), this blog now has a podcast powered by speech synthesis. Not having heard any decent speech synthesis for open domain text (maybe I’m behind the times here), I was pretty impressed with it. John had a post with a quote from The Agony and the Ecstasy and Odiogo got it pretty close to right in terms of pronunciation and intonation. Hopefully it will turn out as well for my blog. Let me know if you give it a listen.

If you are in search of a blog that will put an end to all of your earthly troubles, look no further than the Noisy Channel. Aside from being a font of knowledge that will turn you into an AI from a futurist’s dream, there have been reports that regular TNC readers have been cured of certain debilitating illnesses such as halitosis, trichodaganomania, and the often fatal googlemania.

What are you waiting for?

P.S. Daniel, email me for the address of where you can send the check.

There has been much ballyhoo in the blogosphere touting Google’s so-called foray into semantic search. The blog post announcing the new feature doesn’t even mention the word semantics, but it does say it looks at associations and concepts related to your query. I see no mention of tuples or anything of the sort and the suggested queries are the kind of thing that I would expect to come out of a background closer to document/query classification than semantic analysis.

Related search results for much ado about nothing

And the results are pretty meh. Except for taming of the shrew, those results are no-brainers. That’s query completion quality results. Of course you can’t judge the whole system by one isolated example.

When PC World and a host of other pop tech media zines started toasting the entrance of Google to the semantic arena, I was excited to see some cool stuff. Imagine my disappointment when I was not only underwhelmed by the quality of the results, but by the lack of novelty. How long has that feature been there? Seems like I’ve seen it for ages. Maybe it got a technological face-lift (I guess that would be a face-lift on the inside), but it looks about the same as I remember it. Plus, its placement at the bottom of results page relegates it to search engine hell.

In summary: boring. My complaints are first and foremost with those elements of the blagoblag who over-hyped this. Secondly, I am complaining to Google for not being better. I am feeling demanding today.

It is bad journalism when an old news story is debunked and continues to be rehashed! How sloppy! Shame on you, Houston Chronicle!

Back around 2000, when Palem began thinking about the future of computer chip technology, power consumption wasn’t a big consideration. Only speed mattered.

But today, the energy consumed by information technology – a January news story likened the energy used in just two Google searches to boiling a kettle of tea – has become a major consideration.

Google debunked the results quite quickly after that article ran. Why is it acceptable to cite stories without checking on whether those stories are accurate? Isn’t this what we pay journalists for? I guess it’s too hard to check up on facts and instead we can just say there was a news story that reported it rather than making any claims about its correctness. Isn’t that what we have bloggers for?

Since I started blogging almost a year and a half ago, I have been following many blogs. I managed to find some blogs dealing with computational linguistics and natural language processing, but they were few and far between. Since then, I’ve discovered quite a few NLP people that have entered the blagoblag. Here is a non-exhaustive list of the many that I follow.

Many of these bloggers post sporadically and even then only post about CL/NLP occasionally. I’ve tried to organize the list into those who post exclusively on CL/NLP (at least as far as I have followed them) and those who post sporadically on CL/NLP. I would fall into the latter, since I frequently blog about my dogs, regular computer science-y and programming stuff, and other rants. P.S. I group Information Retrieval in with CL/NLP here, but only the blogs I actually read. I’m sure there’s a bazillion I don’t.

If I’ve missed one+, please let me know. I’m always on the lookout. Ditto if you think I’ve miscategorized someone. I’ve excluded a few that haven’t posted in a while.

Tom Preston-Werner, aka mojombo, rocks. When GitHub announced GitHub Pages recently, they pointed to a new blog engine, Jekyll. Jekyll generates the blog as a set of static pages — no database reads, no PHP, just fast HTML. I was instantly drawn to it, and since I’ve been itching to switch blog engines, I damn near moved this blog. It would be hosted on GitHub, for free. And it would be backed up using my favorite version control system. I would have complete access to all of my content. If WordPress went belly up, I would lose all of my content. That bothers me.

Jekyll is still in its infancy. But for two things, I would switch right now. First, support for tags is incomplete, so pages on my blog such as http://mendicantbug.com/category/computational-linguistics/ would no longer be supported under Jekyll. That would play hell with my Google traffic. I’m willing to make that sacrifice since most of that traffic is from people who don’t care about the main topics I’m interested in. Second, and this is the killer, Jekyll does not support comments. Yet. The good news is, it can be forked and someone may implement comments. I hope so, but the static nature of Jekyll means handling comments is not very straightforward. I can imagine how it might be done, so we’ll see. I suppose I could do it myself, but my plate is so full right now I’m having a hard time getting what I need to get done done.

So what I’m doing instead, for now, is hosting my code there. Jekyll has code highlighting built-in using Liquid. Handy! I put up the source for my post on Bandwidth simulation. I’ll be adding more soon, which I’ll make note of, if for some reason you’re actually interested.

Looking back over 2008, there have been a lot of changes in my life. Many of those are reflected in my blog, but few are reflected in the posts that have gotten the most traffic. But for the hell of it, here are the top posts anyway.

Of all of those posts, the best one is hands down 10 Reasons to Use Git for Research. After that, the Noob’s Guide to Parsing. Some of the posts with the most hits are just link-sharing, where I saw something cool (Salad Fingers, Steampunk Star Wars, Ambigrams) and then other people found my link first. One definite change on this blog was a decrease in the frequency of my posts. Around the end of last year, I was posting close to 2 items per day. Now it has stretched out to about 2 items per week. Maybe I’ll reflect more on that later.