Thursday, February 21, 2008

With the recent release of JRuby 1.1 RC2 I’ve rerun my ‘Real World Performance’ benchmarks. In addition to my normal run I’ve added a second longer run, and the results surprised me. Let me tell you what I did, and then we can take a look at the results.

In both cases I use an application that builds a catalog of good and bad patterns. Then it reads in syslog output and compares the log entries against the catalogs, sending different kinds of notifications depending on the configuration. While this sounds like it might be IO bound, profiling shows that much of the time is actually spent in Array operations and comparisons.

(Between RC1 and RC2, the JRuby team added the -X prefix to the compilation switch to prevent a conflict with MRI’s -C switch.) The -C switch turns off compilation, which is faster on these shorter runs. I also tried using the +C, but am ignoring the results for now.

The results of the new style benchmark looks like this:

Version

switches

Average Run Time

stdev

time/syslog entry

JRuby 1.0.3

-C

127.90

15.8

0.00174

JRuby 1.1 RC1

-C

231.86

30.2

0.00315

JRuby 1.1 RC2

-X-C

218.93

34.89

0.00297

JRuby 1.1 RC2

-X+C

193.15

8.56

0.00262

The first thing that jumps out at me is that the 1.1 release candidates are significantly worse that the 1.0.3 release. The other thing is that performance (measured per entry) degraded on the longer run. The former is something that the JRuby team might want to look at, while the latter means I probably need to look at my code.

I used the same JAVA_OPTS settings shown in the original style tests, but I did run the RC2 version with compilation turned on (and did see an improvement). I should probably set aside a couple of hours to let these tests run against an even larger data set at some point.

Please, don’t assume that your app will end up with the same performance characteristics that LogWatchR has. Run your own performance measurements … and think about posting them, it will help all the implementations get better.

Thursday, February 14, 2008

After running three years worth of quizzes, I retired as quizmaster and passed the project on to some loyal fans. New quizzes are still posted to the Ruby Talk mailing list so look for them there if you want to participate.

Just because I knew it was coming didn’t make it any less unpleasant. I don’t think I’ll be alone in missing James’ weekly Ruby Quiz summaries. I’m not entirely sure who’s picking up the reins on the weekly quiz, but I hope they’ll work hard to live up to the expectation that James has laid for them.

As for James, I’m sure we’ll continue to see his contributions on ruby-talk, and in the larger community. Good luck in all your future endeavors my friend.

I’ve interviewed both Luke and James about Ruby, Puppet, and James’ book. So, it’s only fair that I turn an eye to the book as well. In the interest of full disclosure I should mention that Apress gave me a PDF of this book for review purposes.

Pulling Strings with Puppet was published as an eBook, through Apress’ imprint firstPress. It’s a really big book to see published this way, coming in at 187 pages, two or three times the size of a lot of ebooks. Like other books in the firstPress line, this one is also available in print (the amazon links in this review point to the paper copy, if you want the PDF, look here and follow the link).

James is an accomplished Apress author, having also published Pro Nagios 2.0 and Hardening Linux (both of which come highly regarded). As such, he’s certainly proven his chops in the sysadmin world, so his book on Puppet can be seen as something of a vote of confidence in the system.

The book contains seven chapters, as follows:

Introducing Puppet

Installing and Running Puppet

Speaking Puppet

Using Puppet

Reporting on Puppet

Advanced Puppet

Extending Puppet

The first chapter should whet your appetite for the rest of the book, chapters three and four will get you running along smoothly, but I think it’s the last three chapters that really make this book worthwhile. In chapter five the discussion of the built in report tools and the mechanism for building custom reporting will be of interest. I really liked chapter six’s coverage of scalability (both the admission that Puppet still needs work and the workarounds that can help scale it until the work gets done). All of chapter seven was interesting reading, and shows the ease of extending Puppet and Facter (one of the underlying libraries that makes Puppet work).

This book is filled with helpful code samples and pointers to external resources that look very useful. It’s well written and easy to understand. As good a tool as Puppet looks to be, this looks like an equally good book to get you going. If you’re doing configuration management for anything more than a box or two, run, don’t walk, and pick up your copy of Pulling Strings with Puppet.

Monday, February 11, 2008

Well, I’ve heard from a bunch of the MountainWest Regional Sponsors, and I’m getting really excited as we draw a bit closer. Addison-Wesley is donating some books from their Professional Ruby Series for to give away. Apress and O’Reilly are also donating books that will be pass out as part of the MWRC’s famous ‘binary lottery’ (well, maybe it’s not famous—it sure is fun though).

One of the sponsors has let me know that they’re planning something really special for attendees. I can’t say what it is yet, but they’ll be announcing it soon, so make sure you keep on eye all the sponsors' websites. (You can find the complete list of sponsors on the MountainWest RubyConf website.) Of course, I’ll post the news here as soon as they announce it too, so if you don’t mind getting your news second hand …

If you want to get in on the action, you can go to the MWRCregistration page. It’s $100 for 2 days, 10 regular sessions, 2 lightning round sessions, keynotes by Evan Phoenix and Ezra Zygmuntowizc of Engine Yard and Jim Weirich of EdgeCase, a conference shirt, food, and chances to win great conference schwag, not too mention the awesome surprise to be announced soon—how can you go wrong?

Recently, I’ve been reading about Puppet because of James Turnbull’s excellent Pulling Strings with Puppet. James has been good enough to do a short interview with me as well. I hope you enjoy reading it as much as I enjoyed chatting with James (you might want to check out my interview with Luke Kanies too).

I normally think of you as a Pythonista, yet Puppet comes from the land of Ruby. How much does that matter to you? How much should it matter to an end user?

Actually I’d probably be classed a Perl Monger rather than a Pythonista. Either way I am a “hack & slash” programmer in both languages – functionality over elegance.

I took my first serious look at Ruby about the same time I found Puppet. I initially thought that the language wouldn’t matter to me at all but I confess Ruby has sucked me in a great deal more than I had anticipated. It is a very elegant language and one well suited to beginners and system administrators. I find Ruby’s syntax and flow control very easy to grasp and yet a few simple extensions and you are doing some very powerful things. I think when more system administrators discover Ruby we’ll see a slow movement away from Perl and Python as the core “sysadmin scripting” languages.

As a Puppet end user Ruby almost shouldn’t matter to you though. The “almost” relates to exactly where Puppet is in its life cycle. Puppet is still a young product. It is highly flexible and featured and if you’re using Puppet in serious anger about 80% of what you need is there. But there are still some resources you can’t manage. The fastest way to fix that? Develop these resources yourself. Luke Kanies, Puppet’s author, has built a very simple framework that with some Ruby knowledge allows you to do this. We are also seeing more development input as the community expands and people are starting to get more involved. This may provide most of the additional resource types needed right now.

How did you discover Puppet?

By accident mostly. I was researching configuration management tools and I was reading a blog post and someone had commented on this upstart new tool called Puppet. I went over to the Reductive Labs site and downloaded it. Fifteen minutes later I had a master and 5 clients running managing my DMZ servers. Two days after that I was starting to think – “Gee it’d be nice if Puppet did…” And then I was hooked.

Any recommendations on books, blogs, or websites for sysadmins who want to learn more about Ruby?

I like Apress' (disclosure – I am obviously an Apress author) Beginning Ruby. I it found a very useful book – easy to use as a reference and easy to read through. I also found that the Ruby documentation and the basic tutorial at tryruby.hobix.com are good places to start.

Why not CFEngine, one of the other systems management tools, or even a roll-your-own solution?

I have used CFEngine in the past and I know CFEngine has a very loyal user base. It’s a solid tool for many purposes but it has weaknesses and a lot of things that I personally don’t like about it. It also has a long development cycle and a limited number of people contributing to its development. Alternatively I see Puppet as having a quite dynamic community with developers who are responsive to community requirements. ~ Hence for me Puppet is a better choice of tool and one I felt it was important to support and develop.

Using “roll-your-own” tools I think defeats the whole purpose of configuration management. The scripts usually require trees of case statements to cater for varying operating systems and platforms, they usually aren’t portable and they require a communications mechanism that usually defaults to ssh and a for loop. Puppet makes all of that go away – it is a configuration management abstraction language combined with a secure client-server mechanism to configure your hosts. It takes all the pain out of systematically and efficiently managing your hosts.

Puppet seems like a good answer on the configuration management end of the sysadmin house. Is there room for a Ruby solution on the monitoring side as well? Or do you think nagios, mon, or some other solution have that all sewn up? (If so, which one?)

There is a Ruby solution that I quite like for monitoring called “god”. It is very immature compared to Nagios or others of that ilk but it has some interesting concepts. You can see it at god.rubyforge.org. I think there is always room for new ideas and new approaches in system administration tools.

Since you were learning Ruby and Puppet at the same time, what kinds of things did Puppet teach you about the language?

Well I had little understanding of classes, modules and related concepts—actually very little OO experience. Understanding Puppet required gaining an understanding of Ruby’s OO nature. That has also made a big difference to how I code overall.

What kinds of things did writing the book teach you about Ruby and Puppet?

Well I hadn’t looked at storing hosts in LDAP or had much experience with Mongrel. Writing the book meant I had to delve quite deeply into both. Also explaining to people how to create your own extensions to Puppet to manage other resources also meant I had to ensure I learnt enough Ruby to comfortably explain the concepts involved.

Can you show us a little bit of Puppet in action?

Well I can show you a very simple example of managing a resource. Let’s say you wanted to ensure the sshd service was enabled and running on hosts. In Puppet you would define the following resource:

service { "sshd":
~ enable => true,
~ ensure => running,
}

>

And that’s it! You can then assign this resource to the hosts which need it and Puppet will automatically ensure that the service is enabled and started. The best thing is that this same resource definition will work on Red Hat, Debian, Solaris, BSD and others. This is the magic of Puppet’s configuration abstraction – you only need to define what the resource should look like. Puppet takes cares of the how.

That is a hard choice because lots of people are doing lots of cool things. There is a large multi-coloured search engine company who are managing many thousands of OS X desktops with Puppet and the Fedora Project who use it to manage all of their infrastructure.

Probably the coolest thing I see is how people use Puppet in ways I hadn’t considered. I regularly see people on the mailing list and IRC channel say “Wow – I just found out the syntax can do x”. I look at what they have posted and they have often found some new way to take advantage of the language to configure hosts. It is being part of this sort of community that makes working with Puppet very cool.

What non-Puppet Ruby projects are you watching/using?

I’ve recently started playing with rbot—an IRC bot written in Ruby. That also led me to another IRC bot framework called Autumn Leaves. But generally due to a distinct lack of time most of my focus is on Puppet and Facter.

Who gets the credit (Or is it blame?) for the title of your book, “Pulling Strings with Puppet”?

That’d be my editor and the marketing guys at Apress. Do you know how excited marketing people are when a product allows amusing alliteration and puns? :) But I like it – it’s both kitsch and catchy.

Thursday, February 07, 2008

I’m currently reading Pulling Strings with Puppet, the new book from Apress. So far I’m really enjoying it, look for a review in a week or two. The book whetted my appetite to learn more about Puppet though, so I decided to go to the source and talk to Luke Kanies of Reductive Labs, the development team behind Puppet.

About 4 years ago, I had a pretty clear idea in my head of how I wanted to implement a tool to manage resources, and specifically how each resource type would basically just be a collection of attributes, each with their own behaviour and with the majority of the resource’s behaviour coming from the attributes, not the resource itself. I’m not saying this was a good idea, just that I had it. :)

I was a sysadmin who did a lot of development at the time, which basically meant I was a perl developer, and I was mostly doing OO in my perl. I tried to implement my idea in perl, but I just couldn’t get the class relationships to work (the attributes and resource types each needed to be classes, according to the design in my head). This was back when Python was the shiznit, so I naturally tried it, but Python just makes my eyes bleed (and no, it wasn’t the whitespace, it was things like the fact that ‘print’ was a statement instead of a function, and ‘len’ was a function instead of a method).

I had a friend who had heard Ruby was cool but hadn’t actually tried it himself. Since I was just messing around at the time, I figured I’d give it a go. Four hours in, never having seen a line of Ruby previously, I had a functional prototype.

When I decided to go full time on Puppet, I spent a lot of time agonizing over whether to stick with Ruby (I can’t seem to find a link to the original discussion), because I was rightly concerned that not many people had Ruby deployed and in fact it was a very niche language at the time (this was long before Rails had ever been released).

In the end, I figured developer productivity trumped nearly everything, so I jumped.

How did you get started with systems management?

I am apparently fantastically good at breaking computers, and not so good at fixing them. Thus, when I did fix them, I tended to write code to allow me to repeat that fix. Even in m earliest computer-using days (I’m a Johnny-come-lately—my first computer had a CD-ROM drive and a 500MB hard drive), I could rebuild my computer by booting from a different drive and moving a few directories around.

As I used this ability to get computer jobs, I found that life was always easier when I could teach the computer to do the work rather than doing it myself—the computer doesn’t get bored, and if it’s complaining, I can’t tell. Things got really going for me when I switched from MacOS to *nix (yes, I was doing near-unattended MacOS installs in 1997, using AppleScript and everything).

As the tools got bigger, so did my dreams, and eventually I realized that the only person who was going to create the tool I wanted was me.

Why Puppet?

Puppet’s fundamental advance is its concept of a resource—it builds an abstraction over the things we need to manage, and then uses this abstraction as a kind of API into the operating system. This abstraction allows us to build truly portable configurations that work across multiple platforms, and it also enforces a declarative view of the world instead of the traditional procedural scripting view. See my most recent LCA presentation1 for more information.

The goal for Puppet’s resource abstraction layer is that it can be the lowest layer in a configuration management stack, but so far we haven’t had the time to get much beyond Puppet itself.

What are your future plans for Puppet?

I’m pushing toward a 1.0 this year, hopefully, as soon as I can get the critical APIs stable. I’m also hoping to add a lot of interesting functionality around making each host’s resource catalog more useful outside of Puppet—e.g., you could have all of your resource relationships set up in it, modify /etc/ssh/sshd_config, and then tell Puppet to figure out what services need to restart because of that change.

As we move toward a more database-backed catalog, vs. the current YAML-dumped version, we’ll get a lot more functionality out of it yet, and I can’t really even see most of that functionality right now.

What are some cool things that people are doing with Puppet?

Google’ s managing more than 6000 Mac OS X laptops and desktops with
it, which I think is pretty cool. Red Hat has built a tool called
cft2, which goes some way toward converting traditional-style work
into Puppet code. Julian Simpson at Thoughtworks has done some cool
integration with Cruise Control that uses Puppet and virtualization to
automate testing Puppet configuration3.

Puppet seems like a great solution on the front end of the sysadmin space (configuration management). Is there room for (or a need for) a good Ruby monitoring framework? If not, why not?

I’m not really sure that the language of the framework matters all that much. I chose Ruby because it did the job the best for me during development, not because I wanted a Ruby version of an existing piece of software.

If someone can identify clear shortcomings in existing monitoring tools, to the point where a new tool should be created, it might make sense to implement such a tool in Ruby, but I think the monitoring space is pretty congested right now, so I certainly wouldn’t want to try to product a competitive product there right now.

Do you have a monitoring app that you’re happy with?

Fortunately, I haven’t been an operational admin in years, so I haven’t had to deal with monitoring in a long time. Thus, no, I have no app I’m particularly fond of. I hear a lot about monit, and god (which is apparently pretty new).

I haven’t read the book yet, but I worked pretty closely with James Turnbull as he wrote it, and he’s had a huge impact on the community during its writing. He updated many docs online with the info he gleaned from me, and it seems like the community is very fond of the book.

I just got back from Melbourne, Australia, where James lives, and he’s a great guy; it’s definitely nice to get to meet the various community members in person.

Okay, so this is pretty cool. For the last couple of years, I’ve been working for the Family and Church History department of the LDS Church. We’ve been working on a set of massive genealogical projects, aiming at putting huge piles of data into the hands of researchers everywhere, and making collaboration on genealogical work vastly easier. Yesterday, my department made another major announcement.

This time, instead of news about indexing data and calls for volunteers, we’re announcing a conference aimed at Ruby, Perl, Python, PHP, Java, and other developers. To me, the most exciting bit of news in the press release is:

The newly released FamilySearch Family Tree API and soon-to-be-released Record Search API will be main topics of discussion.

These APIs provide access to billions of records. I’ve had a chance to play a bit with Ruby libraries for accessing the APIs, and I’m excited to see these get more publicity. I can’t wait to see the kinds of tools people start building with them.

Wednesday, February 06, 2008

When the Dr. Kelly first broke the word that he was going to stop working on the Ruby.NET project, M. David Peterson and Ted Neward replied to say that they thought there were still reasons for people to keep working on the project. (I wrote about this flurry of emails here.)

M. David Peterson also wrote that he’d be writing a longer explanation of his thoughts in a day or so. That time has come, and I think it’s worth examining this longer email.

Peterson starts out by commending Dr. Kelly for making the difficult decision to begin supporting the IronRuby platform at the expense of the Ruby.NET platform.But, as he says, “[t]hat doesn’t mean that the Ruby.NET project should just fold up and die. It simply means it was the right decision for them to make, a decision that, quite obviously, was not an easy decision to make for either of them.”

He goes on to talk about two ways that Ruby.NET can still play an important role:

as “a stepping stone for the Ruby language on the .NET platform that Ruby and .NET developers alike can use to get started with Ruby development for the .NET platform right now.”

as “a way that someone can take the same code base they write for IronRuby and compile that code into a static and reusable assembly that is portable and reusable inside of any CLI-compliant language.”

The first is mostly a matter of Ruby.NET being ‘closer’ to a running Ruby platform right now than IronRuby, a claim I’m not in a good position to judge. The second is the more important. Peterson wrote:

One of the most commonly asked questions on the IronPython development list — IronPython as we all know being the basis of what brought about the DLR — is “Can I write my libraries in Python and then call those libraries in my C# or VB.NET code?” While the answer is a bit more complicated than this, for the most part the answer is “Probably not.” This is one area of the DLR that both the IronPython and IronRuby teams have specified would be nice to have, but at present time is a non-goal for the 1.0 release of the DLR and the projects based on that release.

In this regard there is a clean separation between what Ruby.NET can offer the .NET developer right now and what IronRuby will not offer the .NET developer in the near term future. As such this is one area that I believe should be the core focus of the Ruby.NET project moving forward. But not from the standpoint of creating a competitive Ruby language project for the .NET platform — that would be silly and prideful — and instead from the standpoint of merging the focus of the two projects together in such a way that interop between the two code bases is seamless. In other words, and in my own opinion, the purpose of the Ruby.NET project moving forward should not be one of being a separate project with a separate focus …

I don’t think it was by any strange coincidence that when Dr. Kelly created this project on Google Code he chose rubydotnetcompiler as the project name, as ultimately that’s what this project is all about. Ultimately and eventually it may turn out that the IronRuby and DLR teams decide to enable static compilation. And maybe they won’t, deciding instead to focus their time on making the dynamic nature of dynamic languages that much more dynamic than would otherwise be the case.

In the mean time, however, there’s the Ruby .NET compiler project, a project in which I believe should follow the direct path of the IronRuby team, making it possible to take code targeted for IronRuby, statically compile that code, and reuse the static assembly within any other .NET code project. To me, anyway, this is an area of great desire with the .NET development communities and as such the perfect focus for moving this project forward.

Whether as an interim move, or as the only static compiler for Ruby on the .NET platform, this kind of interoperability is really important. Making it work well is going to require a lot of communication between the new Ruby.NET team and the IronRuby folks (and probably more CLR folks that I don’t know about). Hopefully everyone involved can carry it off.

Tuesday, February 05, 2008

Christian Sepulveda, of Pivotal Labs, was kind enough to talk to me a bit about the 2008 MountainWest RubyConf and why they’ve stepped up as a sponsor. I really like his description of regional Ruby conferences as “a group conversation”, I think that’s just what we’re all shooting for.

By the way, if you’re looking to work with Ruby and/or Rails at a cool company like Pivotal Labs (see the Christian’s first answer below), you should register for MWRC and be a part of the conversation.

Why would a company like yours sponsor a regional Ruby conference like MWRC?

We have lots of Rails developers and have made a significant strategic investment in Ruby and Rails. Increasing our brand awareness in this market, both for prospective clients and recruiting talent, is important for us.

What’s the difference between a regional conference and RubyConf or RailsConf for a sponsor’s perspective?

The larger conferences have more reach, but they are also getting a little saturated. The difference for us is like the difference between talking via a megaphone vs. a group conversation.

What do you think attendees will get out of a conference like MWRC?

Idea exchange, networking, knowledge of what’s going on in the larger community.

If you could sit down with a Rubyist from the MountainWest, why/how would you encourage her to go to MWRC?

Ruby is getting really close to going main-stream (some would argue that has started). Now’s the time to get involved and be in front of the wave rather than behind it.

What message about Pivotal Labs do you hope attendees will take home?

Pivotal has some of the best technical chops around. We are committed to Ruby/Rails and its wide adoption.

I’m not really a Windows or .Net kind of guy, but I’ve tried to keep an eye on the Ruby.Net and IronRuby. While neither has been as visible inside the Ruby community as, say, Rubinius or JRuby, Ruby.NET has always had a harder time ‘getting the word out’.

Yesterday, Dr. Wayne Kelly wrote an email that said in part:

[W]e set our selves(sic) the goal of running Rails on .NET and we haven’t achieved that yet. If we can leverage our experience to help IronRuby get to that point, then I’d at least have the personal satisfaction of helping see the job completed.

These are just my views. As a researcher, my prime interest is not in developing products, but in developing innovative new ideas and having an impact by having those ideas used in the real world. I’m aware that others in the community will have different goals and so will presumably have a different take on this – I’m keen to hear what you think. If anyone wants to press ahead, then the code base is still owned and controlled by you the community, so you are free to do with it as you please with our full blessing.

I’d also like to make it very clear that this decision is entirely my own – based on research and technical considerations. Microsoft did not in any way suggest or encourage us to kill the project and we thank them again for their support of the project.

I’d like to thank all of our contributors and supporters and apologize if this decision comes as a disappointment. I hope many of you will join me in contributing to the IronRuby project and see it through to a successful completion.

There were two quick responses from the community, from M. David Peterson and Ted Neward which read as though they (at least) will be continuing to work on Ruby.NET. Peterson wrote:

No, in all seriousness after my recent “Ruby.NET Is NOT Dead” speech Dr. Kelly contacted me directly to let me know he was considering what follows. And after reading his thoughts and discussing things with him further I absolutely 100% both stand behind and believe this is the right decision for HIM to make.

Neward responded:

He sent same to me, and while I’m completely behind the idea of HIM moving on if he feels the need/desire to, I still believe that there is room in the world for both a statically-compiled Ruby and a dynamically-interpreted Ruby on the CLR.

Assuming Peterson and Neward move forward with this, and I hope they do, they need to ensure the Ruby.NET community is working harder to engage the broader Ruby community. There are a number of places that this interaction can and should take place: the rubinius spec, future directions of Ruby, ruby conferences (regional and otherwise), and the Ruby blog-space. Peterson and I talked about this in an interview about Ruby.Net, and I believe he’ll work hard to make sure this comunication happens.

Who know’s maybe this is just the kind of shake-up the Ruby.NET community needs to get things moving again.

(By the way, I think the XRuby project is a remarkable parallel. They’re also aiming for static compilation of Ruby, and aren’t comunicating as well as the corresponding dynamic implementation of Ruby for the same VM. Maybe that’s a post for another day.)

Monday, February 04, 2008

Ok, I guess I’m turning into a grumpy old man. Last week I climbed up on my soap box about IBM Press. Then, this morning I received yet another email from Salve Regina University asking me to participate in their 60th Anniversary Alumni event. I guess this wouldn’t be too bad, accept that I’m not an alumnus, I’ve never been to the school, I’ve never even lived in Rhode Island (Salve Regina is in Newport, RI). Oh, and to make things worse, I’ve been asking them nicely for the last year to take me off their mailing list.

I don’t know which is worse, that a university can’t figure out who their own alumni are, or that they can’t figure out how to pull me off their own mailing list. It certainly doesn’t speak very highly of their abilities. If you were considering attending Salve Regina, maybe you’ll want to think twice given this stunning display of incompetence.

If you’re interested, here’s the timeline for the latest batch of emails:

Dec 10, 2007 I received an email from Stephen Kumnick that read as follows:

Hello All,

We are currently planning an Alumni and Parent Event in Providence on Wednesday, February 13, 2008 at 10 Prime Steak and Sushi. Your help is greatly appreciated to make this event a success! This will be a networking event for alumni and parents in the area.

Are you interested in being on the event committee? As an event committee member you will call and email alumni in the area to ensure that they attend this event. We look forward to hearing from you.

I responded thusly:

Get me off of your mailing list! I’m not now, and haven’t ever been involved with Salve Regina as a student or a parent of a student. The more of these emails I get the less likely I’ll ever want to become involved with your institution.

Mr. Kumnick didn’t get the message. On January 9, 2008 he sent me another email (and this one was all html-y—yuck!). The gist of it was that I was invited to attend a $35 event. No thanks.

Here’s my response:

Mr Kumnick,

I need to tell you once again that I have no relationship with Salve Regina and never have. I do not want to continue to receive spam from your institution. Take me off of your mailing list.

I haven’t received anything else frm Mr. Kumnick, so maybe he figured it out. If so, he didn’t pass the word along because this morning I received an email from a generic looking email address, alumni@salve.edu. Once again, it’s an invitation to attend their event, which is now being called a ‘Providence Networking Event’ for Parents, Alumni, and Friends—I’m certainly neither of the first two, and with their incessant emails I’m not likely to be the third either.

I’m emailing them the link to this post. Hopefully it will help them get the message and get me off their mailing list.

Friday, February 01, 2008

It’s not often that I get up on my soapbox about ‘IP’ issues, though I did write about a copyright infringement problem I had a few years ago — that one turned out well, maybe this one will too.

It’s also not often that I will write a negative review — not because I like everything that I read, just that I'd rather follow "Thumper's Rule" and just not write anything. Every once in a while though, something comes up that makes me go against these personal policies. Yesterday it happened again.

Lately, I’ve been getting interested in extracting information from unstructured data. (I blame some of my recent reading like: Visualizing Data and Programming Collective Intelligence.) So, I was really excited when IBM Press sent me a review copy of “Mining the Talk”.

I sat down and started reading, I’d nearly finished the introduction when I hit a problem. Right there on page xix, the last paragraph reads:

On the other hand, if you want to take our methods and create your own software solution to sell as a product, while we applaud your initiative and enthusiasm, you really should first discuss this with suitable representatives from IBM business development. IBM has sole ownership of all the intellectual property described in this book, all of which is protected by U.S. patents, both granted and pending. All rights reserved, etc., etc.

Wow! How broken is this? IBM has put two senior staff members through the effort of writing a good looking book; put untold others to work editing and marketing said book (at $45 a pop); then put the whole thing under lock and key by claiming that they hold patents over the whole of it. They might as well have written the book in Klingon -- ehh, no I guess not, too many geeks can read Klingon.

I don’t know about you, but I read books so that I can learn how to do something. If a book carries a warning label which states that I’m putting myself at risk of patent litigation by implementing the ideas in the book, I’ve got a problem with that. A big enough problem that I'm not going to reading it.

I brought this to the attention of the authors, and I got the following response from Scott Spangler:

You raise a very important point, and it is one that we wrestled with in writing the book and in making the software that implements its methods available for free on ibm.alphaworks.com. It seems to me there are competing interests here. There's the interest I have as an author to publish my work so that interesting and thoughtful people will read it and comment on it and use it for productive purposes, thus giving me valuable feedback and new avenues for application of my ideas. There's the interest IBM has as a publicly traded company to produce value for its shareholders and make a return on the investment it has made in its research assets. There's the interest of the reader has to get a return on her investment in purchasing the book and spending the time to read it. I believe all these interests are fairly balanced by this book and by the underlying patents that support IBM's ownership of the intellectual property described. ...

Is this a perfect solution? Perhaps not. It would be better if the reader could just read the book and do whatever they wanted to based on the ideas within it without fear of lawyers. And then somehow, magically, if the reader used the ideas to make significant money, IBM's fair share of that profit would be transferred to compensate the company for its foresight in investing and supporting our work over the years. But this is the world we have. And in that imperfect world, I don't regret writing this imperfect book (and I hope Pearson [the publisher behind IBM Press] doesn't regret publishing it!).

Scott closes by saying "I don't know if I have persuaded you to keep reading, but I sincerely hope so." Sadly, while I understand his point of view, I'm still not willing to cross the line they've drawn in the text. Hopefully, I'll hear back from their business development person with a clear statement that Free/Open Source software developers can implement the ideas contained in the book without fear of patent litigation.

I hope that IBM press, and any other parties involved in this book will see the light and specifically allow individuals or companies to use the information that they have worked so hard to publish. Until that happens, I can only recommend that you stay away from this book and look for other alternatives.