That just searches for either a set of quotes with some non-quote characters between them or a run of non-whitespace characters. Those are the two possibilities for the fields. Note that the two separate capture here mean scan() will returns contents in the form:

[[nil,"one"],["two",nil],["a longer three",nil]]

That's why I added a flatten() and compact() to get down to the actual matches.

The regular expression approach can get pretty complex though if any kind of escaping for quotes is involved. When that happens, you may need to step up to a parser.

10

This article was written for the Railcasts 100th Episode Contest. I think the idea is great and I look forward to reading great tips from all who decide to participate.

1. create_or_find_by_…

I imagine most of you know that ActiveRecord can handle finders like:

MyARClass.find_or_create_by_name(some_name)

This will attempt to find the object that has some_name in its name field or, if the find fails, a new object will be created with that name. It's important to note that the order is exactly as I just listed it: find then create. Here are the relevant lines from the current Rails source showing the process:

The above code is inside a String literal fed to class_eval(), which is why you see interpolation being used.

Unfortunately, this process is subject to race conditions because the object could be created by another process (or Thread) between the find and the creation. If that happens, you are likely to run into another hardship in that calls to create() fail quietly (returning the unsaved object). These are some pretty rare happenings for sure, but they can be avoided under certain conditions.

2

The call came down from on high just before the Ruby 1.9 release: replace the standard csv.rb library with faster_csv.rb. With only hours to make the change it was a little harder than I expected. The FasterCSV code base was pretty vanilla Ruby, but it required more work than I would have guessed to get running on Ruby 1.9. Let me share a few of the tips I learned while doctoring the code in the hope that it will help others get their code ready for Ruby 1.9.

Ruby's String Class Grows Up

One of the biggest changes in Ruby 1.9 is the addition of m17n (multilingualization). This means that Ruby's Strings are now encoding aware and we must clarify in our code if we are working with bytes, characters, or lines.

This is a good change, but the odds are that most of us have lazily used the old way to our advantage in the past. If you've ever written code like:

lines=str.to_a

you have bad habits to break. I sure did. Under Ruby 1.9 that code would translate to:

16

If your number one concern when working with CSV data in Ruby is raw speed, you might want to know that FasterCSV is no longer the fastest option.

There are a couple of new contenders for Ruby CSV processing including a C extension called SimpleCSV and a pure Ruby library called LightCsv. I haven't been able to test SimpleCSV locally, because I can't get it to build on my box, but users do tell me it's faster. I have run some trivial benchmarks for LightCsv though and it too is pretty quick:

It's important to note that LightCsv is indeed very "light." FasterCSV has grown up into a feature rich library that provides many different ways to look at your data. In contrast, LightCsv doesn't yet allow you to set column or row separators. Given that, it's only an option for vanilla CSV you just need to iterate over. If that's what you have though, and speed counts, it might just be the right choice.

About

James Edward Gray II was a part the Ruby community before Rails
ever shipped. He wrote code and documentation that now come with
the language. He ran two Red Dirt Ruby Conferences and is was
a regular on the Ruby Rogues podcast for years. He now creates
videos showing real programming in action.
He does all of this just because he loves to program. This site is
where he writes about that.

Projects

Latest Tweets

You know you’re vacation is over when you need to send the people you work with the “Why I Have Been Looking for Other Jobs” email. 😦