Don’t ask me because, with all due respect to the amazing work from committers, I won’t do it.

Unless I’ve made an actual mistake, if my pull request has 5 commits, it is because each of them is independent and I feel they should remain so.

If they really want to, project committers can manually squash and merge the commits, wait for someone else to make a PR with the commits squashed or even reject the PR.

Call it lunacy or pride, but I just won’t squash my commits. Hopefully committers won’t take offense at that as I’ll do my best to not take offense at their suggestion that I didn’t segment my commits correctly.

Why committers ask to squash commits?

I imagine the historical reason it that many contributors aren’t super comfortable with git and git rebase -i in particular, so their commits represent their train of thought, not a sequence of independent changes to apply. E.g.:

Introduce <great feature>

Oops, fix typo

Oh oh, fix a bug of <great feature>

Fix bug of <great feature> (for real this time)

For these contributors, it is mandatory that the commits are not accepted as is. Squashing the whole thing is usually the safest bet.

There are many contributors that use git rebase -i and git commit -p in their sleep though, so that won’t be the situation.

The other reason I’ve been told is that “we generally squash all commits so that it’s easy to backport/revert”. I can’t agree with that. First, it is much more difficult to backport/revert just part of the PR if it is squashed. Morever, it’s pretty easy to backport/revert the whole PR by using git revert and git cherry-pick either with a range of commits or the merge commit. Less than 2% of commits get reverted anyways.

When should committers not ask to squash commits?

If all commits can stand on their own, i.e. all tests pass after each individual commit, then the commits are atomic and do not need to be squashed. I’d even say they probably shouldn’t be squashed.

My commits are typically the smallest unit of change that will work and still pass all tests.
The main exception is a commit of a bunch of trivial changes that are isolated (e.g. removing trailing whitespace, fixing a bunch of typos in the doc,renaming of a local variable). Even then, I won’t commit doc typos and renaming of a variable together, say.

When doing a refactor, I will usually split the changes in small independent refactoring commits. I believe it makes it easier to understand and judge than one big commit.

A rather telling example was this PR I made recently. It aims at fixing one bug, but I broke it down into 15 commits. Each consists of a single refactoring step in the right direction, until the last commit which is the one fixing the bug per say. It’s easier to see the validity of each change this way, while the combined diff has a lot of noise and does a bunch of different things at once. I just can’t grok the combined diff.

I’m not asking everyone to structure their commits with such detail and attention; I’m only asking that it be, if not appreciated, at least accepted to do so.

]]>2013-03-23T00:27:00-04:00http://blog.marc-andre.ca/2013/03/23/method-lookup-in-ruby-20Tech.pro sponsored a tutorial on method lookup in Ruby 2.0.0.

defname({required_arguments,...}{optional_arguments,...}{*rest||additional_required_arguments...}# Did you know?{keyword_arguments:"with_defaults"...}{**rest_of_keyword_arguments}{&block_capture})

In Ruby 2.0.0, keyword arguments must have defaults, or else must be captured by **extra at the end. Next version will allow mandatory keyword arguments, e.g. def hello(optional: 'default', required:), but there are ways to do it now.

Defaults, for optional parameters or keyword arguments, can be mostly any expression, including method calls for the current object and can use previous parameters.

Core classes changes

Prepend

Module#prepend inserts a module at the beginning of the call chain. It can nicely replace alias_method_chain:

1234567891011121314151617181920212223242526272829303132333435363738

# Ruby 1.9:classRange# From active_support/core_ext/range/include_range.rb# Extends the default Range#include? to support range comparisons.definclude_with_range?(value)ifvalue.is_a?(::Range)# 1...10 includes 1..9 but it does not include 1..10.operator=exclude_end?&&!value.exclude_end??:<::<=include_without_range?(value.first)&&value.last.send(operator,last)elseinclude_without_range?(value)endendalias_method_chain:include?,:rangeendRange.ancestors# => [Range, Enumerable, Object...]# Ruby 2.0moduleIncludeRangeExt# Extends the default Range#include? to support range comparisons.definclude?(value)ifvalue.is_a?(::Range)# 1...10 includes 1..9 but it does not include 1..10.operator=exclude_end?&&!value.exclude_end??:<::<=super(value.first)&&value.last.send(operator,last)elsesuperendendendclassRangeprependIncludeRangeExtendRange.ancestors# => [IncludeRangeExt, Range, Enumerable, Object...]

Refinements [experimental]

In Ruby 1.9, if you alias_method_chain a method, the new definition takes place everywhere. In Ruby 2.0.0, you can make this kind of change just for yourself using Module#refine:

# Ruby 2.0moduleIncludeRangeExtrefineRangedo# Extends the default Range#include? to support range comparisons.definclude?(value)ifvalue.is_a?(::Range)# 1...10 includes 1..9 but it does not include 1..10.operator=exclude_end?&&!value.exclude_end??:<::<=super(value.first)&&value.last.send(operator,last)elsesuperendendendenddeftest_before(r)r.include?(2..3)end(1..4).include?(2..3)# => false (default behavior)# Now turn on the refinement:usingIncludeRangeExt(1..4).include?(2..3)# => true (refined behavior)deftest_after(r)r.include?(2..3)endtest_after(1..4)# => true (defined after using, so refined behavior)3.times.all?do(1..4).include?(2..3)end# => true (refined behavior)# But refined version happens only for calls defined after the using:test_before(1..4)# => false (defined before, not affected)require'some_other_file'# => not affected, will use the default behavior# Note:(1..4).send:include?,2..3# => false (for now, send ignores refinements)

Full spec is here and is subject to change in later versions. More in-depth discussion here

Lazy enumerators

An Enumerable can be turned into a lazy one with the new Enumerable#lazy method:

1234567891011121314151617

# Ruby 2.0:lines=File.foreach('a_very_large_file').lazy# so we only read the necessary parts!.select{|line|line.length<10}.map(&:chomp).each_slice(3).map{|lines|lines.join(';').downcase}.take_while{|line|line.length>20}# => Lazy enumerator, nothing executed yetlines.first(3)# => Reads the file until it returns 3 elements# or until an element of length <= 20 is# returned (because of the take_while)# To consume the enumerable:lines.to_a# or...lines.force# => Reads the file and returns an arraylines.each{|elem|putselem}# => Reads the file and prints the resulting elements

Note that lazy will often be slower than a non lazy version. It should be used only when it really makes sense, not just to avoid building an intermediary array.

caller_locations

It used to be tricky to know which method just called. It wasn’t very efficient either, since the whole backtrace had to be returned. Each frame was a string that needed to be first computed by Ruby and probably parsed afterwards.

Enters caller_locations that returns the information in an object fashion and with a better api that can limit the number of frames requested.

For those who can’t upgrade yet, you can still have some of the fun with my backports gem. It makes lazy, bsearch and a couple more available for any version of Ruby. The complete list is in the readme.

Enjoy Ruby 2.0.0!

]]>2011-04-25T00:00:00-04:00http://blog.marc-andre.ca/2011/04/25/dry-migrationsI wanted to write a post about the many things that should be fixed with Rails.

Interestingly, Rails 3.1 fixes quite many of these.

At last, jQuery takes over Prototype. Prototype was nice and didn’t exactly solve the same problem, but in my experience jQuery is mandatory for developing anything decent. Same thing for Sass and I’m glad they have corrected the mistake of the default sass location (which used to be /public/stylesheets/sass when it had to be in /app somewhere. Handling assets was also sorely missing; I’ve been using sprockets before and it’s a fine choice.

I’m happily surprised at CoffeeScript. I’ve also been using it but I didn’t expect it to become the default, especially given the fact that it’s quite young and I’d argue it’s a much bolder move than using Haml. I have no idea as to why Haml doesn’t also come standard.

It’s interesting that we are now targeting the web platform without writing anything directly in it: using HAML instead of HTML, Sass instead of CSS, CoffeeScript instead of Javascript (and accessing the DOM more often via jQuery than directly).

The last goodie is DRY migrations. I find it irritating to write most migrations as I’d really like to generate them automatically from a change to the schema, maybe because my ancient development tool 4D gave me that 25 years ago…

I’d rather write the schema in the model (where it belongs IMO) and generate a “diff” as a migration, but at the very least I wanted to avoid writing the drop_table and remove_column that always correspond one to one with create_table and add_column.

I was actually looking at the code to see where one could have automatically undoable migrations, as it is much easier than my dream solution, and lo and behold, we can now do this!

Much better. Hopefully we’ll soon be able to specify :from => ... when issuing change_column_default or similar so that they become undoable too.

I still have a couple of gripes on my list. In no particular order:

Haml

Default template

Way too basic. There should be a basic solution for the page title (that isn’t a static title!), default content_for, etc… Easy to do yourself, but why not encourage a standard convention?

test environment & fixtures

Also too basic too. I find fixtures longer to generate and harder to maintain when the schema changes compared to factory-based data.

config/database.yml

It has the wrong idea in mixing important production information with less important and more local information for the test & dev environments. I’ve always had problems with source control and that file because I stick with SQLite for dev/test while other developers prefer other DBs.

Yaml

Now that I think of it, I’m not sure there should be any yml files in a rails project. The gain over a strictly Ruby file is minimal, even more so in Ruby 1.9.2, and it’s just less flexible. It also encourages crazy stuff like cucumber yml config file with ERB in it.

MVC…L?

Maybe it’s just me, but I like to write separate functionality that acts like a library. It doesn’t fit as a Model, so I stick that code in /lib with the caveat that there is no default structure, that it doesn’t autoload nor auto reloads. It should probably go in app/lib or similar.

]]>2010-04-01T00:00:00-04:00http://blog.marc-andre.ca/2010/04/01/fixing-mri-dozen-steps-at-timeIs there a term like bugfield? You know, when everytime you get to take a couple of steps in a code base you encounter a different bug, which leads to another one, …, like a minefield of bugs?

Here was my last sequence in Ruby (MRI)…

Main goal: improve Matrix#determinant and #rank after a suggestion of Yu Ichino. The bulk of the work took me quite a while, as I had to check a bunch of things, understand the algorithm, do some performance testing, etc…

When modifying Matrix#rank to use this different approach, I take the opportunity to improve the styling. A variable name of ii is not as clear as row, and… it actually reveals that something is amiss because that loop goes up to the number of columns, not rows…

1) So I find a minimal test case to convince myself I’m not mistaken. Yup, a simple 3x2 matrix has the wrong rank. I add that to the spec and fix Matrix#rank. When cleaning up, I make sure that Matrix#regular? and Matrix#singular? are using the right determinant function and not a bad variant that’s now deprecated.

Turns out they are checking the rank of the matrix, which is not as efficient but more importantly…

2) they both return false if the matrix is not square. This doesn’t make much mathematical sense.

Since I’m now the happy maintainer of the lib and I am confident there is no other reasonable solution, I have them raise an error for rectangular matrices. This means specs are either wrong or incomplete in Rubyspec, though, so I check them out…

3) Turns out Rubyspec is incomplete for those, so I specify what error should be returned in case of a rectangular matrix. Double check my change by running it gives me 0 assertions. Oups?

Turns out that the guard I wrote to signify this was a bug never passes. Ah, right, ruby_bug "", "1.9" means “this is a bug present in the whole 1.9 .x line”, so it will not be executed until Ruby 2.0!

My bad, but the program to run the specs shouldn’t allow that though, so…

4) Discussion with Brian Ford, the maintainer of RubySpec. Good thing he’s always on IRC. Anyways, he might put in a max version to avoid such nonsense in the future. Meanwhile…

5) A quick search in RubySpec reveals about a half dozen of such bad guards, so I set about fixing each one, and…

6) One of the spec that was not guarded properly fails for the latest Ruby trunk. It’s not clear it’s a bug though. At least for me, as I’ve never tried to open the singleton class of a Bignum!

So I investigate, try a couple of things, and yeah, the more I dig, the more it looks like a bug, so I open an issue to confirm with ruby-core. There’s one spec left…

7) The last spec shows clearly a small bug in String#sub! so I fix that in MRI… and I realize that the error message for the wrong number of parameters is misleading.

8) It takes about a microsecond to fix that error message. A quick find reveals other similar error messages in the MRI code. A quick review leads to… 18 issues of all sorts. Some more inaccuracies, some uninformative messages, some that don’t follow the standard format and typos in the doc.

9) I fix all of these too. Ideally this should be refactored, but I’m getting tired. Yet I’m still awake enough to realize that one more method has the wrong doc…

10) From the code, I gather that the interface for SignalException.new is a bit more complex than advertised. I supplement the doc as best as I can.

11) That extra param is a bit odd. Looks like you can build a regexp with a third parameter equal to “n” or “N” and the encoding switches to binary. Other values will get you a warning, and any letter after the “n” will be ignored. Smells like legacy.

git blame tracks back the changes years ago, giving me a reference to the ruby-dev list. Lucky me, it’s not in Japanese and refers to uri/common.rb. A quick check refers to no Regexp.new with that third argument. Ah, there’s a Regexp.new(HEADER_PATTERN, 'N') in uri/mailto. The ‘N’ doesn’t mean binary, though, since it’s in second place (so it means “case insensitive”, as would true), which….

12) is a bug; the regexp is already case insensitive so that ‘N’ has no effect. I don’t understand enough what an extra “N” really does to be sure if it can be removed (since it doesn’t have any effect right now, ) or should put in third position.

I’m a bit dizzy. I should really go to sleep. Even though this is all pretty minor, I fire a redmine issue about the doc and another one about the lib and go to bed…

And I thought fixing Matrix#regular? would be trivial…

]]>2009-09-01T00:00:00-04:00http://blog.marc-andre.ca/2009/09/01/best-time-to-get-involved-in-ruby-coreApart from enjoying the summer, I’ve spent time hacking on MRI, especially since I’ve been accepted as a committer. The feature freeze for Ruby 1.9.2 was planned for yesterday and this has been pushed back a couple of days before. Rejoice!

Why? The reason stated was that the next version of Ruby will, for the first time ever, pass the RubySpec. This makes RubySpec the official meeting point for all Ruby implementations, not just Rubinius (the originator of RubySpec), JRuby and others. This should also give a bit more time to decide on a couple of new features that might make it in 1.9.2.

Much work has been done to have the specs meet MRI 1.9.x and the language and core sections only have a couple of failures1. Most are due to cases for which the best decisions still have to be figured out. I’ll remind you that it’s easy to gain commit access to RubySpec: any accepted patch grants you your commit bit.

There is still quite a bit of work to be done spec’ing the libraries. Actually there’s a lot of work to be done in the libraries themselves. Some are quite badly maintained, others don’t even have an official maintainer. And that’s all about to change, hopefully!

It was announced yesterday that being a maintainer is no longer for life. Not doing anything about opened issues? Sorry, we’ll get someone else to take care of it. Many libraries currently have no maintainer and there should be many others that won’t be claimed in the confirmation process.

Feeling competent to maintain a library? You talk using only sockets? You dream in yaml? Might as well apply to maintain your favorite lib…

I sincerely hope 1.9.2 kicks some serious ass. It’s bound to be the version Ruby 1.9 that most people will use and target for the first time. More reason to get it right!

1Actually, the bulk of the work was spec’ing Ruby 1.8.6 under the supervision of Brian Ford. I helped finish the specs for 1.8.7 and the mysterious and tireless Run Paint Run Run did most of the 1.9 specific specs. Spec’ing Ruby usually leads to finding bugs or asking clarifications. Indeed, Run Paint opened more issues on redmine than any other user!

]]>2009-06-01T00:00:00-04:00http://blog.marc-andre.ca/2009/06/01/stickler-in-silicon-valleyI have not been actively looking for a job yet. Nevertheless, I was contacted by a startup and invited to spend a week in Silicon Valley / San Francisco, hacking around with them to see if I could become part of their team, which I found quite flattering. I learned lots of new things in California. A couple of new words too. I’m still unsure as to what exactly a hipster is, but “stickler” was easier to grasp: one who insists on exactness or completeness in the observance of something.

It was fascinating to witness the startup culture. Tens of thousand of users is considered a small test bed; the target is millions. Every newcomer on the web scene is analysed & probed. There was technology and technology talk everywhere. It seems like everyone in the Bay Area has an iPhone. And I mean everyone! Lacking a decent map of the city, I asked two random strangers for directions and both dug out their iPhone to help me out. When I needed to call someone I was meeting, I asked another stranger if I could use his phone. It was an iPhone, of course, and after a thorough examination to estimate the chances I’d run away with it, he graciously let me use it. I found people particularily nice too, although maybe my tourist status helped, I don’t know.

My timing for the trip was great because Brian Ford and Evan Phoenix were also in town and invited me to have a drink. It turns out the monthly SF Ruby meetup was on that very same day, I met them there. I’d say the crowd was about three times that of a typicial Montreal.rb meet. There were other noticeable differences too. Many people were part of pretty exciting projects and companies (EngineYard, GitHub, PeepCode and the like). Chris Wanstrath (of GitHub) presented his newest gem rip, while Mike Dirolf was presenting his mongoDB project. Three people stood up announcing they were looking for developers, which has yet to happen in Montreal… I guess recession doesn’t have the same meaning in the Valley.

Back to Palo Alto and the startup. I realized a couple of things there. I really enjoy thinking about what a product could look like, how it should be presented to users. Finding ways to improve it by analysing its use is something I’ve never had the chance to do and is quite appealing. On the other hand, I somehow assumed that the “Joel” approach would be a sine qua non for an ambitious startup: hire the best, only the best, give them the best tools and let them loose.

It turns out that when considering what a good programmer is, different qualities can be given different weights. Most will agree that getting things done is the main one. Without it, not much can save you. As a reflection of my values though, I expected that embracing standards, learning the available tools and applying principles like DRY, refactoring, etc…, was also part of it. That’s apparently not the case, and that’s why we all realized I wouldn’t mesh as nicely as we hoped in their startup.

I couldn’t help but notice that all the rails programmers are Windows guys. Except one; he is a Linux guy and although I didn’t have the chance to really work with him, he gave me a really good impression. I’m ready to bet his values are more aligned with mine. The HTML/css/design expert was the only Mac guy and I could not have agreed more on what his opinions and point of view. So is there a Windows/Mac divide? Something like “Get things done” vs “Design it well so it just works”?

Nah. Things are never that simple, as I was reminded when taking part in the interview of a mac guy that clearly didn’t care for DRY or nice tools like named scopes, besides otherwise decent technical skills. So no, I just have the face the fact that, for better or for worse, I’m a stickler for getting things done well.

Update:My friend Pascal suggested this be related to an Engineer/Scientist divide: using tools vs understanding them; making things work vs comprehension through abstraction. Interesting idea.

# Without writing any method/block/lambda,# can you find ways to obtain, in Ruby 1.8.7 or 1.9:x==y# ==> truey==x# ==> false

Here’s how I got to checkout Ruby’s source and stumble upon that.

Age of Innocence

This is all Mathieu’s fault. He asked innocently if my backports gem was compatible with Rails. I thought “duh! of course!”. After all, it’s meant to be compatible with any Ruby code.

Of course, he was right, there were bugs. Hundreds of tests were failing! Turned out to be two bugs. It dawned on me that my small bunch of unit tests were not even close to be enough. I really needed to test some more.

So I set out to test it on JRuby. I found a bug, but it was JRuby’s this time. It was easy to circumvent though, so “JRuby compatibility: check”.

How about rubinius? Well, that’s were the story really begins…
Rubinius is a bit different because most of the builtin library is written in ruby
and that many methods use other core methods. That won’t make a difference for you,
until you fiddle with core methods. For example I was redefining String#upto
by calling Range#each. Kosher in MRI, but rubinius’ Range#each handles string
ranges by calling… String#upto!

There were other problems though, because rubinius was doing all sorts of stuff it wasn’t really supposed to do. And because rubinius is mostly Ruby, it was easy for me to fix. Or should I say temping to fix? I have difficulty to resist that kind of temptation, so I submitted my first patch and eagerly awaited my commit access (granted to anyone who submits a patch)…

Eye Opener

I discussed a bit with Evan Phoenix, the creator or rubinius, about ‘backports’ and told him I’d build it into rubinius, avoiding a bunch of alias_method_chain. I thought it would be dirt quick. That is, until I started.

See, to change things in rubinius, you first start by showing they’re broken. And to do that, enters RubySpecs. It’s a huge collection of tests that check if what you’re running works as expected. Or as MRI runs it, should I say. You knew that Ruby has no official spec, right?

With the help Brian Ford, I started to modify my first RubySpecs. That’s when I realized there were so many questions I never asked myself! Time for another quiz, this time with answers (just click on what you think is right!)

123456

# Assume we have:classMyArray&lt;Array;endfoo=MyArray.new# What is the class of:

foo.to_ary

MyArray

Array

foo.to_a

MyArray

Array

Array.try_convert(foo)

MyArray

Array

foo.dup

MyArray

Array

(foo+foo)

MyArray

Array

(foo*2)

MyArray

Array

foo.pop(2)

MyArray

Array

foo.shift(2)

MyArray

Array

foo[0..2]

MyArray

Array

foo.slice(0,2)

MyArray

Array

foo.slice!(0,2)

MyArray

Array

foo.first(2)

MyArray

Array

foo.sample(2)

MyArray

Array

foo.flatten

MyArray

Array

foo.product

MyArray

Array

foo.combination(1).first

MyArray

Array

foo.shuffle

MyArray

Array

Some are intuitive, like #shuffle, some less so, like #+. I wonder how you’re going to do, because I think I made worse than a monkey would by guessing randomly!

The complexity and amount of detail found in RubySpecs was a real eye opener. The fact is, often you won’t care about that level of detail about the implementation. But inevitably some people will.

So far I’ve ported all 1.8.7 Array methods and I’m working on the rest. Writing the specs is usually a bit longer than the implementation and damn difficult to get right. Well, at least for me; luckily there’s people like Ujihisa that fix my specs minutes after I commit them.

It’s because of a question he asked that I had to refer the Ruby C source and realized there was a potential problem like the x == y but !(y == x).

That cost me a bunch of hours today, because fixing it was another of those challenges I can hardly refuse, even if I had to delve in the C code!

Next blog entry: update on that bug, along with the solution (unless someone posts them in the comments)!

Thanks to Brian Ford and Evan Phoenix for their help and Ujihisa for pointing me to the complexity of the <=> operator he calls the spacecraft operator. And yeah, to Mathieu Houle for his damn question! ;-)

# Without writing any method/block/lambda,# can you find ways to obtain, in Ruby 1.8.7 or 1.9:x==y# ==> truey==x# ==> false

Before giving the answer, let me give you a bit of background…

In a blog post, Ujihisa was discussing how to compare arrays in Ruby and I was curious about the implementation which deals with recursion.

So what’s recursion you may ask? Just check:

123

x=[]x<<x# => [[...]]

x is an array containing a single element: x itself. At this point, the choice is yours. You can ask “why should I care?”. I have no good answer and you might as well stop reading now. Or you can say “cool” and read on.

So recursion happens whenever part of an object refers to the object itself. If you’re not careful about it,you can get infinite loops, for instance. For example, if you attempt to compare arrays naively by comparing their elements, you’ll get into trouble:

123456

x=[];x<<x# => [[...]]xx=[];xx<<xx# => [[...]]x==xx# => ???

Can you guess the answer?

Older ruby 1.8.6 raise a StackOverflowError because it uses the naive algorithm of comparing the elements (x and xx) over and over.

Current ruby 1.8.7 and 1.9 detect the recursion and say “woah, I don’t want to deal with that, let’s just say they’re different”, so it returns false.

How is that implemented exactly? Well, any call that can be recursive (like x.==(xx) in this case) goes through rb_exec_recursive which keeps track of the receiver (x) on which the method (:==) is called. Recursion is detected when an attempt to call the same method is made on the same object. The method :== returns false for recursive cases.

Note that x == x will return still true, because before the call to rb_exec_recursive, :== will check if the two objects being compared are the same.

What struck me immediately was the lack of symmetry. It didn’t smell good and it didn’t take long to find a problem.

Comparing x and y = [x] works fine, actually. x and y are not the same object, so :== calls rb_exec_recursive, which stores x in its ‘deja-vu’ list. The elements of x and y are examined, and since their are both the same object, true is returned. y == x also returns true. So far so good.

Now x and z = [y] are another matter. Again, x and y are not the same object, so rb_exec_recursive gets called. It pushes x on the ‘deja-vu’ list, and compares its elements (x and y). Comparison of x and y triggers is considered as recursion, because x is already on the list. So x == z returns false.

But what about z == x? z and x are not the same object, so z is put on the recursion-list and elements are compared. y and x are not the same, so a second call to rb_exec_recursive is made, but y is not on the list (only z is at this point) so their elements are compared. x and x are the same object and thus the comparison returns true. In summary:

123456

x=[];x<<x# => [[...]]x==[[x]]# => false[[x]]==x# => true

Fixing this inconsistency is not that difficult. Can you imagine how? Instead of pushing only x when calling x.==(y), we need to push the pair [x, y]. Recursion will be triggered only if x.==(y) gets called again, but not for x.==(z). I set out to make a patch in the C code. With the more strict criteria, we get that both x == z and z == x return true.

On the other hand, we still get false for identical recursive arrays that are built independently, like x and xx.

I then realized that if we detect a recursion when comparing x and xx, it simply means that there is no use in looking further down for differences, so we should return true, not false. Unless a difference is detected somewhere else, then xx and xx are equal! This made it possible to compare recursive arrays that have the same contents, even though they were constructed differently:

]]>2009-04-03T00:00:00-04:00http://blog.marc-andre.ca/2009/04/03/zombies-hashes-archaisms-of-ruby-coreI just love hashes. So much so, I named my blog after them. I also like that the hash sign is used for comments, in Ruby, or the way hash resembles hatch, thus the messy graphic theme and all. But I really like hashes. They are like mini-objects (object hatchlings?) and I tend to use them to store all sorts of information or instead of many conditions with case x; when :a ...; when :b ....

So I was quite surprised to note that in Ruby, either it’s really easy and natural to create a hash (with the super nice {:key => value, ...} syntax) or, when you need to generate a hash programatically, you’re basically stuck with

1234

h={}foo.eachdo|key|h[key]=bar(key)end

Well, that’s not quite true, there’s the Hash[key, value, key, value, ...] one can use. Do you use that one? So I decided to propose something. Now I don’t want to risk disturbing people. Especially important people. Except on my blog, of course; it’s your damn fault if you’ve read so far! I still have a bonus for you coming up for all your effort.

So I thought about this, researched it a bit and came up with the very best I could think of. I was quite nervous and excited when clicking on “Create”! My very first ruby posting was born: Feature #666: Enumerable::to_hash.

I didn’t quite know what to think of the strange omen of ID 666, though. In any event, I must admit that the excitement died down after waiting for anything to happen. It took a month for it to be assigned to Matz. Another two weeks for it to have the target set to “1.9.x”. Complete silence after that.

I must confess I was not registered to the ruby-core mailing list, so I would not have know of anyone writing directly to the list and not through the issue tracker. I believe no one did though. At least according to google because… there is no search on ruby-core’s archives! It’s quite an archaic system actually. The web front end is horrendous, the user interface is arcane (if not outright buggy). Don’t except a web link to confirm your registration, you have to send a mail back with a specific body. Short of registering, everything is done by email, actually. There might be a search command you send via email? Argh!

The fact that the search on the issue tracker itself (an otherwise fine product) doesn’t appear to work makes it next to impossible to check previous discussions for something. Like why has Ruby not moved to git yet? I guess I shouldn’t complain since others moved to svn a couple of months ago! Or like why is the ruby C code indented using 4 spaces, then 1 tab, then 1 tab + 4 spaces, etc… How do you even indent like that using TextMate? I’m 37, I’m used to feel old-generation and to find like things are moving quite fast, but damn, how come it’s quite the contrary here?

I pointed out a simple bug two months ago, even provided a patch the small change in the C code. New releases of ruby 1.8.6 and .7 were made today, and still no update on my bug report. I presume that the whole ruby-core team has a lot on their plate, but it’s hard not to be discouraged from contributing with that kind of (non-)feedback. Even clueless tourists seem to get more attention.

All this to say that 6 months after my feature request, still nothing. That’s when I discovered a cool new way to create hashes out of key-value pairs that is undocumented. This time, I made my best so that it wouldn’t go unnoticed. I conjured demons, invoked strange incantations, made dubious attempts at being humorous and documented the whole thing (zombies will be next!). Here it is. So that’s my bonus to you. Matz coded it, I’m letting you know about it! ;-)

That at least got my original issue noticed… and shot down. Some musterer the courage to speak their mind, we’ll see if this goes anywhere.

2011 update: For those interested, a proposal similar to my original one can be seen in this ruby-core thread.

]]>2009-04-02T00:00:00-04:00http://blog.marc-andre.ca/2009/04/02/whats-point-of-ruby-187Can you guess how many built-in methods were introduced or modified when Ruby 1.8.5 came out? How about Ruby 1.8.6? Or the most recent 1.8.7?

Ruby

Changes

1.8.5

Roll over

1.8.6

for the

1.8.7

answers!

Ruby

Changes

1.8.5

2

1.8.6

3

1.8.7

137

I’d love to check that the number of changes was minimal for earlier 1.8.x releases, but I can’t find a good list of changes (other than going through the full changelogs) Anyone has that info?

Are you writing code that targets 1.8.7? I know I’m not. The code I’m releasing on github is aimed at Ruby 1.8 and Ruby 1.9. The thing is, code that runs on 1.8.7 doesn’t necessarily run on 1.9, and even less likely to run on 1.8.6 or earlier. At least if you’re writing Ruby in Ruby and using the new Enumerable features, among others. So you have to test all three?

The fact is, Ruby 1.8.7 has a different API than the rest of the 1.8.x line, but still different from Ruby 1.9. So not only is it already difficult to know if some code is compatible with Ruby 1.9 (e.g. isitruby19.com), there are many more possibilities: some gems can be compatible with Ruby 1.8.7 only, for example. Or 1.8.7 and 1.9.1 but not 1.8.6 and before. It’s actually possible to be compatible just with 1.8.7! Try [:red_pill, :blue_pill].choice.

The solution should have been clear, though. Don’t change the API. Instead, use forward compatibility, and that’s easy to do in ruby. I’ve written my own collection of backports after looking in vain for one. I’m still wondering why change the API instead of releasing a standard forward compatibility gem which would work for all Ruby 1.8.x. I mean, all those OS X users with their default 1.8.6 installation… I’m supposed to tell them to upgrade to 1.8.7 because I want to use map(&:to_s)? In any case, I hope that a single require "backports" will enable 1.8.7 specific code to run on earlier versions of Ruby.

Why was sum not included? Probably because the new inject makes it
easier to sum enumerables (e.g [40,2].inject(:+)) and because Matz wants the methods of Enumerable to remain as generic as possible (and not assume that elements respond to :+, for instance). Still, I quite like the idea of sum.

Product

Now the irony is that product is not defined in rails, but it is in ruby 1.8.7+.

Naming methods is quite a delicate task. My belief is that a more appropriate and descriptive name would have been cartesian_product, cross_product or product_set. product might be shorter I think it will run against the principle of least surprise for a lot of folks. The most frustrating part is that product used without any argument is pretty useless. If you really need that result, there are other ways to get it!

1234

[2,3,7].product[2,3,7].combination(1)[2,3,5].each_slice(1).to_a# => same result

So that’s the hate part.

Now the love part. I had some fun backporting more features of Ruby 1.8.7/1.9 to older Ruby in my backports gem. At some point I had ported enough that I decided I might as well port everything. As of version 1.6, that’s done. This includes, of course, Array#product… which turned out to be the most interesting thing to backport! My first version used a recursive function, but I then thought about using enumerators. After 3 refactors, I got to a really nice version:

I get an enumerator for all the combinations by building it up successively using inject and starting from a trivial enumerator. It would be easy to have product accept a block but the standard simply returns an array, so you’ll find a simple call to to_a at the end. I love enumerators and… I love this implementation of product!

]]>2009-03-02T00:00:00-05:00http://blog.marc-andre.ca/2009/03/02/leave-my-options-aloneLet me start by asking you a small quiz:

12345678910111213141516

# Will there be any difference between the output of:<% content_tag_for(:tr, Foo.new,:class=>"css_class")do%>...<% end %><% content_tag_for(:tr, Bar.new,:class=>"css_class")do%>...<% end %># and the output of:<%- @style = {:class => "css_class"} -%><% content_tag_for(:tr, Foo.new, @style) do %>...<% end %><% content_tag_for(:tr, Bar.new, @style) do %>...<% end %># ?

If you answered “Nope”, congratulations, you’re a normal, sane human being. I like you. Anyone answering “Yup” is either slightly crazy, guessed that I wasn’t asking a trivial question (or both?). Because indeed, the output is different. Why? Because the first content_tag_for modifies @style[:class] argument.

You’re probably not expecting yet another apparently trivial question, but here goes: is this a bug?

If you had to bet about my opinion, your 5 bucks would be safe on “yeah, it’s a bug”. But I can’t really say that it is a bug. I’ve never read anywhere that options won’t be modified. It’s (of course) not specified in the doc of content_tag_for. It’s generally not stated what happens when you pass an unrecognized option, so forget about things like that. I’m not aware of any official general rule of rails. I doubt there is one, because I can find many places where options are modified (e.g. error_message_on, truncate, highlight, excerpt, word_wrap, …). These other examples, though, won’t modify the options in a harmful way. Indeed, writing:

@options.reverse_merge!(:foo => "default_bar")

will not cause a problem like the one I just showed (unless anything else relies on options[:foo] being left unspecified).

1234567

# If this works:truncate("hello",{:length=>4})# Shouldn't this work too?truncate("hello",{:length=>4}.freeze)

They won’t enable you to pass a frozen hash, though. Do you freeze your constants? I like freezing things. I freeze my constants, I freeze my settings, I freeze everything I can. And it upsets me when I can’t pass a frozen options. Is this a bug too?

Unless I’m mistaken, the current stance is that options passed can be changed, tortured and abused as much as the implementation desires and god damn it, check the source if you care.

I believe rails should take a clear and reasonable stance on options. I can think of two:

1) options can be modified, but only in a way that is independent on any other arguments or internal state.

2) options will never be modified.

truncate would be considered buggy only if the second stance is taken, while content_tag_for, as in my example, would be buggy under either positions, since it depends on the class of the second argument.

My personal vote goes for the second stance: leave my options alone!

]]>2009-03-01T00:00:00-05:00http://blog.marc-andre.ca/2009/03/01/does-bill-gates-use-ieAnyone who knows me personally is bound to know that I despise Windows (and Internet Explorer among other Microsoft products). I’m the first to admit that my hatred borders on irrationality. The fact that I’m a complete newbie on Windows probably doesn’t help either. I can count on my fingers the number of hours I spent playing/cursing on windows. That being said, every single time I have to use windows, I always wonder: does Bill Gates uses it? What’s his reaction to all those things that pop-up? Does he browse on Internet Explorer? Does he ever wonder if he just clicked properly and something is happening, or if the computer is just waiting for another click?

Couple of weeks ago, I was staying with my best friend’s family in the middle of the French Alps. They had internet through the owner’s extremely paranoid device that not only requires a password to join the network but also needed a physical acknowledgment to allow the MAC address. I didn’t insist to have my trusty PowerBook blessed and instead accessed my mail with their Dell on Windows XP.

First task: browse. This was a machine owned by a reasonably technical person; Firefox was installed and the little keyboard gizmo was already there, giving me a quick way to switch from the french AZERTY keyboard to the US QWERTY. Side note: If anyone knows of a single reasonable motivation for changing the base layout for any latin language, please enlighten me!1 Anyways, that’s not Microsoft’s fault and I’m glad I could switch easily. Well, most of the time. Sometimes I’d switch and the ‘FR’ just wouldn’t budge. Repeat, still no change. Again… still no change. After 5 or 6 attempts, woo-hoo, it changes. Another time it would change visually (the gizmo says ‘EN’) but the layout used when I entered text was still wrong. The menu disappeared altogether once. What am I supposed to do then?

Note that even if it worked more than half of the time, I’d still rant about the design lunacy of having this setting be per application. Why didn’t they make the only reasonable choice of a per session setting? Beats me. Luckily for me, I didn’t have to really use any other application besides Firefox.

Of course, I gave up entirely on typing any accents in my emails since I don’t know the dozen of 3-digit codes I’d need. Did you know that on any mac you can type all special symbols with easily remembered keys, like alt-c for ç and alt-` + a,e,u for à,è,ù… ? That holding the shift key will yield the uppercase version, like alt-shift-c for Ç, and alt-` + A for À, … ? That Apple introduced this… 25 years ago? Before Windows even existed? Bonus point if you know the alt-code for É!

OK. Second task: Testing my luck, I thought uploading photos to Facebook would be fun. Alas when copying the photos from the USB keys, the machine would freeze about one time out of three. Better than my old compact flash adapter that would make any PC reboot when the card was 4 GB or bigger, but still! The reboot time was really long; actually most everything took forever. My five-year old powerbook was da-bomb compared to it. It took ages to copy everything to the Dell and I was finally able to upload stuff to Facebook.

After a couple of days, my website on Amazon EC2 froze and I then really wanted to have internet on my machine. We found an ethernet cable (note to self: always pack one) and enabled the internet sharing on the Dell. I would not have been able to find it myself, mind you. My friend Pascal showed me the intricate way2. I’m still glad it was there at all! Like the keyboard switching, it would work a bit less than half the time. When it didn’t work, I had to go back in the settings, turn if off, click OK, wait for the window to close (~10 seconds), click ‘Advanced’ again, turn it on, click OK, wait some more (~ 1 minute!), and that would do the trick (most of the time, otherwise goto 10).

So back to my original thought: does Bill Gates use his computer at all? Presumably, he doesn’t change the keyboard layout a lot, type in french much, need to share an internet connection, or about anything worthwhile? Or else wouldn’t he see it doesn’t work properly? He must have the power to fix anything he wants, no? Even if he had to pay from his own pocket to have it fixed, what would it represent for him? He can buy 10 condos like mine, everyday, for the rest of his life without running out of money. If you had this money and power, wouldn’t you say “ok, I’ll just get that fixed”? Forget about making profits, forget about making things better for the planet. “Just fix it for me, yesterday, thank you very much”.

Note: I’ll try not to make more than one (or two?) rants against Microsoft per year!

1 Introduction of new letters (é, ß, …) justify changes to the overall layout but I’m wondering why the common 26 letters couldn’t stay put. Geeks could still curse because the needed symbols [} and such would be placed differently, but for normal needs, there would be a common ground. Let’s thus focus on changes to letters only. One of the changes between the AZERTY and the QWERTY is a swap between the W and Z. These two letters are the two least frequently used in french. Ergo this is the swap, among all 325 possible swaps (ignoring the zillions longer permutations), that will yield the least noticeable gain in efficiency! I leave the trivial proof as an exercise to the reader :-)
If the most popular keyboard layout was Dvorak, I could see how a reasonable way to keep the layout optimized would yield different layouts depending on the language. The fact is, QWERTY is quite far from being optimized in any rational way. It’s reputed to have been designed to insure that successive letters wouldn’t jam a typewriter. I call BS. The most frequent pairs of letters in english are th, he, an, re, er, in, on, at, nd, st, es, en, of, te, ed, or, ti, hi, as, to (source). You’ll notice that almost half of these are more or less adjacent (`th, re, er, in, es, te, ed, as`) while Dvorak has only `th, st` and `hi` that are (and the latter still fits Dvorak’s goals.) So not only is the “official” optimization practically obsolete, it’s not even footing the bill. Anyways, all that to say:

QWERTY is a terrible layout in english

it’s not clear if it is worse in french or other latin languages, but small changes won’t lead to any noticeable gain and will confuse any globetrotter

Why, oh why, are we stuck in a world where not only do we use a bad layout, but we can’t stick with that bad layout for most latin languages that use A-Z?
2 On XP, it is:
Control Panels -> Network Connections -> Local Area Connection -> Properties -> Advanced -> click on ‘Allow area other network’s users to connect through this computers’ internet connection’ -> OK
Compare that to OS X:
System Preferences -> Sharing -> click on ‘Internet Sharing’
Not only will you notice that the number of operations is more than double on XP, but the choices to make are more difficult. On the mac, the only non trivial choice is between ‘Network’ and ‘Sharing’. On Windows, your first choice is between ‘Network Connections’ vs ‘Internet Options’ (and hesitation with ‘Network Setup Wizard’ and ‘Wireless Network Setup Wizard’). Since control panels are not grouped like on the mac, you have to consider all of them, if you don’t know what the answer is. On the mac, there are only 2 other possibilities under ‘Internet & Network’ besides ‘Network’ and ‘Sharing’.
Then you have to select the currently active internet connection. It is the most probable choice but it’s not obvious which connection is the currently active one! Finally, you have to think of looking in the ‘advanced’ tab.
Undoubtably, the most important difference is that on the mac… it works.

Thanks to Pascal for his comments on my first draft.

]]>2009-02-27T00:00:00-05:00http://blog.marc-andre.ca/2009/02/27/please-write-ruby-in-rubyI’m always surprised when I see bright people writing ruby code without using ruby’s standard lib. Do I need to point out that it’s less readable and more error prone?

I plead all rubyists to re-read the doc for Array, Hash and Enumerable/Enumerator. Refer back to it. Use it. Please!

I was quite amazed to see the following code (written by an ex rails-core programmer, nothing less!). Check out the three methods and ask yourself what they do and how they should be written (mouse-over the code for the answers).

The extract_is_cool! method was actually not even needed because there was a merge!(options) later on, just adding insult to injury…

]]>2009-02-23T00:00:00-05:00http://blog.marc-andre.ca/2009/02/23/ruby-doesnt-dig-threadsEither I’m missing something, or threads in both MRI and YARV just plain suck. My test program goes through a 10 MB file of random data, splits it in chunks (either 1K, 10K or 100K each). The results for MRI show the threaded version is much slower (~2x), in YARV performance is similar but usually slower for the threaded version. Mind you, I’m running this on 4 cores! rubinius looks like YARV on a valium overdose (20x slower…). Only in JRuby are things like what I expected, i.e. similar performance or faster for threads, with the difference being noticeable with more processing.

Note: it’s understandable that 1.9 is much slower than 1.8 because I process strings and only 1.9 deals with encoding

]]>2009-02-21T00:00:00-05:00http://blog.marc-andre.ca/2009/02/21/ruby-threadsI’m pondering a really neat scheme for my upcoming FLV editor. My editor can be thought of as a series of processors acting on tags; the first processor reads them, then others analyse/modify them and the last one writes them. The scheme would need some sort of disconnection in the processing, either with continuations (which appear to be implemented two different ways in ruby 1.8 and 1.9) or threads. Which leads to the questions:

What’s the performance comparison of a program that sucessively reads and writes chunks of data, compared to one where one thread reads and the other one writes.

Update: Many reader comments (lost in migration, sorry) agreed with me, and some pointed out http://w3fools.com/

]]>2008-10-17T00:00:00-04:00http://blog.marc-andre.ca/2008/10/17/please-dont-abbreviateAbbreviation sucks. I’ll add famous people that agree with me here when I get the time. And if I find any!

I dislike the fact that ruby’s Time class as a mon method (c’mon!), but at least it is aliased by month. Now why oh why does ruby’s Time class has a min method and no minute method? Same goes for sec vs second. At least sec isn’t as ambiguous as min.