I want to map each row with each corresponding value of x. This operation is not standard matrix multiplication, though I feel like it should be!

I admit I slept in Linear Algebra class. Thus I was a bit dumbfounded about how to express it using normal matrix multiplication. While I could do it using LinAlg’s mapping functions, I knew that it could be done because it was all linear transformations.

Luckily James Lawrence, who maintains LinAlg was emailing me, and asked me about it (thanks!). He gave me some code that did it both ways. After I read it, I slapped myself.

I wasn’t sure which one was faster. Maybe they’d be about the same. Maybe doing matrix multiply would be slower because you’d have to go through all the multiplications. Per James’s suggestion, I benchmarked them, in the fine form of yak shaving.

Run on a laptop 1.1GHz Pentium M, 758MBs, I did two experiments. For one experiment, I kept the matrix at 800 rows, then I grew the columns. Let’s see what it looks like: time (secs) vs column size

I should have labeled the axis in GnuPlot, but you’ll live. The y axis is the number of seconds, and the x axis is the column size. Uninteresting, but nice to know it’s linear. The two functions aren’t significantly different from each other in the amount of time that it takes. I expected the map (calculate2) to be much faster since it didn’t have to do all the multiplies. oh well.

I almost didn’t run the second test, but it proved to be a bit more interesting. This time I kept 800 columns and grew the number of rows. Same axis. Different graph:time (secs) vs row size

Whoa! It’s exponential. Or quadratic. I can’t tell. Anyway, anything curving up is bad news. I suspect this might have something to do with row major/column major ordering. C stores matrices row by row, whereas Fortran stores it column by column.. Update: As corrected by James L., in the first experiment, growing columns will create more multiplies inside the diag() call, but the size of the diagonal will stay the same. Growing the rows, however, will create less multiplies inside the diag() call, but each increase in row size will increase both dimensions of the resulting diagonal matrix, giving us n^2. So it’s quadratic.

So what I went looking for wasn’t what I meant to find. But given that I’ve read about it before, it wasn’t super interestingit would have made sense had I thought about it, I guess it’s not super surprising. Let’s see if we can use transpose to cut down on the time. For one part, we’ll grow the rows as before, and compare it to growing rows, but transposing the input then transposing the output, to get the same deal. What’s it look like:time(secs) vs row size

This is good news. Even if transpose is an extra couple of manipulations, it saves us computation for bigger matrix sizes. The most interesting part of the graph is the crossing of the two graphs. If somehow, LinAlg (or any other package for that matter) can detect where that inflection point is going to be, it can switch off between the two. The only thing I can think of is another package lying underneath doing sampling of each implementation randomly whenever a user calls the function to do interpolation of its growth curve, and then calculate the crossing analytically. I don’t currently know of any package that does this (or if it does, I don’t know about it, cuz it already performs so well by doing the switch!)

This was a nice little diversion from my side projects…a side project of side projects. Back to learning about information gain and its ilk. At least something came out of it. I have a nice little experiment module that I can use to do other experiments. And I spent way too much time on it not to post something…

Optimization isn’t something you should do too early on, but I think a little house cleaning every so often to make sure your pages aren’t ridiculously slow is healthy. With any optimization task, you’d want to benchmark the results and see if there’s an actual gain. The very basic tool for benchmarking is the ordinary script/performance/benchmark. The easiest to find analysis tools is the rails_analyzer gem. The last time I used rails analyzer, it wasn’t that easy to use. The command line arguments seemed arcane. But its bench tool, which can benchmark controllers as opposed to just object models, is fairly easy to use.

In the listing of books, one would display whether it’s actually bookmarked by a user or not. Normally, without the :include, the listing would make repeated queries to the DB every time it displayed a book list element, since it will use bookmarked_by?(user_id) to determine if a user bookmarked the book. So instead of just 1 query, it would make n + 1 queries.

Preloading child tables isn’t necessarily wise all the time. It really depends on what you intend to do with the data after you fetch it. As the Agile rails book warns, preloading all that data will take time. If you look at your log files, you’ll see that it’s a significant amount.

If you’re only going to load a limited number of these book list elements on a single page at a time, it actually might make sense to forgo preloading of child tables, and just use a find() instead of a select.

And if you’re going to display counts of arrays, but all means, use counter caching. It’s easy to do (as long as you follow instructions!), for most situations.

Intuitively, if you want to display over a certain n number of book list elements, it makes more sense to use :include and select it. However, I wanted to point out that when you make decisions like this, you’ll always want to measure the load times, because you earn what you measure.

Also, use the right number of runs. Too short number of a number of times you run a function, the more variation you’ll have in your benchmarks. Let’s say that you get two numbers for two different methods.

So it’s obvious that method2 is better right? Well, not necessarily. While benchmarks only show averages, you’ll need to pay attention to standard deviations. The bigger the standard deviation, the more runs you’ll need to figure out the average load time, and the number of decimal points you can trust. That way, you can figure out whether the difference in load times is statistically significant or not.

That way, you can ascertain whether the optimization you made were worth the trouble or not. tip!

About me

I'm a entrepreneur and programmer currently working on Cubehero, collaboration software for 3D printable projects. I'm available for freelancing or consulting about the growing 3D printing market.