Tuesday, 11 March 2008

As I've mentioned in a previous post, I've been doing a lot of performance tweaking recently. By far and away the biggest gains I've garnered have been by tweaking the SQL/DB in various ways. One of the biggest things I learned was to use indexes appropriately. This means two things:

Put an index on anything important

Get the field order right

So what's important?

From the perspective of an index, immportant means anything that you will need to fetch repeatedly - especially if that fetch is in a tight loop. This is one place where RubyProf is your friend. You can also take a look at the SQL queries scrolling up the screen while watching the console output to get a rough idea of the most important queries.

What about the field order?

Indexes work better if you put the "most discerning" fields before the less-discerning ones. As an example, suppose you have a (completely spurious) table as follows:

owner_name

pet_name

toy_name

Bob

Fido

Bone

Bob

Fido

Rubber chicken

Bob

Whiskers

Jingly ball

Bob

Whiskers

paper on string

Jane

Woofy

chew toy

Jane

Woofy

bouncy ball

Jane

Kitty

paper bag

Jane

Kitty

laser pointer

Assume for some reason it's important to have an index on all three fields above. The field that is most unique is the "toy_name" - doing a lookup on that field will get the best bang for your buck, so it should come first in your index, followed by the next-most discerning field. In the example data above it's the pet_name, but maybe you know your intended audience a bit better. If you know that a lot of people are calling their dogs 'fido' and their cats 'kitty', you might index on the owner_name. The most likely circumstance is that you'll have the owners listed in a users list and there's actually an owner_id - in which case you'd definitely set that as the second key for the index.

Is it really that important?

The contract I'm currently working on requires us to display reports with tens of thousands of orders and trades in them, cross-referenced against multiple calculated benchmark values. It's imperative that the indexes are set up to pull that data out of the db in the most efficient way possible.

I acheived an order of Magnitude speedup just by changing around the ordering of the indexes! This was worth it for us - YMMV

Tuesday, 4 March 2008

So we wanted to test whether our code actually puts things into the log when it should - eg to log important events such as "User X just got archived by User Y" or "Someone tried to login as User X but failed!"

Unfortunately, the Ruby Logger seems to be write-only. There's no way to actually get back the messages that you carefully put into it. So how to test it?

Asking around, I got a few ideas for how to go forward. The best option would be to have a separate "audit log" for special events that need to be independantly auditable. However this would take Time(tm) (of which we seem to be in short supply these days). The next best option was to use mocha... but that,, to my mind, doesn't really test the actual logger, which I want to be certain actually works the way I expect.

So I began to think... hmmm - monkey-patch.

What follows is quick and dirty... I had intended to go for elegeant, but I'm satisfied that it at least tests the actual logger object rather than having to do something dodgy by creating a temporary, second log and reading it in as a file :P

The monkey--patches

Put these into test_helper.rb somewhere up the top (not inside class test::Unit::TestCase)

# a way of actually getting the usual rails logger object
def logger
RAILS_DEFAULT_LOGGER
end
# assert that the log has received the given message at the given log
# level. Can pass in matching message as a string or a regular expression.
def assert_event_logged(regex = nil, lvl = nil, match = true, &the_block)
logger.clear_log_messages
regex = /#{regex}/ if regex.is_a? String
yield the_block
last_logged = logger.last_logged(lvl)
if match
assert !last_logged.blank?, "should have logged some message at given level #{lvl.inspect}"
assert last_logged.detect {|msg| msg =~ regex }, "should have matched the given argument #{regex} at the given level: #{lvl.inspect} Instead got: #{last_logged.join('\n')}"
else
return if last_logged.blank? # trivial case
assert last_logged.all? {|msg| msg !=~ regex }, "should not have matched the given argument #{regex} at the given level: #{lvl.inspect}. Instead got: #{last_logged.join('\n')}"
end
end
# assert that this specific event was not logged
# ie - just an alias for the opposite of event_logged
def assert_event_not_logged(regex = nil, lvl = nil, &the_block)
assert_event_logged(regex, lvl, false) { yield the_block }
end
# check that no events were logged at the given level
def assert_no_event_logged(lvl = nil, &the_block)
logger.clear_log_messages
yield the_block
last_logged = logger.last_logged(lvl)
assert last_logged.blank?, "should not have logged a message at the given level at the given level: #{lvl.inspect}"
end

Here are some basic tests that will test the above assertions are working correctly... and stand as basic examples of use:

def test_should_show_msg_logged
msg = "forcing the logger to log"
assert_event_logged(msg) do
logger.debug msg
end
end
def test_should_show_message_logged_at_a_given_log_level
msg = "forcing the logger to log"
assert_event_logged(msg, Logger::DEBUG) do
logger.debug msg
end
assert_event_logged(msg, Logger::INFO) do
logger.info msg
end
end
def test_should_show_msg_logged_matches_given_regex
msg = "forcing the logger to log"
assert_event_logged(/forcing/, Logger::INFO) do
logger.info msg
end
assert_event_logged(/oogley/, Logger::INFO, false) do
logger.info msg
end
end
def test_should_show_msg_logged_matches_given_regex
assert_no_event_logged() do
# do something that doesn't log anything
three = 1 + 2
end
end
def test_should_not_match_msg_that_doesnt_match
assert_event_not_logged("yada yada", Logger::INFO) do
logger.info "blah blah"
end
end
def test_should_not_match_msg_logged_at_different_log_level
assert_no_event_logged(Logger::INFO) do
# do something that doesn't log anything
three = 1 + 2
# log something at a log level different to that specified
logger.debug "should not get this one"
end
end

PS - if anyone wondered where I've been for the past months I've been pretty busy performance-tuning the site in preparation for some sort of slightly-more-official-launch-than-just-to-special-customers-only...

So haven't had much chance to actually do anything interesting, but hopefully we're actually *that* much closer to really actually launching the site to Real People (tm).