I've done some more work on this and have created a test case that reliably
throws errors, although the errors themselves are not consistent.
About 1 out of every 4 times, I get the "can't convert Symbol to Hash" error
in server/lib/backgroundrb/results.rb:40 in 'merge!'.
I created the following worker class is
{RAILS_ROOT}/lib/workers/results_test_worker.rb
# This class repeatedly writes values to the results, to
# test the results process
class ResultsTestWorker < BackgrounDRb::Worker::RailsBase
def do_work(args)
logger.info "Started ResultsTestWorker"
results[:started_at] = Time.now
args ||= {}
limit = args[:limit] || 10_000
logger.info "Limit is #{limit}"
limit.times do |i|
results[:last_update] = Time.now
results[:counter] = i
end
stop_time = Time.now
logger.info "Stopped ResultsTestWorker at #{stop_time}"
results[:stopped_at] = stop_time
self.delete
end
end
ResultsTestWorker.register
Then in {RAILS_ROOT}/test/unit/drb_results_test.rb I have:
require File.dirname(__FILE__) + '/../test_helper'
class DrbResultsTest < Test::Unit::TestCase
def setup
# start backgroundrb server
`../../script/backgroundrb start`
sleep 5 # give it time to startup
end
def teardown
# stop backgroundrb server
`../../script/backgroundrb stop`
end
def test_results
limit = 10
keys = []
4.times do |i|
job_key = "#{self.class.name}_#{i}"
keys << job_key
MiddleMan.new_worker(:class => :results_test_worker, :job_key =>
job_key, :args => {:limit => limit})
end
sleep 2 # wait for workers to finish
keys.each_with_index do |k, i|
assert_not_nil MiddleMan[k], "checking job_key #{k} on iteration #{i}"
assert_not_nil MiddleMan[k].object, "checking object on iteration
#{i}"
assert_not_nil MiddleMan[k].object.results, "checking results on
iteration #{i}"
assert_equal(limit - 1, MiddleMan[k].object.results.to_hash[:counter],
"checking counter on iteration #{i}")
end
end
end
This test does the following:
- Spawns 4 results_test_worker processes that each write several values to
the ResultsWorker (in parallel)
Increasing the limit value increases the odds of these processes
concurrently trying to write results at the
same time, but I've found that a limit of 10 works pretty well.
- It waits a couple seconds for the workers to finish (is there a better way
to determine if the processes are all done)?
- Then it tries to access the results for each job_key, specifically to
ensure that counter value is equal to limit - 1.
NOTE: I've never gotten this test to complete successfully. In addition to
the "can't convert Symbol to Hash" error,
I've seen the following:
- The [:counter] value is much lower than the expected value. If limit is
10,000 this value might be 246 when 9,999 was expected.
- The job_key is not recognized, the call to MiddleMan[k] returns nil. When
this occurs, I can usually see in the backgroundrb.log
that fewer than 4 workers were actually created. I can see this by
counting the number of "Started ResultsTestWorker"
messages in the log.
- The job_key is resolved, but the call to MiddleMan[k].object.results
returns nil
- The call to MiddleMan.new_worker hangs and never returns
I'm sharing this code so that others can try it out. It's a bit of a hack to
get some testing working (starting and stopping the BackgrounDRb server on
each test, having a test worker class in lib/workers, etc.), but it is
self-contained, and replicated the real-world environment of my code running
in rails. It you have suggestions for improving the testing approach I'm all
ears.
I'm also interested in feedback in the code itself. Maybe I'm not working
with the MiddleMan object correctly. I have to admit I'm still wrapping my
head around Drb.
Resolving this issue is critical to my project so I will continue trying to
track things down. I'll start by adding a mutex to the Results#[]= method.
Mason
On 1/10/07, skaar <skaar at waste.org> wrote:
>> It might be that we have to introduce a mutex in the results worker
> where this happens. I'll try to get this reproduced sometime this
> weekend.
>> /skaar
>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://rubyforge.org/pipermail/backgroundrb-devel/attachments/20070110/bfd945b4/attachment.html