Image Registration

This work investigates the use of clouds and autonomic cloud-bursting
to support a medical image registration. The goal is to enable a
virtual computational cloud that integrates local computational
environments and public cloud services on-the-fly, and support image
registration requests from different distributed researcher groups with
varied computational requirements and QoS constraints. The virtual cloud
essentially implements shared and coordinated task-spaces, which
coordinates the scheduling of jobs submitted by a dynamic set of
research groups to their local job queues. A policy-driven scheduling
agent uses the QoS constraints along with performance history and the
state of the resources to determine the appropriate size and mix of the
public and private cloud resource that should be allocated to a specific
request. The virtual computational cloud and the medical image
registration service have been developed using the CometCloud engine and
have been deployed on a combination of private clouds at Rutgers
University and the Cancer Institute of New Jersey and Amazon EC2. An
experimental evaluation is presented and demonstrates the effectiveness
of autonomic cloudbursts and policy-based autonomic scheduling for this
application.

Medical Image Registration

Nonlinear image registration is the process to determine the mapping T
between two images of the same object or similar objects acquired at
different time, in different position or using different acquisition
parameters or modalities. Both intensity/area based and landmark based
methods have been reported to be effective in handling various
registration tasks. Hybrid methods that integrate both techniques have
demonstrated advantages in the literature. In general, intensity/area
based methods are widely accepted for fully automatic registration. But
landmark based methods, though also commonly used, sometimes still rely
on human intervention in selecting landmark points and/or performing
point matching. Point matching in medical images is particularly
challenging due to the variability in image acquisition and anatomical
structures.

We developed alternative landmark point detection and matching method
as a part of our hybrid image registration algorithm for both 2D and 3D
images. The algorithm starts with automatic detection of a set of
landmarks in both fixed and moving images, followed by a coarse to fine
estimation of the nonlinear mapping using the landmarks. For 2D images,
multiple resolution oriental histograms and intensity template are
combined to obtain fast affine invariant local descriptor of the
detected landmarks. For 3D volumes, considering both speed and accuracy,
the global registration is first applied to pre-align two 3D images.
Intensity template matching is further used to obtain the point
correspondence between landmarks in the fixed and moving images. Because
there is a large portion of outliers in the initial landmark
correspondence, a robust estimator, RANSAC, is applied to reject
outliers. The final refined inliers are used to robustly estimate a Thin
Spline Transform (TPS) to complete the final nonlinear registration.
The proposed algorithm can handle much larger transformation and
deformation compared with common image registration methods such as
finite element method (FEM) or BSpline fitting, while still provide good
registration results. The flowchart of the hybrid image registration
algorithm is shown below

Cloud-Based Medical Image Registration Using CometCloud

Application Scenario

An overview of the operation of the CometCloud-based medical image recognition application scenario is presented below.

In this application scenario, there are multiple (possibly
distributed) job queues from where users insert image registration
requests to the CometCloud. Each of these entry points represents a
research site in research collaboration, and maintains its own storage
where medical images are stored. Each site generates its own requests
with its own policies and QoS constraints. Note that a site can join the
collaboration and CometCloud at anytime (provided it has the right
credentials) and can submit requests. The requests (tasks) generated by
the different sites are logged in the CometCloud virtual shared space
that spans master nodes at each of the sites. These tasks are then
consumed by workers, which may run on local computational nodes at the
site, a shared datacenter or on a public cloud infrastructure. These
workers can access the space using appropriate credentials, access
authorized tasks (i.e., image registration request) and return results
back to the appropriate master indicated in the task itself.

Experimental Environment

The virtual cloud environment used for the experiments consisted of
two research sites located at Rutgers University and University of
Medicine and Dentistry of New Jersey, one public cloud, i.e., Amazon Web
Service (AWS) EC2 , and one private datacenter at Rutgers, i.e., TW.
The two research sites hosted their own image servers and job queues,
and workers running on EC2 or TW access these image servers to get the
image described in the task assigned to them (see Figure below).

Each image server has 250 images resulting in a total of 500 tasks.
Each image is two dimensional and its size is between 17KB and 65KB. On
EC2, we used standard small instances with a computing cost of
$0.10/hour, data transfer costs of $0.10/GB for inward transfers and
$0.17/GB for outward transfers.

Costs for the TW datacenter included hardware investment, software,
electricity etc., and were estimated to $1.37/hour per rack. In the
experiments we set the maximum number of available nodes to 25 for TW
and 100 for EC2. Note that TW nodes outperform EC2 nodes, but are more
expensive. We used budget-based policy for scheduling where the
scheduling agent tries to complete tasks as soon as possible without
violating the budget. We set the maximum available budget in the
experiments to $3 to complete all tasks. The motivation for this choice
is as follows. If the available budget was sufficiently high, then all
the available nodes on TW will be allocated, and tasks would be assigned
until the all the tasks were completed. If the budget is too small, the
scheduling agent would not be able to complete all the tasks within the
budget. Hence, we set the budget to an arbitrary value in between.
Finally, the monitoring component of the scheduling agent evaluated the
performance every 1 minute. The results from the experiments are shown
below.

Figure (b) shows the average cost per task
in each scheduling period for TW and EC2.Figure (a) shows the scheduled
number of workers on TW and

Note that since the scheduling interval is 1 min, the X-axis
corresponds to both time (in minutes) and the scheduling iteration
number. Initially, the CometCloud scheduling agent does not know the
cost of completing a task. Hence, it initially allocated 10 nodes each
from TW and EC2.

In the beginning, since the budget is sufficient, the scheduling
agent tries to allocate TW nodes even though they cost more than EC2
node. In the 2nd scheduling iteration, there are 460 tasks still
remaining, and the agent attempts to allocate 180 TW nodes and 280 EC2
nodes to finish all tasks as soon as possible within the available
budget. If TW and EC2 could provide the requested nodes, all the tasks
would be completed by next iteration. However, since the maximum
available TW node is only 25, it allocates these 25 TW nodes and
estimates that a completion time of 7.2 iterations. The agent then
decides on the number of EC2 workers to be used based on the estimated
rounds.

In case of the EC2, it takes around 1 minutes to launch (from the
start of virtual machine to ready state for consuming tasks), and as a
results, by the 4th iteration the cost per task for EC2 increases. At
this point, the scheduling agent decides to decrease the number of TW
nodes, what are expensive, and instead, decides to increase the number
of EC2 nodes using the available budget. By the 9th iteration, 22 tasks
are still remaining. The scheduling agent now decides to release 78 EC2
nodes because they will not have jobs to execute. The reason why the
remaining jobs have not completed at the 10th iteration (i.e., 10
minutes) even though there are 22 nodes still working is that there was
an unexplainable decrease in EC2 performance during our experiments. The
variations in the cost per task in Figure (b) are because the task
completions are not uniformly distributed across the time intervals.
Since the cost per interval is fixed (defined by AWS) the cost per tasks
varies, depending on the number of task completed in a particular time
interval.

Figure (c) shows the used budget over time. It shows all the tasks were completed within the budget and took around 13 minutes.

This figure shows a comparison of execution time and used budget
with/without the CometCloud scheduling agent. In the case where only EC2
nodes are used, when the number of EC2 nodes is decreased from 100 to
50 and 25, the execution time increases and the budget used decreases as
shown (a) and (b). Comparing the same number of EC2 and TW nodes (25
EC2 and 25 TW), the execution time for 25 TW nodes is approximately half
that for 25 EC2 nodes, however the costs for 25 TW nodes is
significantly more than that for 25 EC2 nodes. When the CometCloud
autonomic scheduling agent is used, the execution time is close to that
obtained using 25 TW nodes, but the cost is much smaller and the tasks
complete within the budget. The reason why the execution time in this
case is larger than that for 100 EC2 node case is as follows: the cost
peaks at time = 11 mins as seen in Figure (b), and this causes the
autonomic scheduler to reduce the number of EC2 nodes to approximately
20 (see Figure (a)), causing the execution time to increase.

An interesting observation from the plots is that if you don’t have
any limits on the number of EC2 nodes used, then a better solution is to
allocate as many EC2 nodes as you can. However, if you only have
limited number of EC2 nodes and want to be guaranteed that your job is
completed within a limited budget, then the autonomic scheduling
approach achieves an acceptable tradeoff. Since different cloud service
will have different performance and cost profiles, the scheduling agent
will have to use historical data and more complex models to compute
schedules, as we extend CometCloud to include other service providers.