You are using a new version of the IGI Global website.
If you experience a problem, submit a ticket to
helpdesk@igi-global.com,
and continue your work on the old website.

To Support Customers in Easily and Affordably Obtaining Titles in Electronic Format
IGI Global is Now Offering a 50% Discount on ALL E-Books and E-Journals Ordered Directly Through IGI Global’s Online BookstoreAdditionally, Enjoy a 20% Discount on all Other Products and FormatsBrowse Titles

As Part of Our Efforts to Assist Customers with More Easily and Affordably Obtaining Titles in Electronic Format, IGI Global is Now Offering a 50% Discount on All E-Books and E-Journals Ordered Through IGI Global’s Online Bookstore*

To support customers with accessing online resources, IGI Global is offering a 50% discount on all e-book and e-journals. This opportunity is ideal for librarian customers convert previously acquired print holdings to electronic format at a 50% discount.

*The 50% discount is offered for all e-books and e-journals purchased on IGI Global’s Online Bookstore. E-books and e-journals are hosted on IGI Global’s InfoSci® platform and available for PDF and/or ePUB download on a perpetual or subscription basis. This discount cannot be combined with any other discount or promotional offer. Offer expires June 30, 2020.

To assist you during the COVID-19 pandemic, IGI Global will convert libraries previously acquired print holdings to electronic formats directly through our InfoSci® platform, ProQuest’s E-Book Central, or EBSCOhost at a 50% discount. Send us a list of IGI Global publications you would like to convert, and we’ll promptly facilitate the set-up and access.

IGI Global offers a rich volume of content related to treatment, mitigation, and emergency and disaster preparedness surrounding epidemics and pandemics such as COVID-19. All of these titles are available in electronic format at a 50% discount making them ideal resources for online learning environments.

IGI Global is now offering a new collection of InfoSci-Knowledge Solutions databases, which allow institutions to affordably acquire a diverse, rich collection of peer-reviewed e-books and scholarly e-journals. Ideal for subject librarians, these databases span major subject areas including business, computer science, education, and social sciences.

Create a Free IGI Global Library Account to Receive an Additional 5% Discount on All Purchases

Exclusive benefits include one-click shopping, flexible payment options, free COUNTER 5 reports and MARC records, and a 5% discount on single all titles, as well as the award-winning InfoSci®-Databases.

Abstract

Computational Grid attributed with distributed load sharing has evolved as a platform to large scale problem solving. Grid is a collection of heterogeneous resources, offering services of varying natures, in which jobs are submitted to any of the participating nodes. Scheduling these jobs in such a complex and dynamic environment has many challenges. Reliability analysis of the grid gains paramount importance because grid involves a large number of resources which may fail anytime, making it unreliable. These failures result in wastage of both computational power and money on the scarce grid resources. It is normally desired that the job should be scheduled in an environment that ensures maximum reliability to the job execution. This work presents a reliability based scheduling model for the jobs on the computational grid. The model considers the failure rate of both the software and hardware grid constituents like application demanding execution, nodes executing the job, and the network links supporting data exchange between the nodes. Job allocation using the proposed scheme becomes trusted as it schedules the job based on a priori reliability computation.

Introduction

The scientific community always thirsts for powerful computational tools and methods. This has resulted in enormous developments in the computing world with regard to processor speed, fast and large memory and efficient network devices for fast and reliable data transmission along with the advancement in software technology. The thirst for computational energy led to newer tools, which again fed back to improve the scientific research. The result of this self-feeding cycle resulted in the aggregation of heterogeneous resources known as Grid, empowering towards collaborative engineering (Foster & Kesselman, 1998; Foster, 2002; Tarricone & Esposito, 2005; Taylor & Harrison, 2009).

A grid can be considered as consisting of a number of clusters with each cluster comprising of computing resources of nearly the same nature. Though, across the clusters the nature of the nodes may differ. Participants inside cluster agree to cooperate in problem solving thus making a virtual organization (VO). At any moment of time there could be many virtual organizations inside the grid with a dynamic constitution. Jobs may enter to the grid through any of the participating nodes. To harness the advantages of the grid these jobs should be scheduled over the grid so as to utilize the parallel and concurrent nature of the jobs. Scheduling is the problem of mapping the jobs over the grid resources and is said to be efficient if this mapping is done keeping in mind the job requirements e.g. the nature of the job, its inherent parallelism, proper load balancing etc. Since scheduling is an NP-hard problem many scheduling models have been proposed in the literature optimizing one or the other parameters.

Whenever a job enters the grid for execution the chances for its failure may spread from the application failure to the resource failure (node failure, etc.). Failure can be the result of many things viz. specification mistake (incorrect algorithms, architectures, etc.) hardware failures (hot crash, network partition etc.), software failure (numerical exception, failed application, etc.), implementation mistakes, component defects, external disturbance (radiation, electromagnetic waves, interference etc.), performance failures (application not completing within a specified time, etc.) or some other failures (machine rebooted by the owner, excessive CPU load, decreased priority by the local resource for the current task etc.) (Huda, Schmidt, & Peake, 2005). A fault tolerant system is one which continues to perform even in the presence of hardware and software failure. A fault is a physical defect, imperfection, or flaw that occurs within some hardware or software component, whereas an error is the manifestation of a fault and is a deviation from accuracy or incorrectness. Specifically, faults are the cause of error and errors causes the failures. Depending on the type of grid it may be susceptible to either or all types of faults.

Reliability is the ability of a system to perform and maintain its functions in routine circumstances, as well as hostile or unexpected circumstances. More the fault tolerance of the system more reliable it is. Reliability adds quality to the system and is an often desired parameter for schedulers owing to large size of the grid and the composition consisting of scarce resources. Failures can result in a huge loss both in terms of money and utilization of computational energy. Thus, it is always desired from a grid scheduler that it ensures the reliable environment to the job execution. Whenever a grid is designed, the hardware components are specified with a failure rate by the manufacturer and are supplied as a part of the hardware specifications. Software components also has failure rate specified during software design using software engineering paradigm. These failure rates reflect the reliability of the system, which is desired to be high. For the scheduling decision, reliability should be computed beforehand keeping in mind the contribution of both the hardware and the software so that the probability of successful job execution may increase. In this work, we propose a Reliability-based Scheduling Model (RSM) which allocates the modular job on the cluster of the grid that matches the job's requirements and offers the most reliable environment to the job execution.