An Opportunistic Storage System for UnaGrid

UnaGrid is an opportunistic virtual grid infrastructure that takes advantage of the idle processing capabilities of conventional desktop machines in computer rooms through the use of Customizable
…

UnaGrid is an opportunistic virtual grid infrastructure that takes advantage of the idle processing capabilities of conventional desktop machines in computer rooms through the use of Customizable Processing Virtual Clusters (CPVCs), these capabilities are used within the development of e-Science projects. This paper presents the design, implementation and assessment of a virtual storage system, which simultaneously allows UnaGrid to take advantage of the storage and processing capabilities available in tens or hundreds of desktop machines. The first tests have shown that this system allows attaining large storage capabilities, at low cost, and superior performance than a NFS-NAS dedicated solution.

4.
Project: Campus Grid Uniandes - UnaGrid Take advantage of the idle processing capabilities available inconventional computer labs. Support the development of e-Science projects.

5.
UnaGrid X X Cores Cores Linux Linux Processing Processing Virtual Machine Virtual Machine Physical Machine of a Physical Machine of a Computer Room Computer Room b. When there is not an End User a. When there is an End User using using the physical machine the physical machineA Processing Virtual Machine (PVM) is executed on each computer of alab, which is executed in background as a low priority process.The PVM is executed in a transparent manner while the users executetheirs daily activities.

8.
UnaGrid – Current Storage SystemA dedicated NFS-NAS storage solution has been used in which all CPVCs store their data

9.
Problem and motivation UnaGrid Disk space benefits available in computer labsA strategy to implement a Virtual Distributed Storage System Take advantage of the idle storage capabilitiesA transparent system for users and applications Provide new storage capabilities to UnaGrid infrastructure

10.
Possible Solutions A new file system or opportunistic system Use an opportunistic The UnaGrid requirements system require another approach.Use a distribute or parallel file system

11.
UnaGrid Requirements The desktops machines of the computer labs have The CPVCs operates with Windows, Linux or Mac, as Linux operating system.their base operating system.The virtual distributed storage system must be executed fromWindows, Linux or Mac desktops and used form Linux CPVCs. Solution Virtualization Technologies

13.
Two Virtual Machines on each Computer X X X X Gigabytes Cores Gigabytes Cores Linux Linux Linux Linux Storage Processing Storage Processing Virtual Machine Virtual Machine Virtual Machine Virtual Machine Windows Windows Physical Machine of Physical Machine of a Computer Lab a Computer Lab a. When there is not a user using the b. When there is a user using the physical machine physical machineIntrusion level on the end user.Priorities and resources assigned to VMs.Resource competition between the VMs.

14.
Solution Strategy Definition of a virtualstorage cluster by computerlab. Concurrent executionwith the CPVCs. Take advantage of theidle processing and storagecapabilities of eachcomputer.

15.
Solution StrategyAny opportunistic system or distributed file system may beexecuted on a Customizable Storage Virtual Cluster (CSVC) This strategy must be validated. Current opportunistic Parallel and distributed filesolutions do not meet the systems can be used to UnaGrid requirements validated the strategy proposed

16.
Methodology Intrusion level on the end userResource competition between virtual machinesPerformance evaluation of the strategy proposed

17.
Level intrusion on the end user X X Several tests were conducted Gigabytes Cores to determine the best resource Linux Linux assignation to the two virtual machines executed in a non- StorageVirtual Machine Processing Virtual Machine intrusive manner: VMs executed in Windows background. Resource assigned to VMs. Tasks executed by the end Physical Machine of a Computer Lab user. b. When there is a user using the Tasks executed by the two physical machine virtual machines.

25.
Performance Evaluation – Read from several clients 120,00 Average Aggregate Read Rate 100,00 80,00 NFS (MB/sec) 60,00 PVFS 40,00 Gfarm 20,00 Lustre 0,00 GPFS 1 3 6 9 12 15 18 21 24 27 30 Number of Concurrent Clients As the number of clients increases, the average bandwidth per clientdecreases. As the number of file system clients (PVMs) increases there is a higher probability that the two VMs executed on each physicalmachine operate as a client (PVM) and server (SVM) of the file systems.

29.
Performance Evaluation Analysis When the CPVCs execute intensive processing tasks,performance of the file systems (CSVCs) is affected by less than4%. With the use of a CSVC it is possible to achieve readbandwidths of 4.5 Gbps and write bandwidths of 2.2 Gbps. Several terabytes may be grouped through the proposedstrategy. With a CSVC bandwidths higher than 1 Gbps are attainedwithout the need of lay down more cable.

30.
Conclusions The strategy of using CPVCs and CSVCs in computer labsconcurrently and transparently allow to take advantage of thenon-used processing and storage capabilities. Hundreds of processing cores and several terabytes may begrouped through the proposed strategy for the development ofe-Science projects. The strategy allows personalizing the tools, middleware,applications, and configurations of the CPVCs and the CSVCs,guarantying the usability of the UnaGrid infrastructure.

31.
Future Work Assessment of the strategy proposed in a productionenvironment and its scalability. Assessment of the performance of applications that useCSVCs. The use of policies and mechanisms of redundancy in theCSVCs. The use of strategies for data placement. Performance evaluation with other opportunistic and filesystems.