20 Şubat 2015 Cuma

Project Description

Cluster computing with cloud services is a new approach for high performance computing. Hpc resources were limited before cloud resources and it was difficult to run any application in any cluster because of the dependency conflicts and configuration problems of static cluster nodes. Cloud services solved such problems and gave us a reinvention opportunity for hpc.

Nowadays cluster solutions on cloud mainly focused on the big data solutions such as hadoop,map reduce... Using the same approach we developed a hpc cluster job management tool for spesific scientific and engineering problems. The system is mainly developed on the Amazon EC2 but the architecture designed to be able to used with other cloud apis like compute engine,openstack.

The main parts of system;

Cluster manager server running on cloud

Parallel solver implementations (Opensource or custom)

Desktop client (User)

Mobile client (User)

Web client

CLUSTER MANAGER mainly developed for driving any mpi,openmp,hybrid jobs on the cloud with the cloud apis. The server is developed as multithreaded to be able to drive many jobs in the same time. This property solves que waiting problem of classical cluster systems. The server is working like job scheduler of classical cluster systems but the difference is the server launching cluster on demand according to the required resources defined by user. For example; a user is needs to solve a cfd problem with openfoam with 256 cores. The user is preparing input files and using one of the clients sending them with the job submission request. The manager checks the status of jobs and if detects any pending job takes in action to run according to solver properties,problem geometry mesh etc... All steps are going with ssh connections between manager,master node and worker nodes.

All of these procedures may have seen by the user clients and the solvers are tracked step by step.

PARALLEL SOLVERS developed using mpi,openmp libraries can be implemented easily in order to run together with manager application. Abstract solver implemantations providing plugin mechanism for any other parallel solver to integrate easily to system without any change to manager. The project tested with two solvers;

Openfoam

FDS(Fire Dynamics Simulator)

but there are many open source parallel solvers can be implemented such as;

SU2

Calculix

LAMMPS

NAMD

Overflow...

USER CLIENTS providing easy access to cluster resources without any line of shell script to run job and that makes easier job submission procedure for engineers and mathematicians. The unix platform difficulties and complexity of job submission procedures makes difficultier usage for non familiar to unix systems and big percent of engineers or scientists are in this category.

As a result of details explained on demand cluster computing solution provides;