Clusters at CÉCI

The aim of the Consortium is to provide researchers with
access to powerful computing equipment (clusters). Clusters
are installed and managed locally at the different sites of the
universities taking part in the Consortium, but they are accessible by
all researchers from the member universities. A single
login/passphrase is used to access all clusters through SSH.

All of them run Linux, and use
Slurm as the job
manager. Basic parallel computing libraries (OpenMP, MPI, etc) are
installed, as well as the optimized computing subroutines (e.g. BLAS,
LAPACK, etc.). Common interpreters such as R, Octave, Python, etc. are
also installed. See each cluster's FAQ for more details.

* In this context, a CPU is to be understood as a core or a hardware thread | count = #nodes x CPU/node
** Filesystem = global scratch space (other than /home) | RAID is a filesystem local to the nodes
*** SMP = all processes/threads on the same node | MPI = multi-node

The CÉCI clusters have been designed to accomodate the large diversity of workloads and needs of the researchers from the five universities.

On one end of the spectrum is the sequential workload (Northern part of the figure). That type of workload needs very fast CPUs and often a large maximum job time, requiring limitations on the number of jobs a user can run simultaneously to allow a fair sharing of the cluster.

On the other end of the spectrum is the massively parallel workload (Southern part of the figure). For such workloads, individual core performance is less crucial, as long as there are many available cores. A job will be allowed to use a very large number of CPUs at a time, but only for a limited period of time to ensure a fair sharing of the cluster. The maximum walltime for a job must even be shorter for researchers engaged in development activities (to reduce the waiting time in the queue to a minimum), while those mainly concerned with production will prefer slightly larger max time values (to avoid unnecessary overhead due to checkpointing tools needed with short maximum time values.) Generally, parallel workloads necessitate of course a fast and low latency network.

Some of the 'parallel' clusters are made of fat nodes (South-West), meaning that the number of cores per nodes is large (e.g. 48 or even 64), while others rely on a large number of smaller (thin) nodes (North-East.) Fat nodes are more suitable for a shared-memory type of work such as using OpenMP, or pthreads. They can host jobs with very large shared memory requirements (up to half a terabyte of RAM). By contrast thin nodes require the use of Message Passing protocols such as MPI or PVM. They offer a better "network bandwith vs. number of core" ratio which makes them more suitable for jobs issuing lots of IO operations -- e.g.jobs that put a heavy load on the centralized scratch filesystem.

The clusters have been installed gradually since early 2011, first at UCL, with HMEM being a proof of concept. At that time, the whole account infrastructure was desgined and deployed so that every researcher from any university was able to create an account and login to HMEM. Then, LEMAITRE2 was setup as the first cluster entirely funded by the F.N.R.S. for the CÉCI. DRAGON1, HERCULES, VEGA and NIC4 have followed, in that order, as shown in the timeline here-under.

Thanks to a private, dedicated, 10Gbps network connecting all CÉCI sites, all the CÉCI clusters share a common storage space in addition of all local spaces. That CÉCI shared storage is based on two main storage systems hosted in Liège and Louvain-la-Neuve. Those storage systems are synchrnonuously replicated, meaning that any file written to one of them is automatically written to the other one. They are connected to five smaller storage systems that serve as buffers/caches through a dedicated 10GBps network. Those caches are located on each site and are tightly connected to the cluster compute nodes.

NIC4

Hosted at the University of Liège (SEGI facility), it features 128 compute
nodes with two 8-cores Intel E5-2650 processors at 2.0 GHz and 64 GB of RAM (4
GB/core), interconnected with a QDR Infiniband network, and having exclusive
access to a fast 144 TB FHGFS parallel filesystem.

Suitable for:

Massively parallel jobs (MPI, several dozens of cores) with many communications
and/or a lot of parallel disk I/O, 2 days max.

VEGA

Hosted at the University of Brussels, it features 44 fat compute nodes with 64
cores (four 16-cores AMD Bulldozer 6272 processors at 2.1 GHz) and 256 GB of
RAM, interconnected with a QDR Infiniband network, and 70 TB of high
performance GPFS storage.

HERCULES

Hosted at the University of Namur, this system currently consists of
approximately 900 cores spread across 65 compute nodes. It mainly comprises 32
Intel Sandy Bridge nodes, each with two 8-core E5-2660 processors at 2.2 GHz
and 64 or 128 GB of RAM (8 nodes), and 32 Intel Westmere compute nodes, each
with two X5650 6-core processors at 2.66 GHz and 36 GB ,72 GB (5 nodes) or 24
GB (5nodes) of RAM. All the nodes are interconnected by a Gigabit Ethernet
network and have access to three NFS file systems for a total capacity of 98
TB.

DRAGON1

Hosted at the University of Mons, this cluster is made of 26 computing nodes,
each with two Intel Sandy Bridge 8-cores E5-2670 processors at 2.6 GHz, 128 GB
of RAM and 1.1 TB of local scratch disk space. The compute nodes are
interconnected with a Gigabit Ethernet network (10 Gigabit for the 36 TB NFS
file server). Two additional nodes have two high-end NVIDIA Tesla 2175 GPUs
(448 CUDA Cores/6GB GDDR5/515Gflops double precision).

HMEM

Hosted at the Université catholique de Louvain, it mainly comprises 17 fatnodes
with 48 cores (four 12-cores AMD Opteron 6174 processors at 2.2 GHz). 2 nodes
have 512 GB of RAM, 7 nodes have 256 GB and 7 nodes have 128 GB. All the nodes
are interconnected with a fast Infiniband QDR network and have a 1.7 TB fast
RAID setup for scratch disk space. All the local disks are furthermore gathered in a
a global 31TB Fraunhofer filesystem (FHGFS).

ZENOBE

Hosted at, and operated by, Cenaero, it features a total of 13.536 cores (Haswell and Ivybridge) with up to 64 GB of RAM, interconnected with a QDR/FDR mixed Infiniband network, and having access to a fast 350 TB GPFS parallel filesystem.

Suitable for:

Massively parallel jobs (MPI, several hundreds cores) with many communications
and/or a lot of parallel disk I/O, 1 day max.