11 RAM disk Available at Venus (at the moment): only allowed from within an LSF job, can only use a part of the memory assigned to the LSF job; module load ramdisk make-ramdisk <size of the ramdisk in GB> path to the ramdisk is fixed to /ramdisks/<jobid> accessible from the whole machine, for fast copy: parallel-copy.sh <source directory or file> <target directory> ramdisk will automatically be deleted at the end of the job More info at https://doc.zih.tu-dresden.de

12 Local disk Recommended at Taurus (Atlas): SSD: best option for lots of small I/O operations, limited size ( 50 GB), ephemeral: data will be deleted automatically after 7 days, each node has its own local disk. Attention: Multiple processes on the same node share their local disk, path to the local disk is /tmp

13 Scratch file system Fastest parallel file systems at each HPC machine: large parallel file system for high bandwidth, data may be deleted after 100 days, paths to scratch file system are /scratch/<login> and /scratch/<project> with access rights for the whole HPC project, resp. All nodes of the machine share this file system.

14 Permanent file systems Common file system for all ZIH s HPC machines: Very slow and small, but with multiple backups. Deleted files are accessible via the logical.snapshot directory. This directory contains weekly, daily, and hourly snapshots. Copy your file to where you need it. Paths to permanent storage are /home/<login> (20 GB!) and /projects/<projectname> with different access rights (cf. Terms of Use). All HPC systems of ZIH share these file systems. Do not use the permanent file system for production! Frequent changes slow down or even disable the backup.

15 Archive Common tape based file system: really slow and large, expected storage time of data: about 3 years, access under user s control.

16 Data transfer Special data transfer nodes are running in batch mode to comfortably transfer large data between different file systems: Commands for data transfer are available on all HPC systems with prefix dt: dtcp, dtcp, dtls, dtmv, dtrm, dtrsync, dttar. The transfer job is then created, queued, and processed automatically. User gets an after completion of the job. Aditional commands: dtinfo, dtqueue. Very simple usage like dttar -czf /archiv/jurenz/taurus_results_ tgz \ /taurus_scratch/jurenz/results

24 Overview Who can use the HPC systems at ZIH? ZIH is the state computig center for HPC. Available for universities and research institutes in Saxony. Free of charge. Life cicle of a project (* outside TU Dresden) 1 Project admin (leader) fills in an online application form 2 Each user fills out an HPC login form, stamp, fax (*) 3 An account is generated ( CPUh) to evaluate the computational needs. 4 Prepare full project application 5 Scientific board (Wissenschaftlicher Beirat) decides, resources are granted 6 Data removal at the end of the project - where to?

27 Data handling We assume that only project related files are in the HPC file systems. (Support team has root access.) Access to data after closing a login: in /projects: user and project administrator in /home: only the user For seamless work over multiple years: store project data only in /projects. Data can be erased by ZIH (e.g. automatically): after 7 days in /tmp, after 100 days in /scratch, 15 months after the closing of the project or login in /projects and /home

36 Full proposal The test period should be used to determine the further needs and document this in an an extended proposal for the scientific board ( Wissenschaftlicher Beirat ). The extended proposal should include: presentation of the problem and description of project content (with references of publications), achieved preliminary work, pre-studies with results, experiences, target objectives and target cognitions, physical and mathematical methods or solutions, computational aspects: algorithms, software, for parallel codes: parallel efficiency, needed resources: CPU time, memory per core, storage - iterms of capacity and frequency. A few figures might be helpful to understand the description.

43 Channels of communication ZIH users: next training course Introduction to HPC at ZIH November 6, 2014 HPC wiki: https://doc.zih.tu-dresden.de link to the operation status, knowledge base for all our systems, howtos, tutorials, examples... mass notifications per signed from the sender [ZIH] HPC Support to your for: problems with the HPC systems, new features interesting for all HPC users, training courses , phone - in case of requests or emergencies (e.g. uses stops the file system).

50 Kinds of support HPC Software questions: help with the compiling of a new software installation of new applications, libraries, tools update to a newer / different version

51 Kinds of support HPC Software questions: help with the compiling of a new software installation of new applications, libraries, tools update to a newer / different version restrictions of this support: only if several user groups need this no support for a particular software allow for some time

52 Kinds of support Performance issues joint analysis of a piece of SW discussion of performance problems detailed inspection of self-developed code in the long run: help users to help themselves

53 Kinds of support Performance issues joint analysis of a piece of SW discussion of performance problems detailed inspection of self-developed code in the long run: help users to help themselves Storage capacity issues joint analysis of storage capacity needs joint development of a storage strategy

Introduction to Computing on Raad Information Technology, Research Computing November 2013 Outline NOTE: It is highly recommended (although not required) that attendees of this session should already have

Berkeley Research Computing Town Hall Meeting Savio Overview SAVIO - The Need Has Been Stated Inception and design was based on a specific need articulated by Eliot Quataert and nine other faculty: Dear

The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the

locuz.com HPC App Portal V2.0 DATASHEET Ganana HPC App Portal makes it easier for users to run HPC applications without programming and for administrators to better manage their clusters. The web-based

NeSI Computational Science Team (support@nesi.org.nz) Outline 1 About Us About NeSI Our Facilities 2 Using the Cluster Suitable Work What to expect Parallel speedup Data Getting to the Login Node 3 Submitting

The Top Six Advantages of CUDA-Ready Clusters Ian Lumb Bright Evangelist GTC Express Webinar January 21, 2015 We scientists are time-constrained, said Dr. Yamanaka. Our priority is our research, not managing

1. How to Get An Account CACR Accounts 2. How to Access the Machine Connect to the front end, zwicky.cacr.caltech.edu: ssh -l username zwicky.cacr.caltech.edu or ssh username@zwicky.cacr.caltech.edu Edits,

OLCF Best Practices Bill Renaud OLCF User Assistance Group Overview This presentation covers some helpful information for users of OLCF Staying informed Some aspects of system usage that may differ from

An Introduction to High Performance Computing in the Department Ashley Ford & Chris Jewell Department of Statistics University of Warwick October 30, 2012 1 Some Background 2 How is Buster used? 3 Software

Cluster Computing May 25, 2011 How to get an account https://fyrkat.grid.aau.dk/useraccount How to get help https://fyrkat.grid.aau.dk/wiki What is a Cluster Anyway It is NOT something that does any of

System Software for High Performance Computing Joe Izraelevitz Agenda Overview of Supercomputers Blue Gene/Q System LoadLeveler Job Scheduler General Parallel File System HPC at UR What is a Supercomputer?

! Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research! Cynthia Cornelius! Center for Computational Research University at Buffalo, SUNY! cdc at

Subject: First report from Activity SA2 (M8): Provide a suitable Software environment for users Author(s): Alejandro Soba Distribution: Public Introduction The main target of Service Activities 2 (SA2,

Self service for software development tools Michal Husejko, behalf of colleagues in CERN IT/PES CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Self service for software development tools

Center for e-research (eresearch@nesi.org.nz) Outline 1 About Us About CER and NeSI The CS Team Our Facilities 2 Key Concepts What is a Cluster Parallel Programming Shared Memory Distributed Memory 3 Using

2. COMPUTER SYSTEM 2.1 Introduction The computer system at the Japan Meteorological Agency (JMA) has been repeatedly upgraded since IBM 704 was firstly installed in 1959. The current system has been completed

Working with HPC and HTC Apps Abhinav Thota Research Technologies Indiana University Outline What are HPC apps? Working with typical HPC apps Compilers - Optimizations and libraries Installation Modules

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services 1 Abstract Supercomputing is moving from the realm of abstract to mainstream with more and more applications and research being

Where a calculator on the ENIAC is equipped with 18,000 vacuum tubes and weighs 30 tons, computers of the future may have only 1,000 vacuum tubes and perhaps weigh 1½ tons. Popular Mechanics, March 1949

Best Practices @OLCF (and more) Bill Renaud OLCF User Support General Information This presentation covers some helpful information for users of OLCF Staying informed Aspects of system usage that may differ

AFS Usage and Backups using TiBS at Fermilab Presented by Kevin Hill Agenda History and current usage of AFS at Fermilab About Teradactyl How TiBS (True Incremental Backup System) and TeraMerge works AFS