3.
High Performance Computing in Our Everydays What Is New in HPC?Cloud HPC Cloud computing: think of it as a utility E.g., you get to use 10 small computer instances for $0.82 an hour Your computer instances do not necessarily correspond to actual computers Virtualization Demo: ReactOS Latest contestant in cloud computing: HPC Not ordinary computer instances

4.
High Performance Computing in Our Everydays What Is New in HPC?Massive Parallelism Figure: Floating-Point Operations per Second for the CPU and GPU

6.
High Performance Computing in Our Everydays What Is New in HPC?Massive Parallelism Parallel versus distributed computing Distributed nodes do not share the memory: Connected through network; Calculations may run in a parallel fashion; Other nodes do not see what one node has computed; Nodes may fail.

7.
High Performance Computing in Our Everydays What Is New in HPC?Why You Should Care Digital libraries and HPC? No need for upfront investment; Go beyond full-text search; Machine learning; Pattern matching; Social media and graph mining; You can deﬁne a new ﬁeld Freedom

9.
High Performance Computing in Our Everydays Supporting FrameworksMapReduce Published in 2004 by Google researchers Since then it has become widespread in data-intensive processing Core idea: keep things simple, you can do two things: Map: Send out chunks of data and then do something on them Reduce: Collect chunks of data and do something on them while collecting Intermediate data structure: key-value pairs The framework should also take care of the mundane tasks, such as failing nodes, network latency, etc.

10.
High Performance Computing in Our Everydays Supporting FrameworksA MapReduce Inverted Indexer The task is: formulate your problem in MapReduce terms Map: gets a chunk of text. Emits: Key: term Value: document id and corresponding frequency Reduce: Merges by key There might be a different number of map and reduce tasks

11.
High Performance Computing in Our Everydays Supporting FrameworksAnother MapReduce Example Sometimes it is worth bypassing the reduce phase Then we do not need to emit key-value pairs at all Distributed GPU random projection

12.
High Performance Computing in Our Everydays Supporting FrameworksExploiting GPU Resources Low-level frameworks: CUDA and OpenCL They certainly do not make GPUs much friendlier Higher-level libraries: BLAS, cuSPARSE As long as you know maths. . .

19.
High Performance Computing in Our Everydays Open IssuesObstacles to Adoption Persistence and high-reliability MapReduce Not just a technological issue Service-level agreement Particularly problematic Another EU FP7 project working on it: SLA@SOI Niche for alternative cloud providers Difﬁculty of integration

20.
High Performance Computing in Our Everydays ConclusionsAcknowledgment Work has been funded by Sustaining Heritage Access through Multivalent ArchiviNg (SHAMAN), an EU FP7 large integrated project. http://shaman-ip.eu/shaman/ Additional funding has been received from Amazon Web Services. http://aws.amazon.com/