The core component of the Haskell ecosystem, the Glasgow Haskell Compiler (GHC) is not only open source, it is also a proper open source project relying on the work of volunteers. Despite its age and its apparent complexity, new contributors are not needed but actually useful to the project.

The core component of the Haskell ecosystem, the Glasgow Haskell Compiler (GHC) is not only open source, it is also a proper open source project relying on the work of volunteers. Despite its age and its apparent complexity, new contributors are not needed but actually useful to the project.

Latest revision as of 01:20, 30 September 2014

The Haskell implementors' workshop is a forum for those involved in
implementing Haskell systems, infrastructure, libraries and tools,
generally for people involved in implementing Haskell technology, to
share their work and discuss future directions and collaborations with
others.

In 2014, the Haskell Implementors Workshop will be co-located with ICFP 2014 in Gothenburg.

The workshop does not have proceedings. Talks and/or demos are proposed by
submitting an abstract, and selected by a small program committee.
The workshop will be informal and interactive, with a flexible timetable
and plenty of room for ad-hoc discussion, demos, and impromptu short talks.

Traditionally, HIW is an open forum for people writing compilers, tools, or
libraries, people with cool ideas for directions in which we should take
the platform, proposals for new features to be implemented, and
half-baked crazy ideas. For more details see the Call for Contributions in full.

HIW 2014 accepted 13 talk proposals. To make this possible, some of the talks had to be accepted as "short talks" of 15 minutes.
Other talks have 20 minutes. All talks are followed by 5 minutes for questions, discussion, and changeovers.

The workshop will also give opportunity for lightning talks organised on
the workshop day.
Speakers sign up for lightning talks during the workshop. Lightning talk slots will be scheduled first-come-first-served, for ~5 minutes in the afternoon sessions.

Abstracts of Accepted Talks

CLaSH: Compiling circuit descriptions

Christiaan Baaij, U. Twente

The CLaSH compiler sees Haskell programs as digital circuit descriptions, in
particular, it sees them as structural circuit descriptions. We take a
structural view so that the implicit parallelism in the description is mapped to
(sub-)circuits actually running in parallel. As a result, annotation-free,
"normal" Haskell code gets mapped to a "maximally" parallel circuit. Probably
unsurprisingly there is catch: you have specify sequential circuits
manually. The output of the CLaSH compiler is low-level, synthesizable, VHDL
code.

In this talk I will explain the general flow of the CLaSH compiler, the design
decisions behind its internal rewrite system, and its system for overloadable
primitive operations. I will also compare the structural synthesis approach
taken by CLaSH, to the embedded DSL approach (such as Lava), and the behavioural
synthesis approach (such as Verity). This comparison will be both about the
advantages and disadvantages for the language user, and the language
implementer. I will conclude with some of the limitations of the current CLaSH
compiler and directions for future work.

The Past, Present and Future of the Programmer-friendly Helium Compiler

Helium was developed as a programmer-friendly compiler for Haskell, in
particular to improve upon the type error diagnosis as reported by Haskell
compilers in a class room setting. Helium has been dormant for some time,
although it is still in use in various places around the world. Over the last
few years, we have, off and on, worked on making Helium available through
Hackage, and we want to use the HIW to introduce it back the world, including a
whole new generation of Haskell programmers and researchers who may only have
remotely heard of it. In the talk, I will discuss the history of Helium,
describing its unique features, the limitations to what it supports, advertise
its availability on Hackage (a task that will be taken care over the summer
holidays), and importantly, what our future plans are for this compiler. In
particular, to discuss how it relates to that other compiler, the UHC, that the
Software Technology group at Utrecht develops.

Of particular interest is the recently started DOMSTED project that was funded
by the Dutch government on providing domain-specific type error diagnosis for
embedded domain-specific languages within the full Haskell language, supporting
also many of the type system exten- sions that everyday Haskell programmers like
to employ. Helium was the first compiler, back in 2003, to provide such support,
but this never went beyond (approximately) Haskell 98. This may not be such a
limitation when teaching students how to program in Haskell, but it certainly is
a limitation if we want the professional Haskell programmer to enjoy the same
benefits in the presence of type families, GADTs, multi-parameter type classes,
existentials and higher- ranked types.
community
interested in this approach to verifying type class laws and other properties.

GUMSMP: a Multilevel Parallel Haskell Implementation

GUMSMP is a new parallel Haskell runtime implementation, which is designed for
hierarchical platforms, such as clusters of multi-cores. It combines distributed
memory parallelism, using a virtual shared heap over a cluster as developed for
the GHCGUM system, with low-overhead shared memory parallelism on the
multi-cores as developed for the GHC-SMP system. The combination of both systems
results in an high-performance execution engine that can flexibly use the
computational power of modern clusters of multi-cores. In particular, the GUMSMP
design features effective latency hiding, and mostly passive load distribution,
to exploit high- latency clusters. We present improvements to the load balancing
system that accounts for the presence of multi-cores and uses pre-fetching to
increase overall utilisation. Specifically, we studied the impact of several
load balancing policies, including work pre- fetching and hierarchical work
distribution, resulting in significant improvements of perfor- mance. We present
novel performance results for this implementation on a cluster of 8-core
machines, demonstrating good scalability of a set of 9 benchmarks, including a
newly added blackscholes implementation, on up to 96 cores with speedups ranging
from 20 to 87, and performance gains of up to 20

The GUMSMP system also proves to be beneficial for executing parallel Haskell
code on state-of-the-art NUMA shared memory machines, that suffer from varying
latencies accessing memory in different NUMA regions. Our recent performance
results show that our hybrid system, GUMSMP, consistently outperforms the shared
memory GHC-SMP implementation on seven benchmarks. The best results are achieved
using one shared heap within a single NUMA region, but using distributed heaps
across the regions. The combination of both shared and distributed heap in
GUMSMP gives us the flexibility to choose these heap settings.

Managing a Haskell heap in JavaScript

Luite Stegeman, (unaffiliated)

The GHC runtime system uses a garbage collector to clean up unused memory and
several additional tasks such as resetting unreferenced CAFs, running finalizers
for weak references, removing indirections for updated thunks and selector
thunks and checking for "indefinitely blocked" situations. GHCJS relies on the
JavaScript garbage collector to do the basic cleanup task. Unfortunately the
JavaScript runtime provides no information that can help the other tasks. Even
the WeakMap proposed for ECMAScript 6 does not allow us to observe when a
reference is dead. Currently, GHCJS uses a fairly simple heap scanning technique
to fill the gaps. In this talk we explore variations on the basic technique and
changes in the GHCJS runtime system to improve performance and reduce the amount
of work.

GHC status update

Simon Peyton-Jones

Contributing to GHC

The core component of the Haskell ecosystem, the Glasgow Haskell Compiler (GHC) is not only open source, it is also a proper open source project relying on the work of volunteers. Despite its age and its apparent complexity, new contributors are not needed but actually useful to the project.

Recently, the project has seen some changes that make it even easier for you to start hacking on it, more convenient to get your changes reviewed and harder to break anything: Our repositories have a less custom setup; a tool called Phabricator is used for efficient and meme-ridden code review; various quality assurances services detect breakage and performance regressions early. This extends our existing tools (trac, the mailing lists) and practices (notes, an extensive test suite) that keep working on GHC manageable.

In this talk we give an overview of old and new practices and tools, especially aiming at interested newcomers, lowering the entry barrier to contributing to GHC.

I propose ToolCabal, a redesign of Cabal. It improves upon Cabal in the
following ways: 1. It improves on the dealing with preprocessors 2. It supports
a native plugin system, meaning no recompilation is needed to use plugins. 3. It
allows multiple backends of a compiler to be build simultaneous

The core of the redesign is using the Haskell class system to deal uniformly
with compilers and preprocessors, and using the shake build system to force
dependency tracking of all source files, be it sources for the preprocessors or
the compiler. Shake is particularly useful here as it allows dynamic
dependencies, meaning it can correctly track dependencies of the generated
files.

It steps away from the custom builds and build hook structure of Cabal by using
a full plugin system that can be used at both the preprocessor as well as the
compiler level. This is com- pletely native and the build system deals uniformly
with plugin and non plugin preprocessors and compilers.

Another major improvement of ToolCabal over Cabal is the support for multiple
targets of a compiler, e.g. the bytecode and profiling backends of GHC, and the
bytecode and JavaScript backends of UHC.

Due to the generic type classes the tool is completely independent of both the
source and target. It is not Haskell specific anymore but can be used with any
other source language and compilers.

Implementing Backpack in GHC

I am spending this summer at GHC HQ working on an implementation of Backpack, a
new module system for Haskell. Why should you care? Along the road to Backpack,
one may: (1) solve Cabal hell, by supporting multiple instances of a package
linked against different ver- sions and allowing us to move away from version
number ranges as the primary mechanism of specifying package dependency, (2)
support module-level pluggability, so you never have to use a silly type class
to make your library support both String and Text, (3) improve the state of the
art for large complicated libraries such as the ghc-api and their users, making
it far easier for users to say, "Here is the subset of the library I’m depending
on." This work is very much in progress, and I’m hoping to report as much as
I’ve accomplished by the time ICFP rolls around. This would also be a good
opportunity to talk about Haskell type classes and their differences from
modular type classes from the ML tradition, and what the future of Haskell type
classes might be.

Dependent Haskell

Richard A. Eisenberg, U.Pennsylvania

This talk will be a preview of Dependent Haskell, with its implementation
currently under way in a fork of GHC. Dependent Haskell is a dependently typed
variant of Haskell. It is intended to be a conservative (that is, backward
compatible) extension to GHC’s Haskell, preserving type erasure wherever
possible. Dependent Haskell does not require all functions to terminate. Thus,
programs in Dependent Haskell are not proofs; instead, properties expressed in
types hold only when the program compiles and runs in finite time (partial
correctness). This seems reasonable in practice, as any divergence of a running
program is caused by a loop written by the programmer – compilation will not
introduce new loops to overcome programmer mistakes. I hope to merge this into
the main branch of GHC when it is mature enough. One of my goals in this talk is
to spur useful design discussion about how best to serve the needs of users.

Partial Type Signatures

In Haskell, programmers have a binary choice between omitting the type signature
(and relying on type inference) or explicitly providing the type entirely; there
are no intermediate options. Partial type signatures bridge the gap between the
two extremes.

In a partial type signature, annotated types can be mixed with inferred types. A
type signature is written like before, but can now contain wildcards, written as
underscores. Also, placing a wildcard in the constraints part of a type
signature will allow the type checker to infer an arbi- trary number of
constraints. The type checker will verify that the inferred type matches the
form of the partial type signature, while at the same time it infers the types
of the wildcards. E.g., a partial type signature _ => a -> _ could lead to the
type Show a => a -> String . Partial type signatures give the programmer
complete control over the amount of type infor- mation he wants to annotate.

Partial type signatures can be useful in the development phase; the programmer
can annotate the parts of the type signature he knows and replace the unknown
parts with underscores. They can also be used to hide complex parts of a type
signature or to emphasise the relevant annotated parts.

We are currently working on integrating partial type signatures in GHC as a
variant of Type- dHoles. Whereas TypedHoles allow holes in programs or
expressions, PartialTypeSignatures allow holes in types. Depending on the
extensions that are enabled, partial type signatures are considered as either
temporary placeholders that need to be replaced with full signatures for the
code to be successfully compiled or a permanent way to leave out uninteresting
parts of types.

In this presentation, I will give an overview of partial type signatures, and
give some insights in the implementation.

The Implementation of the HdpH DSLs: Details and Difficulties

HdpH and HdpH-RS are a pair of Haskell DSLs for irregular task- parallel computation on
distributed-memory platforms. These DSLs were specifically developed to address the issues
of non-uniform communication topologies and fault tolerance arising on large-scale distributed
platforms. They were also specifically designed to rely only on standard GHC - no runtime
system modifications, no third party libraries (apart from standard Hackage packages).
The design and semantics of HdpH and HdpH-RS will be presented at the Haskell Sympo-
sium; this talk will focus on details of the implementation of the DSLs and on barriers that limit
further improvements.

Implementation details include:

Serialisable Closures, and the differences to CloudHaskell. HdpH Closures are fully polymorphic and composable like applicative functors. Overheads on composition are low, making "computing with Closures" a natural paradigm that is used extensively in the implementation of polymorphic skeletons.

Networking backends. The HdpH DSLs implement distributed work stealing protocols, placing a number of requirements on the networking backend. We compare experiences with two networking backends: the TCP-based network-transport package, and a hand-crafted MPI-backend.

Barriers include:

Distributed work stealing relies on low-latency messaging. HdpH currently achieves this by dedicating one core per node exclusively to message handling. IO threads with priorities would be a better solution.

Parallel DSLs often compute on large data structures in normal form. A compact in-memory representation of normal forms (similar to unboxed vectors) would be beneficial for cache performance and might reduce garbage collection and serialisation overheads.

A GHC language extension for static values

Mathieu Boespflug and Facundo Domínguez, Tweag I/O

The Haskell package distributed-process implements a framework for programming
distributed systems. One of the cornerstones of the approach is the ability to
send closures for evalu- ation in remote peers. So far this has been achieved by
pervasive use of Template Haskell, which although effective it is still
demanding on the programmer who must manage tables of functions which may be
used by peers.

This pitfall has been noticed since the conception of the framework. The
suggested remedy was to extend the Haskell language with a notion of static
values, used as references to Haskell values, which are valid in all processes
of a given distributed application. As far as we are aware, such language
extension remained unimplemented and has not been defined with precision.

In this talk we will discuss an implementation of such an extension for the GHC
compiler, its potential impact in the distributed-process framework and its
limitations. A working implemen- tation is publicly available, which is being
prepared to be submitted for inclusion in GHC.