4.7. Filenames and separate compilation

This section describes what files GHC expects to find, what
files it creates, where these files are stored, and what options
affect this behaviour.

Note that this section is written with
hierarchical modules in mind (see Section 7.3.5, “Hierarchical Modules”); hierarchical modules are an
extension to Haskell 98 which extends the lexical syntax of
module names to include a dot ‘.’. Non-hierarchical
modules are thus a special case in which none of the module names
contain dots.

Pathname conventions vary from system to system. In
particular, the directory separator is
‘/’ on Unix systems and
‘\’ on Windows systems. In the
sections that follow, we shall consistently use
‘/’ as the directory separator;
substitute this for the appropriate character for your
system.

4.7.1. Haskell source files

Each Haskell source module should be placed in a file on
its own.

Usually, the file should be named after the module name,
replacing dots in the module name by directory separators. For
example, on a Unix system, the module A.B.C
should be placed in the file A/B/C.hs,
relative to some base directory. If the module is not going to
be imported by another module (Main, for
example), then you are free to use any filename for it.

GHC assumes that source files are
ASCII or
UTF-8 only, other
encodings are
not recognised. However, invalid UTF-8 sequences will be
ignored in comments, so it is possible to use other encodings
such as
Latin-1, as
long as the non-comment source code is ASCII only.

4.7.2. Output files

When asked to compile a source file, GHC normally
generates two files: an object file, and
an interface file.

The object file, which normally ends in a
.o suffix, contains the compiled code for the
module.

The interface file,
which normally ends in a .hi suffix, contains
the information that GHC needs in order to compile further
modules that depend on this module. It contains things like the
types of exported functions, definitions of data types, and so
on. It is stored in a binary format, so don't try to read one;
use the --show-iface option instead (see Section 4.7.7, “Other options related to interface files”).

You should think of the object file and the interface file as a
pair, since the interface file is in a sense a compiler-readable
description of the contents of the object file. If the
interface file and object file get out of sync for any reason,
then the compiler may end up making assumptions about the object
file that aren't true; trouble will almost certainly follow.
For this reason, we recommend keeping object files and interface
files in the same place (GHC does this by default, but it is
possible to override the defaults as we'll explain
shortly).

Every module has a module name
defined in its source code (module A.B.C where
...).

The name of the object file generated by GHC is derived
according to the following rules, where
osuf is the object-file suffix (this
can be changed with the -osuf option).

If there is no -odir option (the
default), then the object filename is derived from the
source filename (ignoring the module name) by replacing the
suffix with osuf.

If
-odirdir
has been specified, then the object filename is
dir/mod.osuf,
where mod is the module name with
dots replaced by slashes. GHC will silently create the necessary directory
structure underneath dir, if it does not
already exist.

The name of the interface file is derived using the same
rules, except that the suffix is
hisuf (.hi by
default) instead of osuf, and the
relevant options are -hidir and
-hisuf instead of -odir and
-osuf respectively.

For example, if GHC compiles the module
A.B.C in the file
src/A/B/C.hs, with no
-odir or -hidir flags, the
interface file will be put in src/A/B/C.hi
and the object file in src/A/B/C.o.

For any module that is imported, GHC requires that the
name of the module in the import statement exactly matches the
name of the module in the interface file (or source file) found
using the strategy specified in Section 4.7.3, “The search path”.
This means that for most modules, the source file name should
match the module name.

However, note that it is reasonable to have a module
Main in a file named
foo.hs, but this only works because GHC
never needs to search for the interface for module
Main (because it is never imported). It is
therefore possible to have several Main
modules in separate source files in the same directory, and GHC
will not get confused.

In batch compilation mode, the name of the object file can
also be overridden using the -o option, and the
name of the interface file can be specified directly using the
-ohi option.

4.7.3. The search path

In your program, you import a module
Foo by saying import Foo.
In --make mode or GHCi, GHC will look for a
source file for Foo and arrange to compile it
first. Without --make, GHC will look for the
interface file for Foo, which should have
been created by an earlier compilation of
Foo. GHC uses the same strategy in each of
these cases for finding the appropriate file.

This strategy is as follows: GHC keeps a list of
directories called the search path. For
each of these directories, it tries appending
basename.extension
to the directory, and checks whether the file exists. The value
of basename is the module name with
dots replaced by the directory separator ('/' or '\', depending
on the system), and extension is a
source extension (hs, lhs)
if we are in --make mode or GHCi, or
hisuf otherwise.

For example, suppose the search path contains directories
d1, d2, and
d3, and we are in --make
mode looking for the source file for a module
A.B.C. GHC will look in
d1/A/B/C.hs, d1/A/B/C.lhs,
d2/A/B/C.hs, and so on.

The search path by default contains a single directory:
“.” (i.e. the current directory). The following
options can be used to add to or change the contents of the
search path:

-idirs

This flag appends a colon-separated
list of dirs to the search path.

-i

resets the search path back to nothing.

This isn't the whole story: GHC also looks for modules in
pre-compiled libraries, known as packages. See the section on
packages (Section 4.9, “
Packages
”) for details.

4.7.4. Redirecting the compilation output(s)

-ofile

GHC's compiled output normally goes into a
.hc, .o, etc.,
file, depending on the last-run compilation phase. The
option -o file
re-directs the output of that last-run phase to
file.

Note: this “feature” can be
counterintuitive: ghc -C -o foo.o
foo.hs will put the intermediate C code in the
file foo.o, name
notwithstanding!

This option is most often used when creating an
executable file, to set the filename of the executable.
For example:

ghc -o prog --make Main

will compile the program starting with module
Main and put the executable in the
file prog.

Note: on Windows, if the result is an executable
file, the extension ".exe" is added
if the specified filename does not already have an
extension. Thus

ghc -o foo Main.hs

will compile and link the module
Main.hs, and put the resulting
executable in foo.exe (not
foo).

If you use ghc --make and you don't
use the -o, the name GHC will choose
for the executable will be based on the name of the file
containing the module Main.
Note that with GHC the Main module doesn't
have to be put in file Main.hs.
Thus both

ghc --make Prog

and

ghc --make Prog.hs

will produce Prog (or
Prog.exe if you are on Windows).

-odirdir

Redirects object files to directory
dir. For example:

$ ghc -c parse/Foo.hs parse/Bar.hs gurgle/Bumble.hs -odir `uname -m`

The object files, Foo.o,
Bar.o, and
Bumble.o would be put into a
subdirectory named after the architecture of the executing
machine (x86,
mips, etc).

Note that the -odir option does
not affect where the interface files
are put; use the -hidir option for that.
In the above example, they would still be put in
parse/Foo.hi,
parse/Bar.hi, and
gurgle/Bumble.hi.

-ohifile

The interface output may be directed to another file
bar2/Wurble.iface with the option
-ohi bar2/Wurble.iface (not
recommended).

WARNING: if you redirect the interface file
somewhere that GHC can't find it, then the recompilation
checker may get confused (at the least, you won't get any
recompilation avoidance). We recommend using a
combination of -hidir and
-hisuf options instead, if
possible.

To avoid generating an interface at all, you could
use this option to redirect the interface into the bit
bucket: -ohi /dev/null, for
example.

-hidirdir

Redirects all generated interface files into
dir, instead of the
default.

Redirects all dump files into
dir. Dump files are generated when
-ddump-to-file is used with other
-ddump-* flags.

-outputdirdir

The -outputdir option is shorthand for
the combination
of -odir, -hidir,
-stubdir and -dumpdir.

-osufsuffix
, -hisufsuffix
, -hcsufsuffix

The -osufsuffix will change the
.o file suffix for object files to
whatever you specify. We use this when compiling
libraries, so that objects for the profiling versions of
the libraries don't clobber the normal ones.

Keep intermediate .ll files when
doing .hs-to-.o
compilations via LLVM
(NOTE: .ll files aren't generated when using the
native code generator, you may need to use -fllvm to
force them to be produced).

-keep-s-file,
-keep-s-files

Keep intermediate .s files.

-keep-tmp-files

Instructs the GHC driver not to delete any of its
temporary files, which it normally keeps in
/tmp (or possibly elsewhere; see Section 4.7.6, “Redirecting temporary files”). Running GHC with
-v will show you what temporary files
were generated along the way.

4.7.6. Redirecting temporary files

-tmpdir

If you have trouble because of running out of space
in /tmp (or wherever your
installation thinks temporary files should go), you may
use the -tmpdir
<dir> option to specify
an alternate directory. For example, -tmpdir
. says to put temporary files in the current
working directory.

Alternatively, use your TMPDIR
environment variable. Set it to the
name of the directory where temporary files should be put.
GCC and other programs will honour the
TMPDIR variable as well.

Even better idea: Set the
DEFAULT_TMPDIR make variable when
building GHC, and never worry about
TMPDIR again. (see the build
documentation).

4.7.7. Other options related to interface files

-ddump-hi

Dumps the new interface to standard output.

-ddump-hi-diffs

The compiler does not overwrite an existing
.hi interface file if the new one is
the same as the old one; this is friendly to
make. When an interface does change,
it is often enlightening to be informed. The
-ddump-hi-diffs option will make GHC
report the differences between the old and
new .hi files.

-ddump-minimal-imports

Dump to the file
M.imports
(where M is the name of the
module being compiled) a "minimal" set of import
declarations. The directory where the
.imports files are created can be
controlled via the -dumpdir
option.

You can safely replace all the import
declarations in
M.hs with
those found in its respective .imports
file. Why would you want to do that? Because the
"minimal" imports (a) import everything explicitly, by
name, and (b) import nothing that is not required. It can
be quite painful to maintain this property by hand, so
this flag is intended to reduce the labour.

4.7.8. The recompilation checker

-fforce-recomp

Turn off recompilation checking (which is on by
default). Recompilation checking normally stops
compilation early, leaving an existing
.o file in place, if it can be
determined that the module does not need to be
recompiled.

In the olden days, GHC compared the newly-generated
.hi file with the previous version; if they
were identical, it left the old one alone and didn't change its
modification date. In consequence, importers of a module with
an unchanged output .hi file were not
recompiled.

This doesn't work any more. Suppose module
C imports module B, and
B imports module A. So
changes to module A might require module
C to be recompiled, and hence when
A.hi changes we should check whether
C should be recompiled. However, the
dependencies of C will only list
B.hi, not A.hi, and some
changes to A (changing the definition of a
function that appears in an inlining of a function exported by
B, say) may conceivably not change
B.hi one jot. So now…

GHC calculates a fingerprint (in fact an MD5 hash) of each
interface file, and of each declaration within the interface
file. It also keeps in every interface file a list of the
fingerprints of everything it used when it last compiled the
file. If the source file's modification date is earlier than
the .o file's date (i.e. the source hasn't
changed since the file was last compiled), and the recompilation
checking is on, GHC will be clever. It compares the fingerprints
on the things it needs this time with the fingerprints
on the things it needed last time (gleaned from the
interface file of the module being compiled); if they are all
the same it stops compiling early in the process saying
“Compilation IS NOT required”. What a beautiful
sight!

Here A imports B, but B imports
A with a {-# SOURCE #-} pragma, which breaks the
circular dependency. Every loop in the module import graph must be broken by a {-# SOURCE #-} import;
or, equivalently, the module import graph must be acyclic if {-# SOURCE #-} imports are ignored.

For every module A.hs that is {-# SOURCE #-}-imported
in this way there must exist a source file A.hs-boot. This file contains an abbreviated
version of A.hs, thus:

The file A.hs-boot is a programmer-written source file.
It must live in the same directory as its parent source file A.hs.
Currently, if you use a literate source file A.lhs you must
also use a literate boot file, A.lhs-boot; and vice versa.

A hs-boot file is compiled by GHC, just like a hs file:

ghc -c A.hs-boot

When a hs-boot file A.hs-boot
is compiled, it is checked for scope and type errors.
When its parent module A.hs is compiled, the two are compared, and
an error is reported if the two are inconsistent.

Just as compiling A.hs produces an
interface file A.hi, and an object file
A.o, so compiling
A.hs-boot produces an interface file
A.hi-boot, and an pseudo-object file
A.o-boot:

The pseudo-object file A.o-boot is
empty (don't link it!), but it is very useful when using a
Makefile, to record when the A.hi-boot was
last brought up to date (see Section 4.7.10, “Using make”).

The hi-boot generated by compiling a
hs-boot file is in the same
machine-generated binary format as any other GHC-generated
interface file (e.g. B.hi). You can
display its contents with ghc
--show-iface. If you specify a directory for
interface files, the -ohidir flag, then that
affects hi-boot files
too.

If hs-boot files are considered distinct from their parent source
files, and if a {-# SOURCE #-} import is considered to refer to the
hs-boot file, then the module import graph must have no cycles. The command
ghc -M will report an error if a cycle is found.

A module M that is
{-# SOURCE #-}-imported in a program will usually also be
ordinarily imported elsewhere. If not, ghc --make
automatically adds M to the set of modules it tries to
compile and link, to ensure that M's implementation is included in
the final program.

A hs-boot file need only contain the bare
minimum of information needed to get the bootstrapping process
started. For example, it doesn't need to contain declarations
for everything that module
A exports, only the things required by the
module(s) that import A recursively.

A hs-boot file is written in a subset of Haskell:

The module header (including the export list), and import statements, are exactly as in
Haskell, and so are the scoping rules.
Hence, to mention a non-Prelude type or class, you must import it.

There must be no value declarations, but there can be type signatures for
values. For example:

double :: Int -> Int

Fixity declarations are exactly as in Haskell.

Vanilla type synonym declarations are exactly as in Haskell.

Open type and data family declarations are exactly as in Haskell.

A closed type family may optionally omit its equations, as in the following example:

type family ClosedFam a where ..

The .. is meant literally -- you should write two dots in your file. Note that the where clause is still necessary to distinguish closed families from open ones. If you give any equations of a closed family, you must give all of them, in the same order as they appear in the accompanying Haskell file.

A data type declaration can either be given in full, exactly as in Haskell, or it
can be given abstractly, by omitting the '=' sign and everything that follows. For example:

data T a b

In a source program
this would declare TA to have no constructors (a GHC extension: see Section 7.4.1, “Data types with no constructors”),
but in an hi-boot file it means "I don't know or care what the constructors are".
This is the most common form of data type declaration, because it's easy to get right.
You can also write out the constructors but, if you do so, you must write
it out precisely as in its real definition.

If you do not write out the constructors, you may need to give a kind
annotation (Section 7.12.5, “Explicitly-kinded quantification”), to tell
GHC the kind of the type variable, if it is not "*". (In source files, this is worked out
from the way the type variable is used in the constructors.) For example:

data R (x :: * -> *) y

You cannot use deriving on a data type declaration; write an
instance declaration instead.

Class declarations is exactly as in Haskell, except that you may not put
default method declarations. You can also omit all the superclasses and class
methods entirely; but you must either omit them all or put them all in.

You can include instance declarations just as in Haskell; but omit the "where" part.

The default role for class and datatype parameters is now representational. To get another role, use a role annotation. (See Section 7.24, “Roles
”.)

4.7.10. Using make

It is reasonably straightforward to set up a
Makefile to use with GHC, assuming you name
your source files the same as your modules. Thus:

Note also the inter-module dependencies at the end of the
Makefile, which take the form

Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz

They tell make that if any of
Foo.o, Foo.hc or
Foo.s have an earlier modification date than
Baz.hi, then the out-of-date file must be
brought up to date. To bring it up to date,
make looks for a rule to do so; one of the
preceding suffix rules does the job nicely. These dependencies
can be generated automatically by ghc; see
Section 4.7.11, “Dependency generation”

4.7.11. Dependency generation

Putting inter-dependencies of the form Foo.o :
Bar.hi into your Makefile by
hand is rather error-prone. Don't worry, GHC has support for
automatically generating the required dependencies. Add the
following to your Makefile:

depend :
ghc -M $(HC_OPTS) $(SRCS)

Now, before you start compiling, and any time you change
the imports in your program, do
make depend before you do make
cool_pgm. The command ghc -M will
append the needed dependencies to your
Makefile.

In general, ghc -M Foo does the following.
For each module M in the set
Foo plus all its imports (transitively),
it adds to the Makefile:

A line recording the dependence of the object file on the source file.

M.o : M.hs

(or M.lhs if that is the filename you used).

For each import declaration import X in M,
a line recording the dependence of M on X:

M.o : X.hi

For each import declaration import {-# SOURCE #-} X in M,
a line recording the dependence of M on X:

If M imports multiple modules, then there will
be multiple lines with M.o as the
target.

There is no need to list all of the source files as arguments to the ghc -M command;
ghc traces the dependencies, just like ghc --make
(a new feature in GHC 6.4).

Note that ghc -M needs to find a source
file for each module in the dependency graph, so that it can
parse the import declarations and follow dependencies. Any pre-compiled
modules without source files must therefore belong to a
package[7].

By default, ghc -M generates all the
dependencies, and then concatenates them onto the end of
makefile (or
Makefile if makefile
doesn't exist) bracketed by the lines "# DO NOT
DELETE: Beginning of Haskell dependencies" and
"# DO NOT DELETE: End of Haskell
dependencies". If these lines already exist in the
makefile, then the old dependencies are
deleted first.

Don't forget to use the same -package
options on the ghc -M command line as you
would when compiling; this enables the dependency generator to
locate any imported modules that come from packages. The
package modules won't be included in the dependencies
generated, though (but see the
--include-pkg-deps option below).

The dependency generation phase of GHC can take some
additional options, which you may find useful.
The options which affect dependency generation are:

-ddump-mod-cycles

Display a list of the cycles in the module graph. This is
useful when trying to eliminate such cycles.

-v2

Print a full list of the module dependencies to stdout.
(This is the standard verbosity flag, so the list will
also be displayed with -v3 and
-v4;
Section 4.6, “Verbosity options”.)

-dep-makefilefile

Use file as the makefile,
rather than makefile or
Makefile. If
file doesn't exist,
mkdependHS creates it. We often use
-dep-makefile .depend to put the dependencies in
.depend and then
include the file
.depend into
Makefile.

-dep-suffix <suf>

Make extra dependencies that declare that files
with suffix
.<suf>_<osuf>
depend on interface files with suffix
.<suf>_hi, or (for
{-# SOURCE #-}
imports) on .hi-boot. Multiple
-dep-suffix flags are permitted. For example,
-dep-suffix a -dep-suffix b
will make dependencies
for .hs on
.hi,
.a_hs on
.a_hi, and
.b_hs on
.b_hi. (Useful in
conjunction with NoFib "ways".)

--exclude-module=<file>

Regard <file> as
"stable"; i.e., exclude it from having dependencies on
it.

--include-pkg-deps

Regard modules imported from packages as unstable,
i.e., generate dependencies on any imported package modules
(including Prelude, and all other
standard Haskell libraries). Dependencies are not traced
recursively into packages; dependencies are only generated for
home-package modules on external-package modules directly imported
by the home package module.
This option is normally
only used by the various system libraries.

4.7.12. Orphan modules and instance declarations

Haskell specifies that when compiling module M, any instance
declaration in any module "below" M is visible. (Module A is "below"
M if A is imported directly by M, or if A is below a module that M imports directly.)
In principle, GHC must therefore read the interface files of every module below M,
just in case they contain an instance declaration that matters to M. This would
be a disaster in practice, so GHC tries to be clever.

In particular, if an instance declaration is in the same module as the definition
of any type or class mentioned in the head of the instance declaration
(the part after the “=>”; see Section 7.6.3.3, “Relaxed rules for instance contexts”), then
GHC has to visit that interface file anyway. Example:

module A where
instance C a => D (T a) where ...
data T a = ...

The instance declaration is only relevant if the type T is in use, and if
so, GHC will have visited A's interface file to find T's definition.

The only problem comes when a module contains an instance declaration
and GHC has no other reason for visiting the module. Example:

Here, neither D nor T is declared in module Orphan.
We call such modules “orphan modules”.
GHC identifies orphan modules, and visits the interface file of
every orphan module below the module being compiled. This is usually
wasted work, but there is no avoiding it. You should therefore do
your best to have as few orphan modules as possible.

Functional dependencies complicate matters. Suppose we have:

module B where
instance E T Int where ...
data T = ...

Is this an orphan module? Apparently not, because T
is declared in the same module. But suppose class E had a
functional dependency:

module Lib where
class E x y | y -> x where ...

Then in some importing module M, the constraint (E a Int) should be "improved" by setting
a = T, even though there is no explicit mention
of T in M.

These considerations lead to the following definition of an orphan module:

An orphan module
contains at least one orphan instance or at
least one orphan rule.

An instance declaration in a module M is an orphan instance if

The class of the instance declaration is not declared in M, and

Either the class has no functional dependencies, and none of the type constructors
in the instance head is declared in M; or there
is a functional dependency for which none of the type constructors mentioned
in the non-determined part of the instance head is defined in M.

Only the instance head
counts. In the example above, it is not good enough for C's declaration
to be in module A; it must be the declaration of D or T.

A rewrite rule in a module M is an orphan rule
if none of the variables, type constructors,
or classes that are free in the left hand side of the rule are declared in M.

If you use the flag -fwarn-orphans, GHC will warn you
if you are creating an orphan module.
Like any warning, you can switch the warning off with -fno-warn-orphans,
and -Werror
will make the compilation fail if the warning is issued.

You can identify an orphan module by looking in its interface
file, M.hi, using the
--show-iface mode. If there is a [orphan module] on the
first line, GHC considers it an orphan module.