Thursday, January 31, 2013

Shake is a build system I have been working on sporadically for the last four years, think of it as a better alternative to writing Makefiles. In the past few weeks I've released versions 0.6, 0.7 and 0.8.

Questions about Shake

Unlike many of my other libraries, Shake invites user questions. It's a complex tool with lots of power to wield, and lots of aspects that emerge from the API, rather than being obvious from it. Therefore, I encourage anyone with any questions about Shake to ask them against the shake-build-system StackOverflow tag (thanks to Lennart Augustsson for creating the tag, as my reputation is too low). I've already asked one question, but I'm sure there are lots of others - "how does Shake/Make compare to monad/arrow?", "why did the Oracle change type?", "how would I define a rule that downloads from the web?". The more questions the easier it will be for future Shake users to find the information they need.

API Change: getDirectoryFiles

There is only one real breaking API change in the above series of versions, getDirectoryFiles now takes a list of FilePatterns. So you can write:

getDirectoryFiles "Configuration" ["*.xml","*.json"]

to find all XML and JSON files in your Configuration directory. The reason for this change is to introduce a new and more powerful matching capability, so you can also write:

getDirectoryFiles "Configuration" ["//*.xml","//*.json"]

And that will find all XML and JSON files anywhere under the Configuration directory. Shake tries hard to issue the minimum number of directory traversals, so searching for a list of patterns results in fewer file system queries than searching for each pattern individually.

Sunday, January 06, 2013

Shake is a build system I have been working on sporadically for the last four years, think of it as a better alternative to writing Makefiles. In the past week I've released version 0.4 followed by version 0.5. I strongly recommend all users upgrade to the latest version of Shake as it fixes a massive space leak compared to versions 0.2 and 0.3, I have measured memory savings of over 1Gb in some build systems (I intend to write a post on the space leak later). There are two notable API changes for people upgrading from version 0.3.

Change 1: The oracle is now strongly typed

Shake has a notion of oracle rules, which store information about the system, for example which GHC version you are running. The intent is to allow users to track this extra information, so if you upgrade GHC, the build system will automatically rebuild all Haskell files, but leave the C files alone, without requiring users to perform a wipe.

An Oracle is essentially a question/answer pair. In shake-0.3 both questions and answers were of type [String], the documentation attempted to justify this abomination by saying:

"This type is a compromise. Questions will often be the singleton list, but allowing a list of strings allows hierarchical schemes such as ghc-pkg shake, ghc-pkg base etc. The answers are often singleton lists, but sometimes are used as sets - for example the list of packages returned by ghc-pkg."

Shake required you to impose a global namespace on questions, and to encode results in an impoverished type. No more! The oracle now allows arbitrary question/answer types, namespacing is automatic since the type definitions act as unique questions, and the answer can be any type required.

The API for working with the oracle has changed, but the necessary modifications should be fairly localised. As an example conversion:

There is a little more boilerplate with the new version, but all the problems caused by [String] are gone, and in larger projects it can lead to a significant reduction in complexity and cross-module namespace issues.

Change 2: validStored becomes storedValue

When defining your own rules (a rare occurrence) previously you had to supply a definition:

validStored :: key -> value -> IO Bool

Given a key, look up that value on disk and check it matches the value you were given. This API is very much what Shake needs to do at runtime - it needs to check that the value it has is still valid. Almost all definitions followed the form:

This revised definition simplifies most rules, still has the same power a old system (because you can define a custom Eq instance for your value type), and has allowed the addition of the AssumeClean feature to Shake (more about that in a future post, or read the Shake Haddock documentation).