If you're a Ruby programmer, you've almost certainly used Rake, the build utility created by the late Jim Weirich. But you might not realize just how powerful and flexible a tool it can be. I certainly didn't, until I decided to use it as the basis for Quarto, my e-book production toolchain.

This post is part of a series on Rake, starting with the basics and then moving on to advanced usage. It originated as a series of RubyTapas videos published to subscribers in August-September 2013. Each post begins with a video, followed by the script for those who prefer reading to viewing.

My hope in publishing these episodes for free is that more people will come to know and love the full power of this ubiquitous but under-appreciated tool. If you are grateful for Rake, please consider donating to the Weirich Fund in Jim's memory.

We’re going to spend some time looking at Rake over the next few episodes. I hope you don’t mind.

Chances are you’ve used Rake at some point. If nothing else, you’ve probably run various Rake tasks associated with Rails projects. Perhaps you’ve written some Rakefiles of your own.

Chances are, though, that you’ve barely scratched the surface of Rake’s capabilities. That was certainly true of me until a few weeks ago. I’d written my share of Rakefiles and task files, sure, but I’d never really dug deeply into all that Rake can do. Now that I’ve spent some time really learning Rake, I’ve realized that it’s a tool of extraordinary power. I’d like to share some of what I’ve learned with you.

We’re going to start, though, with a review of Rake basics.

Let’s say we have a directory full of Markdown files we want to convert to HTML using the Pandoc tool. We could write a simple script to iterate over the files and convert them one by one.

Ruby

1

2

3

4

%W[ch1.mdch2.mdch3.md].eachdo|md_file|

html_file=File.basename(md_file,".md")+".html"

system("pandoc -o #{html_file} #{md_file}")

end

But this script is going to remake every single one of the HTML files every time we run it, even if the source files haven’t changed. If the markdown files are very large, this could mean a long wait.

Instead, let’s make a Rakefile and write a Rake task to generate the HTML. It starts similarly, by iterating over a list of input files, and determining the corresponding HTML file. But then it starts to differ. We use Rake’s
file method to declare that the
html_file has a dependency on the markdown file. Then, inside the block, we tell Rake how to get an HTML file from a markdown file, using a shell command.

What we’ve written here is a rule, or actually three rules, each one telling Rake how to build a particular HTML file from a Markdown source file.

Ruby

1

2

3

4

5

6

%W[ch1.mdch2.mdch3.md].eachdo|md_file|

html_file=File.basename(md_file,".md")+".html"

filehtml_file=>md_filedo

sh"pandoc -o #{html_file} #{md_file}"

end

end

This by itself is already a usable Rakefile. On the command line, we can tell rake to build one of the HTML files and it will oblige us. We can already see an advantage over our script: Rake shows us the command that it is executing.

1

2

$ rake ch1.html

pandoc -o ch1.html ch1.md

If we tell Rake to build the same file again, nothing happens. This is because Rake checks file modification times to see if the Markdown file has changed since the HTML file was created. Since it hasn’t, Rake knows that the HTML file doesn’t need to be rebuilt.

1

2

$ rake ch1.html

$

If we then modify the file and run Rake again, it once again builds the HTML file.

1

2

$ rake ch1.html

$

It’s nice that Rake is tracking when files need to be rebuilt. But specifying which file we want to be built is tedious. We’d prefer to simply have Rake rebuild any HTML files that are out of date.

To make that happen, we add a task to our Rakefile. We name it “html”, and give it a dependency on our three HTML files.

Ruby

1

2

3

4

5

6

7

8

task:html=>%W[ch1.htmlch2.htmlch3.html]

%W[ch1.mdch2.mdch3.md].eachdo|md_file|

html_file=<spanclass="org-type">File.basename(md_file,".md")+".html"

filehtml_file=>md_filedo

sh"pandoc -o #{html_file} #{md_file}"

end

end

This task has no code of its own. But when we tell Rake to build the “html” task, it follows the dependency to the HTML files. It knows how to build those files because of the rules we already wrote, so it proceeds to build them.

1

2

3

4

$ rake html

pandoc -o ch1.html ch1.md

pandoc -o ch2.html ch2.md

pandoc -o ch3.html ch3.md

If we then edit one of the Markdown files and re-run the Rake task, we can see that Rake only rebuilds the one that was updated.

1

2

$ rake html

pandoc -o ch2.html ch2.md

If we’re going to be running this command a lot we can make it even more convenient by declaring a
:default task with a dependency on our
html task.

Ruby

1

2

3

4

5

6

7

8

9

task:default=>:html

task:html=>%W[ch1.htmlch2.htmlch3.html]

%W[ch1.mdch2.mdch3.md].eachdo|md_file|

html_file=File.basename(md_file,".md")+".html"

filehtml_file=>md_filedo

sh"pandoc -o #{html_file} #{md_file}"

end

end

This allows us to rebuild our files by simply running
rake with no arguments.

1

2

3

4

5

$ rm *.html

$ rake

pandoc -o ch1.html ch1.md

pandoc -o ch2.html ch2.md

pandoc -o ch3.html ch3.md

So far we’ve seen how to declare file rules and tasks. Now let’s learn how to write generic rules.

Our three file rules have all have a common pattern, of converting from a “.md” file to a “.html” file. In fact, this pattern is so repetitive that we automated the generation of the rules using an
each loop. Instead of writing an explicit loop, let’s instead teach Rake how to convert “.md” files to “.html” files, and let it work the rest out for itself.

We do this by declaring a rule whose name is the file extension
.html. This rule’s dependency is on the file extension
.md. We then open a block. This block will accept a block argument we’ll call
t. We call it
t because it will be bound to a Rake
Task object.

Inside the block, we use the
sh command to run a shell command. It starts out with the
pandoc command as before. But for the output filename, we interpolate in the task’s
name attribute. And for the input file, we use the task’s
source attribute.

Ruby

1

2

3

4

5

6

task:default=>:html

task:html=>%W[ch1.htmlch2.htmlch3.html]

rule".html"=>".md"do|t|

sh"pandoc -o #{t.name} #{t.source}"

end

That’s it. When we remove our HTML files and run Rake again, we can see that it regenerates them as before.

1

2

3

4

5

$ rm *.html

$ rake

pandoc -o ch1.html ch1.md

pandoc -o ch2.html ch2.md

pandoc -o ch3.html ch3.md

So what just happened here? Since we specified no arguments, Rake executed the
:default task, which has a dependency on the
:html task. The
:html task, in turn, depends on three
.html files. Rake started with the first one,
ch1.html, and looked to see if it existed. It found that it didn’t. So Rake then tried to find a way to build the file.

First it looked for any rules explicitly named
ch1.html, but we removed all of those.

What it did find was our new rule. It saw that using the rule it could generate a
.html file from a corresponding
.md file. Applying the rule to
ch1.html, it found that the corresponding file,
ch1.md , existed. This meant that rule was a match, so it went ahead and executed it. It then repeated the whole process for the remaining missing
.html files.

There is so much more I want to show you about Rake, but RubyTapas is all about one idea at a time, so I’ll save it for future episodes. Stay tuned, and happy hacking!

I hope you've enjoyed this episode/article on Rake. If you've learned something today, please consider "paying it forward" by donating to the Weirich Fund, to help carry on Jim's legacy of educating programmers. If you want to see more videos like this one, check out RubyTapas. If you want to learn more about Rake, check out my book-in-progress The Rake Field Manual.