Just enough static site generator

2018-07-10

I am a huge fan of static site generators. There are a number of fantastic
static site generators around: jekyll being one of the
most used as it renders static files hosted via
github pages. Jekyll is written in
Ruby (a language I do not know at all) an there
are a number of others, including many written in Python (I language I do know).
On a number of occasions I've found myself not quite entirely happy with the
various options and recently I've started just writing a short Python script to
act that does the same job. In this post I'll describe the relatively few
lines of Python required to make a static site generator.

TLDR All you need to create a static site generator is a small number of
lightweight and awesome Python libraries. Here is the full file I'm about to
describe main.py.

What is a static site generator?

First things first: whilst most of the web is now powered by server based sites
that take a request, access a database and serve the corresponding html on the
fly a static site generator is used to do a one off read of all source files
(the "data base") and generate all the html in one go.

Most of these will for example, use the popular file format
markdown to write blog
posts and convert them to html.

As an example this blog post is written in markdown and is currently in a file
in a directory called src:

|
|---src/
|---2018-07-10-just-enough-static-site-generator.md

The first few lines of this file look like:

title: Just enough static site generator
description: A description of a small python script as a static site generator
---
I am a huge fan of static site generators. There are a number of fantastic
static site generators around: [jekyll](https://jekyllrb.com/) being one of the
most used as it renders static files hosted via
[github pages](https://pages.github.com/). Jekyll is written in
[Ruby](https://www.ruby-lang.org/en/) (a language I do not know at all) an there
are a number of others, including many written in Python (I language I do know).
On a number of occasions I've found myself not quite entirely happy with the
various options and recently I've started just writing a short Python script to
act that does the same job. In this post I'll describe the **relatively** few
lines of Python required to make a static site generator.
**TLDR** All you need to create a static site generator is a small number of
lightweight and awesome Python libraries. Here is the full file I'm about to
describe [`main.py`](blog/main.py).
### What is a static site generator?
First things first: whilst most of the web is now powered by server based sites

The first thing we need to be able to do is find all those files

Using Pathlib to find all the markdown files

Pathlib is a fantastic
library that provide an abstraction to file systems (so things work on *nix and
Windows for example).

We can use Pathlib to easily find all the .md files in the src directory.
Here is the first step of a python function main that does this, it
essentially boils down to the src_path.glob("*.md") part.

All of that just makes use of the Pathlib library but where things get
interesting is the ability to get the preamble material at the top of a markdown
file (technically speaking this is usually in a format called
yaml). Here is the
get_content_and_metadata function that does this:

The first step is to split the file on the delimeter (---) which will be
used to separate the yaml and md content. Then we use the pyyaml library
to transform the yaml in to a python dictionary and the markdown library to
transform the rest in to html.

The last step of the read_post function is to return a Post instance. This
is just a namedtuple which makes things simpler to manage at a later stage:

Next we write this html to files that will actually be accessed/read online

Using jinja2 to template how our site will look

The next part of the main function shown previously is to call the
write_post function. This makes use of the very versatile jinja2 library
which makes using templates (so that we only need to write the structure of
pages once) straightforward. jinja2 is actually used by a number of other
libraries but here we're using it "raw":

This function takes a Post instance (the named tuple shown before) and an
output directory (I'll be using posts in my case) and then calls
render_template which is where jinja2 passes the information
post.content, post.date etc to a template file post.html.

Building the site and serving it locally thanks to the http library

If you want to see this site locally on your computer, python comes with a handy
server right out of the box. Go to the parent directy and run it:

$ cd ..
$ python -m http.server

Then go to your browser and type in http://localhost:8000/, you should see a
number of directories there that should include the blog site too. Click on that
and you get a nicely rendered webpage. Of course, because this site is entirely
static you can also just inspect the various html files too.

Pushing to production!

My approach to "publishing" this site is to render locally, push to github and
serve via github pages. In general this looks something like:

I choose not to render my static sites (the python main.py part) using a
continuous integration (CI) service, probably 50% laziness and 50% not wanting to add
a tiny layer of complexity that could break, but that's possible to do.

The test_main.py file contains some unit tests and
I do use (CI) to make sure that doesn't break and also to make sure that python
main.py runs without failure.

Why do this?

If you are happy with any of the awesome static site generators out there
you should not do this.

I've just often found myself wanting to make slight tweaks and either not being
willing to learn Ruby and not entirely satistfied with the tweaks that would
have been required for the Python options.