The Architecture of Open Source Applications: The Blog

We're happy to announce that "500 Lines or Less" is now generally available!
You can read the web version for free on aosabook.org,
while the PDF and paperback versions are available for
purchase from Lulu. Within a few weeks,
the paperback will also be available on Amazon. All proceeds from sales will go
to Amnesty International.

We're happy to announce that the fourth AOSA volume "500 Lines or Less" will be
released on Tuesday July 12th, 2016. As usual, the web version will be
accessible for free on aosabook.org, while the PDF and
paperback versions will be available for purchase from Lulu. Within a few
weeks, the paperback will also be available on Amazon. All proceeds from sales
will go to Amnesty International.

Since "500 Lines" is a very code-heavy book, we've decided not to officially
produce an ePub or mobi version of this volume. We still haven't seen good
examples of code-first books being produced in this format, and we feel that
the freely-accessible web format is much more accessible on mobile. Those who
can't live without an ePub can follow our instructions on how to build the
book from source.

We're extremely grateful for all the support we've received from the hundreds
of people who've reached out to us throughout the production of this book. Your
reviews, issue reports, and direct contributions made this a much better volume
than it would have been otherwise.

The production of 500Lines was greatly accelerated by the financial support of
PagerDuty. Please join us in thanking
them.

Finally, I'd like to extend a huge thanks to AOSA general editor Amy
Brown, who singlehandedly copy-edited all of the
chapters, designed the cover, and rounded out the the thousands of rough edges
that I left behind. If you or your colleagues are ever looking for someone who
knows how to produce books from start to finish, Amy is the person to talk to.

In this chapter, Brandon and Daniel explore the problem of dependency
management in build systems. Throughout their exploration, we learn about how
to use encapsulation, classes, and design patterns in a programming
language with a strong set of builtin data structures and a rich standard
library -- and, perhaps more importantly, when not to.

We'll write up another post next week about the release process and schedule
for both the print and web versions of 500Lines. Thanks for reading along with
us, and a special thank-you to everyone who opened issues and sent feedback on
these chapters.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

Throughout the course of this early access release, we've already seen a couple of different concurrency models -- a thread-based webserver, and a web crawler that uses coroutines. In this chapter, we apply another approach to concurrency by building an event-driven webserver in Common Lisp. We then extend this server into a full-featured web framework by leveraging the power of domain-specific languages.

Next week, we'll be releasing the final chapter in the early-access program. We'll be then doing a test print of the final draft, and the first print version will become available a couple of weeks after that. More on that to follow!

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

In this chapter, we build an interactive image filtering program in Java. We also learn about the joy of prototyping, and how to test our code when it relies on frameworks that weren't designed with testability in mind.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

This chapter is a bit different from the others in the book. Instead of
building a working program, we are building models of systems that allow us
to explore their behaviors. We do this using a modelling language called
Alloy. Along the way, we learn how agile
modelling can help us discover gaps in our designs before or after we've fully
built them out.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

In this chapter, Dann walks us through building a graph database from first
principles in JavaScript. Along the way, we pay careful attention to the design
decisions we are making, and have several opportunities to assess the kinds of
predictions and tradeoffs that practicing programmers face in their work every
day. The end result is a database with surprisingly powerful query capabilities
that runs in your browser.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

In this chapter, we start out by writing the simplest possible web server we
can think of, and then gradually extend it to support richer and more complex
features. In the process, we learn a few lessons about writing software that is
resilient to change.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

Most programmers feel a certain sense of wonder the first few times they write
a working program. However, there is a special sense of wonder that comes from
writing a program that can actually learn behaviours that the programmer has
not explicitly specified.

In this chapter, Marina shows us how to write a program that recognizes
handwritten characters by using the machine-learning technique of training a
neural network to do the job for us.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

Most programmers eventually encounter the basic idea of the Strategy pattern,
which allows you to switch between multiple implementations of an operation at
runtime depending on the scenario.

In this chapter, Christian shows us how this pattern can be useful when trying
to solve the flow shop scheduling problem, which is a notoriously difficult
problem in computer science and operations research. More generally, we learn
how to use this pattern in our own practice when faced with intractable
problems.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

Visual programming languages are popular in early computer science education because, as Dethe notes in his chapter, "a well-done block language can eliminate syntax errors completely."

In this week's early-access release, we create our own visual programming language in JavaScript from first principles. Along the way, we think twice about the wisdom of following conventional software organization principles like MVC (model-view-controller) without first analyzing the nature of our problem.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

In this chapter, Jess teaches us about sampling, a generally applicable technique for complex problems in which it is difficult to find an exact solution. She uses an interesting motivating example -- generating magical items in a game -- to explain how this works. Along the way, we also learn how to write code that works at the limits of floating point precision.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

In this chapter, Malini shows us how to design a fault-tolerant, scalable continuous integration system by splitting into multiple components that each have a specialized role. We've seen this a more academic example of this once before in the cluster chapter; in this one, we get to see a more practical implementation of a distributed system that solves a real-life problem.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

As computers continue to become more fully integrated into human life, more of us are writing programs that are forced to interact with the messiness of the real, physical world. Dessy demonstrates how we can build and refine effective models of reality (much like many other scientists do every day!) in order to accomplish the practical goal of building a working pedometer.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

In this chapter, Allison demystifies the process of writing an interpreter by building one up from very modest beginnings. In the process, we learn about stack machines and how they can be used as a foundation for simply expressing what appear at first glance to be very complicated computations.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

Any student of 3D graphics programming will quickly encounter the pattern of the 'scene graph', which is used to organize and orient the components of a scene to be rendered. Erick's chapter teaches us about this pattern, but additionally shows us how this core data structure can be integrated into an interactive system.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

Continuing with our persistent datastore theme from last week, Taavi's chapter focuses on how we characterize and implement guaranteed behaviour in our systems. This chapter is also different than many of the others, in that we learn about this system the way that most programmers do; by exploring it in its completed form, rather than building it up in increments.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

Programmers often encounter some sort of persistent data store very early in
our careers. There is nothing quite as opaque and mysterious as a full-featured
database system, regardless of the underlying paradigm used for data
organization.

In this chapter, Yoav shows how using immutability as a guiding principle
allows us to build an astonishingly capable database system in fewer than 500
lines of Clojure. This includes a transaction system, a miniature declarative
query language, and a host of other features.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

In this chapter, Carl teaches us that we don't need to know how to write a compiler just to experiment with producing different semantics. He also teaches us a bit about test-driven development along the way.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

Donald Knuth once said that the best theory is motivated by practice, and the best practice by theory. In this chapter, Dustin does something that many programmers have do at least once in their career; he implements a rather opaque algorithm from primary sources (PAXOS) that solves a very practical problem (distributed consensus.) He then demonstrates how sound engineering practices allow us to implement something that at first seems slightly beyond our capability, and to convince ourselves that it actually works.

As usual, if you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

Many programmers treat their programming language implementations as black
boxes. Static analysis tools or language analysis tools can seem
indistinguishable from magic, and good treatments of these topics have been
buried deep in academic curricula that few of us make time for.

In our opinion, Leah has done a terrific job of demystifying the nature of the
techniques used in static analysis and introspection in her chapter. While her
work is specific to Julia, the lessons learned are
applicable in other languages. Enjoy!

If you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

We'd also like to take this opportunity to remind our readers that this book
would not have been possible without the financial support of
PagerDuty. Please take this opportunity to let them
know if you've found this work useful.

We hope that this chapter provides Python programmers with a deep introduction
to how coroutines work, and why and when we should use them. Coroutines first
became a hot topic in Python with the release of the asyncio framework; now, in
Python 3.5, they are also built directly into the language itself.

Beyond Python, there has been a longstanding tension between
thread-based and
event-based
systems in computer science. It can be helpful to a practising programmer to
understand the nuances in this debate, as anyone building things atop
frameworks that use these constructs will eventually have to understand them.

We will have chapters on building thread-based and event-based systems in "500
Lines or Less", and we think that this chapter will serve as an interesting
contrast to both of these contributions in the final draft of the book.

If you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

When Audrey's chapter was first presented to readers, many protested at how
dependent it was on external frameworks and libraries. This is only half of the
truth; Instead of just leveraging these components, Audrey provides a miniature
ontology for the kinds of patterns that web programmers can expect to use now
and in the future. (For the curious, there is also a 99-line version that uses
only DOM
APIs.)

Most engineers spend a lot of time figuring out how to build software based on
the work of others. We feel that reading about how an experienced engineer
identifies the patterns in software she is reusing is a valuable lesson for us
all.

This chapter was written way back in 2014 when Traceur was the leading ES6
compiler implementation. Audrey has informed us that she probably would have
used Babel if she was writing this chapter now. If
you are interested in updating the source and the chapter to reflect this,
we're open to pull requests!

If you find errors you think are worth reporting, please open an issue on our
GitHub tracker.

Audrey also helpfully translated her chapter into Traditional Chinese; you can
find the source for this version
here.

It has been a long time coming, but I'm happy to announce that the first
artifact from 500 Lines or Less will be published on aosabook.org today.

Writing a chapter for 500 Lines is a rigorous process. The code for
each chapter is first reviewed by technical reviewers; then, the chapter drafts
are reviewed by different readers for content and accessibility. Finally, our
heroic copy editor Amy Brown examines each line of text for grammatical errors
or tricky phrasings. And yet, we're sure there are still bugs in this book
waiting to be found.

Just as it is very expensive to fix a bug once it hits production, it is
similarly painful to find an error in a book once it has already gone to print.
To this end, we're going to be doing an early-access release of each chapter on
the 500Lines website at the rate of one per week for the next 20 weeks. We hope
that getting these chapters out early will give all of you the chance to find
the snags that we missed throughout this process. We expect to go to print very
shortly after this process winds down.

These early-access chapters will only be published from this blog, and will be
linked from the news section on
aosabook.org and announced on Twitter from
@aosabook. The production version of the
website will be published around the same time as the final draft goes to
print.

With all of that out of the way, I'm happy to share Ned Batchelder'stemplate engine chapter
with you. Ned's chapter is a great example of what we were hoping to
accomplish with 500Lines: it is compact, well-explained, and there are several
discussions of interesting decisions and trade-offs he had to make while writing
it.

If you find errors you think are worth reporting, please open an issue on our
GitHub tracker. We'll try our
best to collate issues sent to us on Twitter and via other channels, but we
have a lot of book to work on still, and we're probably going to miss things
unless they're persisted in that tracker somehow.

Many thanks again to
PagerDuty for their generous
sponsorship of this project. Please give them a
public thank-you or two throughout this
early-access release for their contribution.

We're very happy to announce that PagerDuty will be
sponsoring the production of "500 Lines or Less",
the fourth book in the Architecture of Open Source Applications series.

The AOSA books are published and distributed on a not-for-profit basis, with
all of our proceeds going to Amnesty International. However, there are always labor
costs associated with the production of each book. With PagerDuty's support, we
will be able to continue to donate all of the book's proceeds while still
covering these costs.

PagerDuty is well-known for its strong engineering culture, and we feel that
they are the perfect partner to support the kind of work we're doing with AOSA.

This volume is called 500 Lines or Less, and is focused on short but complete
implementations of canonical programs and architectural patterns in computer
science.

The motivation is described succinctly in the project statement:

Every architect studies family homes, apartments, schools, and other common
types of buildings during her training. Equally, every programmer ought to
know how a compiler turns text into instructions, how a spreadsheet updates
cells, and how a browser decides what to put where when it renders a page.
This book's goal is to give readers that broad-ranging overview, and while
doing so, to help them understand how software designers think.

Each chapter will consist of a (max) 500 line, self-contained program, and a
narrative surrounding that program that describes how it works, and (more
importantly) why it is designed the way it is.

Contributors are already working on
their 500 line implementations. Once each contributor completes a "first draft"
of their code, we assign them a technical reviewers to provide constructive
commentary.

That's where we need your help.

There are a lot of contributors to this project, and we want them to get as
much feedback as possible before they begin writing up their chapters!

We're looking for reviewers at all levels of experience. If you're still early in
your programming career, we'd especially love to hear from you, as your
opinions as to which parts are easily accessible and which are confusing will
be incredibly helpful.

In terms of scheduling, our first-draft code submissions are due by end of February, but
many contributors are ahead of schedule and could use reviewers right now!

We're also hoping that our code reviewers will stick around to provide
commentary on the drafts of completed chapters once those start to roll in from
March onwards.

All technical reviewers will of course receive credit for their work in the
book, and will be supporting a good cause -- all royalties from paid-for
versions of the book will go to Amnesty International.

The Performance of Open Source Applications is now available for the Kindle
through
Amazon.
As with the other electronic versions, this one costs USD$5.99, with royalties
going to Amnesty International.

If you don't have a Kindle, see the Buy page
for more information on available formats.

I'm happy to announce that The Performance of Open Source Applications is finally out.
You can read it on our
main site or
buy a paper copy from Lulu for US$25.00.
As with the previous two books, all royalties go to Amnesty International.

Thank you to everyone who contributed, and everyone who cheered us on.
I hope you learn as much from the authors as I did.

Releng 2013 is a one day workshop (co-located with ICSE) to
bring together release engineers and researchers to discuss the
challenges in release engineering and develop areas for further
research. Areas of discussion include research and practice of all
activities in between regular development and actual usage of a software
product by the end user, i.e., integration, build, test execution,
packaging and delivery of software. The conference is May 20 in San
Francisco and deadline for submission of talks in February 7th. For more
information see http://releng.polymtl.ca.

One of the reasons we started this project is that almost every other
book on software architecture doesn't actually show readers the
architectures of any actual systems. In particular, most university
students can get through a four-year degree without ever seeing how
large applications are put together. We're therefore very excited to
learn that Dr. Neil Ernst is using AOSA in an undergrad course he's
teaching at the University of British Columbia. If you'd like to see
what his students think of the systems we've described, their
presentations are online. And if you know of anyone else doing this,
please send us a pointer.

Well, it's been a busy two months since Tavish or I have posted anything
on here. Things have been progressing nicely though! We've got 21
authors lined up, and have already gone through 13 proposed outlines.
I'm extremely excited with the proposed chapters so far! We've got a
great mixture of low-level nitty-gritty projects and higher level
applications.

Here's a few more examples:

Ilya Grigorik is going to be discussing the insane number of hoops
that the Chrome team has jumped through to improve Chrome's network
performance

Clint Talbert and Joel Maher are going to be recounting the journey
of improving Talos, the performance testing framework at Mozilla.

You provide a wonderful workspace where millions of people can share
what they make, but there's something you could do to make yourself even
better. Could you please provide a way for us to create plug
applications directly into your site the way Facebook does? Thousands of
really cool software tools are just waiting to be discovered, from code
analyzers to project history visualizers, and while your APIs already
let them use your data on their own web sites, people would be much more
likely to find them and use them if they were nested directly in
projects' GitHub pages. It wouldn't be much of a technological
challenge—given the way Facebook's stock has been doing, you could
probably even persuade one or two of their engineers to come and help
build it—and it would kickstart the same healthy explosion of
third-party tools we saw when Eclipse first came on the scene.

John Hunter, the creator of matplotlib and a contributor to Volume 2
of this series, passed away on August 28 as a result of complications
arising from treatment for cancer. A memorial fund has been set up
to help with his three children's education; please give generously.

A few weeks ago Greg posted about the next book we're doing: The
Performance of Open Source Applications. Well, we don't have a book
yet, but we've made some progress. Earlier this week we had our 15th
"yes" from an author, which puts us close to the chapter counts of
AOSA. We're excited about that and we hope you are too. Here are a
couple of the chapters we're planning:

Jessica McKellar on Twisted again, getting into the details of
asynchronous I/O.

Audrey Tang on profiling and optimizing Ethercalc, a Google
Docs-esque spreadsheet application written in node.js.

Kyle Huey on Mozilla's Memshrink project, which aims to make
Firefox faster and "slimmer".

Rosangela Canino-Koning and Eric McDonald on processing terabytes of
data in the Khmer project, which is a "library and toolkit for
doing k-mer-based dataset analysis and transformations".

Remember, if you want to see your favourite Open Source Application
discussed in the new book, we're accepting contributions. If you're
interested, please get in touch at posabook@gmail.com. If you don't
want to write a chapter yourself, why not send someone our way?

We are pleased to announce that we are starting work on a third book in
this series, which will be titled The Performance of Open Source
Applications. Each chapter will discuss a performance issue in a real
open source system—it could be an over-the-shoulder view of how a
performance problem was fixed, a discussion of how design decisions
affected performance in a particular application, or something else
along those lines. Each entry will be 12-15 pages long, and we hope to
have first drafts by the end of October so that we can publish the book
in Spring 2013. As with AOSA, royalties will go to Amnesty International
and the book will be available for free online under a Creative Commons
license. If you are interested in participating, please contact us at
posabook@gmail.com.

Why performance rather than architecture? Because it's something that
every programmer has to deal with eventually, but which is usually left
out of their education. The last general book on making programs fast
that we know of was Jon Louis Bentley's Writing Efficient
Programs, which was published thirty years ago. There have been lots
of more specialized books since (we're particularly fond of Steve
Souders' High Performance Web Sites and Even Faster Web
Sites, and of John Lakos's Large-Scale C++ Software Design),
but we think the time is right for something that touches on everything
from squeezing the last few cycles out of every precious milliwat in an
embedded sensor to maximizing throughput of large-scale e-commerce
applications. We hope you'll think so too, and we look forward to
hearing from you.

Amy Brown, our indefatigable editor, has posted a series of articles on
her blog about what's involved in going from LaTeX to a published book
on Lulu. We think this is a great alternative to traditional (read:
expensive and restricted) academic publishing, and would welcome other
tips and tricks.