Programming data for display: the PDF story

At the 2017 Papers We Love Conf, a previously-scheduled
speaker fell ill. With just 90 minutes to go before the vacant slot, the
organizers asked me if I could fill in. I didn't have any sort of appropriate
talk prepared, but given my long history working with PDF documents, I thought
I'd be able to put together a reasonably-entertaining presentation on the
history, heritage, and design decisions that led to the PDF file format and
specification while living up to the high standards and expectations of the
Papers We Love community.

I was so relieved that the result was well-received!

The buildup

I said at the top that I had just 90 minutes to prepare the talk. How that went
down is a good story…

Since I was volunteer staff at the Papers We Love Conf, I heard pretty early
that one of the scheduled speakers was ill. At first there was hope that she
would be able to recuperate enough to present, but by noontime, that had
evaporated. Darren Newton, one of the lead
organizers for the event, asked me if I'd be able to fill in. I
said "yes", but it wasn't a done deal yet: Zeeshan
Lakhani, the other lead organizer, was
working on maybe convincing a Strange Loop keynote speaker to take the spot.
While that process churned along, I told Darren I'd be back in a little while,
as I had a video hangout with my family planned for lunchtime.

At some point, Darren thought I'd simply gotten cold feet and disappeared, so
there was some period of high anxiety between him and Zeeshan, especially once
the other Strange Loop keynote speaker eventually declined to fill in. I
made my way back down to the conference space after my family call,
and got some lunch. There followed a flurry of Twitter DMs and then a confirming
conversation with Darren about the topic I had in mind.

I then rushed back to my hotel room to snag my laptop, and sat down at the
conference swag table (of all places) just before 2:00pm to figure out exactly
what I'd be talking about. To settle my nerves, I asked to no one in particular
if someone could procure a beverage; David Ashby (another
conference volunteer) heard the call and showed up with an old fashioned in a
coffee cup 5 minutes later. I plugged in my headphones, and settled in for the
most intense ~90 minutes of outlining and brainstorming and googling and slide
preparation of my life. The result was never going to be "done", or exactly what
I wanted, but around 3:25pm, I sidled up to the podium and gave what I could.

The Papers We Love Conf site has a page for the
talk that includes video of the result
and its abstract and references.

The slides for the talk — such as they are, given time spent preparing them! —
are here:

Though Papers We Love talks are generally motivated by one or many academic
papers that are influential in their field or that the speaker personally finds
illuminating or inspiring, that was not true in this case. Since the most common
page description languages — PostScript and PDF, both of which I discuss in the
talk — were developed and refined within commercial organizations during a
period where it was rare for such organizations to publish findings, there sadly
aren't any published papers to love. Thus, their history is far less
well-established than other common and important technologies.

Though there are a handful of internal corporate memos that
provide a window into the motivations of the engineers at the time, our best
source of information on the development of page description languages, and
PostScript and PDF in particular, has been passed along via narrative histories.
Those were the sources I relied upon most in forming the talk, much of which
I've simply internalized over many years of working with PDF documents.

Reception and revision

After I delivered the talk, and throughout the rest of the week at Strange
Loop (with which PWLConf was co-located), I had
numerous conversations with people that had seen the talk, or heard about it
later (perhaps since the snap preparation is a pretty good story). I came away
surprised on a few fronts:

People really enjoyed the talk! Perhaps because I've been working
with PDF documents for so long, I didn't expect such an enthusiastic reaction
from a "general" audience (even among software professionals).

There is a deep desire among many to better understand the things we use and
rely upon every day. In computing, PDF is absolutely one of those things,
since it is used so pervasively not only for publishing (i.e. as
electronically distributable paper), but for data interchange in the most
important and sensitive domains. The latter is obviously of great importance
to me, us here at PDFDATA.io, and our customers
that — for better or worse — rely upon PDFs as a data source…but again, I
underestimated the broader level of interest.

With this in mind, I am newly resolved to dig deeper into the history and
heritage of PDF, which I plan to publish in future posts here and disseminate as
widely as I can through future events. As exciting as the future of computing
might be, it is incredibly important that we have a solid grounding in the why
and how of the present state of computing, and PDF is a big part and
reflection of that.