What’s Next for PDF

A couple of months ago I started hearing rumors about a new version of PDF, often mysteriously referred to as PDF “next.” As it happened, the PDF Days Europe 2017 conference in Berlin on May 15 had an entire track dedicated to this topic. Could PDF.next be the easy-to-output-to, easy-to-distribute, low-cost digital publishing format that so many of us are seeking? I decided to go to Berlin to find out.

As soon as I arrived, I was quickly reminded of the diverse market that PDF must serve. This was a hardcore technical conference with sessions on accessibility, PDF as a service, PDF/X, PDF/A, PDF/UA, and much more. But I (and many others) were there to learn about PDF.next!

Happy quarter-century, PDF

First, it’s worth remembering that the PDF format was invented 25 years ago, when laptops were in their infancy, and cell phones didn’t even have screens. PDF was all about reliable printing and exchanging documents between Macintosh and Windows computers.

Today, people want to access their information “from watch to wall”—on screens large and small. Mobile phones now generate over 50% of web traffic. There are many people whose only computing device is a phone. I think we can all agree: for small screens, PDF is not a good format. At best on small screens PDF forces us to pinch and zoom in order to read. Furthermore, PDF is “where data goes to die.” The original document may contain all kinds of rich tables, charts, and spreadsheets, but once it is exported to PDF, only the visual appearance remains. There is no way for readers to sort table columns, view alternate graph types, or change data ranges.

PDF 2.0

PDF Days 2017 was organized by the PDF Association, which exists to “promote the adoption and implementation of International Standards for PDF technology.” After nine years of work, they are about to submit the specifications for PDF 2.0 to the ISO.

PDF 2.0 is an evolutionary, not revolutionary update to the PDF 1.7 specification that has some new capabilities that will eventually be useful for printing, color management, and accessibility. You can learn more about PDF 2.0 here.

But the real juicy talk at the conference was about PDF.next, which leverages some of the new features in this updated specification, but lies “beyond” PDF 2.0.

PDF.next

PDF.next is all about making PDFs relevant in the era of mobile devices with small screens. What if a PDF could reflow and adapt to these screens, presenting alternate views depending on the type of device or size of screen the viewer is using? In other words, what if a PDF could be “responsive”?

You’ve probably heard of “Tagged PDF,” and you may know that with enough patience and skill you can create a tagged PDF from InDesign. Today, the primary reason for doing so is to make the file accessible to people with vision impairment, since screen-reading devices need this tag information to specify reading order, information hierarchy, and alternate text that describes images.

InDesign’s Edit All Export Tags command allows you to assign PDF tags to Paragraph styles (top). These tags can then be viewed in Acrobat (bottom).

The idea behind PDF.next is to add more complete, accurate tagging (often referred to as “semantic information”) to PDFs so they can be transformed on-the-fly into reflowable, responsive HTML for reading on small screens.

Maybe someday in the future, when a user opens a PDF.next document with an old reader or browser, they will see the pixel-perfect page rendition they are used to now. But when a user opens the same PDF in a modern PDF reader on a mobile device, the reader software will use the tags and structure in the PDF to transform that information into HTML. CSS will be supplied either by the document creator or by the PDF reader software, so that the HTML reflows and looks great on the smaller screen.

The benefits of PDF.next

In this scenario, the user would get all the benefits of a traditional PDF: an easy-to-distribute, document-centric, single file that is backward compatible, easy to print, and can be validated, signed, and protected if desired. These would be combined with the read-anywhere, accessible, responsive benefits of HTML and CSS.

An “on-the-fly” conversion of InDesign-generated PDF content to reflowable HTML using the early proof-of-concept tool ngpdf.com by Dual Lab.

Another interesting idea is embedding the data for a chart or table in a PDF. Someday, in PDF.next, that data could be referenced and called upon when the HTML/CSS representation of the file is viewed. So a chart that appears static in the PDF page rendition could come alive in the HTML rendition with sortable table rows, range sliders, or interactive charts.

Adobe and the future of PDF

Keep in mind that Adobe no longer “owns” the PDF format. As of 2008, PDF is an ISO standard. So while Adobe is certainly interested in the development of PDF, they are no longer the sole driver of future development. They are only one of many players. Leonard Rosenthal from Adobe is on the PDF Association board, and he represents Adobe in the ISO. (Leonard will be on hand at PePcon: The Print + ePublishing Conference 2017 to discuss the future of PDF.)

When are PDF 2.0 and PDF.next coming?

PDF 2.0 should become an ISO standard within the next few weeks. (Remember that took about nine years to ratify!) But the rest of this stuff (the PDF.next stuff) is a long way off. There is a lot that needs to happen before we see the benefits. At the very least, we need robust tools that help us accurately tag InDesign files with the proper tagging structure. Then, we need PDF reader software that can transcode those tags into HTML and display it on small screens. Many of us will need education in creating structured documents from InDesign and editing CSS.

All of these things may or may not happen. Leonard Rosenthal says, “we are taking our time and doing this right.” If all these pieces fall into place, we could have the best of both worlds: robust PDFs that adapt to any type of reading situation. Stay tuned!

Keith Gilbert

Keith Gilbert is a digital publishing consultant and educator, Adobe Certified Instructor, Adobe Community Professional, conference speaker, lynda.com author, and contributing writer for various publications. His work has taken him throughout North America, Africa, Europe, and Asia. During his 30 years as a consultant, his clients have included Adobe, Apple, Target, the United Nations, Best Buy, General Mills, Lands' End, and Medtronic. Follow him on Twitter @gilbertconsult and at blog.gilbertconsulting.com.

15 Comments on “What’s Next for PDF”

Marvelously good news? Maybe or maybe not. Depending who guides its direction, PDF.next might step in where epub is failing so dreadfully, not progressing beyond HTML circa 1995. I have given up on that standards group. They’re uber-geeks obsessed with metadata and minutia rather than making ebooks look attractive. If cars had taken a similar path, we’d still be hand-cranking to start them.

Here are some of my suggestions:

1. Smart formatting and by that I mean not merely having a PDF contain coding that coverts it to HTML. That simply will not work with books. People scroll down webpages, so HTML is fine for that. But for numerous practical reasons, they page through ebooks and will refuse to do anything else. Any scheme that allows long documents to display on devices with various screen devices cannot be like a browser. Not ever! It has to be smart enough to take a document and turn it into a series of attractive, well-laid-out pages. It must handle page breaks, tables, and graphics well. It must create attractive book pages from a flow of text not one long flow like websites. That’s obvious, but few seem to realize that.

2. Add human intervention to that automatic smartness. Allow designers to specify layout features that affect how page elements move around, roughly what Adobe tried back in the pre-CC era. Give designers the ability to say, “Place this picture at the top of the next screen” and the like. And give them the ability to do things that print can never do, such as accordion text. I grow tired of hearing morons—yes, morons—talk of ebooks bringing a brave new world of publishing and yet continue to turn out ebooks that have virtually no features that print books don’t have. I often wonder if their goal for ebooks is to make them “Like print books only uglier.”

There’s not enough space here to give the specifics, but years ago I read a marvelous article about how long it took factories to adapt to the differences between steam-powered factories and ones powered by electric motors. I see something similar here, but with two factions of non-adapters, both trying to make the future like some element of the past.

1. Those who want ebooks to continue to be like books. Those are the ones who’re keeping ebooks from having a rich new set of features. If it can’t be done with a book, they care not if an ebook should be able to do it. I suspect the executives at large publishers are in that camp. Making ebooks more powerful would mean spending additional money on ebook editions.

2. Those who want ebooks to be more like webpages. I’ve already mentioned the scrolling versus paging issues that creates. But these are also the idiots who want to clutter ebooks with multimedia. Never mind that every effort to do that has bombed in the marketplace. Indeed, in the late 1980s, I was on the fringes of a Microsoft effort to do that with CDs. Nothing came of it. When people want to read, they want to read.

Do you know what these groups should do? They should look at Framemaker. It is brilliantly designed to smoothly reflow technical documents that might run to thousands of pages. Add a couple of paragraphs on page 381, and FM would automatically reflow the pages that followed. A table that was on page 516 may now be on 518, but it will be at the top of the page as intended.

There’s only one glitch with FM. It assumes a certain page size while intelligently flowing text to display properly on those pages. What’s needed is a FM-Plus, that can also do that cleverly directed reflowing as page/screen sizes change.

And yes, as you might have guessed, I created books with FM before shifting to ID. There are FM features I miss.

With PDF 2.0, I’ll say a New era has started which might replace e-pub market. Personally, I just hate e-pubs (they are not so attractive) and therefore, don’t prefer reading an e-pub book. If I had to read a book on screen, I go by the PDF version.

I also agree with Mike Perry and would like to quote it here again:
“It has to be smart enough to take a document and turn it into a series of attractive, well-laid-out pages. It must handle page breaks, tables, and graphics well. It must create attractive book pages from a flow of text not one long flow like websites.”

Jean-Renaud, yes, Adobe Reader has had a “reflow” command for a long time…the 2 main problems with this however is that all too often it doesn’t render the page in any kind of readable way. Sometimes works quite well, other times not so much.

Second, neither the author of the original PDF nor the reader of the PDF has any control over the appearance of this “reflow” view of the document. The idea with pdf.next is that the author and the reader could control the appearance through embedded CSS and/or on-screen controls.

Thanks Keith for very good article and summary of PDFdays agenda. I was lucky to be able to participate that event as well (it was awesome to meet you Keith!)

I just wanted to comment Michael W. Perry´s message. I think people who are designing PDF.next are already thinking about those challenges you mentioned. If I understood correctly, paginating is part of the new responsive PDF as well. Idea is not to use HTML to just re-create layout so that pages are merged together as one scrollable HTML-page. Instead of that, I think there will be something like ePub viewers already do… but maybe with little bit more author control.

Actually there will be lots of similarities with PDF.next and next version of ePUB, Leonard actually showed us a comparison table between planned features of PDF.next and ePub4 – and most of the implementations were almost identical.

And there was quite a lot room for so called “author design control” on that concepts as well. So I believe there´s no intention to make all the layout renditions automatically only, I think there will be all kinds of “liquid layout” rule based type possibilities to control how different renditions will be made.

And Jean-Renaud – this is definitely not fake news. I can assure that if that presented concept will see the daylight some day, it´s much much more than just reflowing the pages.

But as Keith mentioned lots of work has to be done before regular author can start to produce these wonderful new PDFs, and before regular end-user can start to consume them. But I really do hope we will see that day in next few years.

Just please keep the legacy features that we print/book lovers like and use—still good for sharing files with folks who work in different programs, on different platforms, have different fonts, etc. Too often new versions forget their roots! (I have been creating PDFs since the old days of service bureaus.)

Hi Keith, great article. How do you think PDF.next will affect PDF forms? Specifically, I’m looking to make a text field with a fixed font size and have the field expand with as much text as someone wants to enter and have the content after it move down to accomodate the expansion, much like you can do with a Word or LiveCycle form now. I know you said it’s along way off, but I can’t tell you how many times it’s been asked for around here at TR ;-)