The points below highlight some of the more noticeable changes this month. Mainly Elements and Zine has been affected by this months changes.

How to updateYou typically don’t need to update your converted resources if you’re only interested in behavioural fixes. You can replace the js/ and css/ directory of any existing publication to update.

Changes

Added support for YouTube’s short url variant (youtu.be) (Desktop Publisher)

This release mainly contains smaller corrections and improvements. We’re getting this release out a little earlier than usual due to iOS 11 being released in the next 48h as we found some optimisations we wanted to include prior to this release. We have seen some issues in iOS 11 related to how background images are being loaded and unloaded in the Safari browser and we have made optimisations accordingly. This was mainly affecting the Zine viewer.

How to update

You don’t need to republish existing publications to update if you’re mainly interested in the adjustments for iOS 11. You can simply replace the existing “FlowPaperViewer.js” file in the js/ directory of any publication or installation you have created with the new version. You can grab the minified version of FlowPaperViewer.js here.

Changes

Cosmetically improved the 3D page turn effect (Zine)

The vertical reflow template will now scroll on devices with larger display width than 500px (Elements)

Its now possible to deep link into reflowable vertical publications using the #section hash (Elements)

You may have noticed that the text selection can behave erratic when selecting text in your PDF reader (such as Adobe Reader). This is because text in a PDF document is not necessarily ordered in logical reading order. You can see an example of this behaviour in Figure 1.

Figure 1

So why is this a problem? It’s a problem because of two reasons; firstly, it means that Google will most likely index text in your document in the incorrect order. Sentences are in some cases broken and as a result you may not be getting the search hits you’re after for your content.

The second problem is that people who use screen readers won’t be able to read the documents properly. Text will most likely be read out in the wrong order.

Fixing the reading order automatically

The typical solution when dealing with reading order problems is to use something called PDF tagging. This requires you to go through the entire document and tag each text and mark which text that should follow that text in reading order.

Using machine learning, FlowPaper is able to reconstruct the logical reading order of your documents without any manual tagging work. Single column layouts, multiple column layouts, you name it. FlowPaper does this by analysing the layout of the page very much in similar ways as the human eye is recognising a page and its reading order.

How to fix the reading order of your PDF

Make sure you have the desktop publisher installed. It can be downloaded from our public download page.

Figure 2

Firstly, Start up the desktop publisher and select the “Elements – Slide” viewer in the top right corner as seen in Figure 2

Make sure the “Improved Accessibility” checkbox is ticked in the “Accessibility & SEO” section

You can now go ahead and import your PDF. Make your desired adjustments to style and click “Publish” in the top.

Verifying the Results

So how can we check that the corrected reading order is correct?

You could let a screen reader read it out if you have one installed. You can also check the text order manually if you know your way around Chrome a little bit.

To use Chrome to check that the elements are in the correct order, open the publication in Chrome using “View Offline Version” from the desktop publisher and open up the Chrome dev tools from the View->Developer->Developer tools menu

What you will see now is all the HTML5 elements that the desktop publisher has created when converting your PDF document. An easy way to check the reading order is just to step through the elements like in the animation below.

Voila! Please let us know if you have any questions regarding reading order or how it can be used in other scenarios!

We’re pleased to announce our latest release of FlowPaper, featuring a number of enhancements and improvements especially around our Desktop Publisher. Version 3.1.0 will be rolled out during the next 24-48h and will appear in the commercial download archive accordingly.

The main changes in this release can be seen below:

Improved the font loading in the vertical reflow template (Reflow)

Now displaying source page when hovering over text while in edit mode in reflowable publications (Desktop Publisher)

The desktop publisher now allows multiple domain names to be entered at the same time for Creative and Creative Team license holders. Separate the domains with semi colon (;) to generate keys for multiple domains at the same time (Desktop Publisher)

Fixed an issue where the reflowable mode would fail on texts that were filled with patterns (Desktop Publisher)

Fixed an issue where some image resources were incorrectly cleared when switching from 10 pages to all in Zine (Desktop Publisher)

Fixed an issue where links/videos/images added to two fold publications would have them duplicated in certain scenarios (Desktop Publisher)

Fixed an issue where two-fold Elements publications were not being adjusted when reopening them (Desktop Publisher)

Fixed an issue where unicode characters were incorrectly unescaped in the Zine bookmark list (Desktop Publisher)

Fixed an issue where differently sized pages would default to their unadjusted dimensions in some scenarios causing the annotations marks to be mispositioned (Classic)

Fixed an issue where the TOC background color wasn’t being set when using Zine in portrait mode (Desktop Publisher)

Summer is running at full throttle on the northern hemisphere and so are we down under to get you a new fresh build out as you come back from summer holidays! This release features major updates, particularly to the reflow viewer.

A new reflow template has been added and new supervised and unsupervised machine learning algorithms are in place to help with the reflowing of publications. Following this release, there will be multiple subsequent updates to the reflow functionality where we fine tune and adjust these algorithms in the next few months.

Changelog

Huge overhaul over the reflow functionality with new machine learning algorithms in place for better text and layout analysis (Desktop Publisher)

New template using a vertical layout introduced for reflowed publications (Desktop Publisher)

Fixed an issue where the bookmarks were not resolved initially causing some of the reflowable documents to not use them correctly (Desktop Publisher)

We’re in the process of implementing new machine learning (AI) algorithms for our desktop publisher which we hope will help a lot in creating responsive publications from PDF documents. The results we’re seeing so far are very encouraging. We have been working hard on getting these in place since the start of the year under the hood and had a big breakthrough in late April when we took a new approach on how documents can be republished using unsupervised and supervised algorithms.

As we’re in the midst of this we have decided to delay next release until end of July or beginning of August. If you are waiting for something you have reported to be fixed and would like a pre-release then you are welcome to flick us an email and we’ll help you out.

We’re happy to announce our latest version of FlowPaper with a number of great updates. We have been focusing this month at providing some clarity on how FlowPaper can improve search ranking compared to a normal PDF. FlowPaper Elements has had a number of improvements around search indexing and this version greatly improves the way Google and other search engines is able to discover sections and headers in your publications. You can try the new SEO features in Elements by ticking the “Improved SEO” checkbox under behaviour.

Changes in this release

In order to improve SEO, Elements is now using h1/h2/h3 tags instead of div tags for headers (Elements)

FlowPaper Elements is now detecting and creating a table of contents for publications if they do not have one specified in the PDF (Elements)

Zine now checks the proportions of the available area when deciding on viewing mode. If the viewer is very narrow but high then the viewer will choose to display a single page as opposed to two pages (Zine)

Improved indexing (SEO) for Elements publications so that Google and other search engines are now able to index each section separately (Desktop Publisher)

This is the first blog post in a series where we are going to explore how Google ranks PDF documents versus web content created using our upcoming version (version 3.0.1) of FlowPaper Elements. We are going to be completely transparent on how we set up our tests and why we think Google prefers using publications created with FlowPaper Elements as opposed to the PDF so that you can test and verify the results yourself and understand why.

Setting up the test

We decided that we wanted to explore how Google ranked a PDF versus the same content published as HTML5 content. So to do this we staged a little test. We created a blog post with a link to 5 different publications. Each publication converted using FlowPaper Elements and with its corresponding PDF document next to it. You can still see the blog post here. It looks like this:

The following assumptions were made around how Google would treat these links:

Google would treat the PDF and the FlowPaper publication equally on a domain name basis since both are hosted under flowpaper.com. The FlowPaper Elements publication is actually hosted under online.flowpaper.com but Google treats subdomains the same as subdirectories according to themselves. Please see this Youtube link on this.

Both links were added with absolute positions in the blog post to avoid having Google to rank one better than the other if it appears before the other in the layout

Google would treat both equally on a file name basis since both had the same file name

Results & Analysis

We allowed a bit more than a week to pass before starting to collect results. We then decided to do 3 different tests to see how Google found content within these links. Main title, sub titles and body text searches. Below are the results of our findings. Note that we appended “site:flowpaper.com” (in italic) to restrict searches within our own domain in case the same publication would appear elsewhere.

Main title searches

FlowPaper was able to outrank the PDF in every case that we tried for main title. Here are the titles searches we performed for the publications:

So how does a PDF define a title compared to a title in FlowPaper? Well, PDFs do not contain meta data around things, so a main title in a PDF is just a larger font that typically appears on the first pages of a document. A main title in FlowPaper Elements on the other hand is an actual header tag (typically a H1 tag) as seen in the screen shot below.

Why is this important? Because according to Google, titles do have relevance to the match of where a certain page lives. FlowPaper Elements make sure that headers are real headers and that they match the title of the publication.

Section title searches

Now thats all fine you might say, because the PDFs may or may not have well defined titles in their text content, so thats a relatively easy thing to beat the PDF on. Well how about titles in sections? Titles in sections should rank high in a PDF too, but there is one major difference in how we treat sub titles and how a PDF treats them. Google claims that having too many titles on the same page would considered crud. Since a PDF contains all titles in the same document, its quite natural to think that a header that appears further down in a document would not get the same search relevance as one on the very top. Let’s have a go and see what happens within our test. We performed the following sub title searches (sub titles marked in bold):

FlowPaper was able to outrank the PDF documents in each and every case of sub title searches we made. How is it possible? Well, just as with the main title, FlowPaper defines each sub section of a publication with a title using a proper HTML5 tag. It also exports each section into its own HTML page and sets the title of the HTML page to correspond with each section as seen in the screen shot below.

Body Text Searches

So far so good, so how about body texts? A PDF and a FlowPaper publication contains the same body text so these shouldn’t rank differently -right? Well there is one major difference we noted briefly in the previous section. A PDF documents contains all body text in one long page per page structure and FlowPaper splits the document into sections. This means that body text in a sub section would appear higher up in a FlowPaper publication than in a PDF. Lets see what results we’re getting. The body text fragment we searched for is marked in bold.

2016VacationGuideIndexTest as with most things on the beachsite:flowpaper.com (see screen shot)

FlowPaper was able to outrank the PDF in 4 of 5 cases. Number 2 of the tests did not render any Google result from the FlowPaper publication at all (only from the PDF). Whether this was a random fluctuation or why Google for some reason decided to not index that body text is yet unknown. It could be that it will appear in a few weeks.

Conclusions

We have shown that FlowPaper does indeed have the capacity to improve search ranking for your PDF documents while providing accessibility and speed of loading that by far exceeds anything that a normal PDF document can deliver. In our next part we will look at how to avoid getting Google and other search engines to index and save your content using FlowPaper Elements. Keen to get your fingers on our upcoming version? Contact us via email and we’ll send you a pre-release!