Layout-8000 Roadmapby Richard J. Cichelli, President of SCS

Where are newspapers headed?

Newspapers are designed and manufactured. There was a time not that long ago prior to the rise of Google, when owning a newspaper was delightfully profitable. Now, not so much. This has led to the sale of independent newspapers to ever larger newspaper groups.

These newspaper corporations are looking for efficiencies in design and production. They consolidated IT services, picking common systems for use by their business units. They often co-located their computing technology to gain further cost reductions. Newspaper design centers (NDCs) were set up to create display ads. Ad building can be done with off-the-shelf commodity desktop graphic design tools, like Adobe InDesign, QuarkXPress, MultiAd Creator, etc. Often ad building was outsourced to services in low wage countries.

Centralized servers with databases were used for ad tracking and production workflow. This roadmap will show how a well-engineered, server-based ad dummying system can play a pivotal role in improving the efficiency of a newspaper design center.

Spending too much time with dummies?

Dummying newspapers involves fitting rectangles onto bigger rectangles, i.e., taking the space for each display ad and allocating it to a position on the pages of an edition. Manual dummying looks like having fun playing Tetris. It is not that simple. There are many complex constraints involved in doing it well.

Manual processes don't scale. Critical expertise known only to certain individuals doesn't scale. What does scale is a server-based computing architecture providing a knowledge database and highly automated services.

Consider a design center serving a large newspaper group, say one with 100 publications produced daily. All these need to be dummied. One might expect that there are 25 layout operators working throughout the group designing these publications. Do the math. With each operator doing four products per day, that's two hours per product.

Instead of two hours per product per operator, with a well engineered dummying system, designing two products per hour per operator is an achievable goal. That's four times the productivity.

How can a newspaper design center achieve greater dummying productivity?

There are a number of issues that need to be considered in building a scalable, efficient dummying platform that can be deployed not just in one design center, but industry wide:

1) What services should be provided before, during and after dummying.2) What site-specific expertise should be moved to the NDC and how.3) What programming and deployment strategies are needed to make an appropriate system suitable for NDCs and an entire industry.

4) What innovative technologies are needed for task optimization.

What services should be provided before dummying?

To dummy a product one needs to know what the edition will look like and the set of ad insertions that will go in it. Insertion orders describe ads, or, more specifically, space requests. These come from a front-end advertising management system (AMS). AMSs support the sales, order entry and accounting functions.

Edition designs specify what the products are to look like. If you think of edition designs as being in Edition Design Files (EDF), there are likely to be many of them tailored to various publications. Edition designs have parts that are relatively constant for all editions of a product. There are also design constraints and policies that vary with each product, e.g., desired ad news ratios, the size of the sections and the paper as a whole.

An EDF can be thought of as a program or specification for combining the ads into a product. They are design templates.

One complexity in designing newspapers is that there is not a one-to-one correspondence between what is designed and what is printed on the press. Successful designing must accommodate products with multiple variants, called zones. These are best designed all at once, so that corresponding common pages line up. In contrast, what eventually goes on the press are complete zones, one at a time. They may include multiple designs, such as when a tab section is part of a broadsheet paper. Typically they have different column measures.

Press limitations, especially with regard to the color availability on older presses, are particularly difficult to deal with.

A dummied paper is a blueprint for the assembly of the product by the pagination system.

So dummying systems sit as middleware between front-end advertising and newsroom systems and back-end pagination systems. To be suitable for design center (and industry-wide) deployment, the middleware needs to be able to accept insertion orders (preferably in near real time) from a multiplicity of business units and their individual systems, store them for retrieval, access them for extracting, transforming and loading into the dummying engine.

Technology that scales for NDC use offers support for multiple data protocols, including XML, JSON, CSV and fixed field for insertion orders.

Of course there are situations where front-ends do not supply appropriate ad attributes in their interface files. Some files lack essential information. Others present not attributes, but instructions, usually as text commands. Dealing with this automatically requires a named entity recognizer, one which can translate RHP, into a right hand page placement request attribute automatically. Further, the ability to programmatically examine ad images to find out if they are in color, have coupons, are reverse ads, are about selling tires, etc. can help bridge the gap.

Another way to supplement what AMSs provide is to use historic data. How this might be achieved will be discussed later.

There are several services that the middleware can provide which can help facilitate sales. One is to support premium space reservation management. Using this service allows sales reps to sell preferred locations to advertisers willing to pay extra for guaranteed space. Deploying this service on the internet allows sales reps to cross sell products for multiple business units.

Another is to help manage standby ads sales. Using this service allows sales reps to offer advertisers a lower cost way to have their ads run. Standby ads are run on a space available basis. Cross selling standby ads should also be possible. Running them instead of fillers is not just a new revenue source, but can allow making better looking pages.

When dealing with dozens of products, it is helpful for there to be a centralized set of applications that provide management reporting for both sales (ad request distribution reporting) and production (page tracking) to all staff via the internet. Well run newspapers have dummies everywhere.

Having a space inventory and standby ad support can accelerate sales.

The dummying system's middleware must provide facilities for automating both pre- and post-dummying tasks for NDCs.

Server-based technology easily out performs desktop solutions for these tasks.

What services should be provided during dummying?

Dummying newspapers on an industrial scale is neither an art nor a craft. It requires a multi-product view and all the automation that computer aided design technology can bring to the task.

One of the things you do when designing a newspaper is placing display ads. In fact, it is usually what you do first when designing an edition. Sell ad space, tightly fit the rectangles onto pages and use the left over space for news: "All the news that fits, they print."You could use Tetris as a model for an ad dummying system, but as many newspapers have found, systems based on this model don't easily scale to the needs of NDCs. There efficiency concerns are paramount.

Manual dummying may be fun for operators, but it falls short if you are trying to save labor costs. It does have one advantage over smarter technologies. Take such a system out of the box and you can immediately dummy with it as you would with an electronic pencil, just like you did with a graphite one. However, having a person do something a machine can do better dehumanizes the person.

Being able to dummy multiple products at once yields more flexible production timing. (No more waiting for something for one product, just start working on another concurrently.) With this comes better workflow automation. Dummying is on the critical path to pagination. Anything that improves its efficiency, improves overall production efficiency.

Automatic dummying is key to efficient design.

What is automated dummying?

Let's say you have a list of ad records and a set of page thumbnails. You drag and drop an ad onto a page. Instead of requiring you to re-position the already placed ads to accommodate the added one, the software rearranges them automatically. That's part of auto-dummying.

There may be a style requested for the page. A pyramid toward the edge means that the ads are arranged in stair step fashion. Bigger ads will be in the bottom outside corners, smaller ones on top. (It's not good form to bury an ad under others. They should touch news as well.) Ads will align with columns and should not span the tops of multiple ads. Besides pyramid styles, there are thousands of others made up of combinations of style factors.

Designing an entire product requires scaling up from individual pages to an entire edition. One good strategy is to deal with sets of pages by their content. Advertisers may request that their ads be put on sports, business, main news, etc. pages. Such requests correspond to the editorial "desks". Advertisers wish their ads to be among certain news content. Similarly, they often prefer that ads with products like theirs be away from their ads. (I.e., "Don't put another's tire ad on a page with my tire ad.")

So dummying the sports pages together might be a good strategy. To minimize the cases where advertisers don't get their requested placement, it is useful to dummy the most restrictive situations before others. This is called targeted dummying.

Think of an edition with 40 pages and 200 ads. (You wish.) Misplace 10 percent of the ads. How many pages are you likely to have made a mess of? Not 4, i.e., 10 percent of 40. The probable answer is at least 20 pages, every page where an ad was misplaced. You might as well do it manually.

Computer scientists would think of dummying as solving an instance of the difficult problem of 2-dimensional bin packing.

Are there other constraints? Many. To take a description of a press and compute the color availability is called press impositioning. Like other parts of the dummying problem, this, too, is a difficult NP-Hard problem. NP-Hard means non-deterministic polynomial-time hard problems. These have the highest level of computational complexity. They are, by definition, some of the most difficult to solve.

Is doing impositions automatically important? The layout person calls the pressroom foreman and says "I'm designing a 40 page paper with 12 full-page color ads. Where can I run them?" The answer requires very specialized knowledge. (Even the units where inks are to used in the sequence of press runs.) If this expertise is only in the head (as it often is) of the press room foreman, this is not a good thing. Consider that he or she might be the union representative. BTW - Presses run at night and dummying is usually done during the day shift. You could end up paying considerable overtime to cover both.

Artificial intelligence technology can come to the rescue.

Artificial intelligence and newspaper design

It was recently reported that a way one could understand artificial intelligence (AI) was to look at Google's auto-completion of queries. I've been an AI researcher since the late sixties, when I was heavily involved in writing chess programs. Auto-completion? Really?

I query Google with "Van" and it auto-completes "Vanguard login". I'm not surprised, since I type "Van" nearly once every day. But then I tried to give the auto-complete analogy the benefit of the doubt. So auto-completion looks like recording behavior (a user's query set), matching a new query to partial strings of the stored set of queries and then predicting the new query. Well, it's almost this. It needs one more operation for the AI part. One needs to improve its predicting performance over time. To do this, add the eventual new query to the query set, weight its importance and use that knowledge to predict the next similar query. (In the early seventies, pre-Alta Vista, a smart new information retrieval system predicted a query of "porn" when the query entered was unintelligible or empty.)

The feedback loop is the machine learning behavior. Machine learning is a sub-field of AI. When you appreciate this you realize that with AI you aren't really building smart machines, just machines that exhibit smart behavior.

Why did the dummying algorithm/heuristic end up placing an ad where it did? Having a full trace offers a better understanding of the dummying logic.

AI techniques can help with both getting ad attributes and selecting templates. As operators dummy products, either manually or automatically, recording both the decisions and the context in which they are made becomes part of the growing knowledge base of dummying preferences. As new products are dummied, what happened with them is used to adjust derived preferences. This feedback loop becomes a tool for predicting dummying decisions. In short, it gets smarter.

How smart? As the knowledge base grows, manually-made designs are compared to automated ones computed in the background. Automated design can be trusted to take over when the differences are found not to matter.

Having a dummying knowledge base eases the transition from manual business unit based dummying to NDC automated dummying. The local expertise about advertiser's style preferences, etc. is saved in the system. This achieves an important corporate goal of making specialized expertise widely available.

Systems that support automated dummying are expert systems. Expert systems are another sub-branch of AI.

Good Old Fashioned Artificial Intelligence (GOFAI).

Dummying display ads, paginating classified sections, designing news layouts and computing press impositions are examples of applied GOFAI. As with chess programming, getting results in well less than a second makes for a good interactive user experience. The common challenge is to examine a large set of possibilities by trial and error searching quickly and effectively. This is called backtracking. Even the most pleasing screen layouts of page thumbnails can be found by backtrack search.

Some newspapers have very explicit styles. Perhaps they want page A3 to have three 2 column by 6 inch high ads across the bottom with a column of one column ads stacking up the right edge. It helps to have a pattern language (or domain specific language) for specifying such patterns. The dummying routines have to sort through the possibilities to make such special shapes.There are no shortage of interesting problems to solve when doing computer aided design of newspapers.

What services should be provided after dummying?

The results of the dummying process are exported to the pagination subsystem. Commodity desktop applications such as QuarkXPress and Adobe's InDesign are well suited for newspaper assembly. The dummying system provides the geometry in machine readable form and interfaces to these products to construct the framework for the print edition.

Layout staff are designers, not paginators. Design isn't construction, nor should it be thought of as such. Layout staff need to be aware of both the advertisers' and editorial department's needs and the publisher's policies. If they do more than sketch, they do too much. The interface technology should map the design into pagination construction instructions. Columns, which are counted for dummying, are mapped into measured positions for pagination. (Sometimes newspapers change their column measure. Such as when doing a web reduction. This should be the concern of the interface, not the designers.)

While there are popular commodity and proprietary tools for newspaper pagination, new free open source technology may soon challenge them. Scribus is the current leader FOSS for print publication pagination. For developers wishing to automate and optimize newspaper pagination, Scribus offers numerous advantages. Python is its scripting language and its internal document format is plain text in XML. It is wide open for programming.

Interfaces should be available for all pagination platforms.

The output files from dummying systems drive other production subsystems such as those for managing tearsheets, paper checking, design approvals and the monitoring of the production of ads, pages, operator and system performance, etc.

Many key performance indicators are best computed from the data produced within the dummying system.

Programming and deployment strategies which support NDCs.

Inventing software technology is fun. Invention alone isn't sufficient to sustain a business. That requires innovation. Innovation is where invention yields revenue. Revenue only comes with deployment. The goal is to deploy durable technology that allows the fun to continue.

Which do you think is more agile? An architecture built around a monolithic database that holds everything you might know about an ad or advertiser, or one that is built to provide microservices that know exactly what each needs to know to deliver a particular service?

A microservice architecture allows capabilities to be delivered and debugged quickly. Supporting a microservice architecture requires an enterprise service bus, something that securely supports the transfer of data among services. A side benefit of such architecture is that the services, being independent, can be deployed on multiple smaller, less expensive hardware platforms. Roll-outs can be done in chunks for both services and equipment.

A microservice architecture allows customers to enable services as they are ready for them. Developers can get new releases into customer sites faster. It's a better way to manage a growing, complex, evolving system. And it is the key to providing new value for customer support payments.

==================================SCS is a leader in providing applications used in newspaper design centers. These are used by 5 of the top 10 largest newspaper companies through out their 560 business units. These alone have over 12.5 million subscribers. SCS's installed user base of newspapers is over twice this large.