Here’s How Indexing Could Evolve with Ebooks

Last month I shared some thoughts about how indexes seems to be a thing of the past, at least when it comes to ebooks. I’ve given more consideration to the topic and would like to offer a possible vision for the future.

Long ago I learned the value an exceptional indexer can bring to a project. For example, there’s a huge difference between simply capturing all the keywords in a book and producing an index that’s richly filled with synonyms, cross-references, and related topics. And while we may never be able to completely duplicate the human element in a computer-generated index, I’d like to think value can be added via automated text analysis, algorithms, and all the resulting tags.

Perhaps it’s time to think differently about indexes in ebooks. As I mentioned in that earlier article, I’m focused exclusively on non-fiction here. Rather than a static compilation of entries in the book I’m currently reading, I want something that’s more akin to a dynamic Google search.

Let me tap a phrase on my screen and definitely show me the other occurrences of that phrase in this book, but let’s also make sure those results can be sorted by relevance, not just the chronological order from the book. Why do the results have to be limited to the book I’m reading though? Maybe that author or publisher has a few other titles on that topic or closely related topics. Those references and excerpts should be accessible via this pop-up e-index as well. If I own those books I’m able to jump directly to the pages within them; if not, these entries serve as a discovery and marketing vehicle, encouraging me to purchase the other titles.

This approach lends itself to an automated process. Once the logic is established, a high-speed parsing tool would analyze the content and create the initial entries across all books. The tool would be built into the ebook reader application, tracking the phrases that are most commonly searched for and perhaps refining the results over time based on which entries get the most clickthrough’s. Sounds a lot like one of the basic attributes of web search results, right?

Note that this could all be done without a traditional index. However, I also see where a human-generated index could serve as an additional input, providing an even richer experience.

How about leveraging the collective wisdom of the community as well? Provide a basic e-index as a foundation but let anyone contribute their own thoughts and additions to it. Don’t force the crowdsourced results on all readers. Rather, let each consumer decide which other members of the community add the most value and filter out all the others.

This gets back to a point I’ve made a number of times before. We’re stuck consuming dumb content on smart devices. As long as we keep looking at ebooks through a print book lens we’ll never fully experience all the potential a digital book has to offer.

Joe Wikert is Publishing President at Our Sunday Visitor (www.osv.com). Before joining OSV Joe was Director of Strategy and Business Development at Olive Software. Prior to Olive Software he was General Manager, Publisher, & Chair of the Tools of Change (TOC) conference at O’Reilly Media, Inc., where he managed each of the editorial groups at O’Reilly as well as the Microsoft Press team and the retail sales organization. Before joining O’Reilly Joe was Vice President and Executive Publisher at John Wiley & Sons, Inc., in their P/T division.

Joe, you’re definitely on to something here when you talk about a “high-speed parsing tool” which would create initial index entries. The problem with traditional, standalone print-book indexing is that the terms are limited to the author’s usage, which may be very idiosyncratic. The resulting index is “rich,” but only as far as that single book goes. If we instead started with standardized index terms created with a parsing tool, the indexer could create the necessary connections (cross-references) between the generalized terms and the particular usage of the author. The result, an index that is “rich” both vertically (book-specific) and horizontally (field-specific), would open broad venues that could enhance the user experience (for example, allowing the user to connect with other books by the same author or publisher, as you suggest). How do we get there from here? A publisher would need to start somewhere, perhaps with a series of reference books in a particular area of knowledge. A standardized thesaurus would be used to create a list of initial terms. The indexer would then use these terms as a starting point for indexing. Whether non-fiction ebooks will have the same level of detailed indexing we have come to expect from print books remains to be seen; if indexes are to survive and thrive in ebooks, we need to ensure that they contribute to a unique reader experience in a way that print-book indexes cannot.