2015: Envisioning The Year Ahead, Part 1

Just as it has been The Semantic Web Blog’s tradition to look back to the high points of the last year as we approach its end (see here and here), so too do we look ahead to expectations for the New Year, with the help of experts in the arena and its related fields.

To that end, we present their thoughts about what they believe – or at least hope – will take place in 2015 (and beyond), the goals towards which they and the industry are working, and ideas, issues and technologies that should be treated with greater care. (You also should head over to Dataversity.net for additional articles exploring the future for the Semantic Web, Cognitive Computing, NLP, Big Data and other affiliated areas.)

Commitment by the community to improved data quality, Linked Data publication standards, SLA (e.g. uptime/responsiveness); transition of emphasis from quantity to quality (not “more” but “better”); creation and deployment of standards and tools for validation (e.g. RDF Shapes) and collaborative data quality improvement.

Phil Archer, W3C Activity Lead:

We need to get closer to ‘regular Web developers.’ That is, people who don’t want or need to know about graphs but who do know how to build really cool applications. People who can work magic with JSON without the Linked Data bits. I’d like to see better interfaces between Sem Web technologies and everyone else. We need to think about the needs of data visualization folks, user interface design and so on.

The RDF Data Shapes work that started recently is a small part of that, since the file that tells you how to validate a dataset can also be used to generate a user interface with required fields, drop down lists of allowed values, etc. We’ve just started the RDF Data Shapes work and should be starting the Spatial Data on the Web work in January because we think they’re the most important ones. By this time next year I hope we’ll have some work going on to bring SemWeb and general developers closer together. Anything that increases the pool of users is obviously important for SemWeb.

I’m very excited by the fact that Wikidata will be tripling in size through taking over Freebase’s historical dataset and bringing it to another level by putting in place tools to help verify/cite the sources of the various facts therein (in a Wikipedia-like manner). This should grow to become an ever greater resource for the research community in both academia and industry, and I hope it will be another much-used hub of semantic data just as DBpedia currently is.

I’m excited about the academic work surrounding automatically generating captions for images. This is one step closer to being able to search unlabeled images with a text search (e.g. “find all of the pictures of my wife and me at football games”).

2015 will be the year of “intentions.” This is a feature that has been getting more press during 2014 as various vendors role out “intention” miners. We’re excited not only because we have a very robust implementation of intentions across a variety of types (buy, quit, recommend), but also because we see intentions as feature unlike anything ever released in the Text Mining world; a feature that has obvious and immediate impact on a user’s bottom line. This could take the form of identifying buyers, or finding customers who are at risk of dropping service. In either case, intentions provides actionable business insight that we see being adopted across a variety of industries. It is our belief that this feature will drive continued adoption of text mining services throughout 2015.

I hope that the release of Drupal 8 will boost the adoption of schema.org. With schema.org and RDF built in its core, Drupal 8 will put schema.org in the hands of mainstream site builders and webmasters. We learned a lot from the work that went into Drupal 7 and were able to bring improved support for schema.org and HTML5+RDFa into core, supporting data types like telephone, email, link, etc. We also have a good prototype for an RDF mapping user interface which was built by a 2014 Google Summer of Code student, which will be useful during site building for mapping Drupal’s content to schema.org types and properties. Drupal 8 is planned to be released sometime in 2015.

I’ll deconstruct this heap of jargon. Situational means cognizant of both the data production and data consumption contexts: How and why the data was produced (since we’re now massively into secondary uses, with social listening a prime example), and what we’re aiming to learn and do in using it. By fully-informed recommendations, I mean based on behavioral models (tracking and market baskets), profile matches, AND likeness (semantic similarity). Per Marti Hearst, “Sense-making refers to an iterative process of formulating a conceptual representation from of a large volume of information,” and semantic data integration is an approach to linking together diverse information elements across disparate sources.

Surface all of this stuff in improved (and steadily improving) digital assistants, diagnostic and control systems, robotics, and recommendations.

I believe that, as more and more data becomes available with semantic annotations such as schema.org, more and more people will realize the power of the Linked Data proposition. New applications that we could only have dreamed of before will start becoming a reality. On a technology level, I’m chairing the new W3C RDF Data Shapes Working Group, which is chartered to address a long standing gap in the collection of RDF capabilities. Indeed, we’re missing a standard way of defining structural constraints on RDF graphs. Such a capability is necessary to enable the definition of graph topologies for interface specification, code development, and data verification. All of which will help with the adoption of Linked Data technologies in many different use cases.

About the author

Jennifer Zaino is a New York-based freelance writer specializing in business and technology journalism. She has been an executive editor at leading technology publications, including InformationWeek, where she spearheaded an award-winning news section, and Network Computing, where she helped develop online content strategies including review exclusives and analyst reports. Her freelance credentials include being a regular contributor of original content to The Semantic Web Blog; acting as a contributing writer to RFID Journal; and serving as executive editor at the Smart Architect Smart Enterprise Exchange group. Her work also has appeared in publications and on web sites including EdTech (K-12 and Higher Ed), Ingram Micro Channel Advisor, The CMO Site, and Federal Computer Week.