Category: 06 Search

The Best Presentation Award of the International Marketing and Output Database Conference IMAODBC 2016 in Gozd Martuljek, Slovenia goes to Susanne Hagenkort-Rieger and her team from DESTATIS (Statistisches Bundesamt, Germany).

In her presentation Susanne highlighted the importance of web search statistics and why intuition when emphasizing selected statistical data is often not sufficient. To achieve relevance and accessibility of most popular statistical data we should not ignore what the web search data say.

From the New York Times:
“PARIS — Google, which organizes the world’s information digitally, is linking up with a precursor that aimed to do something similar, on paper.
It plans to announce Tuesday [13 March 2012] that it is forming a partnership with a museum in Mons, Belgium, dedicated to a long-ago venture to compile and index knowledge in a giant, library-style card catalog with millions of entries — an analog-era equivalent of a search engine or Wikipedia. …
… Long before them, in 1895, two Belgians, Paul Otlet and Henri La Fontaine, began the project that grew into the Mundaneum. Their card catalog, initially called the Universal Bibliographic Repertory, compiled links to books, newspaper and magazine articles, pictures and other documents from libraries and archives around the world. People were able to submit queries via the mail or telegraph. The collection expanded to 16 million cards, and Mr. Otlet and Mr. La Fontaine envisioned a “city of knowledge,” complete with museum exhibits and other archival material. …
…The partnership is part of a broader campaign by Google to demonstrate that it is a friend of European culture, at a time when its services are being investigated by regulators on a variety of fronts.’

Two years ago Wolfram launched Wolfram|Alpha, a search engine (‘computational knowledge engine’) which does more than find objects or link objects in the sense of linked data: An engine computing answers from a huge amount of (partially manually) curated data and milllions of algorithms used in Wolfram’s software Mathematica.

(not the freshest data by the way ;-))

‘Oh, and by the way, these days the majority of queries to Wolfram|Alpha give zero hits in a search engine; they don’t ever appear literally on the web. So the only way to get an answer is to actually compute it.’
So the words of Stephen Wolfram in his keynote speech at Wofram Summit 2010. This speech gives an extensive insight into Wolfram’s philosophy and objectives, fascinating!.

CDF

And now, some days ago, Wolfram launched a new data format – CDF: Computable Document Format. From the announcement: ‘CDF is a new standard that’s as everyday as a document, but as interactive as an app. It empowers readers to drive content and generate results live for a deeper understanding. And authoring interactivity is easy enough for teachers, journalists, analysts, managers, or researchers to add to reports, presentations, blogs, infographics, articles, and textbooks.’

A viewer (100mb download) is needed to read the CDFs and Mathemathica (Wolfram’s Software) is needed to edit a CDF.

Take this example: Age Distribution in the World as a CDF document. It’s 78kb to download and presents this ternary graph:

This ‘ternary diagram is a graph that shows the proportion of three variables as a position in an equilateral triangle. The three variables have a constant sum (in this case to unity). This particular diagram shows the population proportion of children (0<=age<15), adults (15<=age<65), and elderly (65<=age) for different countries. The proportions have been color-coded to facilitate interpretation. … You can choose a continent or the whole world.’ (manually or with the autorun function).

Some days ago in a post I mentioned how Google and others go semantic and provide in their search results not only information about information (means: links to web pages) but information itself. So i.e. the cinema showtimes.

Search on Google for cinema or weather in a region and you will get more than a link: the weather forecast and the showtimes for today or tomorrow … .

.

Increasingly, search engines are going to provide more than just links, that is the information looked for. To do so Google already uses (since 2009) semantic markup on web pages in order to present search results with information instead of links to sites containing that information. Such so-called rich snippets describe people, reviews, products, recipes, etc.

Wolfram Alpha has this ambition, too. But Wolfram follows another road: Incoming search questions are analyzed via language recognition, linked to the Wolfram Alpha knowledgebase which then delivers corresponding content:

For weather Spain Wolfram Alpha does even better than Google 😉

And now we see a step forward by Google & Co in direction of the Semantic Web: Second of June 2011 Google, Bing and Yahoo! announcedschema.org, a ‘new initiative to create and support a common set of schemas for structured data markup on web pages. Schema.org aims to be a one stop resource for webmasters looking to add markup to their pages to help search engines better understand their websites.’

This is the next step after rich snippets and one further step towards the Semantic Web in action. But: Google unfortunately doesn’t use an existing standard like RDF! 😦

Many new markup categories will be added. Something relevant for statistical sites? Perhaps ‘GovernmentOrganization’ and ‘DataType’.

Providers of websites have now to decide how they will integrate such new markup in their content in order to get a good representation in search engines.