Posts Tagged ‘keywords’

Advertising is the number one option to monetize content on the web. Even with the advent of the real-time web, until a viable model of in-stream advertising is conceived, search engines and their means of online marketing, such as SEO, AdWords and AdSense remain dominant. As in-stream and other real-time web marketing models mature, become relevant and non-intrusive, search engines will have to undergo fundamental change to keep a significant segment of the market.

Keywords are bad

In order to induce fundamental change first we have to identify the fundamental flaw. At the dawn of the World Wide Web the paradigm of search was borrowed from text documents where a certain paragraph is easily spotted by looking up a few words we presume to be in it.

With more extensive content and less prior knowledge about it this paradigm became harder and harder to apply. However, in our efforts to rank content by its structure, context and user preferences we kept keywords all along. Moreover, keywords today fuel an entire industry of online advertising, recklessly overlooking the distortion they add between content and the user’s specific preferences.

To make search and search-based ads as relevant and as non-intrusive as the real-time web has to offer, keywords must be forgotten once and for all.

Content mapping

Content mapping connects units of content directly by user interaction via a rich, irreducible set of relations. Between your content and others there may be similarities, equivalence, references and other sorts of relations of various significance and strength (relevance). Content that is relevant to yours make up its ideal context. The first n results a search engine returns is on the other hand the actual context.

The distance between the ideal and actual context marks the accuracy of a search engine.

Now, when you’re searching with Google, you’re basically trying to define the ideal context for the content you need. Imagine just how clumsy and inefficient it is to do through a couple of keywords.

Using a content mapping engine you type in a piece of content, not context. That content (or one that’s semantically identical) is probably already placed and centered in its ideal context. You’ll receive the elements inside as results in decreasing order of relevance.

SEO

Search engine optimization is an attempt to match the actual context to the ideal. Inevitably, when you tune your webpage for certain keywords you’re guaranteed to bolt it into the wrong context.

With content mapping, there’s no need for SEO. Not in a fair use scenario anyway, but on the other hand, the well-known SEO exploits (black hat, article spinning, keyword stuffing) obviously won’t work either. If the actual context of your content changes, it will re-position itself automatically to a new context that approximates the ideal as close as possible.

Shooting in the dark

When it comes to online marketing, SEO is just one of your options. Ranking algorithms may change and your content gets easily ripped out of the context you worked on so hard to match. So, you turn to a different, somewhat more reliable marketing tool, AdWords for example.

What happens from then on is again viewed through the smudgy glass of keywords. First, you take a wild guess at what keywords will best match your ideal context, bid for them and see what happens. If the conversion rates are not satisfactory, repeat the process until you get the best achievable results.

Assuming your campaign was successful, along the way you’ve probably

lost a lot of time tweaking

lost potential customers / deals

paid for the wrong keywords

took an exam or hired a consultant

and ended up in a wrong context anyway

In a content mapping environment however, you land at the center of your ideal context. With no tweaking, no time nor money lost.

What’s the catch?

I’ve hinted in the definition that content mapping relies on user input. In fact it relies on almost nothing but that. I admit that building and maintaining the connection index takes huge collective efforts, but I’m convinced about its feasibility.

We only have to make sure it

Provides frictionless tools for contribution: When the entire index has to be collected from the network it’s vital for the process not to demand more time and attention from contributors than what’s necessary.

Treats harmful activity as noise: Random noise is natural in content mapping. Useful information within the system – however small percentage – is expected to be coherent and thus extractable. In order to suppress useful information a successful attack would have to insert harmful information of at least equal coherence. Input gathering tools within the system must be designed with that in mind.

Regardless of how cleverly we gather information from the network, latency remains an integral property of content mapping. Changing actual context needs time to catch up to the ideal depending on the size of the network. The bigger it is, the faster the response. At the start of a campaign one must be clear with the delay by which the content gets centered in its context.

Unfortunately neither of the concerns above are comparable to building the network in terms of size and difficulty. However, the steps through which this can be achieved are yet to be defined.

Updates

Click-through rates: It’s sort of self explaining, but it may be necessary to emphasize the following. When a piece of content is centered in its ideal context it will yield the highest click-through rates when placed on a blog or website in an AdSense fashion.

Similar solutions: MyLikes has implemented a system in which advertisers may reach a higher click-through rate by placing their ads next to (or embedded into) relevant content produced by trusted “influencers”.

In-stream solutions: Take a look at this list of Twitter-based marketing tools on oneforty. They may not be all in-stream, but they can give you a general idea of advertising in the real-time web.

We are witnessing the decay of Google Search. The recent improvements (categories, promotion, real-time results) are insignificant compared to the magnitude of the problem, namely, poor relevance of results. By relevance I mean the results’ relation to the specific idea in the user’s mind, and not their relation to the keywords.

Keyword-based search increases the distance and distortion between results and what the user is really looking for.

Poor relevance

Why do keywords perform so poorly? After all, they would work perfectly in a world where all data on the web is semantically indexed through relevant metadata. In reality however, the gap between relevant information and noise is so huge that keywords are likely to be caught in both. The keyword meta tag fiasco around the millennium has proven the inefficiency and vulnerability of metadata.

The widely criticized Semantic Web that is anticipated to be an integral part of the third generation Web aims in that direction anyway. How it’s going to deal with obvious obstacles such as entropy and human behavior remain unanswered.

Making sense of noise

Instead of going into the problems posed by metadata, let’s focus on the naturally noisy web. Since reducing entropy in general requires immense efforts let’s turn the problem around and start digging in the noise.

There are two ways to do this:

treat the entire set of data as noise and recognize patterns that are interesting to us

prepare useful data for extraction from the background noise as we come across them

The first option calls for some sort of AI. While this is a viable solution I’d question its feasibility. I don’t see algorithms – no matter how complex they are – cover every single aspect of content recognition and interpretation.

For the second option I can show a very fitting example. In digital watermarking we’re hiding drops of information in a vast ocean of noise. In order to recover that tiny amount of data we have to make sure that it’s either or both

significantly more coherent than the background noise (coherent)

repeated over and over throughout different domains of the signal (redundant)

We can put the same concept in Web terms by connecting relevant content through user interaction.

Content mapping

There are a couple of attempts at using the crowd to add context to content: Google’s Promote button, Digg, Twitter lists just to name a few. It’s easy to see that these tools don’t connect content to content. They connect content to metadata which brings us back to the original problem. OWL, the language of the Semantic Web can be used to define connections indirectly via class connections, but this solution again favors the metadata domain.

Direct content to content connections are practically non-existing as of today except for online stores where articles refer to each other by a recommendation system. These connections are quite limited by the narrow niche and the very few and specific relations (also bought / also viewed / similar). Unquestionably, creating these connections on a grand scale is an enormous yet far more feasible a task than keeping entropy low. The good news is that tools like the ones mentioned above (Digg, Twitter) spread a completely new user behavior that will perfectly fit content mapping.

By defining a sufficiently rich set of relations in content connections, mapping will be machine readable. It won’t know that e.g. a certain text element does represent a book author as it would in a semantic solution, but through a series of connections it’s going to have implicit knowledge about it.

The “Google killer” cometh

Whatever is going to go in the footsteps of Google Search (perhaps a new Google Search?) it’s going to end the era of keywords. Ideally it’s going to feature strong content mapping induced by fundamentally changing online behavior mixed with light semantics. It will be dumb enough in terms of algorithmic complexity, yet smart enough to harness the collective intelligence and knowledge of content creators and consumers alike.

Updates

In Google abandons Search Andrew Orlowski elaborates on how real-time results and voting kill PageRank and through the generated noise and irrelevance pushes back the entire Internet into the chaos from which it emerged.

Nova Spivack tears down the hype encircling search engines in Eliminating the Need for Search by realizing how search is an “intermediary stepping stone” that’s ““in the way” between intention and action”. He lists a couple of solutions that aim to break out of the conventional search engine image, but in the end fail to bring about drastic change. Instead, he proposes the concept of “help engines” that supposedly help the user in a proactive way.