Ever wonder why Google named certain algorithms after black and white animals (i.e. black hat vs. white hat?) Hummingbird is a broader algorithm altogether, and Hummingbirds can be any color of the rainbow.

One aspect of Hummingbird is about
better understanding of your content, not just specific SEO tactics.

Hummingbird also represents an
evolutionary step in entity-based search that Google has worked on for years, and it will continue to evolve. In a way, optimizing for entity search is optimizing for search itself.

Many SEOs limit their understanding of entity search to vague concepts of
structured data, Schema.org, and Freebase. They fall into the trap of thinking that the only way to participate in the entity SEO revolution is to mark up your HTML with complex schema.org microdata.

Not true.

Don't misunderstand; schema.org and structured data are awesome. If you can implement structured data on your website, you should. Structured data is precise, can lead to enhanced search snippets, and helps search engines to understand your content. But Schema.org and classic structured data vocabularies also have key shortcomings:

Schema types are limited. Structured data is great for people, products, places, and events, but these cover only a fraction of the entire content of the web. Many of us markup our content using Article schema, but this falls well short of describing the hundreds of possible entity associations within the text itself.

Markup is difficult. Realistically, in a world where it's sometimes difficult to get authors to write a title tag or get engineers to attach an alt attribute to an image, implementing proper structured data to source HTML can be a daunting task.

Google is able to extract the semantic meaning from this page even though the properties of "length" and its value of 50-60 characters
are not structured using classic schema.org markup.

Matt Cutts recently revealed that Google uses over 500 algorithms. That means 500 algorithms that layer, filter and interact in different ways. The evidence indicates that Google has many techniques of extracting entity and relationship data that may work independent of each other.

Regardless, whether you are a master of schema.org or not, here are tips for communicating entity and relationship signals within your content.

1. Keywords

Yes, good old fashioned keywords.

Even without structured markup, search engines have the ability to parse keywords into their respective structure.

But keywords by themselves only go so far. In order for this method to work, your keywords must be accompanied by appropriate predicates and objects. In other words, you sentences provide fuel to search engines when they contain detailed information with clear subjects and organization.

2. Tables and HTML elements

HTML (and HTML5), by default, provide structure to webpages that search engines can extract. By using lists, tables, and proper headings, you organize your content in a way that makes sense to robots.

In the example below, the technology exists for search engines to easily extract structured relationship about US president John Adams in this Wikipedia table.

The goal isn't to get in Google's Knowledge Graph, (which is exclusive to Wikipedia and Freebase). Instead, the objective is to structure your content in a way that makes the most sense and relationships between words and concepts clear.

3. Entities and synonyms

What do you call the President of the United States? How about:

Barack Obama

POTUS (President Of The United States)

Commander in Chief

Michelle Obama's Husband

First African American President

In truth, all of these apply to the same entity, even though searchers will look for them in different ways. If you wanted to make clear what exactly your content was about (which president?) two common techniques would be to include:

Synonyms of the subject: We mean the President of the United States → Barack Obama → Commander in Chief and → Michelle Obama's Husband

Co-occuring phrases: If we're talking about Barack Obama, we're more likely to include phrases like Honolulu (his place of birth), Harvard (his college), 44th (he is the 44th president), and even Bo (his dog). This helps specify exactly which president we mean, and goes way beyond the individual keyword itself.

Using synonyms and entity association also has the benefit of appealing to broader searcher intent. A recent case study by Cognitive SEO demonstrated this by showing significant gains after adding semantically related synonyms to their content.

When looking at Google answer box results, you almost always find related keyword-rich anchor text pointing to the referenced URL. Ask Google "How many people walked on the moon?" and you'll see these words in the anchor text that points to the URL Google displays as the answer.

Other queries:

In these examples and more that I researched, matching anchor text was present every time in addition to the relevant information and keywords on the page itself.

Additionally, there seems to be an inidication that internal anchor text might also influence these results.

This is another argument to avoid generic anchor text like "click here" and "website." Descriptive and clear anchor text, without overdoing it, provides a wealth of information for search engines to extract meaning from.

5. Leverage Google Local

For local business owners, the easiest and perhaps most effective way to establish structured relationships is through Google Local. The entire interface is like a structured data dashboard without Schema.org.

When you consider all the data you can upload both in Google+ and even Moz Local, the possibilities to map your business data is fairly complete in the local search sense.

In case you missed it, last week Google introduced My Business which makes maintaining your listings even easier.

6. Google Structured Data Highlighter

Sometimes, structured data is still the way to go.

In times when you have trouble adding markup to your HTML, Google offers its Structured Data Highlighter tool. This allows you to tell Google how your data should be structured, without actually adding any code.

The tool uses a type of machine learning to understand what type of schema applies to your pages, up to thousands at a time. No special skills or coding required.

Although the Structured Data Highlighter is both easy and fun, the downsides are:

The data is only available to Google. Other search engines can't see it.

Markup types are limited to a few major top categories (Articles, Events, etc)

If your HTML changes even a little, the tool can break.

Even though it's simple, the Structured Data Highlighter should only be used when it's impossible to add actual markup to your site. It's not a substitution for the real thing.

7. Plugins

For pure schema.org markup, depending on the CMS you use, there's often a multitude of plugins to make the job easier.

Looking forward

If you have a chance to add Schema.org (or any other structured data to your site), this will help you earn those coveted SERP enhancements that may help with click-through rate, and may help search engines better understand your content.

That said, semantic understanding of the web goes far beyond rich snippets. Helping search engines to better understand all of your content is the job of the SEO. Even without Hummingbird, these are exactly the types of things we want to be doing.

It's not "create content and let the search engines figure it out." It's "create great content with clues and proper signals to help the search engines figure it out."

70 Comments

Cyrus - Thanks for working on dispelling the notion that optimizing for semantic search requires schema or meta data markup when, as you've been pointing out so well, it doesn't. Appreciate the mentions and links to articles I've written. I'm not aspiring in anyway to SEO Godhood, but I definitely like seeing people learn about and understand better how they can have Google understand what they are saying.

The search metrics study on the use (or lack of use) of schema markup wasn't a surprise. It is limited. I agree with Gianluca that "Semantic SEO is now an essential part of our SEO job" and it's actually a fascinating time to be involved in SEO as Google starts learning to understand meaning..

Today, I'm happy to continue to point folks in your direction. In the process, I hope to add my own observations, research and analysis to help educate and contribute to our collective SEO knowledge base.

Part of the fun of doing SEO for a living is the mystery of how the search engines do what they do and the chance to unravel some of that, and the opportunity to watch Google and learn how it tries to do what it does. Part of the fun is also helping clients present themselves on the Web in ways that enable them to be found by people interested in what they offer.

Being able to share ideas and collaborate with others while doing this is also one of the things that makes SEO interesting - the chance to teach, to tutor, to mentor, to be mentored. The way that you put the ideas within this post together to illustrate many ways of how Google uses semantic information defines the spirit of that collaboration that makes this industry special.Thanks.:)

Hi Cyrus, that's a very interesting post and I was probably on the good vibration when I decided to adapt my post about Google Trends and Freebase to English yesterday. Here is a quick link for reference :

Very good post Cyrus, and remarking the fact that Semantic SEO is not just a synonim of Structured Data is something I came up insisting a lot too lately. Somehow, the Semantic SEO = Schema.org is a new myth of the already huge SEO mythology.

And it is true that Hummingbird is probably the most undervalued update Google has done: none (or very very very specific web sites) saw themselves hammered by it, and not so many were able to understand how Hummingbird is at the base of the SERPs we are seeing right now and how it is influencing everything, also those updates that usually worries SEOs the most: just think to Panda 4.0.

Matt Cutts defined it as a totally new architecture. Nobody, from what I've read, asked himself: "Why a new architecture for Panda was needed?", and if they did the answer was on the "softer Panda" side. But a Panda update that changes the 7.something % of the SERPs in Google.com cannot be defined as "soft": it'd be a contraddiction in terms.

Panda 4.0 is a consequence of Hummingbird, as its previous versions were a consequence of Caffeine.

Hummingbird, with its purpose of offering better SERPs thanks to a better natural understanding of Internet's content obliged to reconsider Panda, and transform it from a pure combination of different Pandarank factors to something that is able to answer to this question: "Is this site able to create concepts pairing with its information architecture and content so to offer a real value to the users?". If the answer is no: doomed.

For this reason Semantic SEO is now an essential part of our SEO job. And that's why is even more important to work with Content and Devs in order to create sites based on content hubs and provided with what Russell Smith defines as clear structured Storylines.

To end this... Google still is attending Primary School. Triplets are a very basic rule of how humans think. But Google has an army of quantum computers working just on learning human thinking, hence human language. So go to the library and start studying about the principles not only of Semantics, but also Semiotic and Rethoric, because Google's next steps are toward the understanding of those areas for implementing them into their Hummingbird based algorithms. When Google will be really able to understand if a mention is ironic or sarcastic, then things will get seriously serious :).

Gianluca Fiorelli Comment with most words is really impressive, but can you please make it shorter for us to read and know your opinions, since we just have finished reading a long post and now you get a second one for us to read, lol, I wonder how many has the patience to read all your words

While it'd be nice to believe that semantics are "coming" to SEO, for Google, they've been directly involved since at least August of 2013. With Hummingbird, you're looking at an algorithm that can break down keyword phrases to their absolute value, and then draw correlation between that absolute keyword value and user intent to deliver the best results possible in SERP for the query in question. In my experience, this can be seen most at a page level with keyword bleed between sub-pages & the home page. Sub-pages focusing on a specific concept tend to rank over the home page for "absolute value keyword" terms that are more specific to the content on the sub-page than the home page. If you happen to talk about that concept on the home page as well, and spend very much time on it, you'll see your home page show up in SERP instead of your sub-page. Taken at a website level, post-hummingbird, you're better off having pages of your sites focus on specific concepts (using all keywords appropriate in copy that represent that concept) instead of worrying about a page for "dallas italian restaurants", a page for "italian restaurants in Dallas" & another page focusing on "italian food in Dallas". They all mean the same thing. They all generate the same SERP results (given a 5% variance in the 1st 100 results). Use them all on the same page & keep the absolute value of that concept page-focused to get the maximum effect from your content.

_______________________

TL:DR Version:

Dallas italian restaurants

italian restaurants in dallas

italian food in dalas

--- Absolute value: dallas, italian, restaurant/food

Any combination of these terms in conjunction with each other will do a better job of reenforcing a concept on-page than trying to saturate 1 page with just 1 version of the keyword concept and using another version on a separate page.

Excactly, which has implications in "long-tail" search. In the old days, it was possible to rank for obscure queries simply by having the right words on the page.

Today, we're seeing much more consolidation in the SERPs and decreasing domain diversity as Google combines these types of queries together and rewards the "best match" URL to the query. Even if it doesn't match the exact keywords, it should match the meaning, and the site with the highest authority/trust/relevancy metrics win.

For "how many people walked on the moon" Google still has some work to do, as they are bolding the "20" as a part of the date (July 20, 1969). I think the crowning achievement will be just a box which confidently and boldly displays "12".

I also think of Google's leaked Quality Rating Guidelines which talk about the "purpose of the page" - and how this result does a really poor job: http://www.universetoday.com/55512/how-many-people... - It's titled "how many people walked on the moon?" but doesn't actually give the literal answer until way at the bottom, under a sea of other numbers and information. Not good for semantics, and not as useful for the user.

The Discovery result is far shorter, yet achieves an answer box result - definitely emphasizing best length is totally relative to the "purpose of the page". Long content is not always best.

Awesome post Cyrus! I have been behind this topic for past a month and was focusing on to Knowledge graph optimization (KGO). I also tried to use Freebase which you have mentioned but was stuck in some confusion. We all do schema markup, but the prime essence behind all this is the ERD (Entity Recognition & Dis-disambiguation) project started by Microsoft to determine the relationship of defined entities in the internet galaxy. The Big entity daddy Freebase is the store house of the defined entities and here is what we can make some input. We all SEO folks have been through all authorship, schema markup, data highlighter and many more. Creating contents which could actually serve the queries related to business niche is the best way out to get the bird pick your site. Semiotics is definitely coming in SEO and we would be expecting a more refined search from Google. Thanks again Cyrus for this post.

First off, great post Cyrus! My only question is does it matter WHERE the Schema markup is placed? For instance, if I only list out my address and business hours on my contact page, will the markup be just as effective if it's on a page that doesn't rank as well as my homepage, for instance? I know Google doesn't like hidden text and it in some cases it wouldn't be "natural" to include certain "item properties" on certain pages.

If you have a G+ page with the hours field populate & have publisher schema on your website linking the G+ page, Google will have their own markup on the hours from the G+ page take care of that aspect of your result in SERP, knowledge graph style.

The only place I'd recommend against placing schema.org is in meta tags, except when you absolutely have to. While it's allowed and validates, Google has indicated they prefer you don't put schema in meta data, and my personal experience has hinted that it's not always as effective as placing it in the text body itself.

It's posts like this that force me to realize I need to rely on my creativity, intuition and moxy to keep getting those hits from Google.

I'm certainly not going to be coding anything - I could write more content faster than I could do that. I'm also not paying anyone, so...yep there it is, I'll have to rely on myself. When you get backed-into dark corners like that where the options seem slim great things can be created.

Great post. This brings together the underlying influences of SEO - how human understands language, how humans are training computers to parse language and the root structure of language.

In studying Linguistics and then coming into SEO, I found Noam Chomsky's Syntactic Structure of all human language to be so applicable to SEO, yet largely missed by the field. While Chomsky's political writings have not gained wide acceptance - his outline for the root structure of all language is bearing proof, not only in the Linguistcs field, but here in SEO as well.

As Chomsky said in Syntactic Structures, "A transformational grammar has a "natural tripartite arrangement" which was groundbreaking at the time, and now Google's algorithms are catching up, then us SEO's follow in line.

This is one of the best articles regarding Semantics I have ever read. Simple and concise. Of couse, there is always what to add but if appr. 20% of the website out there use markups (I doubt that they are more than 5%) then we have a very good chance to take advantage on that.

We could probably say 100% of SEOs want to mark up their sites but web developers who've already quoted the project don't want to add structured markup because it's extremely time-consuming, even on an ecomm platform like Magento that is traditionally considered extremely customizable.

Very useful post Cyrus - thanks for sharing these ideas. As someone just learning the nuances of semantic search and semantic web technologies, I find this post helps provide some clarity about what's important to SEO. In fact, it aligns well with the Hangout from last week by Eric Enge, Mark Traphagen and David Amerland: "Give me structure or give me death". A consistent them there was that we need to make sure our content is clear and organized (Amerland is more specific, saying good structure provides: clarity, consistency and classification) and that if we do this well it can provide the benefits of structured data without the markup.

I appreciate how you pointed out how we need to better understand the writing of our content and how it relates with entities. It makes sense, and the layout of your content is super helpful in explaining the triples.

I sometimes get the feeling that SEO's are becoming content editors for the search engines. :)

There s much more in semantic meanings then the eye can nakedly see.. Cyrus you always write in depth great articles, with loads of info and inputs for work ! :) thanks, this one was great too.. also liked some of the external sources you pointed to :)

this Article is truly a #FUFISM based master piece, as it explains so many of the core FUFISM principals in ways that everybody can understand. There is very little technical jargon and this is well written for the average user to get a good well rounded understanding of the issues at stake within search

I have shared this again to a number of different platforms as this article is a master piece.

Interesting take on this and to add more fuel to the fire, I've also noticed Google not identifying schema.org structured data (even when validated through Google's validator) but when the same data on that page was changed to non-schema.org structured data, Google then treated it as structured and showed in search results (in one of our client's cases it then showed businesses' star ratings in search results as well as being shown as reviews on Google Places listings). So it seems even if it's structured Google doesn't always take the data the way you might think they would. Another take away is that possibly Google is not as sophisticated to parse some times of structured data just yet.

Hi Cyrus -- great overview of Hummingbird. Indeed, the semantic web has not yet seen widespread adoption so Google focuses instead on keyword relevance and entity extraction.

I’d be curious to hear what you think of our tools. We at MarketMuse.co make tools to help marketers identify related keywords and content ideas, to help create content that's Hummingbird and Panda 4.0-compatible.

Dear Cyrus, as always your posts are useful and rich of suggestions.. do you think freebase can also give signals for rankings within the semantic world wide web of meanings and entities? Thanks for sharing! Eugenio (SEO)

Google is trying hard to display the best possible results, applying so many updates. I hope all these articles keep up to date the users about it. You have done a great job Cyrus by mentioning some very useful information in your post.

A very good post Cyrus. Complete agree on the usage and how difficult it can be for having use markups/schema.org. This also gives a very good prospective of how google might be working towards using and sourcing these. Gives a different prospective for sure thanks Cyrus.

I really love mark ups. Here in germany I saw some sites wich truly overdid it. Marked up everything possible and more. Hidden mark ups, putting videos into the conten to have another thing to mark up ...

Cyrus- I really appreciate the time you have taken out to explain importance of schema and structured markup. I totally agree with Gianluca Fiorelli as well, on what he has spoken about Panda 4.0 , Hummingbird and semantic SEO.

In this new generation SEO, webmasters should focus more about how to implement structured markup. I know, it's not easy but considering its benefits, I think its definitely worth a shot to spend your time for.

I believe it is panda which affects most of the websites.. Few of clients websites too affected by panda.. Thanks Cyrus for sharing such useful information.. One useful link where one can find more related information about hummingbird is here: http://techyblog.org/will-hummingbird-update-kill-the-relevancy-of-keywords/

Thank you for the post Cyrus, very helpful and informative. At least we have alternatives that we know are proven, SEO is such a big beast to conquer all of the tips from great minds such as yourself are greatly appreciated.