What Not to Do When Using Alternate Language Markup

I wrote about having multiple language versions of your pages in June 2014. I thought it was important because so many people were perplexed about marking pages for various languages. No doubt, when you first started to think about how to do these things properly, it seemed daunting for you, too.

As I've written before, it’s really not quantum physics. If you know the rules, you’ll be set and your pages will appear in appropriate search engine results pages (SERPs) properly, no matter what version of Google visitors are using.

I won’t go over those rules again here, but I think we need to discuss a couple of things you may not have considered. These issues are uber important because if you’re doing things wrong, you can have 100 language translations and none of them will work properly in search.

I know it. You know it. We all make mistakes. Well… You could be perfect, but I highly doubt it. OK, close, but well… Mistakes are us and knowing the kinds of errors that are keeping your pages from appearing properly are rather simple.

Here’s what I mean.

These are Things NOT to Do When Using “hreflang” Markup

Not Using a Return Tag

This happens when you use a language tag on certain pages and neglect to use them on others. Google calls this “missing return tags.”

Let’s illustrate. Maybe you have an English version of a page and a French version of a page.

If one tag is missing (either place), Google considers this a missing return, and because of that, your listings may not work properly in different language versions of Google. For example, the English page might be served to those using Google.fr, which you wouldn’t want to happen, right? This can also occur when you use a sitemap to distinguish alternate language pages or provide HTTP headers for a .pdf, for example.

Whichever form of tagging you use, always be sure that all pages concerned are tagged, no matter how many languages you’ve translated a page into, and have return markups on every page.

Using an Incorrect Language Code

This is just lazy coding because modern language codes are very easy to find. But let’s say you try to use “gr” for German, when the code is “de” (for Deutsche). Gr is for Greek. When you mess this simple stuff up, you’ll confuse the spiders and they’ll start running around in circles. If you’re unsure of a language code, always look it up.

And don’t go on thinking that all English (en) versions are appropriate for both the U. S. and Great Britain, either. We use different spellings and different punctuation. Wouldn’t it be better to show each country the version they prefer?

In order to do that, you’d use tag “en” for the U. S. version and “en-GB” for the Great Britain version. Makes perfect sense to me. You, too?

Using Both Header Tags and a Sitemap

You are so going to mess things up! Choose one way to alert the search engines or the other. If you do both, you’re liable to muck things up. Just don’t do it.

Expecting Immediate Results

Let’s say you coded your pages and then, immediately search for different language versions of Google.

Don’t bother!

Why?

Because it takes time for your pages to be crawled and for the search engine results pages to catch up. Give it a few days if your site is crawled regularly. If not, wait whatever the appropriate time is and then, check.

You can check Google’s crawl rate for your site in Google Webmaster Tools and Bing’s crawl rate in Bing Webmaster Tools. A little checking can save you tons of frustration and thinking that you did things wrong, when you may well have done them right.

A Good Example of a Markup Error

Let’s pick on Firefox. When you go to their English-US page, you can check the page source and you’ll see how many different languages they have accounted for. This is a mere sample:

These are just the first few of Mozilla’s language tags and they go on for days. But notice that before they start, Mozilla names the canonical version of the page, which is the English version.

Here’s what we find when we search for Firefox in Google U.S.:

Notice that the canonical version URL shows in the 1st result. This is fine because it’s the page Americans would want to see. No worries there.

But, let’s move over to Google.gr or Google for Greek speakers:

Again, the canonical is 1st and the Greek version is second.

And in French, take away the AdWords ads, and the same thing happens:

Canonical and then, French.

Is Mozilla shooting itself in the foot here? Should they have specified a canonical page?

Perhaps not.

Google confirms this in a Webmaster Central blog post from 2010. The multilingual content should use the hreflang tags rather than canonicals. This would allow the proper versions of the Mozilla pages to show up first, no doubt, since they are second. (However, don’t think this will happen automatically across the different language SERPs. A site at position #1 in French might be #5 or #6 in another language. These just happen to work so that if the canonical hadn’t been designated, the proper version would appear first. )

I know that if I was searching in the U.S. and found the Tagalog version of a page first, I’d be rather perplexed, wouldn’t you? Better to use the hreflang markup and forget the canonical.

Why is Any of This Important?

Making mistakes when coding for international pages can mean that your pages show up wrong for different languages, across Google’s different language SERPs. It can also mean that if your various language pages are identical, that you have duplicate content on the same domain. Not good.

So… It’s time to get this right, as you shouldn’t be using canonical and hreflang at the same time. Spiders will get that alternate language versions of one page are identical, but that they are written in different languages. Just be sure that you’re getting everything else right.

Let’s face it, we’re living globally online, not just locally anymore. If your business can serve everyone around the globe, proper language tagging isn’t just something to consider, it’s something to rock!

Comments

Holy cow, Stefano, you're right! Mea culpa, Mozilla and readers! I didn't notice the category tag in the first instances. Totally spaced on those. I was only looking at the end of the URL.

But even so, Google is saying it's better to avoid the canonical when you're using hreflang. Hmm... Now, it really makes me wonder. Is it better to use both so your pages show twice? Have to do more digging.

Are you using both hreflang and canonical on any of your pages or your clients' pages? I'd be really interested to see SERPs results for those.

I'm not sure I got the issue with the canonical and hreflang... since the screenshots show what I'd expect to see, and what I believe Mozilla guys want to see: 2 URLs in the right language in each case (the Greek screenshot shows 2 Greek results, and so the French...).