Archive for the 'Web Development' Category

Google have just issued an email to Google Merchant Centre users (which powers Google Product Search), telling merchants who sell electronics that they need to start including at least two unique product identifiers from a choice of MPN, Brand and EAN.

The email reads as follows:

Starting in the first months of 2011, we are making some changes to how your products in the Electronics category may appear on the UK and German versions of Google Product Search. In order to provide as much information as possible to our users, we need your help in matching your products to our “product pages” where users are able to view useful data such as product specifications or reviews. If you are submitting products in the Electronics category, please start including at least two of the following three unique product identifiers: MPN, Brand, and EAN.

These attributes will not be required for your feed to process correctly, however, to avoid seeing a drop in traffic from Product Search, we urge you to start including these unique identifiers as soon as possible. You can find unique product identifier information on Product Search product pages, under the “technical specifications” section for Electronics items.

If you look closely at the header of my website, under my phone number and email address, you’ll see I’ve added one of Facebook’s new Like buttons. There are all sorts of implications for Facebook’s developments in this area, not least that it moves Facebook from a destination to a platform enabling people to find things, communicate across sites without having to visit Facebook itself and thus fundamentally changing online marketing and advertising (at some point…) You can read more about how this will impact us all on the web here (Search Engine Land) and here (Web Pro News).

This post isn’t to get all excited predicting the future though – it’s to help you engage with this shift right now, by simply adding a Like button to your site and enabling your visitors to start sharing their support for you and connecting with each other – driving word of mouth and traffic to your site.

There are two methods and I’ll concentrate on the simple one – using an iFrame to add the button to your site. You could also use the XFBML method, which gives you much more control, but also requires you to be utilising the Facebook JavaScript SDK and my guess is that only the web developers amongst you will fancy that…

All you need is this page: Facebook Like Button Page. Scroll down to the form and enter the relevant details – your page/site’s address, choose a layout style (mine is Standard without Show Faces), set the width to match where you’re going to put the button on your page, choose the font to match your style and either the light or dark colour scheme. The code you will get will look something like this:

All you need to do is cut and paste the code into the right place in the HTML of your page/template, and the Like button will magically appear. The astute amongst you will notice that there are some things you can play with to affect the appearance, notably the use of a frameborder and the use of a CSS style to define the size and appearance of the iFrame that encompasses the button.

I made some slight alterations to the code for my site, as the button was sitting too close to the email address:

Notice that I have added an extra style element: margin-top: 4px; This gives me an extra four pixels space above the Like button, making things neater. In fact, you can add any relevant CSS style element to this, meaning you can position and size the iframe accordingly. I had to play around with the original code a bit, as it was pushing parts of my site’s design around and breaking it.

So, what are you waiting for? Get off and get people liking your site (right after you click my Like button!)

As you might expect, I have a lot of friends who work in the web industry, from programmers to graphic designers, SEO experts to pay per click wizards. In a strange and unconnected coincidence, I have three different friends all with a pretty high degree of experience looking for new opportunities in the web development arena.

So, if you’re looking for a web developer to join your team right now, drop me an email and I’ll put you in touch. Their experience in all cases is in excess of six years, many more for some and by and large they are all front-end developers with some scripting skills, so they can do graphic design, layout, HTML/CSS and implement some JavaScript, ASP, PHP etc. They have a fair bit of management and client liaison experience too, so these aren’t techies with no social skills.

I also heard from a client today looking for a junior web manager, someone who can do a bit of graphics manipulation, HTML, SEO, AdWords etc. Some of that can be provided as training, the key is that you understand the web and have some basic skills to start with. Again, if you know someone, drop me an email.

Long time no blog! I hope you all had a good festive season. I thought I would kick off the new year with a technical post, as Google announced cross-domain support of the Canonical tag last month (worth reading for the explanations of when you might want to use it and how to implement).

You may remember from my earlier post on the canonical tag, that it is a way of telling the search engines the “master” address of a page, when multiple addresses for the same content might exist. Why would you have multiple addresses (URLs) for a page, you might wonder? Well, how about a product list on an e-commerce website with options for ordering the products alphabetically, by price or by manufacturer? It’s likely that the URL will be different in some way for each version of the list, even though its contents are actually the same. That means that a search engine will index all three versions (or possibly six if you have reverse-order options too).

Why the problem? Well, you probably want visitors to see that list in a certain order the first time they visit, let’s say ordered by price, cheapest first. If Google has all six versions of that page in its database, what’s to say it won’t link to your price: descending (i.e. most expensive first) list from its search results? That might make you look expensive and put off potential buyers.

The other issue is link juice – with multiple addresses for the same page, you might have some links to one URL, some to another, all essentially to the same page but for Google, they are different pages. That means the link juice is being split between those different versions of the page. So, using the rel=canonical tag, you can tell Google what the master version of the page is and that therefore, all link juice should be applied to that version and that’s the one that should appear in search results.

It goes in the <head> section of each version of the page, so in the product list example, your page would contain the above code regardless of what version is being displayed at the time. This would probably be done automatically by your content management system, so that when a different category of products is being displayed, the canonical tag references the correct category/product list, because it’s likely the same page template is used for all categories.

In effect, the canonical tag works like a 301 redirect, but without you having to mess around with server settings. What changed in December is that now, you can make cross-domain (i.e. cross-website) canonical tags, when before, you could only use it within one domain. So, even those of you with problematic servers (for example, you’re on shared Windows hosting without access to IIS Admin), you can now create “301”-style redirects, avoiding duplicate content issues.

Do you have a search box on your website that contains a phrase like “Enter Search” or similar? Are you using Google Analytics to track Site Search?

I’ve noticed this on several sites for a while, so thought I would post about it. In most cases where there is text already in the search box, instructing the user what to do, that text tops the list of keywords searched for on the site. Take this example:

As you can see, the search box in the top left of the page has the text “Keyword/Code Search…” inserted by default, and it disappears when you click in the box. Can you guess what the most popular keyword used to search on the site is?

Yep, “Keyword/Code Search…” by a long way! What does that tell us about this use of text in the search box on a website?

My opinion is that it isn’t sufficiently clear to the user what they are supposed to do. So they click the arrow next to the box, expecting it to take them to a full search page, but instead, it gives them the search results from the site for “Keyword/Code Search…” In this site’s case, that gives you a full list of all the products in the catalogue, but not a search page – you get the same box and text again.

My take on this is that designers need to be more instructional about what to do with/how to use the search box on a web page. Just putting “Keyword” in the box is not telling people how to use the function, but “Type what you’re looking for in this box” might just work.

Google launched a new Labs section of Webmaster Tools today, containing two features. The first is called Fetch as Googlebot, which shows you the page that Google gets when you enter a URL from your website. Quite handy to see what Googlebot sees, particularly HTTP headers. Here’s a screenshot of the tool showing the 301 permanent redirect from the old holding page to my new homepage on the Keyword Examiner site:

The other tool reports any Malware found on your site, but I’m happy to report I can’t give you a screenshot from one of my sites for that!

As a result of the MyDeco experience (see earlier post), we found that the site in question wasn’t recording campaign tracking (although obviously we can see referring websites). In case you’re not aware of Campaign Tracking, there’s a guide here.

MyDeco are keen for retailers to use campaign tracking to ensure more accurate results with a better quality of data. This is usually done by appending “?partner=mydeco” to the end of any link to the retailer, so that it shows up in their logfiles. If you’re using Google Analytics, this won’t do, as Google wants campaign tracking to be done in its own UTM format (see the guide linked above).

So, we tried this with the site in question, which is hosted on a Microsoft IIS server and written in ASP .NET (.aspx). This had the effect of causing an error – the pages really didn’t like having a query string put on the end of the URL, which is what putting a “?” means. So, we needed a way to get Analytics to accept an alternative character to replace the “?” and thereby stop the website from throwing errors.

The solution, after some searching, was to use the anchor signifier “#” instead of “?”, which the website is happy to accept. However, you can’t just make campaign URLs with “#” instead of “?”, because by default Analytics won’t know what it means. You need to add this line of code to your Analytics tracking code (the code inserted into every page of your website when you set up Analytics):

pageTracker._setAllowAnchor(true);

This line of code should be inserted as follows: Find the Google Analytics code in your webpage and add it like this:

I was discussing the issues around “hidden” or “protected” content with a client yesterday, specifically the problem that as a website owner you want as much content in the search engine’s index as possible, so that your site will be found, but you don’t actually want humans to see it without registering/paying.

This is an issue that has plagued paid-for content sites for years (see Danny Sullivan’s history lesson here). The problem being that whilst there are pretty simple technical solutions to allowing search engine spiders into your site, whilst preventing access to the casual human browser, pretty much any way of doing this you can come up with constitutes “cloaking” in the eyes of the search engine. If you have a look at Google’s Webmaster Guidelines on the subject, you can understand why this practice is frowned upon – they don’t want users to be taken somewhere they weren’t expecting, as that could severely affect the quality of the user experience and ultimately lead to people using another search engine.

I noticed that Google had made a blog post attempting to deal with this problem while I was on holiday – they want users to be able to find “protected” content because it may be just what they’re looking for, but not at the expense of inviting spam into the index. The solution is simple – allow Googlebot to index your site and when a user finds that page via a Google search, let them see the full page. If they want to access another “protected” page, Google is quite happy for you to require registration/payment; but not for that first page/article they clicked to from the search result. They call it “First Click Free” (FCF), something that has been accepted in Google News search for some time.

Initially, that sounds like a sterling solution. But it doesn’t take long to realise the problems here – firstly, a simple site: command search on Google for the site in question will reveal every page on the site. According to Google’s rules, if you click on any of those pages in the search result, you should see the whole article for free. So, a simple run down the full list of pages provided by that site: search gives you access to every page of paid content on the site in question.

Secondly, there are some simple technologies freely available out there to make you appear to be Googlebot or to make it look like every page you view has been referred from a Google search (here’s just one). So, using these, it would be simple to browse a site conforming to Google’s FCF rules and get access to every page – you wouldn’t even need to keep going back to that site: search listing.

So, what should the webmasters of such sites do? Well, you could take the view that the vast majority of web users have no idea about the site: command, changing user agents or accessing Google’s cache (the “Cache” link that appears under each search result that shows Google’s copy of the page in its database, rather than the “live” page). In which case, the vast majority of your site’s visitors will experience the site just as Google suggests.

However, if this becomes a popular method of allowing Google access to hidden content, how long before tools are developed and widely publicised to make things like changing your user agent incredibly easy? Eventually, there will be enough users doing it to really affect your site. In that case, there are a couple of options:

Create summary pages that contain info “teaser” information to get the user’s attention and to work well enough in terms of SEO. In this case, your full protected pages won’t be accessible to Google or anyone else, but if the pages contain sufficient information and are optimised, they should still appear in searches and therefore do the job.

Change your business model slightly. Allow everyone access to at least one page of protected content when they arrive, then request registration when they move to another page. This is like Google’s FCF model, except it is universal rather than applying only to Google users. If so desired, you could use the <meta name=”robots” content=”noarchive”> tag in the head of your pages to prevent search engines making copies in their cache. However, this may have a negative impact on pages’ performance in search results, as search engines like to compare copies of a page over time to assess its “trustworthiness” and topical relevancy. Remember also that this may restrict crawling of your pages, as Google will experience the site in the same way – it will be able to access one page, but then get the “registration required” message. I would be interested to know if anyone has tried this and whether an XML sitemap gets all the pages indexed anyway?

Google has somewhat changed its mind about re-writing URLs, as they now claim to be better able to understand dynamic URLs (the sort of query strings you often see in e-commerce website addresses, for instance, along with many content management systems). The reason is that they now see query strings such as “search.php?keyword=toys” as more meaningful to the page’s intention and content than “search.php/keyword/toys”, which is how many URLs are re-written. The structure of the former is now properly identified by Google as a search term, whereas previously it may have had little meaning. Converseley, the latter now looks like a page three layers deep in the site, but doesn’t necessarily represent a search query, so Google is less likely to identify the true purpose of that page.

My take on this is that if you are re-writing URLs from something meaningless such as “page.php?id=76″ to something meaningful like “page.php/seo-urls-still-good”, that still helps both the search engines and users to understand the contents of the page and I would continue to use it. If you are re-writing search queries like the “toys” example above, maybe you could try a few without the re-writes – but remember that you could lose the PageRank of the originals, so be sure to 301 re-direct the old URLs to the new ones (and update your sitemap accordingly!)

We had carried out a number of 301 redirects on some of their pages, as for reasons known only to the original developer, a lot of pages had been created as sub-domains, which was causing duplicate content and indexing issues with Google.

What I wasn’t aware of, was that there isn’t any code in the site to auto-update the sitemap.xml file provided to Google Webmaster Tools. I hadn’t seen the error above before – clearly, Google is unhappy if too many of the URLs in your sitemap don’t match what it sees on the site. A lot of those URLs of co urse no longer exist (e.g. the sub-domains), so we have updated the sitemap using GSiteCrawler – it’s a bit techie, but it certainly does the job and can be scheduled to make regular updates with automatic FTP of the new sitemap.xml file.