Install the pathauto module and enable it

The pathauto module is highly recommended. Pathauto will automatically make nice customized URLs based on things like title, taxonomy, content type, and username. You also have to enable the path module for pathauto to work.

Think carefully about how you want your URLs to look. It takes some experience with Drupal to get the exact URL paths that you might want. The URLs are controlled by a combination of taxonomy and pathauto, and I hope to cover that in another tutorial. You can also use the path module to write custom URLs for each page, but that might become tedious and inconsistent on a large site.

At the very least, enable the path module and install the pathauto module. It will generate nice-looking URLs for you without much configuration.

Caution: The above advice is directed towards new Drupal sites. If you have an existing Drupal site be very careful that you don't rename your previously existing URLs with the pathauto module. It is generally a very bad idea to change existing URLs because the search engines will no longer be able to find those pages.

Here are some pathauto settings to watch out for:

For update action choose "Do nothing. Leave the old alias intact." Otherwise the URLs of nodes will change every time you change the title of your post, causing problems with search engines:

Install the Global Redirect Module

The Global Redirect Module will automatically do 301 redirects to your URL aliases. So if you have a node a example.com/node/5, the Global Redirect Module will redirect that URL to your alias at example.com/my-page.

Install the Meta Tags (Nodewords) Module

The Meta Tags Module (formerly called "Nodewords Module") can be highly beneficial to your site. There is a myth in some search engine optimization circles that says, "meta tags are not important". This is not true.

Meta tags are not meant to be used for keyword stuffing. Don't use them for that purpose because it isn't going to help you. The really important meta tag is the meta description.

The meta description should be different on every page for best results. The meta description should be one or two brief sentences to summarize the page. It should be written for your human visitors, but it is not a bad idea to tastefully and sparingly insert a couple of your keywords. Often when a search engine lists your site in the search engine results pages, it will use your page's HTML title for the title, and your meta description for the text snippet. That is why the meta description should be written with human visitors in mind. You want a text snippet that is going to make them want to click on the link.

Here is one textbook example from this site in the Google SERPs with the meta description highlighted in red:

I generally configure the Drupal Nodewords module to output the meta description and meta keywords on every page. I have a few default keywords set, and add a couple more on every post to make a unique combination of relevant keywords. I don't spend much time with it because I don't think the meta keywords are that important.

On the nodewords module's administration page, be sure to check the box that says "Use the teaser of the page if the meta description is not set?". That way each page will get a unique meta description even if you have denied access to create custom meta tags for nodes to some users.

Install the Page Title Module

The Page Title Module allows you to set custom page titles on every page. Highly recommended.

Google Sitemaps Module

Google Sitemaps are not essential, but I've been adding them to my Drupal sites. I think that Google Sitemaps were created by Google primarily for debugging Googlebot and not for the benefit of search engine optimizers.

I recommend not using the Drupal Sitemaps Module. [See the comments on this article for a longer discussion about XML sitemaps and Drupal.]

Drupal Rewrite Rules

Make sure that your site does a permanent (301) redirect in either of the following two ways:

http://example.com to http://www.example.com, or

http://www.example.com to http://example.com

You can setup this redirect in your .htaccess file.

To remove the www from your site, look for the following code in your .htaccess file and uncomment and adapt:

# To redirect all users to access the site WITHOUT the 'www.' prefix,
# (http://www.example.com/... will be redirected to http://example.com/...)
# uncomment and adapt the following:
# RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
# RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]

To redirect to the www version of the site, look for the following code and uncomment and adapt:

# To redirect all users to access the site WITH the 'www.' prefix,
# (http://example.com/... will be redirected to http://www.example.com/...)
# adapt and uncomment the following:
# RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
# RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]

Be sure to replace example.com with your domain name, and then test the redirects in a browser.

Fix Your HTML Headers

There should be one <h1> header element on every page and it should have your keywords in it.

Enclose your site name in DIV tags, not HTML header tags.

I would add one H1 element to the home page.

On teaser views, the node titles should be enclosed in H2 tags, while the main header of the page (e.g., taxonomy term name) should be enclosed in H1 tags.

On node view pages, the node title should be enclosed in H1 tags.

Duplicate Content from /node

By default, the front page of a Drupal site has nearly identical content to the page at /node. Search engines are going to spider and index /node because on the paginated home page view, the link to the first page in the series points at /node.

The fix for this is simple — always use a custom front page when building a Drupal site.

Drupal PHP Session IDs

I haven't seen this problem on Drupal sites in a long time, but if you see PHP session IDs in your URLs, it is very bad for search engines. They have to be removed if you want search engines to be able to spider your site well. A PHP session ID in your URL might look something like this: ?PHPSESSID=37765439acbd6c12345ee987776e65be.

From what I understand, this is the fix if your server supports mod_php — it goes in your .htaccess file:

Otherwise you can probably fix it my modifying your php.ini file (or creating one). I don't know the exact procedure for every host, only that your web site must not have PHP session IDs in the URLs if you want good spidering by search engines. Search Drupal.org or Google for how to turn off PHP session IDs on your server.

Drupal and Robots.txt

The default Drupal robots.txt file has critical errors in it even in Drupal 6.2 (bug report already filed).

Comments

saw your post on the drupal web site right after mine - this is a great synopsis page. I'm sure you've read the post on my web site about drupal search engine optimization and search engine crawlers. The more people getting the info about Drupal out there the better - it's a great CMS to optimize for the search engines. My only complaint is that you can't use pathauto or a compatible module to automatically generate either tags or taxonomy terms and vocab.

Thank you very much for providing useful information about basic procedures to be followed. I love these tips and also I have got more benefits from JTPratt's http://www.smorgasbord.net/how_to_optimize_drupal_web_site_for_google_yahoo_msn_search_crawlers

On the whole I thank both of you people for providing useful information to webmasters in SEO

that is a great article about seo...
Great but it did not handled the problem of content duplicating in depth
i think that is the missing piece...
there is a common problem about drupal friendly url...it is all about the trailing slach
take for example
http://yourdomain.tld/articles/drupal-seo
http://yourdomain.tld/articles/drupal-seo/
On a normal Drupal site, with clean URLs enabled, these two addresses are basically interchangeable...to prevent the misrable message of Page not Found
it is fixed at your .htaccess file

Content duplication is an issue with most CMS packages including Drupal. When using friendly URLs one easy way is to block the non friendly URLs from the sitemap submitted to google to ensure these do not get indexed. The 301 redirect mod you suggest is a great way to get rid of the / content competing

Friendly URLs, path auto, meta tag module and view together make Drupal an absolutely fantastic offering, with out knowing anything about SEO.

Also, in addition to the Global Redirect module, you also want to make sure you 1) resolve any canonical domain name issues (www vs. non-www) in your .htaccess file and 2) set your preferred domain in Google Webmaster Central. Hope that helps!

I just wanted to point out that the idea of NOT including/submitting an XML sitemap to google really only applies in cases where someone is willing/able to dedicate time to checking and optimizing their SEO ratings (the sitemap submissions give "unnatural" results which can hide more important problems that SEO experts would otherwise be able to detect). Not all sites have the budget/resources/knowledge to do this. So ... if you know that this isn't going to be done, it is still advisable to submit an XML sitemap to the search engines so that the pages get indexed.

I disagree. I think the opposite is true -- it's only worth submitting an XML sitemap if you have an enterprise-size site and really know what you are doing and want to take a lot of time making sure everything is correct. IMO, if you're talking about a site with fewer than tens of thousands of pages, it's a waste of time. And if you have that many pages, you're still going to need a lot of inbound links to get them to rank anyway.

It's good to signup for Google's Webmaster Tools and verify, but I don't think it's worth the time to create and submit an XML sitemap.

The Drupal XML sitemap module has been broken for at least a year, which is another reason not to use it.

XML sitemaps don't help with rankings at all. It's only for getting pages indexed.

If you want all the pages to get indexed, create a good site architecture. It's easy with Drupal. You can either make an HTML sitemap, and or use taxonomy or views to create pages with links based on keyword. Then get inbound links to those category pages in order to distribute the PR/juice.

If you want new content to get instantly indexed, send your RSS feed through Feedburner and have Feedburner ping Google Blogsearch.

I guess I'll have to eat my words. I definitely think that site architecture should come first. Doing things to ensure that you content is properly linked together in a logical manner like setting up taxonomy and tagging with appropriate keywords will make sure that there are links to your content so that crawlers can find all your pages to index them. I was under the impression that the sitemap submission would give you a small boost in ranking since there are weights that can be set in the sitemap, but I'll defer to the experts if you say this isn't so.

Thanks, by the way for a great collection of SEO tips, and for the speedy response.

The priority of this URL relative to other URLs on your site. Valid values range from 0.0 to 1.0. This value does not affect how your pages are compared to pages on other sites—it only lets the search engines know which pages you deem most important for the crawlers.

The default priority of a page is 0.5.

Please note that the priority you assign to a page is not likely to influence the position of your URLs in a search engine's result pages. Search engines may use this information when selecting between URLs on the same site, so you can use this tag to increase the likelihood that your most important pages are present in a search index.

Also, please note that assigning a high priority to all of the URLs on your site is not likely to help you. Since the priority is relative, it is only used to select between URLs on your site.

You can also let search engines know what pages are important without an XML sitemap by linking to a page with internal and external links. The Google Webmaster Tools let you view how many internal pages link to each page on your site.

I wouldn't install a stock XML sitemap generator plugin on a small- or medium-sized site. I think Google's ability to find a site's content is more sophisticated than the average automated sitemap generator.

I think that XML sitemaps might be a good idea on really large websites that aren't getting all of their pages indexed. In that case, I don't think a one-size-fits-all sitemap generator is going to do the trick either.

"It’s really not about the ranking; it’s more about crawling… Sitemaps doesn’t impact your ranking at all.

The only way it impacts ranking is that in it helps in that very first obstacle of learning about all your pages because if we don’t about them we won’t index them and we won’t rank them. But other than that it has no impact on ranking."

I'm working in a Drupal website and nowadays I'm working with the SEO issues. This tutorial is perfect, and very clear, but I have one doubt:

Maybe this is a stupid question, but, talking about "Duplicate Content from /node", what does it mean "always use a custom front page"?

I've developed a front page called "inicio". In the "Administer/Site configuration/Site information" page, I've set the drupal from page to: www.mysite.com/inicio.

So, when the URL is "www.mysite.com", it shows my front page, and when the URL is "www.mysite.com/inicio", it shows the same page. This may be interpreted as duplicated content, am I wrong?.

Reading that topic, to solve this problem I have to use a custom front page, but I don't know what it means. Must I create a index.html page or something like that? Could be made my custom front page with Drupal? If I'm not wrong, the "drupal from page" parameter is mandatory so, how could I solve this problem?

Just wanted to let you know (Since this was written way back in 2008) that I have been using the XML site map module for Drupal 5 with pretty good results. I like being able to prioritize which content I really want prioritized. It definitely takes some tweaking, as out of the bag it finds every available page, many of which you won't use. Some examples would be it automatically indexes all of your taxonomy term pages, which you may simply use for sorting via views, and not want visible as a page.

However, with about an hours worth of dedication to XML sitemap, I now have my drupal site generating spot on sitemaps, reflective of how I want my content prioritized, and it is automatically submitting them to google.

There is also now another version of XML sitemap (http://drupal.org/node/449710) that is rewritten so it solves all the annoying bugs and architechtual problems of previous versions. Currently it's incomplete and has support for nodes and menu links, but taxonomy terms and user profile links are coming soon.