How To Optimize WordPress Robots.txt File For Search Engine Bots

Whenever we talk about SEO of Wp blogs, WordPress robots.txt file plays a major role in search engine ranking. It helps to block search engine bots to index and crawl important part of our blog. Though, sometimes a wrong configured Robots.txt file can let your presence completely go away from search engines. So, it’s important when you make changes in your robots.txt file, it should be well optimized and should not block access to important part of your blog.

There are many misunderstanding regarding indexing and non-indexing of content in Robots.txt and we will look into that aspect also in this article.

SEO consists of hundreds of element and one of the essential part of SEO are Robots.txt. This small text file standing at the root of your Website can help in serious optimization of your Website. Most of Webmasters tend to avoid editing Robots.txt file, but it’s not as hard as killing a snake. Anyone with basic knowledge can create and edit his Robots file, and if you are new to this, this post is perfect for your need.

If your website hasn’t got a Robots.txt file, learn here how to do it. If your blog or website does have a Robots.txt file but is not optimized, then follow this post and optimize your Robots.txt file.

What is WordPress Robots.txt and why should we use it?

Robots.txt file helps search engine robots to direct which part to crawl and which part to avoid. When Search bot or spider of Search Engine comes to your site and wants to index your site, they follow Robots.txt file first. Search bot or spider follows this files direction for index or no index any page of your website.

If you are using WordPress, you will find Robots.txt file in the root of your WordPress installation. For static websites, if you have created one or you developer has created one, you will find it under your root folder. If you can’t simply create a new notepad file and name it as Robots.txt and upload it into Root directory of your domain using FTP. Here is ShoutMeLoud Robots txt file and you can see the content and it’s the location at the root of the domain.

As I mentioned earlier, Robots.txt is a general text file. So, if you don’t have this file on your website, open any text editor as you like ( as the example: Notepad) and make Robots.txt file made with one or more records. Every record bears important information for search engine. Example:

By using Disallow option, you can restrict any search bot or spider for indexing any page or folder. There are many sites who use no index in Archive folder or page for not making duplicate content.

Where Can You Get names of Search bot?

You can get it in your website’s log, but if you want lots of visitors from the Search engine you should allow every search bot. That means every search bot will index your site. You can write User-agent: * for allow every search bot. Example:

User-agent: *

Disallow: /cgi-bin

That’s why every search bot index your Website.

What You Shouldn’t do?

1. Don’t use comments in Robots.txt file.

2. Don’t keep the space at the beginning of any line and don’t make ordinary space in the file. Example:

Bad Practice:

User-agent: *

Dis allow: /support

Good Practice:

User-agent: *

Disallow: /support

3. Don’t change rules of command.

Bad Practice:

Disallow: /support

User-agent: *

Good Practice:

User-agent: *

Disallow: /support

4. If you want no index, more than one directory or page don’t write along with these names:

Bad Practice:

User-agent: *

Disallow: /support /cgi-bin /images/

Good Practice:

User-agent: *

Disallow: /support

Disallow: /cgi-bin

Disallow: /images

5. Use capital and small letter properly. As the example, you want no index “Download” directory but write “download” on Robots.txt file. It makes miss understand for search bot.

6. If you want index all page and directory of your site write:

User-agent: *

Disallow:

7. But if you want no index for all page and directory of you site write:

User-agent: *

Disallow: /

After editing Robots.txt file upload via any FTP software on Root or Home Directory of your site.

Robots.Txt for WordPress:

You can either edit your WordPress Robots.txt file by logging into your FTP account of the server or you can use plugin like Robots meta to edit robots.txt file from WordPress dashboard. There are few things, which you should add in your robots.txt file along with your sitemap URL. Adding sitemap URL helps search engine bots to find your sitemap file and thus faster indexing of pages.

Here is a sample Robots.txt file for any domain. In sitemap, replace the Sitemap URL with your blog URL:

So now you have made some changes into your Robots.txt file, and it’s time to check if any of your content is impacted by updating robots.txt file. You can use Google Webmaster tool ‘Fetch as bot tool’ to see if your content can be accessed by Robots.txt file or not. This step is simple, login to Google Webmaster tool and go to diagnostic and Fetch as Google bot. Add your site posts and check if there is any issue accessing your post.

Fetch as Google Bot

You can also check for the crawl errors caused due to Robots.txt file under Crawl error section of GWT. Under diagnostic >Crawl error select Restricted by Robots.txt and you will see what all links has been denied by Robots.txt file.

Here is an example of Robots.txt crawl Error for ShoutMeLoud:

Google Crawl Error

You can clearly see that Replytocom links have been rejected by Robots.txt and so any other link which should not be a part of Google. FYI, Robots.txt file is an essential element of SEO, and you can avoid many post duplication issues by updating your Robots.txt file.

Are you using WordPress Robots.txt to optimize your site? Do you wish to add more insight to Robots.txt file? Let us know via comments. Don’t forget to subscribe to our e-mail newsletter to keep receiving more SEO tips.

Thanks, I have edited my robots.txt file and it looks great now, harsh can you help me about robots meta tags, because i have seen many websites, which are not using noindex meta tag but however getting high ranking, but when i use this method, I got duplicate pages in GWT. Please suggest something…

thank you for this wonderful tips . I had a problem with my site not being able to reach by Google bot for the past two weeks. i have been searching to see where the problem comes from but i couldn’t till i came across this post only to realized that WP include was disallowed .oh damn it!!! thank you

Thank you very much for your post. After reading this post I have corrected my robots.txt I had on my WordPress websites and, believe or not, they were very very poor. Hope now things get better little by little. See you!

And also please explain, why have you disallowed index.php file, it is the main landing page and it should be indexed by search engines. I know you better know than me. Please explain why you have disallowed it. And what will be the effect on search engines’ crawling to index page.

hello these may be off topic but i need help on Blocked Resources
how to remove blocked resources.
last time when i check there was 400 links blocked because of “disallow wp-include” line, so i remove disallow wp-include from robot file. allowing it.
but now after 2 weeks blocked links are 670, i don’t understand what to do?
i check all blocked links in render and fetch option, pages layout are not affected showing perfectly but not showing ads banner and other minor code are banned.
so i’m confused now, does increase link in Blocked Resources affect websites SEO or there is no problem if links increase.
please help on these.

Just went trough Yoast post and they are saying we should no longer disallow wp-content and wp-includes as Google bot started fetching CSS and JS files too.
And he said, there is no point of adding sitemap to robots file and by default wp-admin is excluded from wp 4.0
can you please research on this and let us know more details
Thank you

Yes, that is true! This article is missleading as it suggest to disallow wp-includes! This alone can cause problems with indexing and can have impact on site ranking. I just recieved email from google noting that googlebot is not able to reach all parts of my site – that is why I searched a little and foud that i have wp-includes disallowed in robots.txt – by allowing it again I fixed my issue.

I really found it helpful ! I moved my site to GoDaddy but forgot to modify the robots.txt file and was searching on Google for the best robots file ! Well, Shoutmeloud can be the best blog so why not use the same robots.txt file for myself 😉
Thank You harsh bro 😉

Superb article,
even i had faced many problem editing robot.txt file for my WordPress blog initially for better search engine optimization but when i got how to edit it exactly. Now i know the importance of Robot.txt file for Search Engine traffic.

One more Superb Article after reading this post i made change in my robots.txt file as you mention here, it really help me in future for search engine friendly website.
thank you so much for sharing your knowledge. 🙂

I had little knowledge about Robot.txt and I never had touched this area, As already somewhere being told before using it you have to be very careful, Moreover I neverknow about this option.. due to examples and all the examples helped a lot..
Thanks 😉

Hello Harsh
Adsense says me that allow crawler to index for better indexing and putting targeted adds. So I have to edit my robots.text. This post really helpful for me . I just want to ask that is it better to use WP robots.text plugin instead of editing it by myself ?

This was a useful post, thank you. I have a standard robots.txt that I have been using for about 5 years. I knew it was way out of date but I didn’t know exactly what to change. This has given me a lot of ideas.

my domain age 3 years. recently one month back i started seo for this project with in one month result good all search engines but i upload robots.txt file my all keywords dissapear i dont know why robots.txt file some time working very bad

I have disallowed mediapartners and adbot because I am not using adsense. At the same time I have a main question that by allowing only:
User-agent: Googlebot-Image
Allow: /wp-content/uploads/
we are authorizing Google image bot but what about bing and yahoo images. Is my this line correct:
Allow: /wp-content/uploads
and will help images to be indexed in other search engines. Also I have mentioned tag and category separately as I had a large number of tags, categories and pages (unuseful) indexed in google and my aim is to recover from Panda effect.
Any advice would be helpful. Thanks

Ok thanks I have received answer of my first question but second one is still there that what should I add in robots.txt for Indexing wp-content images from Bing and Yahoo even other search engines. Is this line correct?
Allow: /wp-content/uploads
instead of using
User-agent: Googlebot-Image
Allow: /wp-content/uploads/
or should I use both?
Second thing, if I am not using adsense, do I need to allow media partners and Adbot?

Hi.. Abby Stone.. thanks a lot for this info..I guess blocking using robots.txt is the easier way to block content, I am recently seen that I have a lot of 404 in GWMT, with the /href= at the end of the URL after a theme change, can this be blocked using Disallow /href= in the robots.txt without the original url being effected.