Hey friend! Have fun exploring Q&A, but in order to ask your own
questions, comment, or give thumbs up, you need to be logged in to your
Moz Pro account.
You can also earn access by receiving 500
MozPoints
from participating in YouMoz and the Moz Blog!

Robots.txt to disallow /index.php/ path

I have a problem with my Joomla site (yeah - me too!). I get a large amount of /index.php/ urls despite using a program to handle these issues. The URLs cause indexation errors with google (404). Now, I fixed this issue once before, but the problem persist. So I thought, instead of wasting more time, couldnt I just disallow all paths containing /index.php/ ?.

I don't use that extension, but would it cause me any problems from an SEO perspective?

How do I disallow all index.php's? Is it a simple: Disallow: /index.php/

8 Responses

Do you inbound link pointing to you index.php pages ? If yes, then it might affect your seo. Disallow: /index.ph/ is perfect but after implementing it don't inter link those index.php pages. Can you share me your website URL so that I can show you with example. How to do it.

Couldn't you rewrite those /index.php/ urls to remove the /index.php/?

Like this in .htaccess:

RewriteRule ^(.*)$ /index.php/$1 [L]

Only used Joomla once, but there must be a way to configure joomla to just use "/" instead of "/index.php/"?

Update:

Here's a solution to your /index.php/ issue:

http://www.eprcreations.com/remove-index-php-from-joomla-urls/

Once you've updated that, and have your urls working properly without the /index.php/, you could add this slight modification of the rewrite rule above so that all your old /index.php/ urls would be 301'd to your new ones:

Well, I tried the sensible solution and redirecting to the correct URL instead. However the SEF program is quite limited and keep on creating new URLs regardless of my modification. Im looking for a more permanent solution, and the disallow seems at bit simple as I'm not a super programmer.

Hi Mikkel,
I have checked your robots.txt, it looks perfect. If you redirect /index.php to home page that using httaccess file or by using any joomla plugin that would great for you. And its also a permanent solution. :)

Like Chris, I spidered your site and couldn't find any links to /index.php files, which probably indicates one of two things:

You've fixed the problem - Yay!

Or Google is finding those links from external sources

Google found those links at one time in the past, and is still trying to crawl them.

In the Crawl Errors report in Google Webmaster Tools, if you click on the link of each 404, there's often a "linked from" source where you can see where Google discovered the broken link. This is really helpful in rooting out the cause.

Regardless, I'm going to go with #1 and optimistically believe that you were able to fix the problem. :)

Hey friend! Have fun exploring Q&A, but in order to ask your own
questions, comment, or give thumbs up, you need to be logged in to your
Moz Pro account.
You can also earn access by receiving 500
MozPoints
from participating in YouMoz and the Moz Blog!
Learn more.