This would count a click for example2.com on our script then redirect to example2.com

The problem, is that we have over 100,000 outbound links on our site and it is virtually impossible to manually check all the links.

What prevention method can I take in order to make sure I'm not penalized for broken links or any site that is involved with methods that search engines frown upon? (I'm referring to links after the url=)

I would prefer a script that simply redirects in such a way that it doesn't count as an outbound link and redirects in some manner, is this possible?

I'm seriously thinking about writing a script that put the link in a text box, since I'm so sick of seeing 404 errors everytime I use a validator. I don't want to be penalized if any of these links are bad.

Any thoughts?

Since it is a "redirect" and I don't actually have a link to anything directly, could I still be penalized for my redirected links? (every outside link on my site uses the link.php?url= format)

...I was thinking, what if I add a robots.txt file like this:

User-Agent: * Disallow: /link.php

That should solve the whole problem, right?

I'm not sure if blocking the robots has anything to do with counting an outbound link or not?

This doesn't answer your question, but I've been in the same boat for a while. I had a blog-like site that was linking to everything using a 302 redirect so that I could count the clicks on the links. After reading here and learning about "page hijacking" and that google might really frown upon 302 redirect links because of all the hijackers, and after google decided for some reason to kick me where it counts at the beginning of February and only send 10% of the visitors my way that I was getting before, I've now given up counting the click-throughs and just use direct outbound links instead. Waiting to see if my google traffic picks up at all. So far, nothing.

At anybrowser.com, there is a tool called 'Link Check' that will check links on a page for you. I think this is better than secretly re-directing because I believe this(secretly re-directing) is something that your site could potentially be penalized for(more so than just some broken links, etc).

Having quality, direct, outbound links is a good thing!

If you do not agree, just put the link directory in a certain folder and make that folder available to robots.

If you are going to continue to use a method to hide the links, than, yes, I think #1 is worse. While you may be using the program for good, others are not. You could get lumped in with the "bad hats" or "black hats", whatever they are currently called, as opposed to "white hats". I, personally, would not take this risk. You could be dropped completely from Google(and other search engines). I have seen this happen.

Direct links are the best way to go. But, 25% bad links(404 errors) is a lot! You could easily get this down to a much lower % by using a free program, like the one at anybrowser.com, to cut your numbers in half, or delete the bad link completely. You aren't going to be able to continue to run the link directory in future years unless you take the time to do this.

Hire someone in another country(with a lower minimum wage) to do the work for you. For 100,000 links, it would take your employee about 20 days @ 8 hours a day. It would be worth the $'s or time. My timetable is based on how long it takes me, or my own foreign employee, to check my links, a little over 5000. Either one of us can check them all in one 8 hour day.

Thanks larryhatch, for the link. I did notice though, don't click on anything until it finishes! I clicked on something and the entire process started over!

I found help with my site through word-of-mouth, I know someone that has a site for WAHM(work at home moms). Visit some sites like this and post a message that your looking for help. There are plenty of WAHMs looking for work.

I still like to manually check the links every now and again. If a page turns into a search engine or into a different site, link checking programs won't always catch it.

My directory has 13,000 outbound links and I want to maintain a high degree of working - my target is 99% at all times working.

1. I copy my working directory to an apache htdocs on my design machine and using Ultraedit remove all the redirection scripts across the 800 pages.

2. Using Apache and Zenu, I check all the links for a head response and a redirection to a new domain and make changes to the dierctory as needed. I call each company after 2 weeks to see if they have a new domain or what is happening.

3. As NetSol no longer returns a redirection, I then locate every Netscape server and check those links for redirections through Internet Researcher

4. As that still does not find the dumb designers using html redirection (as compared to a 301 server redirection) to a new domain, I run Internet Research every month on 1/12 of the urls to locate all the refresh lines and see if it is refreshing to a new domain. (So they are checked on an annual basis)

5. As a final check I pay my father in law to check 1/12th of the links every month about 6 months out of phase of #4.

This is a lot of work, but I have 4000 user sessions a day, 150,000 referrals per month and 480 advertisers with an average per click of 30 cents for the adverisers - and I make money and have fun skiing at Deer Valley in Utah and playing golf.

What you need is the Xenu Link Sleuth, a robot that spiders your website and reports whatever you tell it to.

When you have a website with thousands of outbound links you really should be using a MySQL database or similar in combination with a scripting language, that way you will also be able to track clicks without thinking about SE's.

Have an application written that automatically removes the link anchors of 404 errors from the HTML code. That way the text stays, the hyperlinks disappears and all is done automated on your harddrive before you upload to server.

My problem is that my website has an extreme number of outbound links.

Note: The most important goal here is to stay on good graces with Google (proper webmaster guidelines so we're not penalized in any way) they are sending a large majority of our traffic. A close second goal is to be user friendly. Without Google our site wouldn't exist, so we have to make sure we are 100% sctrict (or as close as possible) with their guidelines. Sounds bad that user friendly isn't number 1 concern but their wouldn't be users if it wasn't for Google.

I'm curious to hear feedback about which of the following situations will be "better" in Google's eyes:Note: I'm trying to find a way to deal with outbound links which because of excessive quantity can't be dealt with by hand to check for linking to bad neighbhorhoods,etc

1) PHP redirection script used for every outbound link can set it between 301 or 302. Down side: Potential of linking to "bad neighborhoods" Question:-If a link goes through a redirection script and the script is blocked via robots.txt, does the link get crawled and open up the door for potential bad neighbhorhood penalization?

2) Javascript outbound links either stored in external .js file (blocked from search engines) or regular javascript links on page (uncrawlable in some way.) Down side: Google may frown upon excessive use of javscript. Question:-Is this cloaking? -Has excessive use of javascript ever been shown to cause a penalty?

4) TEXT URLs, no outbound links. Example: Title: Widgets for sale URL: www[dot]widgets[dot]com (without the [dot]'s of course, in the form of a URL just not a hyperlink) instead of <a href="http://www.widgets.com">Widgets for sale</a>Down side: Usability Questions:-Does google crawl text URLs? -Will google penalize for text URLs? -Is there anything wrong with text URLs in this situation over links, aside from usability?

Note: The links are NOT paid nor necessarily reciprocal, so there's no link partner obligation.

I run a regional directory and use a popular directory script program where my links are using the following format:

www.#*$!.com/cgi-bin/jump.cgi?ID=1111

Obviously, these re-direct/resolve to the listing's actual URL.

Is this hurting me in the eyes of Google, Yahoo, MSN, etc. when they spider my site?

The main resason I used this format so it is more difficult for folks to scrape the site for URLs. I've had this happen several times in the past. I know it won't prevent it but will make it more difficult.

Also, how about using a product that not allows copying of links on a web page?

Google can, and sometimes does, follow and index any kind of recognisable URL, whether a hyperlink, text, javascript, or whatever.

In practice, most of the time it obeys robots.txt, but with various exceptions - if it saw the URL before it saw robots.txt, if there's an external link to the URL, if it just feels like it, and so on.

Of course, if your site has so many links you don't know where they go, Google may not consider to be the sort of "quality" site that it wants to rank well.