How Google Indexes Your Web Pages

Google Hocus Pocus Part 2 –

One day your lens is in Google, next day it’s gone. Sound familiar? While we’ve talked about why your lens falls out of Google before, I think it’s time we talk about it on a deeper level. I’ve hesitated doing this mainly because it is a more advanced topic – but also because Google does what Google wants. There is no absolute understanding of Google – and I don’t want anyone to obsess with trying to do just that. But, let’s talk about ol’ Googlebot and your web pages.

Rule #1 – You Can’t Understand Google

Cold, hard fact is this… it’s Google’s search engine and they’ll index however the heck they want to. They do not WANT us to totally understand how it all works. Heck, I read that last year (2008), Google changed their algorithm over 450 times! That is at least one change every DAY.

There is no “code” to crack with Google. Whatever the “code” is today WILL BE different tomorrow. The sooner you accept that, the better off you’ll be.

Rule #2 – The Definition of Insanity Is TRYING To Understand Google

Seriously, the more and more you try and “figure out” Google, the crazier you will make yourself. All my men readers – C’mon guys… how many times have you looked at your wife or girlfriend and thought, “I am going crazy trying to figure you out!” ???

Honey, we women have NOTHING on Google. Google is the most temperamental “woman” you will EVER come across. Worst yet, Google has absolutely NO desire for you to figure out a THING about her. Ironically, the most temperamental woman in the world was developed by two men in a dorm room. (makes me laugh a little)

How Google Indexes Web Pages

With all those disclaimers and warnings out of the way, let’s talk about Google and web pages. Did you know it is Google’s objective to index every page on the web?

This is a HUGE job.

They also take it one step further by trying to rank these web pages in order of relevance to a search query. Just think about all the information that is processed in that 0.15 seconds it took to give you your search results! It’s amazing to me – astounding!

Now, Google also wants to find all the newest and freshest online content and get it in their index asap, too. Yet another HUGE under-taking. Just think how many people are publishing blog posts and articles and lenses and videos and ALLLLLL that every single second of every single day!

Google’s crawlers (GoogleBot) are out on the web all the time looking for new content, updating older content – just spidering the web like crazy.

In fact, GoogleBot actually has two types of crawlers – FreshBot and DeepBot (no, I am not making this up…lol)

Meet FreshBot and DeepBot

Freshbot has one job – find all the newest and the freshest content as it can and get it in the Google index.

DeepBot has a monthly job – to deep crawl all the web pages on the internet, follow the links, evaluate the web pages, and then completely re-update the entire index and get it to all the Google data centers.

Let me use my blog as an example –

Good ol’ FreshBot is hanging around here or at least waiting on me all the time. I love FreshBot! Within minutes of me publishing a new post, here comes ol’ FreshBot. He grabs it and gets it in the Google index within about 30 minutes. Then, he will tell ol’ DeepBot to get over here to PotPieGirl.com and be sure to do a deep index of my site each month.

DeepBot also “remembers” to come do a deep crawl here by finding my blog from other links all over the internet (remember how important I said back links are???)

FreshBot finds you and gives you a bonus of getting in the index quickly. DeepBot decides what to do with you. In other words, DeepBot checks your back links and ALL that stuff that FreshBot could care less about.

What Does This GoogleBot Activity Look Like?

Now I want to show you an image from inside my Google WebMaster Tools. This is an image of the number of pages that have been crawled per day in the last 90 days here at PotPieGirl.com.

As you can see, ol’ FeshBot is in here daily checking for new content, new posts, new comments, etc etc. But, see that BIG spike there in the beginning of December? That is DeepBot coming in for a BIG look at my site. I’ve recently had two more big scans. I first realized this was happening here at PotPieGirl.com when I got a nice little “Warning!” message about my bandwidth usage from my hosting. Yes, DeepBot can eat up some bandwidth. The better your Page Rank, the deeper they go….and the more bandwidth they use.

I can only imagine what the big sites with high Page Rank go thru! You know, sites like Squidoo.com

FreshBot, DeepBot, and Your Squidoo Lens

Ok – now Squidoo has a nice high Page Rank of 8. FreshBot hangs around there A LOT- probably all the time. He KNOWS there is constantly new content to crawl on that site.

Problem is, sometimes he misses stuff when crawling the Squidoo site. This is why we get some links to our new lenses to help FreshBot get there. Or we could do some edits and republish helping to remind FreshBot that we’re there. Even if we do all that, he still might miss. Hey, he’s a computer program…give him a break…lol

OR, FreshBot DOES find your new lens and you are in that Google index in a matter of hours – or minutes! Awesome! BUT – then days go by and DeepBot hasn’t been….and your new lens falls out of the index.

OR, FreshBot DOES get to you and DeepBot does, too – BUT, ol’ DeepBot didn’t find or wasn’t aware of any back links TO your content – and your lens falls out again. Remember now, just because YOU know there is a back link to your lens does NOT mean that DeepBot knows about it yet. And also keep in mind that you may SEE your back link in the Google index, but that could be the work of FreshBot – not DeepBot.

Regardless, it’s Google’s party and they’ll crawl if they want to =)

Bottom Line

When your lens is new and if appears to be struggling with Google, keep it fresh. I’m not talking about major updates – just a little sumthin-sumthin and republish. Also, keep making those links TO your lens (preferably from places that FreshBot hangs out).

All this with FreshBot and DeepBot applies to article marketing and blogs and ALL web sites. I know it’s confusing, but the best tip I can give to understand all this is to simply accept that you will never completely understand all this.

Good content that is well-optimized with links pointing TO that content WILL come back into the Google index. The higher the Page Rank of the site your content sits on, the better your odds of being found and STICKING in the Google index. The only thing that WON’T work to bring your lens back into the Google index is obsessing over why it’s not in the Google index. As I always say – KEEP MOVING FORWARD =)

Just for the record, I do not KNOW all this. Nor do I personally KNOW anyone at Google who has told me all this. These are my observations and the results of my reading and testing. You can read more about the GoogleBot and all her buddies here.

8 FATAL Pinterest Mistakes-

and how to fix them

Free Guide from PotPieGirl

Are YOU making these 8 fatal mistakes on Pinterest?

If so, these mistakes are probably why you're not getting the traffic and results from Pinterest that you deserve.

After fixing these 8 Fatal Pinterest Mistakes, I saw AMAZING improvement in my results from Pinterest.

Great post as usual, you certainly have become the squidoo pro in my opinion. None of the other internet marketers seem to go into detail about squidoo as much as yourself. Which I find makes you stand out from the rest.

I’m glad you made this post about google indexing lenses as I find my lenses also drop out of the index occasionally and wonder why, now I know. Thanks.

P.S: FreshBot and DeepBot LOL interesting names for the crawlers! I will have to remember that HA!

Yeah you are right Jeni, We all have to accept that we can’t understand all this fully. Somebody understands good batter, somebody less, but nobody can understand it fully. Anyways, as usual, good article from your side. I have some knowledge about it, but you made it more clear. Thanks.

Thanks Jennifer, I didn’t know about the dual google bots. I know that you’ve recommended freshening up the squidoo lenses, and now I know why! It’s like putting out bird seeds. Google is like a flock of birds! No seeds, no birds! Speaking of Google, here’s a link to a 44 page google internal document about how to rank pages, and how to determine if a page is spam. The info you’ve provided in your sites is right on target! Provide useful content! Here’s the link: http://www.mauriziopetrone.com/blog/wp-content/uploads/quality-rater-guidelines-2007.pdf
Thanks for all your help.

Wow… being new to all this I’m constantly being seduced to go down this road and that road to try to figure this stuff out. I should just stay right here and read through ALL your posts, because you do such a great job of explaining it all, while keeping it light, friendly and conversational.

What’s that old saying; “Even a blind pig finds an acorn every once in a while”. Well, this blind pig has found a whole mess of acorns right here!

If you’re to pick between editing your lens and getting more inbound links go for the latter. Links will not only help you stay in the index but it’ll also have your lens rank better (editing often wont) 😉

I’m new to the blogging thing, and I just wanted to say that, this being my first time to your blog has been a very helpful experience in learning something new about the way that google works. Thanks for the information about deepbot and freshbot, because I had no idea about them. Needless to say I’ll be back to find out just what squidoo is.

Email List Building the Easy Way

The sooner you set up an email list and marketing campaign, the sooner you can start making connections with your readers, growing your traffic and improving your results.Discover how having an email list can help when you sign up for our behind-the-scenes email course.