As I said it’s very basic right now… I’ve added in the ability to see which pages are being crawled. I’ll add an updated screenshot later today so you can see some more of my progress.

MSNBot is included in the tracking, it just hasn’t hit my site yet.

What I’ve noticed is the really weird the way crawlers visit. At least what I’ve seen so far, there is 5-15 minute breaks between each page hit sometimes so it’s very hard to tell what a “session” is. Sometimes the crawlers will hit pages that do NOT directly link to each other. And what makes it even harder to track a session is that sometimes back-to-back hits come from different IPs.

Maybe it’s just the way the crawlers hit my site, but I can’t tell a real pattern between sessions so far. Sometimes they’ll just hit one page an hour or so, very strange. I’ll have to do a little bit more research about how they operate.

If it’s gonna segregate it like that, it’s gonna be great. I trawl for scrappers by looking through the raw server logs, so if it’s picking up activity by somebody trying to run stealthy through the pages, this would be a boon to my activity.