“It's not about how to achieve your dreams, it's about how to lead your life, ... If you lead your life the right way, the karma will take care of itself, the dreams will come to you.”
― Randy Pausch, The Last Lecture

This just helps is to filter out common user agents used by crawlers. Here's the command: tail -f /var/log/apache/access.log | grep -f bots.txt

7 - Top Crawlers This command will show you all the spiders that crawled your site with a count of the number of requests. cut -d'"' -f6 /var/log/apache/access.log | grep -f bots.txt | sort | uniq -c | sort -rg

How To Get A Top Ten You can easily turn the commands above that aggregate (the ones using uniq) into a top ten by adding this to the end: | head

That is pipe the output to the head command. Simple as that.

Zipped Log Files If you want to run the above commands on a logrotated file, you can adjust easily by starting with a zcat on the file then piping to the first command (the one with the filename).