What is a Log File?

A Log File contains the records of hits from different user agents that the server receives. The data in the log file contains data such as the time the hit or request to the server was made, the IP address, the URL requested and the user-agent used.

Why is Log file Analysis useful?

Log File analysis is one of the best ways to understand exactly how different user agents are crawling your site. Log File analysis is particularly useful when it comes to understanding how much Crawl Budget is being wasted, and crucially, what URLs this is on. Any accessibility errors will also be illuminated during log file analysis as well as any other crawl deficiencies. For example, if you suspect Googlebot is ignoring some pages where thin content is potentially an issue, but you need to prove this to your client, this is where Log File analysis can be used to categorically prove this through our old friend: Data.

For the purposes of SEO, we will mainly be filtering the Googlebot User-Agent. (Other User-Agents are available). To perform a decent analysis you are going to need around 60-120,000 rows of Excel data.

The Anatomy of a Log File

Now you know a little about the basics of log file analysis, it is now time to show you the different sections of a Log File. Log Files are consistent in the fact that they will almost always include the following:

Server IP

Date and Time

Method (GET / POST)

Request URI

HTTP status Code

The User Agent

Log Files may contain other information, such as the host name, client IP address or the bytes downloaded.

Crawl Budget: What is it exactly?

Crawl budget is the allocation of pages given to your website by a search engine (such as Google) each time it visits your site.

How is Crawl Budget determined?

Crawl budget and allocation is based on the authority of your site. In the olden days this was determined in part by PageRank. Essentially, the more authority your site has, the more URLs will potentially be crawled.

Even if your site has a large Crawl Budget, Google may choose to ignore certain sections of your website if it sees that you are producing content that is thin, or low quality on a large scale.

Log File Analysis Tools: Screaming Frog Log File Analyser & Splunk

The Screaming Frog Log File Analyser is one of the most user friendly ways to analyse your Log Files. As this product is made specifically for SEOs it is the one I recommend above all others, although if you are looking for other alternatives there is always Splunk, which is also reasonably user friendly. (Although the option to simply drag and drop a server file from any server such as Apache, ISS or NGINX on to the actual tool for fast analysis is not an option).