SEO Spider Tabs

Internal

The internal tab combines all data crawled from all other tabs except the external and custom tabs. So it combines data from the following tabs – response codes, uri, page titles, meta description, meta keywords, h1, h2, images, meta & canonical so data can be viewed or exported all together.

Address – The URI crawled.

Content – The content type of the URI.

Status Code – Http response code.

Status – The http header response.

Title 1 – The (first) page title.

Title 1 Length – The character length of the page title.

Title 1 Pixel Width – The pixel width of the page title as described in our pixel width post.

Meta Description 1 – The meta description.

Meta Description Length 1 – The character length of the meta description.

Meta Description Pixel Width – The pixel width of the meta description as described in our pixel width post.

Meta Keyword 1 – The meta keywords.

Meta Keywords Length – The character length of the meta keywords.

h1 – 1 – The first h1 (heading) on the page.

h1 – Len-1 – The character length of the h1.

h2 – 1 – The first h2 (heading) on the page.

h2 – Len-1 – The character length of the h2.

Meta Data 1 – Meta robots data.

Meta Refresh 1 – Meta refresh data.

Canonical – The canonical link element data.

Size – Size is in bytes, divide by 1024 to convert to kilobytes. The value is set from the Content-Length header if provided, if not it’s set to zero. For HTML pages this is updated to the size of the (uncompressed) HTML in bytes.

Word Count – This is all ‘words’ inside the body tag. This does not include HTML markup. Our figures may not be exactly what doing this manually would find, as the parser performs certain fix-ups on invalid html. Our definition of a word is taking the text and splitting it by spaces.

Level – Depth of the page from the start page (number of ‘clicks’ away from the start page).

Inlinks – Number of internal inlinks to the URI. ‘Interal inlinks’ are links pointing to a given URI from the same subdomain that is being crawled.

Outlinks – Number of internal outlinks from the URI. ‘Internal outlinks’ are links from a given URI to another URI on the same subdomain that is being crawled

External Outlinks – Number of external outlinks from the URI. ‘External outlinks’ are links from a given URI to another subdomain.

Hash – Hash value of the page. This is a duplicate content check. If two hash values match the pages are exactly the same in content.

Response Time – Time in seconds to download the URI. More detailed information in can be found in our FAQ.

Last-Modified – Read from the Last-Modified header in the servers HTTP response. If there server does not provide this the value will be empty.

Title 2, meta description 2, h1-2, h2-2 etc – The spider will collect data from the first two elements it encounters in the source code. Hence, h1-2 is data from the second h1 heading on the page.

Filter by –

HTML – HTML pages.

JavaScript – Any JavaScript

Images – Any images.

PDF – Any portable document files.

External

The external tab includes information about external URI.

Address – The external URI address

Content – The content type of the URI.

Status Code – Http response code.

Status – The http header response.

Level – Depth of the page from the homepage or start page (number of ‘clicks’ aways from the start page).

Inlinks – Number of links found pointing to the external URI.

Filter by –

HTML – HTML pages.

JavaScript – Any JavaScript

Images – Any images.

PDF – Any portable document files.

Response Codes

The responde codes tab includes response information from internal and external URI.

Address – The URI crawled.

Content – The content type of the URI.

Status Code – Http response code.

Status – The http header response.

Redirect URI – If the address URI redirects, this column will include the redirect URI target. The status code above will display the type of redirect, 301, 302 etc.

Filter by –

No Response – Where we receive no response to our request. Typically a malformed URI or a connection time out.

URI

Hash – Hash value of the page. This is a duplicate content check. If two hash values match the pages are exactly the same in content.

Length – The character length of the URI.

Canonical 1 – The canonical link element data.

Filter by –

Non ASCII Characters – The URI has characters in it that are not included in the ASCII character encoding scheme.

Underscores – The URI has underscores within it which are not always seen as word separators.

Duplicate – This is a duplicate content check. It filters for all duplicate pages found via the hash value. If two hash values match the pages are exactly the same in content.

Dynamic – The URI could be dynamic in nature (includes paramaters such as ‘?’ or ‘&’ etc).

Over 115 characters – The URI is over 115 characters in length (hence getting fairly long).

Page Titles

The page title tab includes data related to page titles.

Address – The URI crawled.

Occurences – The number of page titles found on the page (maximum we find is 2).

Title 1/2 – The page title.

Title 1/2 length – The character length of the page title.

Filter by –

Missing – Any pages which have a missing page title.

Duplicate – Any pages which have duplicate page titles.

Over 70 characters – Any pages which have page titles over 70 characters in length.

Same as h1 – Any page titles which match their h1.

Multiple – Any pages which have multiple page titles.

Meta Description

The meta description tab includes data related to meta descriptions.

Address – The URI crawled.

Occurences – The number of meta descriptions found on the page (maximum we find is 2).

Meta Description 1/2 – The meta description.

Meta Description 1/2 length – The character length of the meta description.

Filter by –

Missing – Any pages which have a missing meta description.

Duplicate – Any pages which have duplicate meta description.

Over 156 characters – Any pages which have meta descriptions over 156 characters in length.

Multiple – Any pages which have multiple meta descriptions.

Meta Keyword

The meta keywords tab includes data related to meta keywords. PLEASE NOTE – We advise to ignore the meta keyword tag, it is widely ignored, in particular Google does not consider it at all in their scoring of sites for ranking.

Address – The URI crawled.

Occurences – The number of meta keywords found on the page (maximum we find is 2).

Meta Keyword 1/2 – The meta keywords.

Meta Keyword 1/2 length – The character length of the meta keywords.

Filter by –

Missing – Any pages which have a missing meta keywords.

Duplicate – Any pages which have duplicate meta keywords.

Multiple – Any pages which have multiple meta keywords.

h1

The h1 tab includes data related to the h1 heading.

Address – The URI crawled.

Occurences – The number of h1s found on the page (maximum we find is 2).

h1- 1/2 – The h1 data.

h1-len- 1/2 – The character length of the h1.

Filter by –

Missing – Any pages which have a missing h1.

Duplicate – Any pages which have duplicate h1.

Over 70 characters – Any pages which have h1 over 70 characters in length.

Multiple – Any pages which have multiple h1.

h2

The h2 tab includes data related to the h2 heading.

Address – The URI crawled.

Occurences – The number of h2s found on the page (maximum we find is 2).

h2- 1/2 – The h2 data.

h2-len- 1/2 – The character length of the h2.

Filter by –

Missing – Any pages which have a missing h2.

Duplicate – Any pages which have duplicate h2.

Over 70 characters – Any pages which have h2 over 70 characters in length.

Multiple – Any pages which have multiple h2.

Images

The images tab includes data related to any images crawled.

Address – The URI crawled.

Content – The content type of the image (jpeg, gif, png etc).

Size – Size of the image. File size is in bytes, divide by 1024 to convert to kilobytes.

Filter by –

Over 100kb – Large images over 100kb in size.

Missing Alt Text – Images that are missing alt text. Click the address (URI) of the image and then the ‘image info’ tab in the lower window pane to view which pages have the image on and which pages are missing alt text of the said image.

Alt Text Over 100 Characters – Images which have one instance of alt text over 100 characters in length.

Directives

The directives tab includes all information related to meta data, canonical and rel=“next” and rel=“prev” link elements crawled by the SEO spider.

Address – The URI crawled.

Meta Data 1/2 etc – Meta data found on the URI. The spider will find all instances if there are multiple.

Meta Refresh 1/2 etc – Meta Refresh found on the URI. The spider will find all instances if there are multiple.

Canonical Link Element 1/2 etc – Canonical link element data on the URI. The spider will find all instances if there are multiple.

HTTP Canonical 1/2 etc – Canonical issued via HTTP. The spider will find all instances if there are multiple.

X-Robots-Tag 1/2 etc – X-Robots-tag data. The spider will find all instances if there are multiple.

rel=“next” and rel=“prev” – The SEO Spider collets these html link elements designed to indicate the relationship between URLs in a paginated series.

Filter by –

Canonical

rel=“next” and rel=“prev”

Index

Noindex

Follow

Nofollow

NoArchive

NoSnippet

NoODP

NoYDIR

NoImageIndex

NoTranslate

Unavailable_After

Refresh

AJAX

The Ajax tab shows both ugly and pretty URLs, with filters for hash fragments. Some Ajax pages may not use hash fragments (such as a homepage), so the ‘fragment’ meta tag can be used to recognise an Ajax page. In a the same way as Google, the SEO Spider will then fetch the ugly version of the URL

Pretty URL – The pretty URL of the page.

Ugly URL – The ugly URL actually requested.

Status Code – Http response code.

Status – The http header response.

Custom

The custom tab works alongside the ‘custom’ configuration feature. This feature allows you to search the source code of html pages. You cannot ‘scrape’ or extract data from html elements using this feature at the moment. There are 10 filters under configuration which relate directly to the 10 filters in the custom report.

Address – The URI crawled.

Content – The content type of the URI.

Status Code – Http response code.

Status – The http header response.

Occurrences – The number of times it appears within the source code of the URL.

Filter by –

Filter -1-10 – Shows URI that either contain or do not contain the query string entered in the relevant custom filter.

URL Info

If you highlight a URI in the top window, this bottom window tab populates. This contains a very brief overview of the URL in question.

URL – The URI crawled.

Status Code – Http response code.

Status – The http header response.

Content – The content type of the URI.

Size – File or web page size.

Level – Depth of the page from the homepage or start page (number of ‘clicks’ aways from the start page).

Inlinks – Number of internal inlinks to the URI.

Outlinks – Number of internal outlinks from the URI.

In Links

If you highlight a URI in the top window, this bottom window tab populates. This contains a list of internal links pointing to the URI.

You can also now edit page titles and descriptions directly in the interface.

The SEO Spider will by default remember the edits you make to page titles and descriptions, unless you click the ‘reset title and description’ button. This allows you to make as many changes as you like and then export and send to a client or development team.

Follow Us!

Why Purchase A Licence?

The 500 URI crawl limit is removed

You can access ALL the configuration options

You can save & re-upload crawls

You can search for anything in the source code, & collect any data from the HTML of a URL using XPath, CSS Path or regex

You can connect to the Google Analytics API & pull in data directly during a crawl