Web Scraping & Data Extraction with Screaming Frog

What is Web Scraping?

Web Scraping also known as Web Data Extraction, or screen scraping, is used to extract large amounts of data from websites. The data can then be extracted into spreadsheets, or databases for further analysis.

Why do I need web scraping for SEO?

Content Idea Inspiration and Research

Understanding Competitors Content Strategy

Creating alt text entries for 1000s of images quickly

Collect plain text

Google Analytics IDs

Schema Markup

Social Meta Tags (Open Graph Tags, Twitter Cards)

Mobile Annotations

Comment Scraping

Email Scraping

Hreflang code

Prices of Products

Stock Availability

A Beginners Guide to Web Scraping with Screaming Frog

Web Scraping with Screaming Frog SEO Spider is one of the less used features of Screaming Frog, but certainly a useful trick to have up your sleeve when you need to extract large amounts of data from the HTML of a webpage.

Screaming Frog is by no means the only tool that you can use for web scraping, (Python is generally considered the go-to solution). But for beginners to web scraping Screaming Frog provides all the features you need to allow you to extract using CSS Path, Xpath and regex.

The Three Methods of web scraping with Screaming Frog are:

XPath – This option allows you to scrape data using Xpath selectors. Recommended for most web scraping scenarios.

Extracting with Xpath

//h2

CSS Path – CSS selectors are patterns used to select elements and allows you to scrape data quickly. Recommended for most web scraping scenarios.

Regex – A string of text used to match patterns in data. Regex is flexible and can be used to scrape HTML comments or inline JavaScript. Recommended for advanced web scraping.

How to web scrape with Screaming Frog

Click on: Configuration > Custom > Extraction. This will open up a new extractor page, which will have 10 separate inactive extractors.

Inspect an element on a webpage (On Chrome click on ‘Inspect Element) and find the specific data that you want to pull: Select either a CSS Path, XPath or you can use Regex. (These are the three methods for webscraping that Screaming Frog accepts).

Input the Syntax into the relevant fields on the extractor page.

If your Syntax is valid, then a green tick will appear next to the input fields.

Close the extractor page and go back to the main Screaming Frog page, enter the URL of the website that you want to scrape the data from and click on Start.

Once Screaming Frog has completed you will be able to view your data under the Custom tab and Extraction Filter.

Export the data into Excel.

A Video Guide to Web Scraping with Screaming Frog

Web Scraping with Screaming Frog SEO Spider

5 (100%) 1 vote

About the Author

Chris

Chris is a London SEO Consultant working as an SEO Account Director for Blue 449, part of Publicis Groupe.