Articles

Web Scraping Articles

What is Web Scraping?

Web Scraping (also termed Web Data Extraction, Web Harvesting, Screen Scraping, etc) is an automated compunter software to extract huge amounts of information from websites. Web Scraping scripts and applications will simulate a person viewing a Web site with a browser. With these scripts you can connect to a Web page and request a page, exactly as a browser would do.

Website data in the Internet can normally be viewed only using a web browser. Most websites do not offer any API to save a copy of the data which they list to your local storage. To do that , you have to manually copy and paste the data displayed by the website in your browser to a local file in your computer - a very tedious task which can consume many hours or sometimes days to complete.

Web Scraping is the technique of automating this process, therefore instead of manually copying the data from websites, the Web Scraping software will execute the same task within a much shorter time.

A Web Scraping software will interact with websites in the same way as your web browser. But instead of displaying the data served by the website on screen, the Web Scraping software saves the required data from the web page to a local file or database.

Web scraping is essentially a form of data mining. Items like weather reports, auction details, market pricing, or any other list of collected data can be sought in Web scraping efforts.

The practice of Web scraping has drawn a lot of controversy because the terms of use for some websites do not allow certain kinds of data mining. Despite the legal challenges, Web scraping promises to become a popular way of collecting information as these kinds of aggregated data resources become more capable.