Web Data is Ephemeral

Web data is here today, but may be gone tomorrow.

Anyone, at any time, can change the information provided on a website they control. Prices and products change, content changes, statistics change, news may not always be archived, companies go under, and websites disappear. The point is, you don’t know what data you’re missing until it is too late. That’s why many Import.io customers are collecting web data every day, even if they don’t have an immediate use for the data.

By continuously collecting data:

Organizations have historical data to track trends in pricing and product offerings.

Market and financial research firms have past data on industries, news, and businesses.

Non-profit watch groups have web data that may one day not be available due to policy changes.

You may remember when the Trump administration took down climate science data from the EPA website. Watch groups around the country started scraping the EPA website, but in the case of the climate science data it was too late, the website was already hidden from the public. The only way to get back to this data is to proactively collect and store it.

What if you are a retailer and your brand has a dip in online sales. If you just start collecting pricing data today, then you won’t know what price competitors were charging over the past few weeks. Unless you have already collected the data, you can’t analyze and resolve the problem. By being proactive about collecting pricing, positioning, and web page changes, you can prevent the loss of market share before it occurs.

Import.io offers the capability to set up data extractors that automatically collect and store data in the Import.io cloud. Web data can be extracted hourly, daily, weekly, or monthly. Once set up, Import.io runs in the background. You can also download the data as a spreadsheet, JSON file, or data feed into another program using APIs.

Bottom line – it’s important to start collecting data now! Don’t wait until your data strategy and analysis tools are perfect. As Andrew Ng, one of the most influential minds in artificial intelligence and deep learning, said “Whoever has the most data wins”.