Many of you may know Ned from various parts of MATLAB Central, such as the community blog "MATLAB Spoken Here". If you're a frequent visitor of MATLAB Central, you may have also visited Trendy, which allows you to quickly query and plot trends from the web. One of the utility functions provided within Trendy has been urlfilter, and it's a convenient function that allows you to easily scrape data from a web page. Now, you can use urlfilter outside of Trendy!

To see how it works, take a look at the Trendy tutorial or the published example script included with Ned's entry. But here's a quick example of how it could be used.

Let's say that I want to grab and plot the high and low temperatures in Natick, MA for the next 10 days. I will grab data from this URL at http://www.wunderground.com. As you can see from the web page, the 10-day forecast is displayed about halfway down the page in a table. Each day has a header in the format of "day of week, day", e.g. "Friday, 17".

First, I calculate the days I'm interested in, which is today to 10 days from today. I also determine the day of the week using the weekday function. I need this information, because urlfilter will use this to scrape the necessary data.

Note that I could have done this more efficiently with a single call to urlfilter, extracting about 40 numbers at once, and then parsing the numbers to get the necessary high and low temperatures. I used the above approach to make it easier to understand.

Comments

Wasn't that easy? Give this a try, and let us know what you think here or leave a comment for Ned. If you find interesting data, consider tracking the trend using Trendy!

Yes, you should be able to use “urlfilter” to scrape data from the page. The key is to find the keyword in the web page near the value that you are interested in and capture the result for any postprocessing.

Also, check out Trendy. It’s set up to automatically scrape the data every day.

@Paul,

Thanks for this! I like the “occurrence” option. That provides additional flexibility to make it easier to scrape data. Can you suggest the enhancement to Ned?

These postings are the author's and don't necessarily represent the opinions of MathWorks.