6 Replies - 1317 Views - Last Post: 20 April 2014 - 07:19 PM

Article/Media Scraping App

Posted 19 April 2014 - 05:04 PM

I'm wanting to create an app which scrapes Google and some other news sites for articles, images and videos. The user enters a keyword then the app scrapes news and media for that keyword and delivers the various results throughout the day through the app. It's main function would be to provide news updates on certain topics in a simplified manner. I'm new to programming and I'm not so familiar with scrapers so I'm not too sure how this would work. I'm relatively new to programming, how hard would this be to do for someone like me?

Also, would there be any legal implications of scraping data from other websites?

Re: Article/Media Scraping App

Posted 19 April 2014 - 07:49 PM

Doing a naive screen scraping is easy. Making an intelligent indexer is much harder. For example, if the keyword you are looking for is the noun "taxes", you'll probably want to filter out the results where the word "tax" is used on a webpage where it says "Taxes and shipping not included", and pages where the word "taxes" is used as a verb (e.g. "Installing Photoshop taxes the system"), but you'll want to keep pages that say "Remember to pay your taxes".