AbstractEverything you need to know about screen scraping, from simply pulling down a page to more complex issues like submitting forms and cookies. Here you will learn how to use the Webclient and httpWebresponse classes and which is better for what task.

There have been articles on ASPAlliance about data scraping, today we will be looking at the different techniques. The WebRequest class is provided for accessing data via the web, it has two derived classes that will be looking at: Webclient and httpWebresponse.

Both classes are able to do anything you wish to do, it is more of a case of which to use for what job.

Here we will cover everything you would want to do with the two classes and see which comes out best.

Title:
Thanks for sharing this
Name:
Evgeniy
Date:
2011-06-14 2:28:07 PM
Comment: Nice article. Usually i use commerce libraries for .net scraping, like Scraper or Gogybot. I think your article describe how this libraries are work internally. Thanks!

Title:
error in links
Name:
harish
Date:
2011-04-07 8:11:53 AM
Comment: most of the links in this website are not working

Title:
Frustating code
Name:
Milind
Date:
2010-05-14 3:28:08 AM
Comment: The article seems to be good and looks like covering all the aspects. But damn, I am not able to see the code !! It just keep giving 503 error.

Hello Admin/Webmaster,Any reason? or when can we expect this to be fixed?ThanksMilind

Title:
Great Article
Name:
Seamus McMahon
Date:
2010-04-21 12:24:01 PM
Comment: This article is very useful. I have been reading up on screen scraping and in particular entering data into forms but there is very little of the subject covered in any books. This is has been very helpful

Title:
mr
Name:
G P Zob
Date:
2009-12-03 6:50:51 AM
Comment: what happens if the page you are scraping errors? Do you get the error page in the response stream? No. So how do you display the scraped error page in the scraping page? Any ideas?

Title:
How does this translate for using with Siebel?
Name:
Marc Tucker
Date:
2009-10-13 12:23:09 PM
Comment: I am interested in the code you use to do this in Siebel. I am using Siebel 7.8 currently, and my manager has asked if we can screen scrape data from the siebel screen to populate some notifications to our sales reps. I can use your example to scrape basic info, but how do I drill into the specific frame and or object to get the date I'm wanting to retrieve?

Title:
Service Unavailable error
Name:
H Yeung
Date:
2009-09-08 5:31:48 PM
Comment: Is the service down? I received service unavailable error when I tried to see the source.

Title:
Scraping w/o request
Name:
John McKenney
Date:
2008-03-07 8:34:05 AM
Comment: How to do implement a scrape if you cannot request the URI. Meaning, I have to read a static HTML page served by Peoplesoft, I cannot request the URL, I alreasy have the page. I have a VB.Net app that I want to read a certain peice of data from that static page. Any pointers?

Title:
Screen scraping from embedded actix controls?
Name:
Marc Tucker
Date:
2008-02-24 7:25:07 AM
Comment: I have a siebel application that we copy and paste values out of and into another app made in vb.net. We've sent in an enhancement request to the dev team to get the data an easier way but it's not a top priority for them. Can we use screen scraping to extract the data from the specific applet in question? Siebel uses frames within frames within frames also. I have tried mapping out the frames to access the data via the DOM but that isn't getting me the info I want and need to know.

The problem was with my .Net assembly folder. I did a 2.0 repair and it worked fine after.

My apologies.

Title:
Fortunate
Name:
Brendan
Date:
2007-12-18 9:14:39 AM
Comment: It isn't Article.HttpWebRequest. that is why you get an error. You should be using System.Net.HttpWebRequest like the author does in this article.

Title:
screen scrapping with all the links having absolute path.
Name:
Ross
Date:
2007-12-07 9:57:12 AM
Comment: I need more information on scrapped with the links having absolute path.So, that they can be mapped with the local web application.

Title:
re:Webscrap a Website
Name:
DamianM
Date:
2007-11-09 6:13:56 AM
Comment: I would be hard for me to say if your doing anything wrong this would depend on the site you are scarping. You need to mimic 100% what the browser is doing. It is possible to do what you want, it is just trick some times.

Title:
Webscrap a Website
Name:
Sandeep
Date:
2007-11-09 3:33:25 AM
Comment: I tried passing values, but did'nt worked am i doing anything wrong?. What I want the webscraper program to do is pass loginid and password to the login page and invoke the "LonIn" button click event so that I get the response and then the page after login page is called, is it possible?

Title:
re:Webscrap a Website
Name:
DamianM
Date:
2007-11-08 4:33:42 AM
Comment: Read the passing forms section. The password and user name are probably passed as a form. To simulate pressing the submit button, you will need to pass the form to whatever url the form is submitted.

I want to do web scraping for web site which asks for a userid and password (which i have) how do i pass this info to the website, also how do i invoke the button click event, so that it will execute the code behind that button and give a response. Also once in i want to perform various task like buying a product out of many and finaly make payment using credit card, all this needs to done using web scraping.

Title:
Source Code
Name:
Code?
Date:
2007-10-05 8:25:38 AM
Comment: The source code seems to display fine, its justs some of the example that do not work.

Title:
Pages can't load
Name:
Someone interested in this topic
Date:
2007-10-04 10:07:04 PM
Comment: Looks like a great article whereby the author intent to show the full codes directly. But the thing is.... the pages can't load...and I see error pages all the time :(

Title:
Too bad...
Name:
Can't see code
Date:
2007-04-26 3:34:08 AM
Comment: Would be a great article, but something is wrong with aspalliance setup here. Saw a couple of the examples yesterday, but today all I get is the "500" error.

Title:
Any solution to my above problem
Name:
Ritesh
Date:
2006-08-02 4:26:09 AM
Comment: Is there any solution or has anyone ever encountered this

Title:
Error : The underlying connection was closed: The remote name could not be resolved.
Name:
Ritesh
Date:
2006-08-02 4:25:11 AM
Comment: Exception Details: System.Net.WebException: The underlying connection was closed: The remote name could not be resolved.