Introduction

This article is more of a guide on how to programmatically execute some actions on web sites. I took an example of submission of links to DZone to illustrate this concept. These days, there is a lot of emphasis on submitting or sharing your post, links, articles with the whole community so I thought this example of DZone link submission will work out good. At the heart of implementing this whole concept are the HttpWebRequest and HttpWebResponse objects. Since I am using the .NET Framework, I mentioned these classes. But behind the scenes, it is as simple as sending an HTTP request and analyzing the response. So you can use whatever tool you have at hand. I will explain each step that I followed to come up with the solution. These steps pretty much work for all kinds of applications.

Use Web Site to Perform the Action

First you need to analyze what action you are performing and how the web site sends its request and what kind of response is returned. These two analysis steps are what drive this whole solution. Let's take an example of submitting a new link from the DZone site. You click on "Add a new link" and you are taken to a new page where it asks you to login. Then you login and you are sent to a page where you supply values for URL, Title, Description, Tags, etc. Then you click "Submit" button and you are done. So based on this, following are the steps that you need to perform programmatically.

Submit request to add link

Catch redirect to login page

Perform login into site

Send request to add page after unsuccessful login

Now to see what the browser is doing to perform all these actions, fire up tool like Fiddler and monitor all requests/responses for these actions. So if you can mimic these actions, you are good to go. Now let's see how you will perform this action programmatically.

Submit Add Request

You will be using the HttpWebRequest object to send a request to http://www.dzone.com/links/add.html. At this point, you do not have to worry about specifying any other parameters like Title, URL, etc. as your request is not going to go through because you are not logged into the site. In technical terms, you have not established an authenticated session with the site.

Catch Redirect To Login Page

When you send an unauthorized request to add a link, the site will redirect you to the login page. What this means is that when you send an HTTP request to access add.html page, the server sends an HTTP response with status code 302 which means that the response is being redirected. And with that response, it sends the redirection location in Location header in response. So programmatically you need to submit a request, look for the response status code and find the Location header. The code is as shown below:

Notice that the code is in a while loop, the reason being that some sites actually can redirect you to a couple of pages before sending you to the final login page. So I have limited the loop to 20 hops.

Cookies

This is the biggest part of the whole implementation. When you start a session with a site, it sends some cookies in response. And it expects some of those cookies sent in subsequent requests to make sure that you have an authorized session open. If you look at the code above, I have attached a CookieContainer object to request to make sure that all the cookies sent in response are collected. And then this container can be attached with subsequent requests.

Perform Login

When you perform login on site, it does a FORM submission to server with some key-value pairs that contain the data required to validate the user. You can use Internet Explorer Toolbar, FireBug or any other tool to inspect the HTML of the page to locate the FORM tag and values that need to be sent. I used FireBug to inspect that section to find out the values that I need. The following images show the result:

You can see that there is a FORM with POST action pointing to /links/j_acegi_security_check. And you will find that it has two text boxes with element names j_username and j_password that take login information and are used to submit data with POST request. So these are the pieces of information you needed to perform the login action. The following code shows how this is accomplished:

Did Login Succeed?

After you executed the above request and got the response back, now the big question you will ask is how do I check if the login succeeded or not. You can't rely on status code of response because if it will be 200 means request succeeded. There are a couple of things that you can check. Some sites will redirect you to a landing page so you can check if you got 302 response code. Or a sure way to check is to parse the response and see if you have a login box on the page. For example in case of DZone.com site, you can check if there is a markup node on the page that has name attribute with value of j_username or any markup that is unique to the login page. If you will find that node, that means login did not work. Here is some sample code that I used for my application.

Submit New Request with Authorized Session

During this whole process of login and redirections, make sure that you keep the cookie container around so that it keeps collecting all the cookies. You are going to need this cookie container to send a request to submit your links. Now you just need to send a new POST request to the target URL with appropriate FORM parameters like title, URL and description.

Sample Project

A sample project and other pre-requisites for the code shown here are available at ByteBlocks.

Hi,
this seems to be solution to my problem
but i am facing problem implementing it.
I have copied all of ur code and also add reference of htmlparser pro
but i am getting error on "RequestAttributes" (its undefined in the program) can u tell me in which assembly the RequestAttributes is defined?