Monday, December 18, 2006

I ran into a problem while reading binary data from a site for my web-spidering application that I was developing a couple of months ago. I was able to read text strings from few sites but failed on many websites because the ResponseStream was not seekable. See this code snippet

The ContentLength property of the WebResponse object failed to retrieve the stream’s length and threw “This stream does not support seek operations” exceptions. I later realized that it was not actually a problem with the WebResponse object, but the way I intended to retrieve binary data from the web was not right. Ok, so how can I retrieve data out of this stream, say, a straight HTML text for further parsing? The best way to do this would be to copy this stream to a MemoryStream and finally convert it into a Byte Array.

Here, I am instantiating a new MemoryStream object, reading fixed bytes from the stream and copying it over to the MemoryStream. The Stream.Read() method reads a maximum of "bytesRead" bytes from the current stream and store them in the buffer. In the above example it reads a maximum of 2048 bytes each time, stores them in a buffer and finally write that into the MemoryStream. The method returns a 0 if there is no more data to be read. One important thing to be noted here is that the Stream.Read() method can return fewer bytes than requested (< 2048) even if end of the stream has not been reached.

MemoryStream.ToArray() method finally converts it into a Byte Array. If the retrieved data is of plain-text type, which can be know from its headers, can be converted into a string using the System.Text.Encoding.ASCII.GetString(buffer) method. Else, write the byte array to a file using the FileStream object.

You will find this implementation very useful if you’re planning to download something and later resume broken downloads in your web-spidering applications….!

Sunday, December 3, 2006

Encryption of sensitive data is very important in most of the software applications today. In this feed I’ll show you how to encrypt and decrypt a string data using the encryption classes contained within .NET. The method I’m going to employ here is a “private-key” algorithm that uses one key to encrypt and the same key to decrypt the data. I’ll be using the TripleDES encryption algorithm which is considered to be very secure. It performs three times as much encryption as the standard DES.

DES, Data Encryption Standard, encrypts data using 64-bit or 8-byte blocks and employs 16 rounds of encryption to every block of data. It uses a key size of 56 bits, 8 bits reduced from 64 bits which serve as parity bits. As mentioned earlier, TripleDES, which performs 3 times the encryption as DES requires 168-bit key and encrypts each block three times. That is, each block of data is actually encrypted 48 times, making TripleDES more secure.

In this example, instead of using a 24-byte private key directly, I’m converting a pass phrase into a TripleDES key using the Hash Computation algorithm found in MD5CryptoServiceProvider class. I am doing this because the pass phrase makes more sense when you want to share the private key among those whom you want to share your secure data with. You definitely don’t want to memorize a private key of this kind {12, 26, 13, 44, 95,16,17,38,29, 10, 11, 22, 43, 24, 15, 56, 37, 78, 29, 27, 23, 52, 43, 4} to decrypt or encrypt your data.

Let’s look at the code now. To begin with, we’ll add a few namespaces.

An initialization vector is used to mask the encryption method in private-key algorithms as they are known to process data in blocks. In most general terms, a block cipher will move to the next block after encrypting each block of data using the same key. Say, if the alternate block has the same data, then the encrypted equivalent also would be having the same encrypted data because they were encrypted with the same key. An initialization vector, in combination with the private key, uses the previous block’s information in encrypting subsequent blocks thereby generating different encrypted data even though they’re the same.

The above initialization vector contains 8 bytes. This can be replaced with your choice. Take a look at the Encrypt Method.

I’m first converting the plain text string to a byte Array since many cryptographic methods expect the data to be in this format. I’m then instantiating couple of objects, firstly, TripleDESCryptoServiceProvider which accepts the initialization vector and the private key, and secondly, MD5CryptoServiceProvider which is used to compute the Hash and create a valid TripleDES private key from the Pass Phrase that I was telling you about. The ICryptoTransform defines the basic operations of cryptographic transformations. The TransformFinalBlock method transforms the input byte array into an encrypted byte array by applying the private key and the initialization vector. This is finally converted into a Base64String and returns the encrypted string to the calling method.

The process of decrypting the data is very much similar to encrypting the data because they use exactly the same private key. The Decrypt method shown below does the decryption process.

This completes the encryption/decryption process of a string data by employing the classes and methods contained in the System.Security.Cryptography namespace. The screenshots of the sample application that I created is shown below.

You can create your own Encryptor/Decryptor class library with these methods and develop more secure enterprise applications. Some of the areas of interest would be like encrypting the Passwords before saving it to the database or encrypting the QueryStrings in your web applications.