TopMenu

Wednesday, 12 October 2011

Isaac Newton once said he stood on the shoulders of giants. Dennis Ritchie was a giant whose shoulders people like Steve Jobs stood on.

Dennis Ritchie, Creator of the C Programming Language & UNIX is dead at age 70.....all programmers owe him a moment of silence! Read his biography at Wikipedia.

Pioneer of Computing Resting in Peace...*Dennis Ritchie has graduated from Harvard University with degrees in physics and applied mathematics and has started began working at the Bell Labs Computing Sciences Research Center and received a Ph.D. from Harvard.*He is the creator of the C programming language and a key developer of the Unix operating system, and as co-author of the definitive book on C, The C Programming Language

* He is one of the pioneers of modern computing. The C language is still widely used today in application and operating system development

You don't know him ? He is the Father of C, without him Operating Systems(Windows, Apple) wouldn't be here, Games, Programs, Apps on your Apple, Android, wouldn't exist. Many say he is the Father of Computer Science, He has so many achievements..

Dennis Ritchie (1941-2011). His pointer has been cast to void *; his process has terminated with exit code 0.

Tuesday, 4 October 2011

Sometimes its required to fetch random records from database. The simplest approach is to use NewID() with order by clause. Well I have no problem with that but when you have large table to query then It’ll start popping up the issues of performance. Why? I’ll try to answer this question with alternative solution.

To start with I’ve Adventureworks database with around 260000 records in a table and we’ve a query that will take a random record from it.

The problem with this query is, when you look at the execution plan then you’ll find it performs a sort operation before it picks a top record.

So 91% of the cost is due to the sort operation. Well this will happen if you use NewID() to pick the random records it first sorts the records based on NewID() which is worst case for sorting based on GUID values.

Now get rid of Sort operation of NEWID() by- BINARY_CHECKSUM

Best approach is to use the BinaryCheckSum instead of NewID to reduce the sort operation to perform on NEWID column generated in Memory.

SELECTTOP 1 [BusinessEntityID]

FROM HumanResources.EmployeePayHistory

WHERE (ABS(CAST(

(BINARY_CHECKSUM(*)*

RAND())asint))% 100)< 10

The BINARY_CHECKSUM function generates a checksum value that is based on the values of the columns that you specify. If two rows are different, they typically will generate different checksum numbers. TheBINARY_CHECKSUM function generally is used to verify whether any of the columns in a row in a table have changed. However, for our purposes, it generates a number that looks like a random number for each row.

Another form of using BinaryCheckSum can be:

SELECTTOP 1 [BusinessEntityID]

FROM HumanResources.EmployeePayHistory

WHERE

(ABS(CAST(

(BINARY_CHECKSUM

([BusinessEntityID],NEWID()))asint))

% 100)< 10

Both will form of Binary_Cheksum will have same query Execution plan with similar cost.

Now lets have a look at the Figures analyzed by the Microsoft team to compare performance of both queries.

13 million rows

NEWID query

253

347,420

13,810,132

14,157,552

422,891

BINARY_CHECKSUM query

21

347,420

0

347,420

49,203

And in our case here I’ve also did some comparison based on Query plans of Old query of NEWID and new query with BINARY_CHECKSUM

Just another tips improving performance but if you have to pick random records from small tables in SQL then you can still use NEWID() with no issues.

Monday, 3 October 2011

Alright folks who knows the panic of using SqlDataAdapter class to fill the Datatable in case if you required only a single table to fill from database. This post is to show you there’s an alternate way to get a DataTable and that’s alternate way is returning DataReader from the command object after executing the command.

Well I always thought of DataReader when it comes to performance but I can’t forget to keep the data in memory and also keeping in mind that DataReader works in connected mode. So I came up with the mixed approach means reading the fetching the data from DataReader and then fill the data into a DataTable. The next thing was to make it reusable so finally I did some juggling with the code and came with the code below that can convert your DataReader object to DataTable.

Always use IDataReader to receive the object returned from the Command.ExecuteReader() method.

you can get the schema of the Table by method that is available through the IDataReader interface GetSchemaTable(). I’ll use this method to get the schema of the table that IDataReader is holding. Then creating a new DataTable using the data we received from the schema and then start populating it by using the data in the DataReader. See below snippet I’ve put all the ideas together and created this method GetDataTableFromDataReader().

Now you can use this method to fetch the data in the DataReader and then get the DataTable from it. Its really fast I’m telling you. Yeah people can ask me about the complexity of creating another table’s schema and looping through the records. But this method is still worth using SqlDataAdapter class instead.

Let me go though a small demo over this. Let create a DB connection and use this method and the Adapter method to get the Performance records.

Now Run the code and lets have a look at figures. I ran this program 2 times just to make sure counts are correct

First Run - Time Elapsed in Adapter: 93.0053 Time Elapsed in Custom : 13.0008

Second Run - Time Elapsed in Custom : 129.0074 Time Elapsed in Adapter: 14.0008

So I’m sure you must be impressed with performance here. Moreover you can make it more fast by using the Parallel.For() and Parallel.Foreach() loop if you are working on .Net 4.0 and have a multicore processor.