Saturday, April 23, 2016

This is not something new with latest Microsoft SQL Server but this is still being discussed and it is unknown or unclear to many of database developers. While I was discussing transaction log of SQL Server database with few, as a part of it, how SQL Server accepts our requests and modifies records was discussed. Thought it is useful to everyone, hence making a post on it;

What really happens when we send a request to SQL Server? It can be an update or a delete. Request might be related to one record or many records. Have a look on below image;

This starts with the request. Either using an application or connecting directly to SQL Server using something like SSMS, we send the request. Once SQL Server received the request, it checks data pages related to the records. If data pages required are not in the memory (or buffer cache), it loads relevant data pages from the disk to memory. Then, remember, it modifies records in pages that are in the memory, not pages in the disk. That is what 1st and 2nd steps in the image explains.

Third step is, updating the transaction log in the disk. Once the page (or pages) in the memory are modified, they become dirty pages. Then SQL Server writes redo and undo information to the log file. During this update, pages related are locked until the transaction log is completely updated.

Once the log is updated, the acknowledgement is sent to the application. Note that, even though the data files are not updated, we receive a message saying records are successfully updated. But we do not want worry, even something happen after we received the message, SQL Server can recover committed records, making sure durability which is one of the properties of the transaction, is satisfied with SQL Server.

Later, after one ore more transactions, a process called Checkpoint writes all dirty pages back to the disk, updating data files. This is how SQL Server handles our update requests.

Thursday, April 21, 2016

When PolyBase is opened with SQL Server editions other than APS or PDW, I tested it with SQL Server 2016 CTP 2.1. It had some issues but was able to get it worked with some workaround. Thought to do the same with RC3 since there are many enhancements, and it works fine now. If you are interested in Polybase, have a look on the post I have written in June 2015; SQL Server 2016 features: Polybase: SQL Server connector to Hadoop.

What is PolyBase? It is a feature that is available with SQL Server 2016. It facilitates us to use TSQL against data stored in Hadoop or Azure Blob Storage, allowing us to combine structured data with semi-structured data. Data warehousing benefits a lot from this as it reduces the time spending on ETLing and supports real-time analysis. Other than that it can be used with OLTP databases and can be used for archiving data as well.

Currently this feature is available with SQL Server 2016 RC3, I did not come across Edition Comparison related to 2016, hence no idea which Edition will support with final release. Note that there are few prerequisites like .Net framework 4.5 or later, Oracle Java SE runtime. Make sure you have installed all required before installing SQL Server and enabling Polybase. You can read a post I wrote on installation SQL Server RC3: Installing SQL Server 2016 RC1 - Sharing issues.

Generally, we should install PolyBase feature as Standard-alone Polybase. If you want to make a collection of SQL Server instances as a PolyBase Scale-Out Group, then select PolyBase Scale-Out Group during the installation.

PolyBase support two Hadoop providers: Hortonwork’s Data Platform (HDP) and Cloudera’s CDH either on Linux or Windows. In addition to that it can connect with Azure Blob Storage too.

Let's start with a simple one. I have uploaded a file to my Azure Storage. This file holds small number of CDR records and let's see how we can read these records using TSQL.

In order to access this file, we need to make sure PolyBase is configured properly. Make sure two new services: SQL Server PolyBase Data Movement and SQL Server PolyBase Engine are running. In addition to that, make sure that it is enabled and connectivity is set as below.

For understanding the value that needs to be set with Hadoop Connectivity, read this.

Let's create a database and configure Data Source. Note that, we need to create a Database Credential first for using it with the Data Source.

Now I can simply query data in my Azure Storage using the table I created. Not only that I can join this dataset with my other tables and do more analysis. I will add more complex examples with next set of posts.

Tuesday, April 19, 2016

Although the word "Versioning" was not used, we have been maintaining versions of records, or more precisely history of records using different techniques. SQL Server offers various ways of handling or maintaining history records, or changes that have been done to our records, using features like CDC, CT, Optimistic Isolation Levels. Most of these features do not support actual "versioning" of records but these features can be used for handling different scenarios.

SQL Server 2016 introduces a new feature called System-Versioned Temporal Tables that provides the entire history of our records related to the changes done. This feature records versions of the records based on update and delete operations with the validity period of the version, allowing us to see not only the current record, state of the record during any given period, or allowing us to do point-in-time analysis. This feature is based on ANSI SQL 2011 standard but SQL Server current 2016 release does not support all the features that describes with it.

Let's write a code and see how it works. The following code creates a Database called Sales and a Table called Customer in Sales. Code creates Customer as a Temporal Table that requires few additional elements with the CREATE TABLE statement. It requires two datetime2 columns for maintaining the validity period. In addition to that, it needs SYSTEM_VERSIONING = ON and optionally a name for the history table. If the name is not mentioned, system will create one for us.

As you see, history table does not show any records and last two datetime2 columns have been automatically filled. Now let's make some changes. Note that records have been inserted on 14/04/2016 and will be updating records on 16th, 17th and 19th.

As you see, three records are in the history table; 2 records for the Customer Id 1 for two modifications made and 1 record for the Customer Id 2 for the deletion. This table exactly says how recorded are changed and when they have been changed, not only that it allows us to see the state of the record based on the validity period.

Let's see how we can retrieve records. There are multiple ways for querying records. We simply query the table without considering the history or we can go through the history using new clause given; FOR SYSTEM_TIME. Note the different between BETWEEN and FROM.

-- Retrieving the current record
SELECT *
FROM dbo.Customer
WHERE CustomerId = 1;
-- Retrieving for a date
-- Better include time too
SELECT *
FROM dbo.Customer
FOR SYSTEM_TIME
AS OF '2016-04-17 00:00:00'
WHERE CustomerId = 1;
SELECT *
FROM dbo.Customer
FOR SYSTEM_TIME
BETWEEN '2016-04-16 00:00:00' AND '2016-04-19 00:38:43'
WHERE CustomerId = 1
ORDER BY ValidFrom;
SELECT *
FROM dbo.Customer
FOR SYSTEM_TIME
FROM '2016-04-16 00:00:00' TO '2016-04-19 00:38:43'
WHERE CustomerId = 1
ORDER BY ValidFrom;

Friday, April 15, 2016

SQL Server uses Parameter Sniffing for creating the best plan for your query (or stored procedure) that sniffs values passed for parameters and generates the plan accordingly. In most of the cases, this is the best for most queries and stored procedures unless the values for parameters are vary with each execution. There was no easy way of enabling and disabling this with previous versions but trace flags 4136 allows you to disable Parameter Sniffing at server level. However the question is, should we disable it, or why should we disable it?

Let's try to understand this before looking at the setting given with SQL Server 2016. Have a look on the following code;

SQL Server generates two different plans for these two queries because they are ad-hoc statements and two different values are passed to SalesOrderDetailID. The plans were generated based on the values passed, basically using Parameter Sniffing. That is why we see two different plans for the same query.

With stored procedures, this is bit different. The plan for the procedure is created with its initial execution, based on the values passed for parameters and it stores the plan in the cache. SQL Server does not sniff parameter values with subsequent executions for generating the plan again unless it cannot find the cached plan. This speeds up the execution because SQL Server does not need to spend time and use resources for generating the plan again and again. But, remember, it uses the same plan for all type of values passed, it may not be the optimal plan for all values.

Okay, now why we should disable this behavior? If we disable this, SQL Server uses average distribution statistics for generating the plan which is not best for almost all implementation. If you really need a plan that does not depend on values passed and you need sort of average type plan for all executions, then you will benefit disabling Parameter Sniffing.

As I mentioned above, the trace flag can be used for disabling but it applies to all databases. With SQL Server 2016, whether you need it or not, this setting is available at database level and can be enabled/disabled anytime.

As you see, a plan has been generated without considering values passed and it is being reused. If you think that your codes benefit with this, then this new option can be applied to your database, otherwise, keeping the default is the best.

Wednesday, April 13, 2016

This is something we always worry, what will happen if the name of the host machine is renamed? After renaming, can I connect with my SQL Server instance using new name set with the host, or I have to re-install SQL Server, or I have to do some setting changes? Let me share what I just did.

My computer name was initially set as DESKTOP-QOKBL3L because I forgot to set the name properly when installing the OS. And not only that, I installed an instance of SQL Server 2012 as a named instance too. After realizing that name of the computer is not as I want, I just changed it as DINESH-HM-LAP01. Now can I connect with my SQL Server instance?

I just tried with the old name as I used to;

As you see, it did not allow me to connect. Then I tried with my new computer name;

And I was able to connect with my instance, yes without doing anything. But, does it mean that I do not want to anything?

Let's check the server name using TSQL;

As you see, not all metadata is updated, means it needs some updates. For updating metadata, we need to drop the server and add again. After that, a service restart is required, once done, if the above query is run again, both will show the new name instead of old name.

Wednesday, April 6, 2016

We know that Point-In-Time restoration is possible with SQL Server and it is one of useful ways of recovering data in a disaster-recovery situation. This requires backup taken that contains data that needs to be recovered. If no backup is available, we can still achieve this by taking a backup of the database, may be tail-log backup.

However, Azure SQL Database works differently. For Point-In-Time restoration, you do not need backups if the recovery related to certain time period. Azure SQL Database allows you to recover your database (or restore) to any restore point within 7 days if the tier is Basic, 14 days for Standard, 35 days for Premium. This is a very simple process and it restores the database with a different name like Database Name + TimeStamp.

Here are the steps for Point-In-Time restoration;

First login to your Azure Portal and get SQL Servers blade opened. Select the server and get the preferred database server blade opened.

The select the database from Databases section and get Properties of database opened. Click on Restore button for opening the blade for resorting.

You can see oldest restore point based on your tier. Select the Point-In-Time as you want, change other properties if need (example, different server) and click OK to get it restored.

Once restored, it can be accessed via Management Studio just like you access the other database. One thing you need to remember is, you cannot see both databases by connecting to the server, hence when connecting, make sure you mention the database name for connecting to newly created database.

Sunday, April 3, 2016

Administrators always try to automate routine tasks, making sure that all required operations are done on time without failures. The main component used for automation is SQL Server Agent, which is used for creating jobs. If you have noticed that a scheduled job has not run or all jobs are not running, how do you troubleshoot? How do you start troubleshooting?

Here are some guidelines for troubleshooting based on a discussion I had;

First thing needs to be checked is whether SQL Server Agent service is running or not. You need to make sure that Startup type is set to automatic, this makes sure that service is started when the server is restarted. If, for some reasons, service is not running, and you cannot even manually start it, check following;

Check and see whether account assigned for the Agent is valid and no issues with the password. The account assigned may have been expired, changed or disabled. Check the system log for more details.

Check the msdb database. If it is corrupted or offline, Agent will not be started.

Check the job history. Check whether last run was successful. There can be some issues with business logic implemented.

Check whether the job is enabled. Someone might have disabled it.

Check whether the schedule set is either expired or disabled.

Check and see whether proxy accounts are working properly if you have used. Check credentialsused for proxy accounts.

Check dependencies. There can be steps in the job that run without any issues but some. Check whether all required items, such as files, folders and all required permissions for accessing are available.

Search This Blog

About Me

Dinesh Priyankara (MSc IT) is an MVP – Data Platform (Microsoft Most Valuable Professional) in Sri Lanka with 16 years’ experience in various aspects of database technologies including business intelligence. He has done many local and foreign business intelligence implementations and has worked as a subject matter expert on various database and business intelligence related projects. He is the Founder/Principal Architect of dinesQL (Pvt) Ltd and he consults, teaches and runs training programs on data related solutions and subjects.