Normalization and T-SQL in SQL Server

(4.0)

| 2899 Ratings

Introduction

In this topic, let us try to understand the intrinsic details of what database normalization is and also, at the same time, we will try to understand the concepts of T-SQL in Microsoft SQL Server. Though the first concept is more relevant to almost all the relational databases available in the market today, the second concept that we would be discussing here would be very much specific to Microsoft’s SQL Server database alone. Let us go through these two concepts in detail and understand these to the core.

What is Normalization?

To understand the term normalization in a generic sense, it means to bring something to a normal condition or state. But this term has a very specific meaning for it in the areas of Databases. Hence, let us try to understand the concept of Normalization in the context of databases. Database normalization is the process of reorganizing data in a relational database in accordance with the series of so-called normal forms in order to reduce the amount of redundant data and also, at the same time, to improve upon the data integrity. The two principles that govern the process of Database Normalization are:

There is no redundant data available (all the data is stored only at one place)

Data dependencies should be logical (all related data should be stored together)

Database normalization was proposed and invented by Edgar F Codd and is an integral part of his relational model. He is also considered the Father of the relational data model. Almost all the relational databases engines that exist today still use the rules laid by him. As per the rules laid by him, he had extracted the first 3 Normal Forms (acronym as NF) – namely 1NF, 2NF and 3NF. He had proposed the theory of Normalization with the introduction of First NF and then continued / extended the theory with the Second and Third NFs. This theory had then been extended with the help of Raymond F Boyce to form the BCNF. We will discuss these in further detail in the following sections for better understanding.

Learn how to use MSBI, from beginner basics to advanced techniques, with online video tutorials taught by industry experts. Enroll for Free MSBI Training Demo!

What are the types of Normalization?

With the understanding that we have gained through the sections that are covered till now, let us expand our horizon of understanding by going through each and every one of these normal forms. Let us now go through each and every one of these Normal Forms and also understand what it brings to the table when it is applied.

First Normal Form (1NF):

The First Normal Form is achieved when each of the table cell contains only a single value, and also, each record needs to be unique.

The first point seems to be clear enough that there can’t be more than one value assigned for each column in a database table.

The second point is achieved by the usage of Primary Key (A Primary key is the single column of a table that uniquely identifies a database record.)

A point to note here is that the Primary Key can be composed of more than one key. A Composite Key is the set of columns that can be used in conjunction with each other to uniquely identify a database record.

These definitions would help in understanding the concepts of a table, database and etc. Please take a look at these if you feel are required:

A Table can be defined as a set of data elements using database rows and columns

A Table cell can be defined as the value when a specific table row and column intersect

A Table record / Row is a specific set of values for a set of columns that define a table

A Column is defined as a specific element of a table and a combination of such columns constitute a table

Second Normal Form (2NF):

A table can be taken to the 2nd Normal Form only if it is fully compliant with the rules laid for being called 1st NF.

The next rule that comes in to picture is that there should be one and only one column that should act as a Primary Key. The concept of Composite Key doesn’t allow us to a table to be in 2nd NF if it has a Composite Primary Key.

In order to reduce the dependency of using a Composite Primary Key, the table which has a Composite Primary Key should be disintegrated into 2 different tables and use the dependencies via the newly created tables. Hence comes the concept of Foreign Key.

A Foreign Key is a column that references the Primary Key of another table to uniquely connect your tables

Third Normal Form (3NF):

A table can be taken to the 3rd Normal Form only if it is fully compliant with the rules laid for being called 2nd NF.

The next rule that comes into picture is that there should be no transitive functional dependencies.

A transitive functional dependency is that scenario when changing a non-key column may affect in changing any other non-key column values.

Ideally, a table cannot go further than this level in the process of normalization, but for the sake of knowing things, we will go through the others as well.

Boyce-Codd Normal Form (BCNF):

Generally, most of the scenarios won’t reach to this level, but there can still be anomalies resulted if there is more than one Candidate key.

Sometimes, the BCNF form is also called as the 3.5NF.

Fourth Normal Form (4NF):

If there are no database table instances containing two or more, independent and also multivalued data describing a relevant entity – it is then called to be in the 4th NF form.

Fifth Normal Form (5NF):

Firstly, a table can be in the 5th NF only if it is compliant with all the rules laid for it to be called 4th NF.

A database table should no longer be in a state to be decomposed to further tables without losing any data, only then it will be called to be in the 5th NF form.

Domain/Key Normal Form(DKNF):

A domain/key normal form is a normal form used in the process of database normalization which requires that the database contains no constraints other than database constraints and key constraints.

A domain constraint mentions the permissible values for a provided attribute

A key constraint specifies the attributes that uniquely identifies a given row of a table

A table is said to be DKNF compliant when every constraint on the relation is logical consequence of the definitions of keys and domains.

Enforcing both these constraints are met ensures that there are no non-temporal anomalies existing in the database.

Sixth Normal Form(6NF):

Generally speaking, 6th NF is not a standardized form of a database table

There are still discussions happening amongst the database experts to be clear on what rules should it meet to be called a 6th NF form table.

What is Transact-SQL?

T-SQL, often abbreviated as Transact SQL, is a proprietary extension to SQL owned by Sybase and Microsoft SQL Server. T-SQL expands the horizons on the SQL standard by including more features than what SQL standards provide. T-SQL is much more centralized to SQL Server as such and most of the operations that are performed in SQL Server are done via T-SQL. This is very much true in case of GUI tools such as SSMS or DBeaver. From now, remember that any operation that you will and wish to do on Microsoft SQL Server on any given GUI tool, it is all T-SQL that runs in the background. T-SQL is proprietary in the case of Microsoft SQL Server and also Microsoft Azure SQL database.

Transact SQL (T-SQL) is not the only extension to the already set SQL standard, there are many as such. T-SQL is an extension owned by Sybase & Microsoft SQL Server. PL/SQL is an extension owned by Oracle whereas PL/pgSQL is an extension that is owned by PostgreSQL folks. Although there are various advantages that you gain while using it, it makes the process of moving from one database system to another extremely difficult.

T-SQL features:

With the understanding that we have on T-SQL, let us deep dive into the features that it provides to us. Some of the features that T-SQL provides (which are rather not provided by SQL) are as follows:

Procedural programming

Local variables to provide more flexible control over the application flow.

Various support functions for processing strings, dates and logical / mathematical functions. This helps T-SQL comply with the Turing completeness test (a test that determines the universality of any given computing language).

Provides a specific implementation to DELETE and UPDATE than that of SQL as it allows FROM clause to be added. Once this is added, we will be able to use or include JOINs.

Provides a way to use BULK INSERT to import lots of data into a database table or a view in a specified user format.

T-SQL also allows you to have a greater grip on the programmability. An example to this is the use of stored procedures where you can alter the input parameters to have a changed output parameter as well.

What is T-SQL used for?

We have already discussed in the earlier sections that T-SQL is a proprietary version of SQL standards for Microsoft SQL Server just the same way as PL/SQL is for Oracle database. An individual can create the required T-SQL units such as any other SQL scripts, Triggers, Functions or Stored procedures etc. to cater to specific requirements. Hence, to cater specific requirements on SQL Server, there can be specific code written such as a Trigger to handle the audit on a specific table or something.

Such requirements can be handled on the SQL Server instance itself using some SQL code, since SQL Server has its own notion of its syntax in the form of T-SQL – the database artifacts thus created would be called T-SQL pieces of code rather than SQL code. Mentioning it T-SQL, you’re drawing a definitive line of difference that it works only with SQL Server and may / may not work with the other database vendors – as there can be changes to the syntaxes, functionalities that are not made available for usage and the like.

What is T-SQL vs SQL?

Let us try to understand the differences between SQL & T-SQL in detail in this section here. Let us go through each point to understand the strengths of each of these that we are discussing here:

SQL

T-SQL

SQL (Structured Query Language) can be defined as a language to operate over sets of data.

T-SQL is a proprietary procedural language designed specifically for Microsoft SQL Server, and can also be said that T-SQL is an extension to SQL

SQL is an open format, which has been followed by various database providers (eg. Oracle, Sybase, PostgreSQL, MySQL, MS SQL Server)

T-SQL is proprietary to Microsoft SQL Server

SQL is not Turing complete and it is very limited in the scope of what can be done with it.

T-SQL is Turing complete

SQL doesn’t have any procedural programming on its own

T-SQL on the other hand contains procedural programming, local variables.

SQL has its own implementation of DELETE and UPDATE

T-SQL has its own implementation of DELETE and UPDATE than that of SQL.

Advantages of Normalization and T-SQL:

There are numerous advantages that are available for us to discuss here and we will just do that. Let us take a closer look at the advantages that Normalization provides us:

Normalizing a database to the lowest possible level gives us a greater overall database organization

It also ensures that there is no redundant data available

It also ensures that there is data consistency within the database itself

It also provides a better grip over the database security as well.

Functional dependencies are very important and handled better during the normalization process itself.

Provides ways and means for flexible database design.

Normalization also ensures that there are smaller tables with smaller rows ensuring more rows per page (less logical I/O).

Index searching gets quicker as the indexes tend to be narrower and shorter.

Let us now take a closer look at the advantages that T-SQL provides us:

T-SQL provides us ways and means to interact with SQL Server in a better way with SQL queries

Stored procedures bring in the concept of doing multiple things under a single transaction context

Triggers help in getting audit related stuff handled in a much efficient manner

There are more and more database related artifacts that we can handle at the database level itself, rather than relying on gaining these features via other programming languages for SQL Server.

The main advantage of T-SQL is that it provides better control over the database instance from the database instance level itself.

Conclusion:

In this article, we have tried to understand the concepts of Normalization and T-SQL (pertaining to MS SQL Server). We have also gone through the various available forms of normalization in a RDBMS system. We have tried to focus on the features that T-SQL has to offer and also have gone through the advantages that it brings to the table with its usage. On the whole, these two concepts put to use in an efficient manner will offer a greater performance to your application.

We have taken utmost care to provide you best of the details, but we would still encourage you to go through the official documentation for the latest updates as things are getting changed on a quicker pace. Hope that you have got all the details at once and hope all the details provided were of good use as well.

List of Related Microsoft Certification Courses:

Subscribe For Free Demo

Phone *

E-mail Address *

Free Demo for Corporate & Online Trainings.

About The Author

Ravindra Savaram is a Content Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.

Mindmajix - Online global training platform connecting individuals with the best trainers around the globe. With the diverse range of courses, Training Materials, Resume formats and On Job Support, we have it all covered to get into IT Career. Instructor Led Training - Made easy.