In this post “Skip header and footer rows in Hive“, we are going to learn that how we can ignore few header and footer records in Hive without loading or reading these records in another table or in a view temporarily. If you want to read more about Hive, visit my post “Preserve Hive metastore in Azure HDInsight” which explains Hive QL in detail.

Skip header and footer records in Hive

We can ignore N number of rows from top and bottom from a text file without loading that file in Hive using TBLPROPERTIES clause. The TBLPROPERTIES clause provides various features which can be set as per our need. It can be used in this scenario to handle the files which are being generated with additional header and footer records. Let’s have a look at the below sample file:

In this blog “Preserve Hive metastore in Azure HDInsight“, we are going to learn how we can preserve the hive metadata while working with the Azure HDInsight services. Microsoft Azure HDInsight is an on-demand managed Open source Big Data analytics service for the enterprises. We can provision clusters as per the demand in few minutes, perform the computations, and then we can shut it down to avoid charges. We pay as per the usage only. You can visit this link to know more about Azure HDInsight.

What is Hive?

Apache Hive is a SQL like Big Data query language which is used as an abstraction for the map reduce jobs. The Hive query seamlessly converts into an equivalent map reduce job without the need to write low-level code. This increases the productivity of a developer to a great extent. If you want to read more about Hive … More

In this post “Connecting Python 3 to SQL Server 2017 using pyodbc”, we are going to learn that how we can connect Python 3 to SQL Server 2017 to execute SQL queries. We can change the settings accordingly to connect to other versions of SQL Server also. If you are interested to know more about Python and why you should learn it, visit our post “Why Python and how to use it in SQL Server 2017“.

What is pyodbc?

pyodbc is an open source DB API 2 Python module. It provides a convenient interface to connect a database which accepts an ODBC connection. In order to use pyodbc module, firstly, we need to install it. Click here for more information on pyodbc.

pip install pyodbc module

We can use pip install command to install the pyodbc module in Python 3 on a Windows machine. Before executing the … More

C# Entity Generator or Class Generator Tool

A few years back, I created a tool “C# Entity Generator”. Though it was created for an older version of visual studio, still it can be useful. I am sharing this tool here so that anyone can download and use it for free. This tool does not require any installation and is a copy paste utility.

C# Entity Generator is a tool which can be used to generate C# Entity Layer classes without writing a single line of code. If we are using three-tier architecture in our applications and need to create entities frequently to map the output of the relational queries, this tool can be very useful. It increases the developer’s productivity especially if the entity contains a large set of properties. Only once would we need to define the data types, their prefixes as per the naming convention guidelines, and their … More

As Microsoft has integrated Python in SQL Server 2017 for advanced data analytics and machine learning purpose, it can also be used to ease the complex data transformation and analysis which might be tedious and a bit complex while doing the same using T-SQL.

Let’s create the required table with the sample data to demonstrate the use case example.

Python use case – Get employees for given skill set

Assume that, we have a table employee master named as tbl_EmpMaster which has … More

We know that Microsoft has integrated Python in SQL Server 2017 to enable rich data analytics capabilities within the database itself. Python is one of the most powerful languages which provides lots of built-in libraries for advanced data analytics and transformations. We can use Python for almost everything from website development to robotics and Data Science. In SQL Server 2017, Python can be used primarily for Machine learning purposes but it is not limited to that only. We can also use Python for complex data transformations and analysis which might be a bit tedious and complex while doing the same using T-SQL in SQL Server.

In this post, we will be exploring an use case example of Python for data transformation in SQL Server 2017. If you want to read more about Python and how to use it in SQL Server, you can visit my previous blog post “Why … More

Microsoft has integrated Python in SQL Server 2017 which can be used for in-database analysis purpose. In this post, we are going to explore “Why Python and how to use it in SQL Server 2017”, and then we will explore that how we can use it in SQL Server 2017.

Why Python

Python is a general purpose object oriented programming language which can be used to develop applications for a variety of domains. We can use Python for almost everything from desktop and website development, gaming, robotics, scientific and numeric computing to spacecraft control and much more. Python is a high-level programming language which is an interpreted language (execute line by line) instead of compiled language. The Python has gained popularity because of its user friendliness. The developers fall in love with Python because it is easy to learn, but still very powerful. The technology giants like Google, YouTube, Dropbox, … More

Microsoft has launched the most recent SQL Server 2017 release candidate (RC1, July 2017). It can be downloaded from this link. SQL Server 2017 will run on both Windows as well as on Linux OS. It also supports macOS via Docker too. In this post, we will discuss the new features of SQL Server 2017.

SQL Server 2017 New Features (SQL Server vNext)

Though the SQL Server 2017 has many new features, in this post, we are going to highlight the features which can be mostly used by SQL Server Developers.

1. SQL Server Machine Learning Services – R and Python

SQL Server 2016 integrated the R programming which can be run within the database server and can be embedded into T-SQL script too. Now, in SQL Server 2017, we can execute the Python script within the database server itself. Both, R and Python are most popular programming language … More

Sometimes it happens that we need to open an SSIS package which contains connections which are not accessible or not allowed to be accessed due to some security reasons. In this post “Work offline with SSIS package”, we are going to learn that how we can open, read, and modify the components of an SSIS package when its connection managers are not able to connect to the underlying data source. This feature can be highly useful when it comes that there is a developer who is not authorized to access the underlying data source being used in the package but he needs to read some column mappings information from the source and destinations being used in the package.

Whenever we open an SSIS package in Visual Studio designer, the designer tries to connect to the data sources being used by the package to verify the metadata. And if it fails … More

In this post, “Specific row at the top then sort the rest result set in SQL Server“, we are going to learn that how we can order a result set in a customized way which cannot be achieved using ORDER by Clause in a simple way.

To demonstrate this, let’s create a sample table named as “tbl_Department” and insert some dummy rows in it. Below is the code to create the sample table: