On 2 February I’ll be speaking at TECHUnplugged Austin, Texas. This event, which has free registration, focuses on how technology innovation is changing business and IT.

TECHunplugged is a full day conference focused on cloud computing and IT infrastructure.

Its innovative formula combines three essential parts of the industry for an exceptional exchange of information, insights and education:

– A group of independent, insightful and well-recognized influencers
– Leading disruptive technology vendors,
– End users who manage rich technology environments.

The ultimate goal of TECHUnplugged Conference is to bring quality information to IT decision makers by bringing them together with independent influencers and industry vendors, to engage, debate and be informed through open discussions on topics such as IT infrastructure, virtualization, cloud computing and storage.

I’m going to be talking about how data has changed over the years and how data quality issues can become obstacles to business innovation.

If you are in IT and would like to attend, use the registration form below. If you use my special code, you’ll be entered to win a special prize of an Amazon Echo (I SO LOVE MINE!) at the event.

On 12 April 2011 it was Yuri’s Night — the night we space fans celebrate Yuri Gagarin’s 1961 history-setting flight into space. In 2011 we were celebrating 50 years of manned spaceflight. On that same day in 2011, we reached the end of support for SQL Server 2005 SP4. On 12 April2016 we will reach the end of extendedsupport for SQL Server 2005. That means no more hotfixes, no help from Microsoft and no love for your data still living in SQL 2005 databases.

I’m hoping your organization is already on its way to upgrading and migrating data and applications to newer versions of SQL Server. SQL Server 2016 is already being used in production by early access customers. No matter which version you will be migrating to, I want to share with you some of the features and perks you’ll have available to you now that you are moving away from a dead version. Of course there are hundreds of enhancements that have happened since 2005, but today I’m focusing on those that a data architect would want to use in a database design for enhanced performance, security and data quality.

Columnstore Indexes

If you are designing for data warehouse type solutions, this is the closest thing we have for a "turbo switch" for SQL Server. Columnstore Indexes achieve high compression rates since they store columns together instead of storing rows together. They also support much faster query performance for batch and aggregation queries. They typically achieve 10x performance increases, sometimes even more. This feature was introduced in SQL Server 2012, but you’ll want the advances to this feature that came with SQL Server 2014.

SEQUENCEs

SEQUENCEs have been around in other DBMSs for a while, but were introduced in SQL Server 2012. These special objects work much like IDENTITY columns, but offer more flexibility and use cases. The main feature is that you can grab a sequence (or a bunch of them) before you insert a row. Many developers use GUIDs for similar reasons, but GUIDs are much longer and therefore had performance downsides. SEQUENCEs are integer types.

New Data Types

So many new data types have been introduced since SQL Server 2005, but the ones that really stand out for me are DATE, TIME, DATETIMEOFFSET, the geospatial types, and the deprecation of timestamp.

It wasn’t until SQL Server 2008 that we had access to data types that comprised only the DATE or TIME portion of a point in time. So we had to do all kinds of conversions just to strip out unwanted data (00:00:00). We also had to make room to store that unwanted precision. Storing millions of rows of unneeded zeros hurts performance, both operationally and for backup and recovery.

SQL Server 2008 also introduced DATETIMEOFFSET, which allows us to track data in context of its time zone. If you remember the days when meeting invites did not include this important piece of information, you’ll know why this is important.

The spatial data types GEOGRAPHY and GEOMETRY and have added a new and feature-rich way of tracking places, their geometry plus special features that make it much easier to answer questions like "which is the closest" or "is this address located in this neighbourhood". If your data story includes location points, you’ll want to use these.

SQL Server was always an oddball when it came to the data type TIMESTAMP. In other DBMSs, this data type was one that included date and time, to a very large precision. In SQL Server, TIMESTAMP is a type of row version identifier that has nothing to do with TIME. So data architects migrating from other DBMSs were often bitten when they used the wrong data type. Microsoft announced in 2008 that it was depreciating TIMESTAMP and recommending the use of ROWVERSION, which is similar (but not the same) in functionality.

Encryption

SQL Server 2016 currently includes support of Always Encrypted, a feature that does just that: it support the encryption of data from application to database and back, so that it is better protected than solutions that encrypt data once it is written to the database. I’m always reminding you that keeping your CIO out of jail is part of your job description, right?

Data Archiving

As our data gets bigger and bigger, the size of our databases is growing as well. That means that performance takes a hit. Developers want us to take shortcuts on data quality to improve performance because size matters. One of the ways to help manage data volumes is to move "cold" data to other storage locations. Starting in SQL Server 2016, we can stretch a database to Azure, which means that data that isn’t accessed as often can be stored in the cloud and retrieved when needed. This allows our hot data to be local and fast, while the cooler data is more economical to store and still there and your application doesn’t even have to manage this difference.

JSON Support

In SQL Server 2016 we are getting support for JSON processing. This isn’t the same as a JSON data type like we have with XML, but a set of import and export features for providing relational data as JSON documents and brining JSON data into SQL Server. Now you won’t have to manage all those curly brackets on your own.

One Last Thing…

As vendors withdraw support for their products, third party tool makers do so as well. If you are supporting older, out of support versions of databases, it’s likely that your data modeling, data quality and data integration tools are also dropping support for these solutions. You’ll be left supporting database systems without vendor support and without professional enterprise class modeling and design tools. I know how hard it is to keep track of databases that my tools can’t connect with. Don’t let sticking with an old version be the end of data modeling support for that data.

Show Your Data Some Love

These are just a tiny number of the types of features that will be available to you when you upgrade to modern versions of SQL Server. The advent of security, data quality and performance features are leaving your old solutions behind, putting your data at risk and leaving your customer data feeling unloved. There’s a data space race going on. Don’t live your company using old technology to manage data. Go fix that!

I conducted a Twitter poll last week about how to spell DATATYPE (or is it DATA TYPEs?). Many compound words start out as two separate words, then get hyphenated, then concatenated to a new word. We saw this with:

I have a couple of presentations where I describe how generalized data modeling can offer both benefits and unacceptable costs. In my Data Modeling Contentious Issues presentation, the one where we vote via sticky notes, we debate the trade-offs of generalization in a data model and database design. In 5 Classic Data Modeling Mistakes, I talk about over-generalization.

Over the last 20 some years (and there’s more “some” there than ever before), I’ve noticed a trend towards more generalized data models. The means that instead of having a box for almost every noun in our business, we have concepts that have categories. Drawing examples from the ARTS Data Model, instead of having entities for:

Purchase Order

Shipping Notice

Receipt

Invoice

etc

…we have one entity for InventoryControlDocument that has a DocumentType instance of Purchase order, Shipping Notice, Receipt, Invoice, etc.

See what we did there? We took metadata that was on the diagram as separate boxes and turned them into rows in a table in the database. This is brilliant, in some form, because it means when the business comes up with a new type of document we don’t have to create a new entity and a new table to represent that new concept. We just add a row to the DocumentType table and we’re done. Well, not exactly…we probably still have to update code to process that new type…and maybe add a new user interface for that…and determine what attributes of InventoryControlDocument apply to that document type so that the code can enforce the business rules.

Ah! See what we did there this time? We moved responsibility for managing data integrity from the data architect to the coders. Sometimes that’s great and sometimes, well, it just doesn’t happen.

So my primary reason to raise generalization as an issue is that sometimes data architects apply these patterns but don’t bother to apply the governance of those rules to the resulting systems. Just because you engineered a requirement from a table to a row does not mean it is no longer your responsibility. I’ve even seen architects become so enamoured with moving the work from their plate to another’s that they have generalized the heck out of everything while leaving the data quality responsibility up to someone else. That someone else typically is not measured or compensated for data integrity, either.

Sometimes data architects apply these patterns but don’t bother to apply the governance of those rules to the resulting systems

Alec Sharp has written a few blog posts on Generalizations. These posts have some great examples of his 5 Ways to Go Wrong with Generalisation. I especially like his use of the term literalism since I never seem to get the word specificity out when I’m speaking. I recommend you check out his 5 reasons, since I agree with all of them.

Jefferies is also leading a US$425m covenant-lite credit to back Idera’s acquisition of Embarcadero Technologies. Idera is backed by TA Associates. The deal, which launches on Thursday, includes a US$25m revolving credit, a US$300m first-lien term loan and a US$100m second-lien term loan.

So last year we had Embarcadero attempting to purchase ERwin from CA, now today we have Idera, makers of SQL Server focused database-related solutions, moving towards buying Embarcadero.

The Embarcadero-buying-ERwin deal fell through, in part, due to regulatory concerns over market consolidation of the database/data modeling tool business. I’m wondering how regulators will feel about this consolidation of tools.

I’ve worked with both vendors in the past. Both are based in Austin, Tx. Standing by to see what happens next.

UPDATE: I’m now seeing official communications about the sale, with a very aggressive closing date. This is in contrast to the prolonged, ultimately failed acquisition attempt by Embarcadero for CA’s ERwin Data Modeler product.

Thoma Bravo,a leading private equity investment firm, today announces the sale of Embarcadero Technologies, a leading provider of software solutions for application and database development, to Idera, Inc., an application and server management software provider. The exit represents the culmination of a long and successful partnership between Thoma Bravo and Embarcadero Technologies. The deal is expected to close in mid-October, subject to normal closing conditions and approvals.

Are you an educator teaching at an accredited university or college? Do you want Azure cloud subscription grants for you and your students? All you need to do is register with some basic information at: