I recently got a new laptop and had to go through the ritual of reinstalling all my programs and drivers. I sometimes work with SSIS locally to import data from Excel and occasionally do demos with Power BI where I read from an Access database so I needed to install the ACE OLE DB provider. If you try to do either of those things without the ACE OLE DB provider you will get a message like the one below.

The ‘Microsoft.ACE.OLEDB.12.0’ provider is not registered on the local machine.

I needed both the 32-bit and the 64-bit adapters to cover my various use cases. I started by installing the 64-bit Microsoft Access Database Engine 2010 Redistributable. If you have ever attempted this, you know there is a secret. You cannot install the 2010 Access database engine with a different processing architecture (bittedness) from your Office install (or install both versions of the 2010 Access database engine) without a workaround. If you try, you will get an error message.

The 2007 Access database engine can read the same format, but didn’t have a 64-bit install. It doesn’t have the check to see what other products are on your machine and can co-exist peacefully with the 64-bit 2010 Access engine.

I recently worked on a project that used Flow to update a SharePoint list each time an item was updated in the Power Apps Common Data Service. In order to update a SharePoint list item, you must have the unique ID, even if there are other fields that are unique to the item. I spent a while searching through forums to find out how to accomplish this, so I’m documenting it here in the hopes it helps someone else.

We know we need a starting point from which the updates originate. For me, that was when a record is updated in an entity in the Common Data Service, but this could be anything (an email arriving, an update occurring in another SharePoint list, etc.). This source doesn’t contain the unique ID for the list item, since that ID is unique to SharePoint. In my example below I have a CDS entity called Tactic. Tactic has three fields that are submitted to the SharePoint list upon creation: Tactic Name, ID, and Tactic ID. My SharePoint list is called Meagan Test Flow List and contains the corresponding fields Title, TacticID, and RecordID.

The next part is the trick that makes this work. You’ll need to add a new action and choose SharePoint – Get items. Be sure that you choose SharePoint Get Items(items is plural). There is another choice called SharePoint – Get Item, which is not the same and will not work for this purpose. In the Get Items action, populate the site address and list name. Then expand the advanced options and add a filter. This filter should use a field in your source that uniquely identifies a record. For my example, the ID field in the CDS entity has unique values for each record. My filter condition is RecordID eq and then the CDS field. RecordID is an integer. If the value on which you are filtering is a string, you’ll want to put single quotes around it.

Once you have completed your filter condition, you can add the final action: SharePoint – Update Item. Flow will automatically add a for each container as you begin to fill in the required fields – this is fine. Even though you are only updating a single record, the Get Items action could potentially get multiple records. As long as you are using a field that uniquely identifies the single record you want to change, this will work.

You should end up with something resembling the image below.

You should populate the ID field with the ID from the Get Items result. You want to make sure you have mapped the fields from your source to any field that should be updated. The only field I was updating in my list was Title. The TacticID and RecordID do not change after creation.

I’m pleased to say that I am speaking at IT/Dev Connections 2017. This year the conference will be held in San Francisco October 23-26. I had a great experience speaking at IT/Dev Connections in 2015, so I am excited to return again this year.

This conference is special to me because of its focus on providing great content for developers and IT pros – the conference website describes it as the “anti-keynote” conference with no forced marketing content.

I also enjoy it because it is more than just SQL Server/Data Platform (they have tracks for Cloud & Data Center, Enterprise Collaboration, Development & Dev Ops, and Enterprise Mobility and Security), and it’s nice to get out of my comfort zone a bit.

If you have been wanting to get into Azure Data Factory (ADF) development, join me for this demo-filled overview. In this session, we’ll go over the basic anatomy of an ADF solution. You’ll learn what ADF is and isn’t as we walk through a solution to pull data from an on-premises SQL Server database to a blob storage and then populate and Azure SQL Data Warehouse. You’ll learn tips for creating ADF solutions in Visual Studio, and I’ll show you how to make ADF development less tedious with a free Visual Studio Add-in called Biml Express. You’ll leave with a basic understanding of ADF and a list of tools and skills you’ll want to acquire as you begin your ADF development.

Improve Data Warehouse ETL Delivery with a Patterns-Based Approach

What if I told you that 90% of your data integration development in SQL Server could be automated? In 5 years, you will be “old fashioned” if you are hand coding SSIS packages. Developers with different skill levels and design preferences create databases and SSIS packages however they see fit to get the job done. Documentation is frequently omitted. Maintenance and small enhancements consume too much development time while manual errors and inconsistencies slip through the testing and release process. You can use tools and frameworks to rearrange the development process and alleviate these common problems. The implementation and automation of design patterns leads to improved efficiency and communication. Join me in this session to learn how to use Business Intelligence Markup Language (Biml) and Excel to facilitate metadata-driven SSIS development. I’ll use database schema information plus Excel inputs to implement a small data mart from staging through the dimensional model.

On a recent project I used Azure Data Factory (ADF) to retrieve data from an on premises SQL Server 2014 instance and land them in Azure Data Lake Store (ADLS) as ORC files. This required the use of the Data Management Gateway (DMG). Setup was quick and easy in our development environment. We installed the DMG for development on a separate server in the client’s network, where we also installed SQL Server Management Studio (SSMS) for query development and data validation. We set up resource groups in Azure for development and production, and made sure the settings for development and production were the same. Then we set up a separate server for the production DMG.

Deployment and execution went well in the dev environment. Testing was completed, so we deployed to prod. Deployment went fine, but the pipelines failed execution and returned the following error on the output data sets.

Java Runtime Environment is not found.

The Java Runtime Environment (JRE) is not required for the DMG to run successfully, but it is needed for the creation of ORC files. The only problem with this error message was that we did indeed have the JRE installed on the server.

After reinstalling the JRE and the DMG and getting the same error, I consulted the troubleshooting guide. After finding nothing relevant there, I asked some colleagues for suggestions.

I double-checked that I had the same version of the DMG that worked in dev and that I had the most current version of the JRE.

I double-checked that the DMG and JRE matched bit-wise (32-bit vs. 64-bit). Both were 64-bit in my case.

I checked that JAVA_HOME was set correctly in the environment variables.

When none of those things worked, I logged a support ticket with Microsoft. They had me do the following:

Check the registry key – HKEY_LOCAL_MACHINE\Software\JavaSoft\Java Runtime Environment should have a Current Version entry that shows the current JRE version.

Check that the subkey in the folder labeled with the version has a JAVAHOME entry with the correct path (something like C:\Program Files\Java\jre1.8.0_74).

After review, we realized that we had installed SSMS on the dev DMG server but not on the prod DMG server. SSMS would have required the installation of the C++ redistributable package, which is why we didn’t encounter this error in dev.

I will confess that I don’t understand exactly why missing C++ libraries manifest themselves in an error claiming a missing Java Runtime Environment. If you have a good explanation, please leave it in the comments and I’ll update this and give you credit.

I hope that someone else who runs into this issue will find this blog post and avoid days of troubleshooting and confusion.

I’ve spent the last couple of months working on a project that includes Azure Data Factory and Azure Data Warehouse. ADF has some nice capabilities for file management that never made it into SSIS such as zip/unzip files and copy from/to SFTP. But it also has some gaps I had to work around. My project involved copying data from on-premises SQL Server to an ORC file in a data lake staging area for ingestion into an Azure SQL Data Warehouse through Polybase. Then I had planned to move that file to the a raw area of the data lake for archiving.

In other words, as sung below by a great lemur, I like to move it.

But at this time ADF doesn’t support that. You can copy a file with a copy activity, but you cannot actually move (i.e., copy and delete).

Luckily, we had a workaround for our situation. If you tell ADF to copy data to a file that already exists in the specified location in the data lake, it will overwrite the existing file. We made sure the file name is always the same for each table in the staging area so there is always only one file per table.

I landed the data in Raw first so that we would not have to pull from SQL again if we needed to re-run a slice. Data latency wasn’t a huge issue for this client – we had some pipelines that ran hourly and some that ran daily. The extra seconds it took to land the file in Raw was not a concern.

For now, if you do need to actually move or delete, you can use a custom C# activity to delete files. I chose not to do this because I didn’t want to add another technology for the client to learn/manage while adopting Azure. This may be the way to go for other projects.

If you think moving (copying and deleting) files should be a first class citizen in Azure Data Factory, please vote for the idea and spread the word for others to vote.

I support 'Move Activity' – An activity that copies and then deletes in Azure Data Factory. What do you think? https://t.co/fyBE7tyw8w

Years ago in a previous job, I worked at a company that had no DBAs. I am/was a BI developer, so I know my way around a database, but I wasn’t dedicated to keeping all databases in good health. There were several application developers at this company (mostly focused on .NET and Javascript) who built applications with SQL Server databases as the back end. And there was a guy who acted as a system admin among his many other duties. The application developers had built a web app that was to be used by users around the world. The application had been launched and things were fine for several weeks. I wasn’t involved with the project, but I was aware of it.

One day, a manager asked me if I could help on an urgent matter: the application suddenly could no longer execute transactions on the production database and the database connection was intermittently failing. The system admin was busy with other duties, so I was the closest thing they had to a DBA. All they could tell me was the production database had crashed and they got an error message about insufficient disk space.

I logged on to the server that housed the database to see what was going on. The server itself had been set up appropriately and seemed to have sufficient memory and CPU to support the load of this application. I saw 3 volumes on the server: a C volume for application and system files, a large F volume for data, and a large G volume for logs.

I connected to the database with Management Studio to do some more digging. The first thing I noticed is that the dev, test, and prod databases for this application were all on the same SQL Server instance. The dev and test databases weren’t very large, so while that wasn’t what I would have recommended, that didn’t seem to be the main problem. As I looked at the prod database, I noticed that the MDF and LDF files were sitting on the C volume rather than the spacious F volume that was made for them! The person who configured the server hadn’t made the C volume very large since user databases weren’t supposed to be there.

Then I looked at the size of the log file. It was huge! A bit more digging revealed that they had left all the defaults on the database for full recovery and autogrowth of the log file, but they had never done a transaction log backup. (Sidenote: You can check the Log_Reuse_Wait_Desc column in sys.databases to verify the database is waiting on a transaction log backup.) The developers had worked long and hard to get the application up and running and hadn’t quite finished up the maintenance and disaster recovery tasks.

Once I knew what I was dealing with, I was able to fix the problem. A full backup and a log backup later we were back in business. I went ahead and shrunk the log file back to a reasonable size (please remember this is reserved for special occasions). I took the database offline (which was acceptable since the application was currently unusable anyway), moved the MDF and LDF files to their rightful home, and brought it back online. A lesson on recovery models and setting up SQL Agent jobs that scheduled such backups ensured this didn’t happen again anytime soon.

This should be a good reminder to have a healthy respect and understanding for your database settings and to make sure you have (and test) your backups (both full and transaction logs) for your production databases.