For years, developers and management alike have been stating that vendor lock-in is a bad thing. Obviously, being too dependent on one vendor can lead to complications if that vendor fails to deliver. Logically, this vendor lock-in question has been raised regarding cloud providers. A recent VentureBeat post sums up the problem nicely:

Big corporations are still worried about security and about getting “locked in” to staying with a certain provider once all their data is stuck in certain applications.

What some people fail to realize is that this cloud lock-in question is the same as any other lock-in question.

For example, we can look at the database server vendors and software development. People were worried for years that their teams would develop software that would require a specific database. Microsoft created ODBC to avoid this problem, and Java followed suit with JDBC. These technologies allowed you to talk to a database without knowing the type of database it was. As long as it had the proper table structures, your application would still run correctly. The software side of IT has continuously tried to keep things at the API level because of the vendor lock-in or failure problems.

On the hardware side, we also have general standards regarding how things work. An ethernet connection on one machine really does not work any differently than a connection on another machine. Obviously there are differences when developing on these machines based on the operating system installed, but operating systems are the one area where people are OK with lock-in. However, most companies tend to purchase several machines from the same vendor. This occurs because they have either developed software that will run on those machines, or they purchase enough that they get volume discounts.

Even though these things have been done to avoid lock-in, invariably companies tend to lock themselves in through decision making anyway. If a company decides to store their data in an Oracle database, it is highly unlikely that they will switch database providers. If they do, there will be a significant data transfer project as well as long QA projects to determine that their applications still work when using the new database server. For server hardware, the problem is typically the same. Many companies are now using Linux, so there are few differences between the operating systems. The hardware is typically certified by the vender to support various operating systems. This also allows companies to move quickly, but again they tend to stay with the same vendor. Why? This avoids the unknown. If you always purchase HP servers with a specific version of Linux, you know about most of the problems you will see. Why would you switch to Dell servers with a different distribution of Linux?

Why Is The Cloud Different?

With cloud providers, you are getting your hardware preconfigured. You are also installing “machine images”, so your operating system and system software (i.e. Apache, Tomcat, etc.) are all configured in the same way. So, what makes the cloud lock-in different? Sometimes the data storage is some proprietary mechanism. This was true, but many providers are now allowing standard database installations like Oracle or MySql. If you are not using a standard database and you want some level of options for data storage, you can always look at things like Lucene (not exactly storage, but a search index) or Hadoop or any other alternative storage mechanism. This is possible because most of the cloud providers have a way for you to install whatever software you want.

So, what is the problem? Why are people concerned about cloud lock-in? The cloud services are currently being used mostly for application hosting and scalability. In that perspective, you only have the application, which you likely developed elsewhere and have the source code for, and the data storage. Typical database software has ways of extracting the data in a delimited or structured form, so you would just need to import the data into your new server. The other data storage mechanisms are either file system based, meaning you can just copy the directories, or they also have a data export mechanism.