Perhaps we should remind ourselves of the many ways data models can be caused to churn. Here are some examples that are top-of-mind for me. They do overlap a lot — and the whole discussion overlaps with my post about schema complexity last January, and more generally with what I’ve written about dynamic schemas for the past several years..

Just to confuse things further — some of these examples show the importance of RDBMS, while others highlight the relational model’s limitations.

The old standbys

Product and service changes. Simple changes to your product line many not require any changes to the databases recording their production and sale. More complex product changes, however, probably will.

A big help in MCI’s rise in the 1980s was its new Friends and Family service offering. AT&T couldn’t respond quickly, because it couldn’t get the programming done, where by “programming” I mainly mean database integration and design. If all that was before your time, this link seems like a fairly contemporaneous case study.

Organizational changes. A common source of hassle, especially around databases that support business intelligence or planning/budgeting, is organizational change. Kalido’s whole business was based on accommodating that, last I checked, as were a lot of BI consultants’.

That ability was also the most noteworthy feature of PeopleSoft’s application development technology, back in 1990s, at least the way I remember Rick Berquist explaining PeopleTools to me.

Mergers & acquisitions. Obviously, accommodating a business combination has a huge effect on data management, especially if you follow the usual path of starting with separate legacy systems and combining them where possible over time. And it plays merry hell with the trend-tracking parts of your accounting and BI systems.

Application replacement. Replace your third-party apps, for whatever reason, and you almost surely get a new database structure too. The same, of course, goes when you deploy entirely new apps. And when things get either more integrated (e.g. by replacing silos with an application suite) or less so (e.g. by introducing selective SaaS apps), special fun ensues.

Refactoring and MDM. There are numerous ways it can make sense to refactor your custom apps, including your custom/in-house ones. One important reason of many is to increase your adoption of master data management.

Third-party data. Enterprises are making ever more use of data supplied by third parties. That data typically shows up whenever the customer chooses to pay for it, in whatever form the data vendor chooses to supply.

Internet log data. Website logs are a mess, and the same goes for many mobile-app equivalents. Part of the reason is nested data structures. But even leaving those aside, it’s a best practice to extract and directly store different fields at different points in time.

The examples I’ve written about explicitly are eBay and Zynga. Satisfying a similar need is one of the pillars of the Splunk value proposition.

Machine-generated data. Besides the points I’ve already noted about log data, there are other issues with machine-generated data. In particular:

There’s a lot of it. Moore’s Law teaches us that some sensors will always be able to throw off more data than it will be affordable to store. Hence, selections will have to be made as to what constitutes a signal or even worth storing. As choices change, data structures are apt to change as well.

There are many different kinds of it. If you’re taking data from all the machines in a factory, or all the major parts of an automobile, then new sources or aspects of data will be introduced frequently, as engineers find new ways to use ever more affordable chips. And when the mobile devices are powerful multipurpose computers, such as smartphones, any one model can keep changing its mind as to what data it sends.