Introduction

For the 2018 DZone Guide to Big Data, we surveyed 540 software and data professionals to get their thoughts on various topics surrounding the field of big data and the practice of data science. In this article, we focus in on what respondents told us about database management systems (DBMS) and using the cloud to house and analyze data sets.

Database Management Systems

Database popularity has been (according to db-engines.com) moving in fairly steady patterns for the past several years. 2012/2013 saw some combat between MS SQL Server and MySQL before MySQL overcame Microsoft’s DB, and Oracle’s database stood above them all (although dropping significantly since 2016, at times in danger of being surpassed by MySQL). Among our Big Data survey respondents, MySQL is still the most popular DBMS from 2017, though its popularity has dropped (61% use in production in 2017 vs. 55% use in production in 2018). Oracle’s use in production, on the other hand, has increased from 48% in 2017 to 54% in 2018. Other DBMS trend changes in production include an increase in respondents’ usage of PostgreSQL from 35% to 41% and a decrease in respondents’ usage of MS SQL Server from 49% to 42%.

Finally, when asked which databases respondents used specifically for their “Big Data needs,” the most common response was the NoSQL DBMS MongoDB, having 11% more respondents than the next DBMS for “Big Data,” Oracle, at 29%. While we’ve seen advanced filesystems (like Hadoop) start making an impact on Big Data collection and analysis, non-relational databases seem to also be showing their worth for dealing with data beyond the standard levels of volume, velocity, and variety.

Data in the Cloud

Respondents typically working with data “in the cloud” rather than on-premise or in a hybrid manner has increased since last year’s survey. Those who work with data in the cloud (particularly respondents who answered that they have data science experience) increased from 31% in 2017 to 39% in 2018. Meanwhile, respondents saying they typically deal with data on-premise or in a hybrid format decreased from last year's responses by 6% and 4%, respectively. While an increase in the adoption of data living in the cloud is unsurprising, given general development trends toward overall cloud usage, the growth in cloud data specifically for Big Data needs is minor compared to cloud adoption in other areas we have researched, such as Continuous Delivery. This is likely due to the fact that truly “big” data is often easier and faster to work with the closer it is.

Conclusion

If we compare the results of our survey to the rankings of the site DB-Engines, a few fluctuations occur. Whereas our respondents ranked MySQL as the top DBMS, DD-Engines places MySQL in second behind Oracle. The bronze medal in DB-Engines's rankings went to Microsoft SQL Server. But, the DB-Engines rankings are not geared toward big data, which could have an effect on the difference in findings.