why are noSQL databases more scalable than SQL? - Software Engineering Stack Exchangemost recent 30 from softwareengineering.stackexchange.com2019-09-15T10:51:58Zhttps://softwareengineering.stackexchange.com/feeds/question/194340https://creativecommons.org/licenses/by-sa/4.0/rdfhttps://softwareengineering.stackexchange.com/q/19434096why are noSQL databases more scalable than SQL?ducinhttps://softwareengineering.stackexchange.com/users/877072013-04-08T21:24:30Z2019-01-18T15:08:20Z
<p>Recently I read a lot about noSQL DBMSs. I understand <a href="http://en.wikipedia.org/wiki/CAP_theorem">CAP theorem</a>, <a href="http://en.wikipedia.org/wiki/ACID">ACID</a> rules, <a href="http://en.wikipedia.org/wiki/Eventual_consistency">BASE</a> rules and the basic theory. But didn't find any resources on why is noSQL scalable more easily than RDBMS (e.g. in case of a system that requires lots of DB servers)?</p>
<p>I guess that keeping constraints and foreign keys cost resources and when a DBMS is distributed, it is a lot more complicated. But I expect there's a lot more than this.</p>
<p>Can someone please explain how noSQL/SQL affects scalability?</p>
https://softwareengineering.stackexchange.com/questions/194340/-/194344#19434475Answer by Michael Kohne for why are noSQL databases more scalable than SQL?Michael Kohnehttps://softwareengineering.stackexchange.com/users/62942013-04-08T21:55:19Z2013-04-08T21:55:19Z<p>noSQL databases give up a massive amount of functionality that a SQL database gives you by it's very nature. </p>
<p>Things like automatic enforcement of referential integrity, transactions, etc. These are all things that are very handy to have for some problems, and which require some interesting techniques to scale outside of a single server (think about what happens if you need to lock two tables for an atomic transaction, and they are on different servers!).</p>
<p>noSQL databases don't have all that. If you need that stuff, you need to do it yourself, but if you DON'T need it (and there are a lot of applications that don't), then boy howdy are you in luck. The DB doesn't have to do all of these complex operations and locking across much of the dataset, so it's really easy to partition the thing across many servers/disks/whatever and have it work really fast.</p>
https://softwareengineering.stackexchange.com/questions/194340/-/194367#1943674Answer by RandomProgrammer for why are noSQL databases more scalable than SQL?RandomProgrammerhttps://softwareengineering.stackexchange.com/users/877272013-04-09T03:04:38Z2013-04-09T03:33:52Z<p>It's true that NoSQL databases (MongoDB, Redis, Riak, Memcached, etc.) don't maintain foreign key constraints, and atomic operations must be more explicitly specified. It's also true that SQL databases (SQL Server, Oracle, PostgreSQL, etc.) can be scaled to handle very large performance requirements by seasoned DBAs.</p>
<p>NoSQL databases allow seasoned programmers, who are well aware of race-conditions and atomic operations, to forego a large amount of processing only required in a small percentage of today's web application code. NoSQL databases certainly have atomic operations and most all transactional requirements present in SQL databases can also be obtained NoSQL databases. The difference is the level of abstraction. NoSQL databases remove the higher levels of abstraction and hand that capability to the application programmer, thereby resulting is faster code overall with the increased probability of data corruption by unseasoned programmers. </p>
<p>As a result we are much more likely to see NoSQL databases being used more and more heavily in the web application space, where development time and performance are very important. Financial and corporate software is likely to retain it's SQL heritage because hardware performance is relatively cheap, they have seasoned DBAs on-hand, and the increased risk caused by unseasoned programmers is not palatable.</p>
https://softwareengineering.stackexchange.com/questions/194340/-/194381#1943814Answer by Md Mahbubur Rahman for why are noSQL databases more scalable than SQL?Md Mahbubur Rahmanhttps://softwareengineering.stackexchange.com/users/637152013-04-09T06:05:10Z2015-11-24T14:15:34Z<p>From IBM developerWorks: <a href="http://www.ibm.com/developerworks/cloud/library/cl-nosqldatabase/index.html?ca=drs-" rel="nofollow noreferrer">Supply cloud-level data scalability with NoSQL databases</a></p>
<p><strong>Scalability</strong> is the system that should be able to support very large databases with very high request rates at very low latency.</p>
<p><strong>NoSQL systems have a number of design features in common:</strong></p>
<ul>
<li>The ability to horizontally scale out throughput over many servers.</li>
<li>A simple call level interface or protocol (in contrast to a SQL
binding).</li>
<li>Support for weaker consistency models than the ACID transactions in
most traditional RDBMS.</li>
<li>Efficient use of distributed indexes and RAM for data storage.</li>
<li>The ability to dynamically define new attributes or data schema.</li>
</ul>
<p><strong>Why relational databases may not be optimal for Scaling</strong></p>
<p>In general, relational database management systems have been considered as a "one-size-fits-all solution for data persistence and retrieval" for decades. They have matured after extensive research and development efforts and very successfully created a large market and solutions in different business domains.</p>
<p>The ever-increasing need for scalability and new application requirements have created new challenges for traditional RDBMS, including some dissatisfaction with this one-size-fits-all approach in some web-scale applications. The answer to this has been a new generation of low-cost, high-performance database software designed to challenge dominance of relational database management systems. A big reason for the NoSQL movement is that different implementations of web, enterprise, and cloud computing applications have different requirements of their databases — not every application requires rigid data consistency, for example.</p>
<p>Another example: For high-volume websites like eBay, Amazon, Twitter, or Facebook, scalability and high availability are essential requirements that cannot be compromised. For these applications, even the slightest outage can have significant financial consequences and impacts customer trust.</p>
<p>Over on DBA.SE: <a href="https://dba.stackexchange.com/questions/4508/what-does-horizontal-scaling-mean">What does horizontal scaling mean?</a></p>
<p>Horizontal Scaling is essentially building out instead of up. You don't go and buy a bigger beefier server and move all of your load onto it, instead you buy 1+ additional servers and distribute your load across them.</p>
<p>Horizontal scaling is used when you have the ability to run multiple instances on servers simultaneously. Typically it is much harder to go from 1 server to 2 servers then it is to go from 2 to 5, 10, 50, etc.</p>
<p>Once you've addressed the issues of running parallel instances, you can take great advantage of environments like Amazon EC2, Rackspace's Cloud Service, GoGrid, etc as you can bring instances up and down based on demand, reducing the need to pay for server power you aren't using just to cover those peak loads.</p>
<p>Relational Databases are one of the more difficult items to run full read/write in parallel.</p>
https://softwareengineering.stackexchange.com/questions/194340/-/194408#194408169Answer by Joeri Sebrechts for why are noSQL databases more scalable than SQL?Joeri Sebrechtshttps://softwareengineering.stackexchange.com/users/20632013-04-09T10:36:17Z2017-10-16T07:18:10Z<p>It's not about NoSQL vs SQL, it's about BASE vs ACID.</p>
<p><strong>Scalable</strong> has to be broken down into its constituents:</p>
<ul>
<li>Read scaling = handle higher volumes of read operations</li>
<li>Write scaling = handle higher volumes of write operations</li>
</ul>
<p>ACID-compliant databases (like traditional RDBMS's) can scale reads. They are not inherently less efficient than NoSQL databases because the (possible) performance bottlenecks are introduced by things NoSQL (sometimes) lacks (like joins and where restrictions) which you can opt not to use. Clustered SQL RDBMS's can scale reads by introducing additional nodes in the cluster. There are constraints to how far read operations can be scaled, but these are imposed by the difficulty of scaling up writes as you introduce more nodes into the cluster.</p>
<p>Write scaling is where things get hairy. There are various constraints imposed by the ACID principle which you do not see in eventually-consistent (BASE) architectures:</p>
<ul>
<li>Atomicity means that transactions must complete or fail as a whole, so a lot of bookkeeping must be done behind the scenes to guarantee this.</li>
<li>Consistency constraints mean that all nodes in the cluster must be identical. If you write to one node, this write must be copied to all other nodes before returning a response to the client. This makes a traditional RDBMS cluster hard to scale.</li>
<li>Durability constraints mean that in order to never lose a write you must ensure that before a response is returned to the client, the write has been flushed to disk.</li>
</ul>
<p>To scale up write operations or the number of nodes in a cluster beyond a certain point you have to be able to relax some of the ACID requirements:</p>
<ul>
<li>Dropping Atomicity lets you shorten the duration for which tables (sets of data) are locked. Example: MongoDB, CouchDB.</li>
<li>Dropping Consistency lets you scale up writes across cluster nodes. Examples: riak, cassandra.</li>
<li>Dropping Durability lets you respond to write commands without flushing to disk. Examples: memcache, redis.</li>
</ul>
<p>NoSQL databases typically follow the BASE model instead of the ACID model. They give up the A, C and/or D requirements, and in return they improve scalability. Some, like Cassandra, let you opt into ACID's guarantees when you need them. However, not all NoSQL databases are more scalable all the time.</p>
<p>The SQL API lacks a mechanism to describe queries where ACID's requirements are relaxed. This is why the BASE databases are all NoSQL.</p>
<p>Personal note: one final point I'd like to make is that most cases where NoSQL is currently being used to improve performance, a solution would be possible on a proper RDBMS by using a correctly normalized schema with proper indexes. As proven by this very site (powered by MS SQL Server) RDBMS's can scale to high workloads, if you use them appropriately. People who don't understand how to optimize RDBMS's should stay away from NoSQL, because they don't understand what risks they are taking with their data.</p>