What does ACID mean in database technologies?

Why is ACID important?

Atomicity, Consistency, Isolation, and Durability (ACID) are import to database, because ACID is a set of properties that guarantee that database transactions are processed reliably.

Where is the ACID Concept described?

Originally described by Theo Haerder and Andreas Reuter, 1983, in ‘Principles of Transaction-Oriented Database Recovery’, the ACID concept has been codified in ISO/IEC 10026-1:1992, Section 4

What is Atomicity?

Atomicity ensures that only two possible results from transactions, which are changing multiple data sets:

either the entire transaction completes successfully and is committed as a work unit

or, if part of the transaction fails, all transaction data can be rolled back to databases previously unchanged dataset

What is Consistency?

To provide consistency a transaction either creates a new valid data state or, if any failure occurs, returns all data to its state, which existed before the transaction started. Also, if a transaction is successful, then all changes to the system will have been properly completed, the data saved, and the system is in a valid state.

What is Isolation?

Isolation keeps each transaction’s view of database consistent while that transaction is running, regardless of any changes that are performed by other transactions. Thus, allowing each transaction to operate, as if it were the only transaction.

What is Durability?

Durability ensures that the database will keep track of pending changes in such a way that the state of the database is not affected, if a transaction processing is interrupted. When restarted, databases must return to a consistent state providing all previously saved/committed transaction data

Here is a table quick reference of some common database and/or connection types, which use connection level isolation and the equivalent isolation levels. This quick reference may prove useful as a job aid reference, when working with and making decisions about isolation level usage.

Definition of a Foreign Key

A foreign Key (FK) is a constraint that references the unique primary key (PK) of another table.

Facts About Foreign Keys

Foreign Keys act as a cross-reference between tables linking the foreign key (Child record) to the Primary key (parent record) of another table, which establishing a link/relationship between the table keys

Foreign keys are not enforced by all RDBMS

The concept of referential integrity is derived from foreign key theory

Because Foreign keys involve more than one table relationship, their implementation can be more complex than primary keys

A foreign-key constraint implicitly defines an index on the foreign-key column(s) in the child table, however, manually defining a matching index may improve join performance in some database

The SQL, normally, provides the following referential integrity actions for deletions, when enforcing foreign-keys

Cascade

The deletion of a parent (primary key) record may cause the deletion of corresponding foreign-key records.

No Action

Forbids the deletion of a parent (primary key) record, if there are dependent foreign-key records. No Action does not mean to suppress the foreign-key constraint.

Set null

The deletion of a parent (primary key) record causes the corresponding foreign-key to be set to null.

Set default

The deletion of a record causes the corresponding foreign-keys be set to a default value instead of null upon deletion of a parent (primary key) record

Rebuilding Netezza view sometimes becomes necessary when the view’s source table have changed underneath the view. Rebuilding a view can be done on Netezza or in Aginity. In Aginity, it is a simple process, assume your user has permissions to create or replace a view. The process breaks down into just a few steps:

Generate the create / replace view SQL of the original view into the query window, if you don’t have it already

In the object browser:

Navigate to the Database and view you wish to rebuild

Select the view and right click

Select ‘Scripts’, then ‘DDL to Query window’

Make may updates to create / replace View SQL

This step is not always necessary, sometimes the changes which invalided the view did not actually impact the code of the view. If changes are necessary, make may updates to the SQL code.

Execute The code

This I usually do by choosing the ‘Execute as a single batch’ option. Make sure the code executes successfully.

Verify the view

To verify the simply execute a select statement and make it executes without errors and/or warning.

Isolation levels are part of the ACID (Atomicity, Consistency, Isolation, Durability) paradigms in database control. Isolation levels allow developers and user to trade-off consistency for a potential gain in performance. Therefore, it is essential to understand them and how the apply in structured Query Language(SQL). The dashDB RDBMS has four isolations levels:

Repeatable Read (RR)

The repeatable read (RR) isolation level locks all the rows that an application references during a unit of work (UOW). If an application issues a SELECT statement twice within the same unit of work, the same result is returned each time. Under RR, lost updates, access to uncommitted data, non-repeatable reads, and phantom reads are not possible.

Under RR, an application can retrieve and operate on the rows as many times as necessary until the UOW completes. However, no other application can update, delete, or insert a row that would affect the result set until the UOW completes. Applications running under the RR isolation level cannot see the uncommitted changes of other applications. This isolation level ensures that all returned data remains unchanged until the time the application sees the data, even when temporary tables or row blocking is used.

Every referenced row is locked, not just the rows that are retrieved. For example, if you scan 20 000 rows and apply predicates to them, locks are held on all 20 000 rows, even if, say, only 200 rows qualify. Another application cannot insert or update a row that would be added to the list of rows referenced by a query if that query were to be executed again. This prevents phantom reads.

Because RR can acquire a considerable number of locks, this number might exceed limits specified by the locklist and maxlocks database configuration parameters. To avoid lock escalation, the optimizer might elect to acquire a single table-level lock for an index scan, if it appears that lock escalation is likely. If you do not want table-level locking, use the read stability isolation level.

While evaluating referential constraints, the dashDB might, occasionally, upgrade the isolation level used on scans of the foreign table to RR, regardless of the isolation level that was previously set by the user. This results in additional locks being held until commit time, which increases the likelihood of a deadlock or a lock timeout. To avoid these problems, create an index that contains only the foreign key columns, which the referential integrity scan can use instead.

Read Stability (RS)

The read stability (RS) isolation level locks only those rows that an application retrieves during a unit of work. RS ensures that any qualifying row read during a UOW cannot be changed by other application processes until the UOW completes, and that any change to a row made by another application process cannot be read until the change is committed by that process. Under RS, access to uncommitted data and non-repeatable reads are not possible. However, phantom reads are possible. Phantom reads might also be introduced by concurrent updates to rows where the old value did not satisfy the search condition of the original application but the new updated value does.

For example, a phantom row can occur in the following situation:

Application process P1 reads the set of rows n that satisfy some search condition.

Application process P2 then inserts one or more rows that satisfy the search condition and commits those new inserts.

P1 reads the set of rows again with the same search condition and obtains both the original rows and the rows inserted by P2.

In a dashDB environment, an application running at this isolation level might reject a previously committed row value, if the row is updated concurrently on a different member. To override this behavior, specify the WAIT_FOR_OUTCOME option.

This isolation level ensures that all returned data remains unchanged until the time the application sees the data, even when temporary tables or row blocking is used.

The RS isolation level provides both a high degree of concurrency and a stable view of the data. To that end, the optimizer ensures that table-level locks are not obtained until lock escalation occurs.

The RS isolation level is suitable for an application that:

Operates in a concurrent environment

Requires qualifying rows to remain stable for the duration of a unit of work

Does not issue the same query more than once during a unit of work, or does not require the same result set when a query is issued more than once during a unit of work

Cursor Stability (CS)

The cursor stability (CS) isolation level locks any row being accessed during a transaction while the cursor is positioned on that row. This lock remains in effect until the next row is fetched or the transaction terminates. However, if any data in the row was changed, the lock is held until the change is committed.

Under this isolation level, no other application can update or delete a row while an updatable cursor is positioned on that row. Under CS, access to the uncommitted data of other applications is not possible. However, non-repeatable reads and phantom reads are possible.

Cursor Stability (CS) is the default isolation level.

Cursor Stability (CS) is suitable when you want maximum concurrency and need to see only committed data.

In a dashDB environment, an application running at this isolation level may return or reject a previously committed row value, if the row is concurrently updated on a different member. The WAIT FOR OUTCOME option of the concurrent access resolution setting can be used to override this behavior.

Uncommitted Read (UR)

The uncommitted read (UR) isolation level allows an application to access the uncommitted changes of other transactions. Moreover, UR does not prevent another application from accessing a row that is being read, unless that application is attempting to alter or drop the table.

Under UR, access to uncommitted data, non-repeatable reads, and phantom reads are possible. This isolation level is suitable if you run queries against read-only tables, or if you issue SELECT statements only, and seeing data that has not been committed by other applications is not a problem.

Uncommitted Read (UR) works differently for read-only and updatable cursors.

Read-only cursors can access most of the uncommitted changes of other transactions.

Tables, views, and indexes that are being created or dropped by other transactions are not available while the transaction is processing. Any other changes by other transactions can be read before they are committed or rolled back. Updatable cursors operating under UR behave as though the isolation level were CS.

If an uncommitted read application uses ambiguous cursors, it might use the CS isolation level when it runs. To prevent this escalation, modify the cursors in the application program to be unambiguous and/or Change the SELECT statements to include the for read-only

Recently, I had reason to during the months between two dates to test some data against business rule/requirement. Pleasantly surprised I was to find that Netezza had an easy to use function ‘months_between’ function to calculate the difference.

The Months Between SQL Function syntax

The months_Between function uses two dates to perform the calculation. Whether you want the output to be a positive or negative number determines the field order within the function.

For a positive number result, put the Newest Date Field first, separated by a comma, then Oldest Date Field

For a negative number result, just reverse the order putting the Oldest Date Field first, separated by a comma, then the Newest Date Field

The results will contain a decimal for the days of the month and you will need to round, based on your business requirements, to achieve a whole number.

Substring is a common enough function in SQL, however, the exact language format used to perform this function can vary from one database to another. So, here are a few quick notes on the substring format in Netezza / PureData.

What is the purpose of a substring?

SUBSTRING allows SQL to extract part of a string with a field, which in Netezza / PureData is based on positions with the string.

Sometimes it is useful to be able number data with it groups, which can be quickly and easily done suing the Row Number function in Netezza. However, you do need to pay particular attention to you ‘partition by’ (conceptually a group by operation) field list and field order, as well as, you ‘Order by’ field list. Otherwise, you can get some bizarre results. it find it helpful to think of them as having a parent, child, relationship. In this line of thought, your ‘partition by’ field list should usually have one less more field than than your ‘order by’. This is not an absolute rule, but under normal circumstances your ‘order by’ field list should contain all the field in your ‘partition by’ and, occasionally, provide a orby direction (ascending or descending).

SQL Netezza Row Number Format

ROW_NUMBER() OVER( PARTITION BY <<Partition_By_Field_list>> ORDER BY <<Sort_By_Field_List [direction asc/desc]>> ) as ROW_NUM

The Case Statement, is one which I occasionally use within Netezza/PureData SQL and Stored Procedures. Basically, the Case Function provides an ‘IF-THEN-ELSE’ decision capability. This ‘IF-THEN-ELSE’ capability allows the evaluation conditions and return a value when the first condition is met and/ or else break out logic, if the condition or conditions are not met.

In the classic Software Development Life Cycle (SDLC) process, Data Models are typically initiated, by model type, at key process steps and are maintained as data model detail is added and refinement occurs.

The Concept Data Model (CDM) is, usually, created in the Planning phase. However, creation the Concept Data Model can slide forwarded or backward, somewhat , within the System Concept Development, Planning, and Requirements Analysis phases, depending upon whether the application being modeled is a custom development effort or a modification of a Commercial-Off-The-Shelf (COTS) application. The CDM is maintained, as necessary, through the remainder of the SDLC process.

The Logical Data Model (LDM) is created in the Requirement Analysis phase and is a refinement of the information entities of the Concept Data Model. The LDM is maintained, as necessary, through the remainder of the SDLC process.

The Physical Data Model (PDM) is created in the Design phase to facilitate creation of accurate detail technical designs and actual database creation. The PDM is maintained, as necessary, through the remainder of the SDLC process.