One of the humbling things about working at Oracle with the
various MySQL personnel is that you are often blown away by
something one of them says or does. And that is on a
regular basis. In this case it is Dr. Charles Bell who gave
a great series of presentations last June at the Southeast
Linuxfest.
In particular he presented in a full formed state some ideas that
had been rattling around in my skull (but no way near as
coherent) on how to take advantage of the MySQL JSON data
type. Below are his points from his slide deck. I was
reviewing my notes from his presentation when I realized that
this information really needs to be more widely
disseminated. And I would like your feedback on these
ideas?
1.. We can use a JSON field to eliminate one of the issues of
traditional database solutions: many-to-many-joins

This allows more freedom to store unstructured data (data
with pieces …

Well, every now and then, when we began to start a new project or
app, which has some data storage requirement, we have a deep
intriguing thought as to how best represent the data structure so
as to support a variety of needs including but not limited to
(ACID rules):

1. Normalization2. Reliability3. Consistency4. And many others

Below, I provide a set of steps which you can follow to arrive at
a data model that correctly suites your requirements.

Steps:1. Identify the project or app requirements /
specifications and business rules which tell you what your app
will be able to do when it is ready.2. From these business rules, identify possible objects
for each business rule and mark them in a paper using rectangular
sections like authors, posts etc.3. Once you have recognized the …

Before you start creating database entities, spend some time
designing your database to ensure that it is fit for your
purpose. The time you invest in this process saves a lot of time
and trouble later. Professional database designers fine-tune
their design using a process called 'normalization'. The
normalization process takes your database design through a number
of 'normal forms', which aim to ensure efficient data access,
greater query flexibility, and easier maintenance.

For example, the First Normal Form (or '1NF') ensures that all
your database columns contain only a single value. A column that
contains multiple data values is difficult to access and keep up
to date. It also ensures that each table row only represents a
single 'real world' item. Like all the other normal forms, this
encourages you to split your data across multiple tables, with
less rows in each table. You can quickly see the benefits of this
approach as your database …

A couple of years ago I wrote a post about
key/value tables and how they can ruin the day of any honest
person that wants to create BI solutions. The obvious
advice I gave back then was to not use those tables in the first
place if you’re serious about a BI solution. And if you
have to, do some denormalization.

However, there are occasions where you need to query a source
system and get some report going on them. Let’s take a look
at an example :

This has really been a long debate as to which approach is more
performance orientated, normalized databases or denormalized
databases. So this article is a step on my part to figure out the
right strategy, because neither one of these approaches can be
rejected outright. I will start of by discussing the pros and
cons of both the approaches. Pros and Cons of a Normalized
database design. Normalized databases fair very well under
conditions where the applications are write-intensive and the
write-load is more than the read-load. This is because of the
following reasons: Normalized tables are usually smaller and...

At Kscope this year, I attended a half day in-depth session
entitled Data Warehousing Performance Best Practices, given by
Maria
Colgan of Oracle. My impression, which was confirmed by folks
in the Oracle world, is that she knows her way around the Oracle
optimizer.

These are my notes from the session, which include comparisons of
how Oracle works (which Maria gave) and how MySQL works (which I
researched to figure out the difference, which is why this blog
post took a month after the conference to write). Note that I am
not an expert on data warehousing in either Oracle or MySQL, so
these are more concepts to think about than hard-and-fast advice.
In some places, I still have questions, and I am happy to have
folks comment and contribute what they know.

One interesting point brought up:
Maria quoted someone (she said the name but I did not grab it)
from …

Does having small data-sets really help? Of course it does! Are
memory lookups faster that disk lookups. Of course ! So many
times I have seen people complain about queries taking too long
now, while they were not taking that long earlier. There is one
big reason for this, earlier the size of data-set was small so it
could fit into memory. Now that the data-set has grown large
enough that it cannot fit entirely into memory, the disk seeks
really have slowed down the queries significantly. What to do
now? Vertical partitioning. Divide the data-set into separate
data-sets vertically....

If you’ve been reading up on the various NoSQL offerings and have
wanted to try out one but don’t know how to get started, this is
one of the easiest ways. I chose MongoDB for this example because
I’m going to start using it for a project that needs features
that MySQL isn’t as fast at: namely denormalized data with
billions of rows. MongoDB has plenty of drivers for other
scripting and high-level languages but I’ll focus on the PHP
driver today. If there is interest I can do a write up on Python
usage later. This example is limited to CentOS, Fedora, and
Redhat 5 servers that use the yum package management system. For
more information you can reference their download page: http://www.mongodb.org/display/DOCS/Downloads

Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Oracle and does not necessarily represent the opinion
of Oracle or any other party.