Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

When creating tables from multiple joins for use in analysis, when is it preferred to use views versus creating a new table?

One reason that I would prefer to use views is that the database schema has been developed by our administrator from within Ruby, and I am not familiar with Ruby. I can request that tables be created, but requires an additional step and I would like more flexibility when developing / testing new joins.

I started using views following the answer to a related question on SO (When to use R, when to use SQL). The top-voted answer begins "do the data manipulations in SQL until the data is in a single table, and then do the rest in R."

I have started using views, but I have run into a few issues with views:

queries are much slower

Views do not get dumped from the production to backup database that I use for analysis.

Are views appropriate for this use? If so, should I expect a performance penalty? Is there a way to speed up queries on views?

It sounds like views are appropriate here, but I'm not sure what could be causing the slowdown when querying them.
–
FrustratedWithFormsDesignerApr 11 '12 at 18:42

@FrustratedWithFormsDesigner are there any diagnostics that would help (short of creating a reproducible example)? The same complex query takes < 4s when done directly on joined tables and > 25s when done on views. Are views expected to not have a performance penalty?
–
DavidApr 11 '12 at 18:56

I use MySQL and I will tell you views are terrible, unuseable when you get to 100K and above, just use straight queries where you have control over what fields to return and what joins to use
–
ssmusokeApr 11 '12 at 19:07

2 Answers
2

Views in MySQL are handled using one of two different algorithms: MERGE or TEMPTABLE. MERGE is simply a query expansion with appropriate aliases. TEMPTABLE is just what it sounds like, the view puts the results into a temporary table before running the WHERE clause, and there are no indexes on it.

The 'third' option is UNDEFINED, which tells MySQL to select the appropriate algorithm. MySQL will attempt to use MERGE because it is more efficient. Main Caveat:

If the MERGE algorithm cannot be used, a temporary table must be used instead. MERGE cannot be used if the view contains any of the following constructs:

Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)

DISTINCT

GROUP BY

HAVING

LIMIT

UNION or UNION ALL

Subquery in the select list

Refers only to literal values (in this case, there is no underlying table)

I would venture to guess your VIEWS are requiring the TEMPTABLE algorithm, causing performance issues.

Here is a really old blog post on the performance of views in MySQL and it doesn't seem to have gotten better.

There might, however, be some light at the end of the tunnel on this issue of temporary tables not containing indexes (causing full table scans). In 5.6:

For cases when materialization is required for a subquery in the FROM clause, the optimizer may speed up access to the result by adding an index to the materialized table.
...
After adding the index, the optimizer can treat the materialized derived table the same as a usual table with an index, and it benefits similarly from the generated index. The overhead of index creation is negligible compared to the cost of query execution without the index.

As @ypercube points out, MariaDB 5.3 has added the same optimization. This article has an interesting overview of the process:

The optimization is applied then the derived table could not be merged into its parent SELECT which happens when the derived table doesn't meet criteria for mergeable VIEW

I have done no testing on these claims but MariaDB 5.3 (recently released as stable) has some major improvements on the optimizer, including Views: Fields of merge-able views and derived tables are involved now in all optimizations employing equalities
–
ypercubeApr 11 '12 at 20:00

@ypercube thanks for that link...it appears MySQL 5.6 has at least the optimization of adding an index to derived tables.
–
Derek DowneyApr 11 '12 at 20:35