SQL decomposition and testing (SQLite)

Posted 27 January 2014 - 06:54 AM

Motivation: (skip to the questions if you don't care)

In a personal project that I'm planning, I want to keep as much of the logic in the database as possible. I usually avoid all things database and have been looking for an excuse to get my hands dirty for a while. This project seems ideal because the data that the user will want to see is a distilled summary of the data that needs to be stored in the database.

At the same time, I have a work project that deals with analysing some scientific data. I've done the required analysis for some experiments "manually" in a spreadsheet and think it's time to automate it. Rather than my usual approach of writing some scripts to parse the data, I thought it was about time to do this sort of stuff in a database.

SQLite seemed great for both these (embedded application database for the former and standalone engine for data processing for the latter)

Questions:

What are the best techniques for decomposing SQL. This is difficult to google because I keep getting this. However, I have found a couple of pages saying views are useful for decomposing and reusing SQL. I've also found some comments saying it's a bad idea and others saying it helps performance, although I'm sure this is dependant on the DBMS. Are there better things I could be doing?

Following on from this, what is the best way of separating logical aspects of my design? For example, my second project might have several tables in each category of raw data, interim processing, final results, visualisation scripts. Java has packages. .NET has namespaces. Is there any equivalent for databases?

Finally, what about testing? I would usually do this stuff in a programming language and write unit tests. I could still use unit tests to populate some data, test the results and clean up the test database but I can't help wondering if there is a better tool.

Re: SQL decomposition and testing (SQLite)

Posted 27 January 2014 - 07:41 AM

I'm also not entirely following this...

Right, you have tables. Most of your joins are those tables are to essentially de-normalize things. Most of the rest aggregate. You can make views to those tables that are all the foreign key style joins you could make. Make the logical group bys. There are probably more, but you'll figure it out as you go. Now, from this point forward, you consider your tables and views the record sets available: no more joins or grouping.

At this point, the only thing you need work with are essentially filters. Where clauses. These can simply be picked from the available fields and use basic where grammar.

Re: SQL decomposition and testing (SQLite)

Posted 27 January 2014 - 09:34 AM

Quote

Right, you have tables. Most of your joins are those tables are to essentially de-normalize things. Most of the rest aggregate. You can make views to those tables that are all the foreign key style joins you could make. Make the logical group bys. There are probably more, but you'll figure it out as you go. Now, from this point forward, you consider your tables and views the record sets available: no more joins or grouping.

I think this is pretty much what I was talking about, but expressed far more clearly. Thanks!

What about testing then? In making those views of joins and group bys, I will also be including formulae. One example would be standard deviation (chosen because SQLite doesn't appear to have a function for that). This is something I would want to test if it were coded in a traditional language and I feel like I should do so here. Is the done thing to throw together some unit tests in another language or is there something better?

Quote

i would reccommend dl ms sql -- it will make your life a lot easier

It seems to be Windows only. The attraction of SQLite to me is the lack of installation, configuration and its availability on multiple platforms (I work in Windows and Linux and aspire to own a Mac one day). What makes you recommend MS SQL? Is it just the stored procedures thing?

Re: SQL decomposition and testing (SQLite)

This is something I would want to test if it were coded in a traditional language

SQL is a declarative language. In essence, it is a functional language. If you've written your expression correctly and it returns the result set you expect, you're pretty much done.

The only way it would be invalid is if the data returned didn't conform to expectations. That test is done by users. If you can think of a unit test for a SQL query, then you've probably already written your SQL to pass it.

It's more of a GIGO (Garbage In, Garbage Out) thing. Your SQL will always validate, even if your question is dumb.

Re: SQL decomposition and testing (SQLite)

Right, rule #1 with stored procedures. They should only be used if a view won't work. In particular, loops should only be used if all else fails. Additionally, using IN with a select is worst case scenario.

Re: SQL decomposition and testing (SQLite)

SQL is a declarative language. In essence, it is a functional language. If you've written your expression correctly and it returns the result set you expect, you're pretty much done.

Hmm... I see similar things written about functional languages. However, from the little functional code I've written, unit testing has helped there. I know I have written SQL that returned the wrong result in the past. It wasn't until I threw some data at it that I noticed my mistake. I thought unit testing was just that: throwing data at some code to see if it meets expectations. I have a hard time believing that not keeping and rerunning the test is a good thing.

I'm torn between just going along with your experience or bashing out test code until I reach enlightenment.

Quote

Right, rule #1 with stored procedures. They should only be used if a view won't work.

I'm intrigued as to why. If I'm using SQLite then I don't have much choice but what makes views the preferred option?

Re: SQL decomposition and testing (SQLite)

Agreed. However, keep in mind that SQL is a very domain specific language; not general purpose. There's only so much it can, will, and should do. This makes the test domain even smaller. Essentially, testing involves feeding a query all possible data variations. Once you've done that, what else can you do?

cfoley, on 27 January 2014 - 02:27 PM, said:

I know I have written SQL that returned the wrong result in the past. It wasn't until I threw some data at it that I noticed my mistake.

Right. And, once you've seen it work with the data... what is there to test. Also, YOU had to observe the behavior. A test harness would do... what?

For procedure vs. view:
Well, if you can express your request in the form of a SQL query, you're playing to the strength of your database. Databases are most efficient when processing a SQL expressions.

A procedure is always a bit of a hack. You can only ask it one question and the black box that answers might do all kinds of heinous things behind the scenes. It can be the only way to solve a problem, but it only solves one.

A view is transparent. Since it must conform to query rules, there's only so bad it can get. It is also reusable. You can treat a view as a table and ask it any questions you like. A procedure is a one trick pony.