I feel very comfortable with Imperative programming. I never have trouble expressing algorithmically what I want the computer to do once I figured out what is it that I want it to do. But when it comes to languages like SQL or I often get stuck because my head is too used to Imperative programming.

For example, suppose you have the relations band(bandName, bandCountry), venue(venueName, venueCountry), plays(bandName, venueName), and I want to write a query that says: all venueNames such that for every bandCountry there's a band from that country that plays in venue of that name.

Example: I want all the venueNames in which bands from all countries (bandCountry) played. Also, by "relation" I mean an SQL table.

In my mind I immediately go "for each venueName iterate over all the bandCountries and for each bandCountry get the list of bands that come from it. If none of them play in venueName, go to next venueName. Else, at the end of the bandCountries iteration add venueName to the set of good venueNames".

...but you can't talk like that in SQL and I actually need to think about how to formulate this, with the intuitive Imperative solution constantly nagging in the back of my head. Did anybody else had this problem? How did you overcome this? Did you figured out a paradigm shift? Made a map from Imperative concepts to SQL concepts to translate Imperative solutions into Declarative ones? Read a good book?

This is a good question because you're voicing a weakness that many (including myself) have.
–
DavidDec 31 '10 at 2:47

It might be useful to define what you mean by "relation" in your question. In the relational model (the math behind SQL), "relation" is roughly analogous to a SQL table. A lot of people will say "relation" when they really mean to say "relationship".
–
Jason BakerDec 31 '10 at 3:05

@Jase21, I personally am quite familiar with both, but non-trivial stuff in SQL still feels funny. None of the clean math examples deal with the weird real-world stuff. Additionally, one can use LINQ, and thus not be bothered with SQL. Finally, to the asker: you will get used to it with time.
–
JobDec 31 '10 at 4:20

9 Answers
9

The idea behind doing things declaratively is that you are supposed to specify what, not how.

To me, it sounds like you're on the right track. The problem isn't that you are thinking about things the wrong way. It's that you're going too far. Let's look at what you're trying to do:

For example, suppose you have the
relations band(bandName, bandCountry),
venue(venueName, venueCountry),
plays(bandName, venueName), and I want
to write a query that says: all
venueNames such that for every
bandCountry there's a band from that
country that plays in venue of that
name.

So far, this is great. But then you do this:

In my mind I immediately go "for each
venueName iterate over all the
bandCountries and for each bandCountry
get the list of bands that come from
it. If none of them play in venueName,
go to next venueName. Else, at the end
of the bandCountries iteration add
venueName to the set of good
venueNames".

In essence, you're doing unnecessary work. You know what you want, which is all you really need. But then you go on and try to figure out how to get it.

If I were you, I'd try to get in the following habit:

Define what you want.

Consciously stop yourself from defining how to get it.

Figure out how to represent what you want in SQL.

It might take some time and effort on your part, but once you really grok declarative programming, it becomes very useful. In fact, you might find yourself using declarative programming in the rest of your code.

If you're looking for a book, I'd recommend SQL and Relational Theory. It really helps you understand the theory behind SQL databases. Just remember to take Date's recommendations with a grain of salt. He gives very good information, but he can be a bit opinionated at times.

I don't understand how figuring out how to get something is the wrong approach. It doesn't matter what kind of language you are using you have to figure out how to tell it to do what you want.
–
davidk01Jan 1 '11 at 0:39

think in terms of sets, not iterators; the sql statements define the properties of the desired output set (aka table/relation)

all venueNames such that for every bandCountry there's a band from that country that plays in venue of that name

the result of this (if i understood your intentions correctly!) would be the set of venues that have at least one band that plays in that venue. The iteration over bandCountry is unnecessary, as the PLAYS relation already has the information that you seek, you just have to eliminate the duplicates

so in SQL this would be:

select
distinct venueName
from PLAYS

EDIT: ok, so the actual set desired is a bit more complicated. The question being asked of the database is: what venues have hosted bands from all countries?

So, we define the membership criteria for an element of the desired set as the goal, then work backwards to populate the set. A venue is a member of the output set if it has a PLAYS row for at least one band from every country. How do we get this information?

One way is to count the distinct countries for each venue and compare it to the count of all countries. But we don't have a COUNTRY relation. If we think about the model given for a moment, we see that the set of all countries is not the right criteria; it is the set of all countries that have at least one band. So we don't need a country table (though for a normalized model we should have one), and we don't care about the country of the venue, we can just count the countries that have bands, e.g. (in MS-SQL)

Now, that's not the prettiest query possible (GROUP BY and HAVING might be considered a more 'elegant' solution than temp variables and a subquery) but it's fairly obvious what we're after so we'll leave it at that for the OP's purpose.

The OP's purpose was to learn how to shift the mindset from imperative to declarative. To that end, look at what the imperative solution described was doing:

for each venueName iterate over all the bandCountries and for each bandCountry get the list of bands that come from it. If none of them play in venueName, go to next venueName. Else, at the end of the bandCountries iteration add venueName to the set of good venueNames

What is the determining criteria in the above? I think it is:

...If none of them [the set of bands from a particular country] play in venueName...

This is a disqualifying criteria. The imperative thought process is starting with a full bucket and throwing out things that don't fit the criteria. We're filtering out data.

That's fine for simple stuff, but it helps to think in terms of constructing the desired result set; what is the corresponding qualifying criteria that would allow one to fill up the bucket instead?

disqualifier: if there is no band from a bandCountry that plays at a venue, the venue is disqualified

(partial) qualifier: if at least one band from a bandCountry plays at a venue, then the venue might be ok; keep checking the rest of the bandCountries

(full) qualifier: if at least one band from each bandCountry plays at a venue, then the venue is qualified

The final qualifier can be simplified using counts: a bandCountry is 'satisfied' if at least one band from there plays at a venue; the number of 'satisfied' band countries for a venue must equal the number of band countries for the venue to be qualified.

Now we can reason across relations by navigation:

start with the VENUE relation [we don't need it for the answer, but it is the conceptual starting point for the relational navigation]

join to PLAYS on venueName

join to BAND on bandName to get the bandCountry

we don't care about the band name; select only the venueName and bandCountry

we don't care about redundant bandCountries; eliminate duplicates using DISTRICT or GROUP BY

we only care about the count of distinct bandCountries, not the names

we only want venues where the count of distinct bandCountries is the same as the total number of bandCountries

which leads up back to the solution above (or a reasonable facsimile thereof)

One way to learn how to think and program in a declarative style is to learn a general purpose array language like APL or J. SQL is probably not the best vehicle for learning how to program declaratively. In APL or J you learn to operate on entire arrays (vectors, matrices, or arrays of higher rank), without explicit looping or iteration. This make understanding SQL and relational algebra much easier. As a very simple example, in order to select items from a vector V whose value are greater than 100, in APL we write:

(V>100)/V

Here V>100 evaluates to a boolean array of the same shape as V, with 1's marking the values we want to keep. It does not occur to the seasoned APLer that there is iteration going on, we are just applying a mask to the vector V, returning a new vector. This of course is conceptually what an SQL where clause or relational algebra restrict operation is doing.

I don't think you can get a good grip on declarative programming without doing a lot of it, and SQL generally is too specific. You need to write a lot of general purpose code, learning how to do without loops and if/then/else structures and all of the apparatus that attends imperative, procedural and scalar-style programming.

There may be other functional languages that help with this way of thinking too, but the array languages are very close to SQL.

+1 for "[you can't] get a good grip ... without doing a lot of it". Nobody learned imperative programming (with its clearly counter-intuitive constructs like a = a + 1) overnight either. It takes time to learn declarative styles like logic, functional, et al just like it took time to learn imperative programming.
–
JUST MY correct OPINIONDec 31 '10 at 6:27

First, you have to learn both. You may have a preference, but when working in areas where the other is better, don't fight it. Lots of programmers are tempted to use cursors in relational databases since they are so use to stepping through each record, but database is much better at sets. You don't want to get into the mindset of "I know how to do it this way and I have the most control, blah, blah, blah".

You learn to think declaratively like you learned to think imperatively: by practice starting with simpler problems and working on up as you "get it".

Your first experiences with imperative programming included a whole bunch of counter-intuitive (and, in fact, utterly ridiculous) statements like "a = a + 1". You wrapped your mind around that to the point that now you likely don't even recall the recoiling from the obvious untruth of the statement. Your problem with declarative styles is that you're back where you were when you first started with imperative styles: a "clueless newb". Even worse, you've got years of practice with one style that is utterly at odds with this new style and have years of habits to undo -- like the habit to "control at all costs".

Declarative styles work with a different approach that you lack intuition for now (unless you've kept your maths skills very sharp over the years -- which most people don't). You have to relearn how to think and the only way to relearn that is to do it, one simple step at a time.

Picking SQL as your first foray into declarative programming might be a mistake if you really want to learn the concepts. Sure the tuple calculus it's based on is as declarative as it gets, really, but unfortunately the purity of the tuple calculus has been badly compromised by the realities of implementation and the language has, in effect, become a bit of a muddled mess. You may wish instead to look at other more directly useful (in the sense you're used to) declarative languages like the Lisps (esp. Scheme), Haskell and the MLs for (mostly) functional programming or, alternatively, Prolog and Mercury for (mostly) logic programming.

Learning these other languages will give you better insight, in my opinion, into how declarative programming works for a few reasons:

They're useful for "cradle to grave" programming -- as in you can write a full program in these languages from beginning to end. They're useful standing alone, unlike SQL which is really quite useless for most people as a standalone language.

They each give you a different slant on declarative programming that can give you different roads to finally "getting it".

They also each give you a different slant on thinking about programming in general. They'll improve your ability to reason about problems and coding even if you never directly use them yourself.

The lessons you learn from them will help you with your SQL as well -- especially if you brush up on the tuple calculus behind relational databases for the pure form of thinking about data.

I'd especially recommend learning one of the functional languages (Clojure, as one of the Lisps, is probably a good choice here) and one of the logic languages (I like Mercury best, but Prolog has a lot more useful material for learning) for maximal thought process expansion.

It's not wrong to think imperatively in a declarative setting like SQL. It's just that the imperative thinking should be happening at a bit higher level than what you described. Whenever I need to query a database that uses SQL I always think to myself:

Here are the pieces I need.

I'm going to put them together in this fashion.

I'm going to whittle down what I just obtained with the following predicates to get to what I'm really looking for.

The above is a high level imperative algorithm and it works pretty well for me in the SQL setting. I think this is considered a top-down approach and Steven A. Lowe describe a pretty good bottom-up approach.

The key to your question is in what you said in the next-to-last paragraph: "You can't talk like that in SQL." It may be more useful for you at this stage to approach SQL as a foreign language instead of a programming language. If you think of it this way, writing a SQL query is really translating an English statement of what you want into "SQLish". The computer understands SQLish perfectly and will do exactly what you say, so you don't need to worry about implementation as long as you translate correctly.

That said, what's the best way to learn a foreign language? You obviously have to learn the grammar and vocabulary, which you can get from your SQL documentation. The most important thing is practice. You should read and write as much SQL as you can, and don't feel that you have to know the syntax thoroughly first; you can, and should, look up things as you go along. You'll know you've got it when you find it easier to describe what data you want in SQL than in English.

It took me a long time to wrap my head around SQL as well. We did some relational theory at university and at the time, that only served to complicate matters. In the end, my learning process was very much trial and error informed by various learning materials and examples that I found useful along the way. Essentially, you will get used to it eventually, and adding a new way of thinking about data and queries will be of some value to your mental development.

I found that I was able to accelerate my learning by gradually building up a collection of simple scripts demonstrating how to use each language feature and how to achieve certain results on a known table (table definitions included for reference).

Earlier this year I did some formal training involving a data migration project on a messy Oracle database, where I had to gradually piece together fragments from my library to filter the query results in various ways until I had exactly what I wanted, then transform them as necessary, and so on. Some of the queries became very complex and difficult to debug. I doubt I could read them now, but I hope I could arrive at a similar solution again by using my reference building blocks.

Other ways to increase your intuitive awareness of the declarative and functional spaces are learning set theory and programming languages more suited to a particular paradigm. I'm currently in the process of learning some Haskell, for example, to maintain and improve my mathematical abilities.

When you face a problem you usually think how to solve it.
But if you know how computer solves it for you! Then you're concern about how will be eliminated.

I try to say how it happens.

You may already are familiar with recursive programs, in recursive programs, you define the problem rather to say how it is solved. you define the base, and define n based on n-1. (for example factorial(n) = n * factorial(n-1)) But you may already know how the computer solve it. it start from the function and call the function recursively until it reach a base definition, then evaluate all the other functions based on the base value.

It's what happen in declarative programming. you define everything based on the existing definitions. And computer knows how to derive the answer for you based on the basic functions.

In SQL you may don't relate definitions to each other but you relate objects or information to each other, you specify what you want and computer search to something (object, information) based on the relations you have provided.