Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I have a tree structured model Group which has the columns id, parent_id, name, lft and rght. I also have a model User which has the columns id, name and password. Group and User have a Has-and-Belongs-to-Many (HABTM) relationship using the join-table groups_users

I want to create a view that creates a new tree which includes all the groups and all the users as children of the groups they belong to. The view should have the columns id, parent_id, model, foreign_key, lft and rght.

The id column should just be the Nth row and parent_id, lft and rght can't be just copied from the Group model, they will have to calculated for the tree of the view.

Your lft/rght fields are not correct. I doubt you could build a table like that on the fly
–
dogmatic69Sep 19 '12 at 21:36

Fixed the example. Why do you doubt that? I thought SQL was capable of doing very complicated stuff?
–
EvertSep 19 '12 at 21:43

For every insert on a MPTT table you need an update to fix the lft/rght values of other rows. How are you going to union two tables and at the same time update all the lft/rght values to be a non corrupt tree?
–
dogmatic69Sep 19 '12 at 23:30

Well, for the Group model I have that covered with a behavior in CakePHP. However, for the VIEW that shouldn't be a problem. The idea is that the VIEW is generated on the fly via some sort of super-complex query. So the lft/rght values of the VIEW should also be calculated on the fly by the query.
–
EvertSep 19 '12 at 23:47

Can you clarify: a) what tables you currently have and b) what views you want to create? (sample rows would be fine)
–
ypercubeSep 20 '12 at 8:37

1 Answer
1

Views in MySQL cannot reference variables. Try defining even a simple view that references a session variable, and there's no joy. Take a working query that references a variable and try creating a view using that query:

ERROR 1351 (HY000): View's SELECT contains a variable or parameter

Granted, there's a hack-around for this: you can call a stored function that you wrote, which returns the value of the variable you want to be accessible to the view. But, oops, we're already out of the scope of "view".

Additionally, a view, like a query, has no ability to iterate through rows and do something in "this" row because of what happened in "that" row.

Granted, again, there are hacks with variables in queries that can emulate some of that behavior but on the best days they can only look backwards and not forwards, and they don't always "see" the rows in the same sequence that the rows are returned.

I see three possible approaches to this, if SQL is where you really want to do it:

Option 1 involves a stored procedure and a temporary table. A stored procedure can iterate through rows with a cursor, and you'd need a temporary table since stored procedures don't have arrays or hashes. Iterate through your source data, either with a cursor or in a loop that starts by finding the top parent node and then each subsequent select based on what you've found so far... populating the temporary table as you go, taking precautions to avoid infinite loops caused by circular references in parent_id values (gotcha!), and then SELECT from your temporary table within the proc to return its contents as a result set to the client.

You might even end up with a second stored procedure that you call from the first one, which then recursively calls itself, to traverse your tree and build your nested set.

Option 2 involves insert, update, and delete triggers on the group and user tables, which would rebuild your MPTT structures every time any modification is done to group and user.

This would, in a sense, be the "most correct" way to do it, since your left and right ids would never be inconsistent with the underlying data... but as @dogmatic69 has pointed out, it's a very expensive operation... and not without its own snags, since you're limited to what you can to to table_x while you're inside a trigger on table_x.

On the other hand, with this option, when you needed to see the tree, the work to build it has already been done... so, much faster on SELECT.

Option 3 is to calculate the tree values with something other than SQL... so, really, I lied earlier, and there were only two SQL options I've come up with. Calculate it all in PHP, Perl, etc., and then populate the database with the values you calculated, which is how I generally do it. :) But I get away with it, because the parent/child relationships in my databases are not updated by any other process. If they were, I'd be staring down the barrel of Option 2.

Well I know I can very easily do it with option 3, but I thought that MySQL is faster at these things than PHP. That is why I wanted to do it with MySQL. However, if you say that it makes no difference performance wise between option 2 and 3 then I will take option 3. What do you think?
–
EvertSep 20 '12 at 9:01

There are a lot of times when putting the logic and the data "as close together" as possible (i.e., letting the database server do all of the work including the logic itself) is clearly the optimum choice. Intuitively, this doesn't feel like one of those times, and in fact it seems like it would be distinctly slower... however, now, you've made me curious. I can't promise a timetable but I'm going to tinker with this on some of the data I work with on a daily basis and see how it ends up looking.
–
Michael - sqlbotSep 20 '12 at 22:44

I wrote a pair of stored procedures, build_nest() and nest_recurse(parent_id,last_lr_id,current_depth)... build_nest finds all of the root nodes with various tests (parent_id = my id, parent_id is null, parent_id doesn't exist in id) and calls nest_recurse on them... nest_recurse sets left_id on the current parent, then calls itself recursively on THOSE children, setting right_id on the parent_id before returning... it works, but so far my Perl implementation (which has the benefit of one massive SELECT at the start) is still faster. I'm still tinkering.
–
Michael - sqlbotSep 26 '12 at 2:05

Would you like to share both your SQL and your Perl code with me?
–
EvertSep 27 '12 at 3:19