This document provides a step-by-step introduction to the Rose::DB::Object module distribution.
It demonstrates all of the important features using a semi-realistic example database.
This tutorial does not replace the actual documentation for each module,
however.
The "reference" documentation found in each ".pm" file is still essential,
and contains some good examples of its own.

This tutorial provides a gradual introduction to Rose::DB::Object.
It also describes "best practices" for using Rose::DB::Object in the most robust,
maintainable manner.
If you're just trying to get a feel for what's possible,
you can skip to the end and take a look at the completed example database and associated Perl code.
But I recommend reading the tutorial from start to finish at least once.

The examples will start simple and get progressively more complex.
You,
the developer,
have to decide which level of complexity or abstraction is appropriate for your particular task.

Some of the examples in this tutorial will use the fictional My:: namespace prefix.
Some will use no prefix at all.
Your code should use whatever namespace you deem appropriate.
Usually,
it will be something like MyCorp::MyProject:: (i.e.,
your corporation,
organization,
and/or project).
I've chosen to use My:: or to omit the prefix entirely simply because this produces shorter class names,
which will help this tutorial stay within an 80-column width.

For the sake of brevity,
the use strict directive and associated "my" declarations have also been omitted from the example code.
Needless to say,
you should always use strict in your actual code.

Similarly,
the traditional "1;" true value used at the end of each ".pm" file has been omitted from the examples.
Don't forget to add this to the end of your actual Perl module files.

Although most of the examples in this tutorial use the base.pm module to set up inheritance,
directly modifying the @ISA package variable usually works just as well.
In situations where there are circular relationships between classes,
the use base ... form may be preferable because it runs at compile-time,
whereas @ISA modification happens at run-time.
In either case,
it's a good idea to set up inheritance as early as possible in each module.

package Product;
# Set up inheritance first
use base qw(Rose::DB::Object);
# Then do other stuff...
...

The PostgreSQL database will be used in the examples in this tutorial, but the features demonstrated will not be specific to that database. If you are following along with a different database, you may have to adjust the specific syntax used in the SQL table creation statements, but all of the same features should be present in some form.

This tutorial is based on a fictional database schema for a store-like application. Both the database schema the corresponding Perl classes will evolve over the course of this document.

Operations 2 through 6 are done through the setup method on the metadata object associated with this class. The table must have a primary key, and may have zero or more unique keys. The primary key and each unique key may contain multiple columns.

Now Product will create a My::DB object when it needs to connect to the database.

Note that the My::DB->new call in init_db() means that each Product object will have its own, private My::DB object. See the section below, "A brief digression: database objects", for an explanation of this setup and some alternatives.

Looking forward, it's likely that all of our Rose::DB::Object-derived classes will want to use My::DB objects when connecting to the database. It's tedious to repeat this code in all of those classes. A common base class can provide a single, shared location for that code.

This use of a common base class is strongly recommended. You will see this pattern repeated in the Rose::DB tutorial as well. The creation of seemingly "trivial" subclasses is a cheap and easy way to ensure ease of extensibility later on.

For example, imagine we want to add a copy() method to all of our database objects. If they all inherit directly from Rose::DB::Object, that's not easy to do. But if they all inherit from My::DB::Object, we can just add the copy() method to that class.

The lesson is simple: when in doubt, subclass. A few minutes spent now can save you a lot more time down the road.

If there is no row in the database table with the specified primary or unique key value, the call to load() will fail. Under the default error mode, an exception will be thrown. To safely check whether or not such a row exists, use the speculative parameter.

Since the primary key of the products table, id, is a SERIAL column, a new primary key value will be automatically generated if one is not specified. After the object is saved, we can retrieve the auto-generated value.

The examples above show SELECT, INSERT, UPDATE, and DELETE operations on one row at time based on primary or unique keys. What about manipulating rows based on other criteria? What about manipulating multiple rows simultaneously? Enter Rose::DB::Object::Manager, or just "the manager" for short.

But why is there a separate class for dealing with multiple objects? Why not simply add more methods to the object itself? Say, a search() method to go alongside load(), save(), delete() and friends? There are several reasons.

First, it's somewhat "semantically impure" for the class that represents a single row to also be the class that's used to fetch multiple rows. It's also important to keep the object method namespace as sparsely populated as possible. Each new object method prevents a column with the same name from using that method name. Rose::DB::Object tries to keep the list of reserved method names as small as possible.

Second, inevitably, classes grow. It's important for the object manager class to be separate from the object class itself so each class can grow happily in isolation, with no potential for namespace or functionality clashes.

All of that being said, Rose::DB::Object::Manager does include support for adding manager methods to the object class. Obviously, this practice is not recommended, but it exists if you really want it.

Anyway, let's see some examples. Making a manager class is simply a matter of inheriting from Rose::DB::Object::Manager, specifying the object class, and then creating a series of appropriately named wrapper methods.

The names are pretty much self-explanatory. You can read the Rose::DB::Object::Manager documentation for all the gory details. The important thing to note is that the methods were all named based on the "products" argument to make_manager_methods(). You can see how "products" has been incorporated into each of the method names.

This naming scheme is just a suggestion. You can name these methods anything you want (using the methods parameter to the make_manager_methods() call), or you can even write the methods yourself. Each of these methods is a merely a thin wrapper around the generically-named methods in Rose::DB::Object::Manager. The wrappers pass the specified object class to the generic methods.

The Perl code for the Product::Manager class shown above can be generated automatically by calling the perl_manager_class method on the Rose::DB::Object::Metadata that's associated with the Product class. Similarly, the make_manager_class method called on the Product metadata object will both generate the code and evaluate it for you, automating the entire process of creating a manager class from within your Rose::DB::Object-derived class.

If you decide not to heed my advice, but instead decide to create these methods inside your Rose::DB::Object-derived class directly, you can do so by calling make_manager_methods() from within your object class.

The most common task for the manager is fetching multiple objects. We'll use the get_products() method to do that. It's based on the get_objects() method, which takes many parameters.

One (optional) parameter is the now-familiar db object used to connect to the database. This parameter is valid for all Rose::DB::Object::Manager methods. In the absence of this parameter, the init_db() method of the object class will be called in order to create one.

Passing no arguments at all will simply fetch every Product object in the database.

This section has covered the bare minimum usage and functionality of the Rose::DB::Object module distribution. Using these features alone, you can automate the basic CRUD operations (Create, Retrieve, Update, and Delete) for single or multiple objects. But it's almost a shame to stop at this point. There's a lot more that Rose::DB::Object can do for you. The "sweet spot" of effort vs. results is much farther along the curve.

In the next section, we will expand upon our Product class and tap more of Rose::DB::Object's features. But first...

You can read the Rose::DB documentation to explore the capabilities of these db objects. Most of the time, you won't have to be concerned about them. But it's sometime useful to deal with them directly.

The first thing to understand is where the database object comes from. If the db attribute doesn't exist, it is created by calling init_db(). The typical init_db() method simply builds a new database object and returns it. (See the Rose::DB tutorial for an explanation of the possible arguments to new(), and why there are none in the call below.)

package Product;
...
sub init_db { My::DB->new }

This means that each Product object will have its own My::DB object, and therefore (in the absence of modules like Apache::DBI) its own connection to the database.

If this not what you want, you can make init_db() return the same My::DB object to every Product object. This will make it harder to ensure that the database handle will be closed when all Product objects go out of scope, but that may not be important for your application. The easiest way to do this is to call new_or_cached instead of new.

package Product;
...
sub init_db { My::DB->new_or_cached }

Since init_db() is only called if a Product object does not already have a db object, another way to share a single My::DB object with several Product objects is to do so explicitly, either by pre-creating the My::DB object:

But now we're faced with a few problems. First, while the status column only accepts a few pre-defined values, our Product object will gladly accept any status value. But maybe that's okay because the database will reject invalid values, causing an exception will be thrown when the object is saved.

The date/time fields are more troubling. What is the format of a valid value for a TIMESTAMP column in PostgreSQL? Consulting the PostgreSQL documentation will yield the answer, I suppose. But now all the code that uses Product objects has to be sure to format the date_created and release_date values accordingly. That's even more difficult if some of those values come from external sources, such as a web form.

Worse, what if we decide to change databases in the future? We'd have to hunt down every single place where a date_created or release_date value is set and then modify the formatting to match whatever format the new database wants. Oh, and we'll have to look that up too. Blah.

Finally, what about all those default values? The price column already had a default value, but now two more columns also have defaults. True, the database will take care of this when a row is inserted, but now the Perl object is diverging more and more from the database representation.

Let's solve all of these problems. If we more accurately describe the columns, Rose::DB::Object will do the rest.

Before examining what new functionality this new class gives us, there are a few things to note about the definition. First, the primary key is no longer specified with the primary_key_columns() method. Instead, the id column has its primary_key attribute set to a true value in its description.

Second, note the default value for the date_created column. It's a string containing a call to the PL/SQL function now(), which can actually only be run within the database. But thanks to the allow_inline_column_values attribute being set to a true value, Rose::DB::Object will pass the string "now()" through to the database as-is.

In the case of "creation date" columns like this, it's often better to let the database provide the value as close as possible to the very moment the row is created. On the other hand, this will mean that any newly created Product object will have a "strange" value for that column (the string "now()") until/unless it is re-loaded from the database. It's a trade-off.

Let's see the new Product class in action. The defaults work as expected.

Conveniently, Rose::DB::Object::Manager queries can also use any values that the corresponding column methods will accept. For example, here's a query that filters on the release_date column using a DateTime object.

The upshot is that you no longer have to be concerned about the details of the date/time format(s) understood by the underlying database. You're also free to use DateTime objects as a convenient interchange format in your code.

This ability isn't just limited to date/time columns. Any data type that requires special formatting in the database, and/or is more conveniently dealt with as a more "rich" value on the Perl side of the fence is fair game for this treatment.

Some other examples include the bitfield column type, which is represented by a Bit::Vector object on the Perl side, and the boolean column type which evaluates the "truth" of its arguments and coerces the value accordingly. In all cases, column values are automatically formatted as required by the native column data types in the database.

In some circumstances, Rose::DB::Object can even "fake" a data type for use with a database that does not natively support it. For example, the array column type is natively supported by PostgreSQL, but it will also work with MySQL using a VARCHAR column as a stand-in.

Finally, if you're concerned about the performance implications of "inflating" column values from strings and numbers into (relatively) large objects, rest assured that such inflation is only done as needed. For example, an object with ten date/time columns can be loaded, modified, and saved without ever creating a single DateTime object, provided that none of the date/time columns were among those whose values were modified.

Put another way, the methods that service the columns have an awareness of the producer and consumer of their data. When data is coming from the database, the column methods accept it as-is. When data is being sent to the database, it is formatted appropriately, if necessary. If a column value was not modified since it was loaded from the database, then the value that was loaded is simply returned as-is. In this way, data can make a round-trip without ever being inflated, deflated, or formatted.

This behavior is not a requirement of all column methods, but it is a recommended practice--one followed by all the column classes that are part of the Rose::DB::Object distribution.

The Product class set up in the previous section is useful, but it also takes significantly more typing to set up. Over the long term, it's still a clear win. On the other hand, a lot of the details in the column descriptions are already known by the database: column types, default values, maximum lengths, etc. It would be handy if we could ask the database for this information instead of looking it up and typing it in manually.

This process of interrogating the database in order to extract metadata is called "auto-initialization." There's an entire section of the Rose::DB::Object::Metadata documentation dedicated to the topic. The executive summary is that auto-initialization saves work in the short-run, but with some long-term costs. Read the friendly manual for the details. For the purposes of this tutorial, I will simply demonstrate the features, culminating in the suggested best practice.

Believe it or not, that class is equivalent to the previous incarnation, right down to the details of the columns and the unique key. As long as the table is specified, Rose::DB::Object will dig all the rest of the information out of the database. Handy!

In fact, that class can be shortened even further with the help of the convention manager.

Now even the table is left unspecified. How does Rose::DB::Object know what to do in this case? Why, by convention, of course. The default convention manager dictates that class names are singular and TitleCased, and their corresponding table names are lowercase and plural. Thus, the omitted table name in the Product class is, by convention, assumed to be named "products".

Like auto-initialization, the convention manager is handy, but may also present some maintenance issues. I tend to favor a more explicitly approach, but I can also imagine scenarios where the convention manager is a good fit.

Keep in mind that customized convention managers are possible, allowing individual organizations or projects to define their own conventions. You can read all about it in the Rose::DB::Object::ConventionManager documentation.

Anyway, back to auto-initialization. Yes, auto_initialize() will dig out all sorts of interesting and important information for you. Unfortunately, it will dig that information out every single time the class is loaded. Worse, this class will fail to load at all if a database connection is not immediately available.

Auto-initialization seems like something that is best done only once, with the results being saved in a more conventional form. That's just what Rose::DB::Object::Metadata's code generation functions are designed to do. The perl_* family of methods can generate snippets of Perl code, or even entire classes, based on the results of the auto-initialization process. They'll even honor some basic code formatting directives.

Copy and paste that output back into the "Product.pm" file and you're in business.

The door is open to further automation through scripts that call the methods demonstrated above. Although it's my inclination to work towards a static, explicit type of class definition, the tools are there for those who prefer a more dynamic approach.

When a column in one table references a row in another table, the referring table is said to have a "foreign key." As with primary and unique keys, Rose::DB::Object supports foreign keys made up of more than one column.

In the context of Rose::DB::Object, a foreign key is a database-supported construct that ensures that any non-null value in a foreign key column actually refers to an existing row in the foreign table. Databases that enforce this constraint are said to support "referential integrity." Foreign keys are only applicable to Rose::DB::Object-derived classes when the underlying database supports "native" foreign keys and enforces referential integrity.

While it's possible to define foreign keys in a Rose::DB::Object-derived class even if there is no support for them in the database, this is considered bad practice. If you're just trying to express some sort of relationship between two tables, there's a more appropriate way to do so. (More on that in the next section.)

Let's add a foreign key to the products table. First, we'll need to create the table that the foreign key will reference.

Note that a vendor_id column is added to the column list. This needs to be done independently of any foreign key definition. It's a new column, so it needs to be in the column list. There's nothing more to it than that.

There's also the foreign key definition itself. The name/hashref-value pair passed to the foreign_keys() method is (roughly) shorthand for this.

In other words, vendor is the name of the foreign key, and the rest of the information is used to set attributes on the foreign key object. You could, in fact, construct your own foreign key objects and pass them to foreign_keys() (or add_foreign_keys(), etc.) but that would require even more typing.

Going in the other direction, since our class and column names match up with what the convention manager expects, we could actually shorten the foreign key setup code to this.

foreign_keys => [ 'vendor' ],

Given only a foreign key name, the convention manager will derive the Vendor class name and will find the vendor_id column in the Product class and match it up to the primary key of the vendors table. As with most things in Rose::DB::Object class setup, you can be as explicit or as terse as you feel comfortable with, depending on how closely you conform to the expected conventions.

So, what does this new vendor foreign key do for us? Let's add some data and see. Imagine the following two objects.

Note the use of the idiomatic way to create and then save an object in "one step." This is possible because both the new and save methods return the object itself. Anyway, let's link the two objects. One way to do it is to set the column values directly.

$p->vendor_id($v->id);
$p->save;

To use this technique, we must know which columns link to which other columns, of course. But it works. We can see this by calling the method named after the foreign key itself: vendor().

$v = $p->vendor; # Vendor object
print $v->name; # "Acme"

The vendor() method can be used to link the two objects as well. Let's start over and try it that way:

Remember that there is no column named "vendor" in the "products" table. There is a "vendor_id" column, which has its own vendor_id() get/set method that accepts and returns an integer value, but that's not what we're doing in the example above. Instead, we're calling the vendor() method, which accepts and returns an entire Vendor object.

The vendor() method actually accepts several different kinds of arguments, all of which it inflates into Vendor objects. An already-formed Vendor object was passed above, but other formats are possible. Imagine a new product also made by Smith.

Here the arguments passed to the vendor() method are name/value pairs which will be used to construct the appropriate Vendor object. Since name is a unique key in the vendors table, the Vendor class can look up the existing vendor named "Smith" and assign it to the "Rope" product.

If no vendor named "Smith" existed, one would have been created when the product was saved. In this case, the save process would take place within a transaction (assuming the database supports transactions) to ensure that both the product and vendor are created successfully, or neither is.

Like the name/value pair argument format, a primary key value will be used to construct the appropriate object. (This only works if the foreign table has a single-column primary key, of course.) And like before, if such an object doesn't exist, it will be created. But in this case, if no existing vendor object had an id of 1, the attempt to create one would have failed because the name column of the inserted row would have been null.

To summarize, the foreign key method can take arguments in these forms.

An object of the appropriate class.

Name/value pairs used to construct such an object.

A reference to a hash containing name/value pairs used to construct such an object.

A primary key value (but only if the foreign table has a single-column primary key).

In each case, the foreign object will be added to the database it if does not already exist there. This all happens when the "parent" (Product) object is saved. Until then, nothing is stored in the database.

There's also another method created in response to the foreign key definition. This one allows the foreign object to be deleted from the database.

Again, the actual database modification takes place when the parent object is saved. Note that this operation will fail if any other rows in the products table still reference the Acme vendor. And again, since this all takes place within a transaction (where supported), the entire operation will fail or succeed as a single unit.

Finally, if we want to simply disassociate a product from its vendor, we can simply set the vendor to undef.

$p->vendor(undef); # This product has no vendor
$p->save;

Setting the vendor_id column directly has the same effect, of course.

$p->vendor_id(undef); # set vendor_id = NULL
$p->save;

Before moving on to the next section, here's a brief note about auto-initialization and foreign keys. Since foreign keys are a construct of the database itself, the auto-initialization process can actually discover them and create the appropriate foreign key metadata.

Since all of the column and table names are still in sync with the expected conventions, the Product class can still be defined like this:

Foreign keys are a database-native representation of a specific kind of inter-table relationship. This concept can be further generalized to encompass other kinds of relationships as well. But before we delve into that, let's consider the kind of relationship that a foreign key represents.

In the product and vendor example in the previous section, each product has one vendor. (Actually it can have zero or one vendor, since the vendor_id column allows NULL values. But for now, we'll leave that aside.)

When viewed in terms of the participating tables, things look slightly different. Earlier, we established that several products can have the same vendor. So the inter-table relationship is actually this: many rows from the products table may refer to one row from the vendors table.

Rose::DB::Object describes inter-table relationships from the perspective of a given table by using the cardinality of the "local" table (products) followed by the cardinality of the "remote" table (vendors). The foreign key in the products table (and Product class) therefore represents a "many to one" relationship.

If the relationship were different and each vendor was only allowed to have a single product, then the relationship would be "one to one." Given only the foreign key definition as it exists in the database, it's not possible to determine whether the relationship is "many to one" or "one to one." The default is "many to one" because that's the less restrictive choice.

To override the default, a relationship type string can be included in the foreign key description.

Even foreign keys are included under the umbrella of this concept. When foreign key metadata is added to a Rose::DB::Object-derived class, a corresponding "many to one" or "one to one" relationship is actually added as well. This relationship is simply a proxy for the foreign key. It exists so that the set of relationship objects encompasses all relationships, even those that correspond to foreign keys in the database. This makes iterating over all relationships in a class a simple affair.

Given the two possible cardinalities, "many" and "one", it's easy to come up with a list of all possible inter-table relationships. Here they are, listed with their corresponding relationship object classes.

one to one - Rose::DB::Object::Metadata::Relationship::OneToOne
one to many - Rose::DB::Object::Metadata::Relationship::OneToMany
many to one - Rose::DB::Object::Metadata::Relationship::ManyToOne
many to many - Rose::DB::Object::Metadata::Relationship::ManyToMany

We've already seen that "one to one" and "many to one" relationships can be represented by foreign keys in the database, but that's not a requirement. It's perfectly possible to have either of those two kinds of relationships in a database that has no native support for foreign keys. (MySQL using the MyISAM storage engine is a common example.)

If you find yourself using such a database, there's no reason to lie to your Perl classes by adding foreign key metadata. Instead, simply add a relationship.

Here's an example of our Product class as it might exist on a database that does not support foreign keys. (The Product class is getting larger now, so previously established portions may be omitted from now on.)

They syntax and semantics are similar to those described for foreign keys. The only slight differences are the names and types of parameters accepted by relationship objects.

In the example above, a "many to one" relationship named "vendor" is set up. As demonstrated before, this definition can be reduced much further, allowing the convention manager to fill in the details. But unlike the case with the foreign key definition, where only the name was supplied, we must provide the relationship type as well.

relationships => [ vendor => { type => 'many to one' } ],

There's an even more convenient shorthand for that:

relationships => [ vendor => 'many to one' ],

(Again, this all depends on naming the tables, classes, and columns in accordance with the expectations of the convention manager.) The resulting vendor() and delete_vendor() methods behave exactly the same as the methods created on behalf of the foreign key definition.

Here, the product_id column in the local table (prices) is connected to the id column in the foreign table (products).

The methods created by "... to many" relationships behave much like their "... to one" and foreign key counterparts. The main difference is that lists or references to arrays of the previously described argument formats are also acceptable, while name/value pairs outside of a hashref are not.

Here's a list of argument types accepted by "many to one" methods like prices.

A list or reference to an array of objects of the appropriate class.

A list or reference to an array of hash references containing name/value pairs used to construct such objects.

A list or reference to an array of primary key values (but only if the foreign table has a single-column primary key).

Setting a new list of prices will delete all the old prices. As with foreign keys, any actual database modification happens when the parent object is saved. Here are some examples.

Since rows in the prices table now link to rows in the products table, a product cannot be deleted until all of the prices that refer to it are also deleted. There are a few ways to deal with this.

The best solution is to add a trigger to the products table itself in the database that makes sure to delete any associated prices before deleting a product. This change will allow the naive approach shown above to work correctly.

A less robust solution is necessary if your database does not support triggers. One such solution is to manually delete the prices before deleting the product. This can be done in several ways. The prices can be deleted directly, like this.

The final relationship type is the most complex. In a "many to many" relationship, a single row in table A may be related to multiple rows in table B, while a single row in table B may also be related to multiple rows in table A. (Confused? A concrete example will follow shortly.)

This kind of relationship involves three tables instead of just two. The "local" and "foreign" tables, familiar from the other relationship types described above, still exist, but now there's a third table that connects rows from those two tables. This third table is called the "mapping table," and the Rose::DB::Object-derived class that fronts it is called the "map class."

Let's add such a relationship to our growing family of classes. Imagine that each product may come in several colors. Right away, both the "one to one" and "many to one" relationship types are eliminated since they can only provide a single color for any given product.

But wait, isn't a "one to many" relationship suitable? After all, one product may have many colors. Unfortunately, such a relationship is wasteful in this case. Let's see why. Imagine a colors table like this.

Notice that the colors "green" and "red" appear twice. Now imagine that there are 50,000 products. What are the odds that there will be more than a few colors in common among them?

This is a poor database design. To fix it, we must recognize that colors will be shared among products, since the set of possible colors is relatively small compared to the set of possible products. One product may have many colors, but one color may also belong to many products. And there you have it: a textbook "many to many" relationship.

Let's redesign this relationship in "many to many" form, starting with a new version of the colors table.

Note that only the map class needs to be used in the Product class. The relationship definition specifies the name of the map class, and (optionally) the names of the foreign keys or "many to one" relationships in the map class that connect the two tables.

In most cases, these two parameters (map_from and map_to) are unnecessary. Rose::DB::Object will figure out what to do given only the map class, so long as there's no ambiguity in the mapping table.

In this case, there is no ambiguity, so the relationship definition can be shortened to this.

Finally, the same caveats described earlier about deleting products that have associated prices apply to colors as well. Again, I recommend using a trigger in the database to handle this, but Rose::DB::Object's cascading delete feature will work in a pinch.

# Delete all associated rows in the prices table, plus any
# rows in the product_color_map table, before deleting the
# row in the products table.
$p->delete(cascade => 1);

To summarize this exploration of inter-table relationships, here's a terse summary of the current state of our Perl classes, and the associated database tables.

For the sake of brevity, I've chosen to use the shorter versions of the foreign key and relationship definitions in the Perl classes shown below. Just remember that this only works when your tables, columns, and classes are named according to the expected conventions.

If the code above still looks like too much work to you, try letting Rose::DB::Object::Loader do it all for you. Given the database schema shown above, the suite of associated Perl classes could have been created automatically with a single method call.

You can also ask the loader to make actual Perl modules (that is, a set of actual *.pm files in the file system) by calling the aptly named make_modules method.

The code created by the loader is shown below. Compare it to the manually created Perl code shown above and you'll see that it's nearly identical. Again, careful table name choices really help here. Do what the convention manager expects (or write your own convention manager subclass that does what you expect) and automation like this can work very well.

The Product::Manager class we created earlier is deceptively simple. Setting it up can actually be reduced to a one-liner, but it provides a rich set of features.

The basics demonstrated earlier cover most kinds of single-table SELECT statements. But as the Product class has become more complex, linking to other objects via foreign keys and other relationships, selecting rows from just the products table has become a lot less appealing. What good is it to retrieve hundreds of products in a single query when you then have to execute hundreds of individual queries to get the prices of those products?

This is what SQL JOINs were made for: selecting related rows from multiple tables simultaneously. Rose::DB::Object::Manager supports a two kinds of joins. The interface to this functionality is presented in terms of objects via the require_objects and with_objects parameters to the get_objects() method.

Both parameters expect a list of foreign key or relationship names. The require_objects parameters will use an "inner join" to fetch related objects, while the with_objects parameter will perform an "outer join."

If you're unfamiliar with these terms, it's probably a good idea to learn about them from a good SQL book or web tutorial. But even if you've never written an SQL JOIN by hand, there's not much you need to understand in order to use your manager class effectively.

The rule of thumb is simple. When you want each and every object returned by your query to have a particular related object, then use the require_objects parameter. But if you do not want to exclude objects even if they do not have a particular related object attached to them yet, then use the with_objects parameter.

Sometimes, this decision is already made for you by the table structure. For example, let's modify the products table in order to require that every single product has a vendor. To do so, we'll change the vendor_id column definition from this:

vendor_id INT REFERENCES vendors (id)

to this:

vendor_id INT NOT NULL REFERENCES vendors (id)

Now it's impossible for a product to have a NULL vendor_id. And since our database enforces referential integrity, it's also impossible for the vendor_id column to have a value that does not refer to the id of an existing row in the vendors table.

While the with_objects parameter could technically be used to fetch Products with their associated Vendor objects, it would be wasteful. (Outer joins are often less efficient than inner joins.) The table structure basically dictates that the require_objects parameter be used when fetching Products with their Vendors.

As you can see, the query includes "tN" aliases for each table. This is important because columns in separate tables often have identical names. For example, both the products and the vendors tables have columns named id and name.

In the query, you'll notice that the name => { like => 'Kite%' } argument ended up filtering on the product name rather than the vendor name. This is intentional. Any unqualified column name that is ambiguous is considered to belong to the "primary" table (products, in this case).

The "tN" numbering is deterministic. The primary table is always "t1", and secondary tables are assigned ascending numbers starting from there. You can find a full explanation of the numbering rules in the Rose::DB::Object::Manager documentation.

In the example above, if we wanted to filter and sort on the vendor name instead, we could do this.

Now let's see an example of the with_objects parameter in action. Each Product has zero or more Prices. Let's fetch products with all their associated prices. And remember that some of these products may have no prices at all.

Each Product also has zero or more Colors which are related to it through a mapping table (fronted by the ProductColorMap class, but we don't need to know that). The with_objects parameter can handle that as well.

Anyone who knows SQL well will recognize that there is a danger lurking when combining JOINs. Multiple joins that each fetch multiple rows can result in a geometric explosion of rows returned by the database. For example, the number of rows returned when fetching products with their associated prices and colors would be:

<number of matching products> x
<number of prices for each product> x
<number of colors for each product>

That number can get very large, very fast if products have many prices, colors, or both. (The last two terms in the multiplication maybe switched, depending on the order of the actual JOIN clauses, but the results are similar.) And the problem only gets worse as the number of objects related by "... to many" relationships increases.

That said, Rose::DB::Object::Manager does allow multiple objects related by "... to many" relationships to be fetched simultaneously. But it requires the developer to supply the multi_many_ok parameter with a true value as a form of confirmation. "Yes, I know the risks, but I want to do it anyway."

As an example, let's try fetching products with their associated prices, colors, and vendors. To do so, we'll have to include the multi_many_ok parameter.

It's questionable whether this five-way join will be faster than doing a four- or three-way join and then fetching the other information after the fact, with separate queries. It all depends on the number of rows expected to match. Only you know your data. You must choose the most efficient query that suits your needs.

Moving beyond even the example above, it's possible to chain foreign key or relationship names to an arbitrary depth. For example, imagine that each Vendor has a Region related to it by a foreign key named "region". The following call will get region information for each product's vendor, filtering on the region name.

The same caveat about performance and the potential explosion of redundant data when JOINing across multiple "... to many" relationships also applies to the "chained" selectors demonstrated above--even more so, in fact, as the depth of the chain increases. That said, it's usually safe to go a few levels deep into "... to one" relationships when using the require_objects parameter.

Finally, it's also possible to load a single product with all of its associated foreign objects. The load() method accepts a with parameter that takes a list of foreign key and relationship names.

The same "multi many" caveats apply, but the multi_many_ok parameter is not required in this case. The assumption is that a single object won't have too many related objects. But again, only you know your data, so be careful.

I hope you've learned something from this tutorial. Although it is by no means a complete tour of all of the features of Rose::DB::Object, it does hit most of the highlights. This tutorial will likely expand in the future, and a separate document describing the various ways that Rose::DB::Object can be extended is also planned. For now, there is a brief overview that was pulled from the Rose::DB::Object mailing list in the wiki.