The SitePoint Forums have moved.

You can now find them here.
This forum is now closed to new posts, but you can browse existing content.
You can find out more information about the move and how to open a new account (if necessary) here.
If you get stuck you can get support by emailing forums@sitepoint.com

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Implementing a data mapper

I've been thinking lately that the ActiveRecord approach to ORM I've been using isn't flexible enough, and I've been looking into other patterns. I'm aware of the ones discussed in PoEAA, but not having read the book, I'm not 100% on what the best implementation would be. Here are some ideas I have, though:

The first is to have the mapper be contained by the model, passed as a parameter in the constructor, like this:

PHP Code:

<?php

class Model {
var $mapper;

function Model($mapper) {
$this->mapper = $mapper;

}

function save() {
$this->mapper->save($this);
}

}

class Mapper {
function save($model) {

}

}

$model = new Model(new Mapper());

$model->save();

?>

I like this approach because it achieves the decoupling of the data access code from the business logic with very little change to application code; simply the extra parameter in the constructor.

However, the code could be even further decoupled like this:

PHP Code:

<?php

class Model {

}

class Mapper {
function save($model) {

}

}

$model = new Model();
$mapper = new Mapper();

$mapper->save($model);

?>

However this would require rewritting a lot of the code that uses the model, and I don't currently see what advantages it has over the other approach.

Are there any other approaches that are worth looking into? How is the Data Mapper pattern typcially implemented? And how about the Row and the Table Data Gateway patterns?

Just wondering, I know that in general loose coupling is preferred but why exactly should the Model and Data Mapper be decoupled? In any decent separation the Data Mapper will be used only by the Model layer and personally, I'd say it's part of the Model layer in that it should be separated from the rest of the application.

This would mean that in OOD terms, the relationship would be composition whereas with your examples, a different layer creates (aggregates) the Mapper and the Model merely sends messages (association) to it.

How about when you want to use a different mapper? Creating the mapper sometimes depends on some decision-making and maybe configuration, so the models shouldn't bear the responsibility of creating them.

Besides, if you use a database abstraction layer, even the mapper has to be decoupled with it, so what you get is:

which means that, unless you want the model to be tied in an application context (ie. retrieve the $db from a registry/singleton/godforbid-global), you have to extract the creation of the mapper and place it outside. Or maybe provide a factory class that does the code above and is context-aware.

How about when you want to use a different mapper? Creating the mapper sometimes depends on some decision-making and maybe configuration, so the models shouldn't bear the responsibility of creating them.

Besides, if you use a database abstraction layer, even the mapper has to be decoupled with it, so what you get is:

which means that, unless you want the model to be tied in an application context (ie. retrieve the $db from a registry/singleton/godforbid-global), you have to extract the creation of the mapper and place it outside. Or maybe provide a factory class that does the code above and is context-aware.

This also allows different models to utilise different databases, which I find far more flexible.

The Database Access Layer is part of the Model in any 3-tiered application (which I'm assuming is the context of this discussion). Therefore, it shouldn't be globally accessible. What the hell is a view or controller going to do with a database!?

Can you please post here some of the inflexibilities you are dealing with? Thanks.

I can't speak for 33degrees, but a thing that keeps me from picking up one of the nice ORM-layers out there is the single-table approach they all seem to share. There is no reason why an ActiveRecord can't span several tables, and in fact it could be quite useful with multi-table inheritance.

We must have very different ideas of what a Controller does. In my opinion, a controller doesn't take the data from the database and place it in the model (and vica versa) it merely co-ordinates the creation, operation and collaboration of the Model with the View. For me, the Model is responsible for interacting with external data sources.

Everyone is entitled to their own opinions and interpretations I suppose.

What's wrong with the following?
...
This also allows different models to utilise different databases, which I find far more flexible.

In my opinion, at least two things are "wrong" here:

1) you're retrieving/constructing a database from inside of the model, which couples the model with the DatabaseFactory class. This is not flexible enough, if you want to reuse the model.

2) the model is aware of the completely irrelevant piece of data (irrelevant to the model, that is): the database type string. To provide some portability, you'd have to pass that string to the Model constructor, so _it_ could pass it down to the Mapper, so the proper DB gets constructed, right? Instead, you could just pass an existent mapper to the model (a model factory does that), which results in two good side-effects:
- switching to another db only happens at the construction time of the models (from within the factory), not from the models themselves. It's not model's responsibility to be aware of the database in use.
- models could reuse the mapper instance, instead of each constructing their own. While we could argue about which is better, it's definitely more resource-friendly to reuse an instance. After all, in a large application there could be thousands of models in use at some time.

The AppModelFactory is the only class tied in a context. It only provides some application-specific glue, it's lightweight and easy to port. It can cache mappers for later re-use if the same types of models are requested. The mapper and model are decoupled to some extent -- the model is not aware of what type of mapper it uses, it relies on it to be the valid one.

The database/mapper/model switching is concentrated in these few lines of code for the whole project. The logic of selecting the db lies here.

public function __construct($model) {
$this->model = $model;
// Creates an instance of the DB abstraction. Therefore, Mapper composes Database
$this->db = new DatabaseFactory::create('sqlite'); // The Database "type" could here be taken from a registry, passed through a parameter etc.
}
public function insert() {
// Utilise $this->db and $this->model to map $this->model->getData() to a table or tables
}
}

Now, when we look at the controller, the controller no longer has anything to do with the database and is in fact working solely with the Data Model layer.

PHP Code:

// Controller aggregates Model (nothing new here)
$model = new ExampleModel();
$model->setData('foo');
// Controller also aggregates Mapper (this, therefore, can be considered part of the Data Model layer.
$mapper = new ExampleMapper($model);
$mapper->insert();

I'd throw together some UML to illustrate this further, but I don't have a UML editor at work.

We must have very different ideas of what a Controller does. In my opinion, a controller doesn't take the data from the database and place it in the model (and vica versa) it merely co-ordinates the creation, operation and collaboration of the Model with the View. For me, the Model is responsible for interacting with external data sources.

After re-reading your post, I must admit I got it wrong. I meant that the database is the model (or part of). But you're right that when using a DataMapper, the Object-Model gets seperated from the storage (database). Using TableGateway/ActiveRecord the seperation isn't as sharp.

After re-reading your post, I must admit I got it wrong. I meant that the database is the model (or part of). But you're right that when using a DataMapper, the Object-Model gets seperated from the storage (database). Using TableGateway/ActiveRecord the seperation isn't as sharp.

Agreed, though I would still class the Model, the Mapper and the Database Abstraction as all being part of the Data Model layer. While they are loosely separated, in a 3-tier design, I'd place them firmly in the Data Model

Heh, don't worry, I got a few things wrong two (my examples for one thing). But instead of drawing attension to that, I've instead opted to draw attention to your admission of incorrectness in the hope that no one will notice mine :P

Heh, don't worry, I got a few things wrong two (my examples for one thing). But instead of drawing attension to that, I've instead opted to draw attention to your admission of incorrectness in the hope that no one will notice mine :P

No sweat.

I noticed in the following code, you'd be creating an instance of ExampleMapper per each instance of ExampleModel. Is this on purpose or just for sake of simplifying the example ?

PHP Code:

// Controller aggregates Model (nothing new here)
$model = new ExampleModel();
$model->setData('foo');
// Controller also aggregates Mapper (this, therefore, can be considered part of the Data Model layer.
$mapper = new ExampleMapper($model);
$mapper->insert();

It seems logical that you'd only need one Mapper per class - not one per object. So rather something like :

That's a good point, I hadn't thought of it. Your solution appears to be a satisfactory means to solve it (assuming that your access of a registry is interchangable with whatever means people would want to use).

Of course, the method signature for ExampleMapper->insert would change to:

PHP Code:

public function insert(ExampleModel $model)

and perhaps even the method signature for Mapper->insert could change to:

Can you please post here some of the inflexibilities you are dealing with? Thanks.

There are two main issues I'm having. The first is that, from a testing perspective, it's easier to run tests on my business logic with the persistance mechanism abstracted out; I can simple replace it with a mock mapper. The other is that I want to get away from models being neccesarily tied to a database, and make persitance optional. This would have the added benefit of being able to change persistance mechanisms at run time, loading objects from one place, and then save them to another. A good example would be an Email object. Using different types of mappers, I could; load it from an imap mailbox, keep it in the session for modifying, save it in the database, write it out to a text file, etc.

Originally Posted by kyberfabrikken

I can't speak for 33degrees, but a thing that keeps me from picking up one of the nice ORM-layers out there is the single-table approach they all seem to share. There is no reason why an ActiveRecord can't span several tables, and in fact it could be quite useful with multi-table inheritance.

Inheritence is an issue, but as you said, it can be implemented using ActiveRecord and is not one of the reasons why I was looking into mappers

Originally Posted by kyberfabrikken

It seems logical that you'd only need one Mapper per class - not one per object. So rather something like

Actually, my goal would be to have a single Mapper per persistance mechanism; I would rather avoid having to code a Mapper per Model per datasource...

how would you structure your domain classes such that the mapper would be able to figure out how to perform the CRUD operations? would there be a standard method that would basically list all fields?

Clearly, there's a certain amount of metadata needed for CRUD and in an ActiveRecord approach it can all be stuffed into the Model. For a mapper, I think field information would be found in the model, whereas persistence implementation specific data would be in the mapper (table name, etc). In my current ORM system, I use a schema object to store this metadata, so that it could be stored in a variety of formats (ie. simply serialised to the file system), or read directly from the database, which is handy when you're developing and your tables are changing often. I think I would probably do something similar with a DataMapper approach...

Originally Posted by gaialucien

in addition, how do you plan on going about the different finder methods if you have only data mapper.

There are two options I can think of offhand, either the finder methods go into the data mapper, or there's a third Finder object that takes care of this. Not being too familiar with the pattern itself, I'm not sure what the recommended approach would be.

I know this thread is really old but I was just wondering if anyone is now aware of anything out there that implements the data mapper pattern properly. there is phpDataMapper (phpdatamapper.com) but its in very early stages and at the moment it only supports MySQL, I need Postgres support.

I was just about to ask about ZF's data mapper! It didn't seem like the Table gateway was doing a full data mapper job but it seems pretty close though.

Is my understanding correct to say that ZF doesn't do a data mapper because Zend_Db_Table doesn't take domain entity objects as its argument for doing inserts/update/delete. Also it returns Zend_Db_Table_Row and Zend_Db_Table_Rowset for selects instead of entity objects?