Application Development using Catalyst, Moose, Plack, DBIx::Class and other Modern Perl software!

September 2008

09/08/2008

Exciting times continue for Catalyst, one of Perl's leading web development frameworks. In a previous interview, we spoke with Groditi and Konobi, who are the lead implementers in the project to port Catalyst to the Moose Object Oriented development toolkit. We spoke broadly about the value of Moose to the Perl community and why having Catalyst on Moose is going to help drive adoption, ease development and increase quality for both Catalyst and Perl development in general. Today we followup on this topic with Matt S. Trout (a.k.a 'mst') a luminary figure in the Catalyst community. Matt S. Trout is on the core Catalyst Team and the founder of the DBIC project, one of the most complete and popular Object-Relational Mappers for Perl. He is also a Moose active evangelist and member of the Moose community, contributes to many other Perl projects, and administers many of the tools Catalyst developers use, such as our version control repository, mailing lists and IRC channels. He currently is the lead tech and cofounder of a UK based consulting company (http://www.shadowcat.co.uk/).

In today's interview, Matt expands on the the goals of the Catalyst to Moose porting project (“Catamoose”). He also tries to help us understand the Catalyst development roadmap while musing on some of the 'shiny cool' features Moose is going to enable. He also speaks a bit on Perl and our community in general. We will try to provide some details and examples regarding what some of these changes will mean for Catalyst developers.

<question>: How do you feel the Catamoose port is shaping up? Where are we at with this project?

<mst>: There's two fairly distinct phases to this, just like there will be for DBIx::Class (a.k.a DBIC). The first phase is to switch all of Catalyst core over to being Moose classes, with compatibly (to the current stable version). That's now done except that we can't create an “inlined” constructor yet because the constructor has to specially deal with the fact that Class::Accessor::Fast just blesses what's passed and completely ignores what has accessors and what doesn't since it doesn't have any concept of "attributes”. That may actually have -just- been fixed (Editorial Note: It has been fixed since this interview took place). Basically we have to shake out any breakage in the straight conversion. phase two -which can pretty much go in parallel now- is to take advantage of the fact that everything is now Moose to refactor the living hell out of the core.

As part of the awareness that Moose grants programmer's a richer toolkit and a more complete ecology of best practices and shared knowledge, there will eventually be a port of the popular Perl ORM, DBIC to use Moose for its underpinnings. DBIx::Class uses its own accessor generators and C3-based component system (which are now independently on CPAN as Class::Accessor::Grouped, and Class::C3::Componentised, respectively). In many ways, having DBIC on Moose could have even broader impact on Perl web application development than the Catamoose project, since the expressiveness of the Moose toolkit is directly applicable to a best practices approach of modeling your domain logic. Early discussions on this project are underway. If you are interested in joining in, please shout out on the DBIC mailing list or meet up on the appropriate IRC channel (irc://irc.perl.org#dbix-class).

The issue with “inlined” constructors and compatibility with Class::Accessor::Fast a.k.a “CAF”, has to do with one of the optimization tricks that Moose does in order to give you all the power of a modern Meta Object Protocol with the minimal performance penalty. This issue has been fixed since this interview took place. That means you can take advantage of optimizations like, “make_immutable”. The only outstanding issue now is the ability to use method modifiers on your "MyApp.pm" class (the class that inherits from Catalyst.pm). You have to take care that all your modifiers come after your __PACKAGE__->setup() call and they can't be used with attributes.

Since Catalyst stable is built on top of Class::Accessor::Fast, fixing any compatibly issues at this level is a very high priority. Also, since CAF is is part of Catalyst::Component, it's common for developers to use it in their customized applications, so this makes compatibly doubly important, since the goal is for people to be able to upgrade to the Moose port of Catalyst without having to change their existing code.

The main difference here (at the syntax level) is that Class::Accessor::Fast (CAF) lets you quickly create classes with 'accessors' (think mutators for instance variables) like so:

However, CAF was designed for speed and ease of use. There is no easy way to set defaults, to constrain the incoming fields to particular 'types', etc. It has no underlying meta object to allow introspection of the class.

Moose also has some sugar to make it easy to declare instance accessors. Using the terminology from Perl6, they are called attributes and the scope of their functionality is a significant superset of CAF. At the minimum they give you CAF style mutators, along with an underlying meta object protocol for introspection and runtime modification.

Package MyApp::MyMooseClass; use Moose;

has 'name' => (is=>'rw'); has 'age' => (is=>'rw');

And the usage is (nearly) identical to the last example:

## Note that ->new() accepts incoming arguments as a hash. CAF requires a hash ## reference, but Moose allows either a hash or hashref. Current practice is to ## use the hash method, since it's a few less characters to type.

And then your attributes would validate that 'name' is a string, which is required and defaults to 'John Doe', while age is an integer and is not required. Since Catamoose is built on top of Moose, you'll be able to use Moose style attribute declaration and more in all your Controllers, Models and Views. The end result of all this is that the Catamoose will run all your old Class::Accessor::Fast based code, as well as allow you to start taking advantage of all the Moose goodness for your Catalyst components.

This has been a very short overview of Moose attributes, intended to demonstrate differences from Class::Accessor::Fast and to generate interest. For a more complete overview, please see the Moose documentation and the tutorials. More examples of cool Moose tricks for Catamoose will follow.

<question>: Could you point out some things you'd like to see heavily refactored and what that will buy the community?

<mst>: Well, to be honest quite a lot of the internals need a go over. Catalyst::Helper isn't nearly as flexible as it could be and basically is quite independent of anything else, so I'm hoping somebody will step forward to deal with that.

Catalyst::Helper is a namespace for a set of bootstrapping tools designed to help people get started quickly. You can use the helpers to generate a skeleton structure of directories and files, such as scripts to start the development web server, install standard tests, create root controllers and views, etc. Since most advanced developers skip using the helpers they are in sore need of a talented leader to take charge and increase their usefulness.

<mst (continued)>: I'm hoping this time round for the dispatcher to get the complete overhaul it needs which I got halfway to for 5.50 but got cut short by the code freeze and to be honest, if I'd finished it then I'd probably -still- want to overhaul it by now :)

Catalyst::Dispatcher is the code responsible for deciding which Controller and Action handles an incoming request. Since the beginning of Catalyst development, lots of different features have been grafted on to this, making it somewhat brittle and probably not as optimized as it could be. For example, features not part of a dispatchers core functionality could be moved to a Moose Role, such as the code that builds up the the table of debug output you see when you start up the development server, etc.

<mst (continued)>: I also want to clean up how action registration works. Attributes are cute, but the implementation makes them a bit of a pain to use with Moose and the implementation shouldn't be as tied to the syntax as it currently is.

Attributes in Catalyst are different from attributes in Moose, so take care. Catalyst attributes are built on top of the attributes pragma introduced in Perl 5.6, which is a system that let's you introduce arbitrary meta data to your variables and subroutines. These are the familiar 'tags' you add to your action subroutines so that Catalyst knows what action is handling what request. For example:

For Moose, attributes are are values that are aggregated to an instance of an object. See above code for an example or read the documentation.

There is an extension package for Moose, called MooseX-MetaDescription which can be used as a flexible replacement for associating arbitrary meta-data with classes, instances, attributes, etc.

<mst (continued)>: The component loader is starting to feel more and more like it's not -quite- in line with what we need but it's a bit difficult to work out exactly what we do need so the current idea is to move that to using Module::Pluggable::Object together with Bread::Boardand then things like $c->model(...) will be a simple call onto a service locator, i.e. $c->service('model/Foo') or similar. If we can get to there, we open up the chance to use as much of Bread::Board as you want and even to throw out the catalyst M/V/C standard stuff entirely but allows us to trivially have the -default- configuration behave exactly like what everybody's used to and provide other default configurations as separate classes as and when we have a good feel for what actually needs to go in. It's always hard to guess in advance what features will actually be useful so the best approach is to clean up the core every time we refactor so it's easier for people to add features instead and then after a while we can spot what the patterns are in what users are doing and turn those patterns into nice reusable modules.

Bread::Board is an Inversion of Control container (see IOC for more), mentioned in the previous interview. To understand how an IOC can help Catalyst, you need to realize that if you are developing for Catalyst now you are already being helped by a framework that incorporates some IOC concepts. One of the things that's nice about Catalyst is how the core system will automatically load up all your Controllers, Models and Views, initializing them via the global configuration object and making them available via an internal registry of components. Internally, you can use $c->forward to dispatch to a controller method via it's internal, private name. This works as well for views and models. In Catalyst, you can call up a model with $c->Model('Name') and behind the scenes Catalyst will provide an nicely instantiated object via something like 'MyApp::Model::Name->new($config->{'Model::Name'}) which you can just use. Another thing that is nice about using a component registry is that you can detach your functionality from their actual namespaces, giving you a lot of power in refactoring. It is also the thing that makes it easy to detach actual URLs from the code which handles that request, powering stuff like $c->uri_for(...), one of Catalyst's killer features. So basically Catalyst has a lot of the functionality that's generally associated with IOC, but it's not particularly extensible. The idea here is that instead of trying to grow our own half done IOC, let's just use a well developed and tested one. This will give us the flexibility of creating custom component registries that are not tied to the standard Catalyst Model, View, and Controller directories. For example, you may wish to expand out the standard MVC idea to include related concepts, such as splitting the Model tier into two parts: Domain Model (your core business logic) and Presentation Model (a class that represents all the data and behavior of your User Interface).

Of course if you find this all confusing and too much to learn today, you can just skip it for now and carry on using the tried and true methods (just create your Model, View and Controller directories as expected, or use one of the helpers to do it for you). You can just hold the warm fuzzy feeling that behind the scenes there is some really cutting edge technology running the show, and you don't need to let your Java developer friends hold anything over you.

Why choose Bread::Board? It's a good choice for this task since: 1) It's written by the someone that's already written another IOC container, so consider B::B to be the “second one done better.” 2) It's written in Moose, so it will play nice with a 'Moosified' Catalyst, and 3) it's written by Stevan Little, who you might recognize as the founder of the Moose project, so you can be sure it's well written.

<question>: Can you give an example of something cool and shiny that Catamoose is going to let us do?

<mst>: Cool and shiny? Well, (in terms of potential future syntax) I really hope to be able to say:

A Moose Role is basically a way to extend your class functionality. A Role provides abilities that might cut across class hierarchies, making it unsuitable for classic inheritance. Think if the relationship type in inheritance 'is a' then Roles would be 'does a'. Roles are a great way to extend a class without the old 'huck everything into a base class' approach. This encourages good design and reusability. Almost all of what makes a good Catalyst Plugin will make a better Role. For more about Moose Roles, see some documentation.

<mst (continued>: I also hope to be able to add extra conditions so a given chained action can assert that it -and- anything underneath it only respond to, say, (HTTP) POST requests in terms of components. (Additionally) I want to free us from the tyranny of MVC. For example there's things like sending mail that don't really fit into the approach so it'd be nice to just have service/Mail or something:

(This approach could also be used for providing some core Catalyst functionality.) So there'll be a session service somewhere and "sub session" in $c becomes just another sort of lookup (to the IOC container,) $c->service('Session') or something.

Some of the things are are currently plugins would probably make better models that can be accessed via the service registry of Bread::Board. Things like Sessioning, Authentication, Caching, which are written as plugins to the Application context (basically, things that extending the Catalyst.pm class, which you are extending in your custom Catalyst application) would be much better off if they were decoupled via IOC. This will make them a lot easier to test and troubleshoot. Additionally it will simplify the application context and allow you to use those services outside your Catalyst application. So, for example, if you needed to write a bunch of command-line application for adding and removing users, you could do something like that much more smoothly. Additionally, a few things that are currently integrated into the core application context, like Logging and Configuration might also make good targets for moving out of core.

<mst (continued)>: I'm not sure how all of this will fall out. The trouble with sketching is until we build it and can experiment it's hard to tell what we'll -want- to do with it. Something I do think is a definite goer is to allow scoped attributes. Currently we only really have per-app instances and per-context instances. I'd like to be able to say:

has 'foo' => (..., scope => 'Session');

And not have to know or care how that gets in and out of the session. We currently do 'global controller' (except when your component uses ACCEPT_CONTEXT), Ruby on Rails does 'controller instance per request'. I think there's room for being richer than that and being completely transparent as well. But like I say, this is all about enabling possibilities not about any specific plans.

Think of “per application instances” as basically an object instance that is create once and only once when the application is started. A configuration object is a good example. Instantiating configuration is typically expensive and is not expected to change over time or for different users that may log into your site. A “per context instance” is all those other objects that get created as part of servicing a particular http request and response. For example, you get a fresh Request object each time an incoming HTTP request hits your application. What Matt is pondering is whether or not we need to limit ourselves to those two levels of granularity. Would it be valuable if some of your components (either the classic Models, Views and Controllers, or other components managed by Bread::Board) could be instantiated on a 'per session' basis, that is once and only once when your user logins in, and endures until she logs out, or the session expires. This is the kind of flexibility that Bread::Board can buy us, along with Moose's rich underlying Meta Object Protocol. I'd also like to point out that this kind of 'out of the box' thinking is helpful if we are going to counter the hype machines of some competing web development frameworks. This kind of thinking is greatly enabled when we can all agree on our underlying development tools (i.e. Moose), since that frees us from having to reinvent basic wheels for each new project. It's exactly what I mean when I say Moose is a tool around which a rich ecology of best practices and shared knowledge can grow.

<mst (continued)>: Something I've found quite a lot is that if you factor everything to be clean and general then you get accidental features because things flex nicely when the user tries to do something you didn't think of. The DBIx::Class resultset system () was like that - the aim of that was cleaner internals, not any specific features and yet people have figured out tricks involving resultset chaining, and ways to take advantage of that in code like DBIx::Class::Schema::RestrictWithObject that nobody had even considered being possible so simply before we did the work. So I'm not heavily worried about exactly what features fall out of this yet. The most exciting ones are always the ones you don't think of until later, once you've had some practice with the new cleaner APIs.

A ResultSet in DBIx-Class is an object that represents the definition of a query, typically on a SQL database. Conditions and attributes can be built up in several discrete steps, making it much easier to create reusable code. This feature of DBIx-Class (or DBIC as mentioned above, which is the shorthand name for this project) is one of it's most powerful, yet was not originally planned.

DBIx-Class-Schema-RestrictWithObject is a system to automatically 'preload' conditions on your base ResultSets. One common use for this is to automatically restrict a table by the currently logged in user.

<question>: If someone could spend 8 hours on Catalyst what types of things would you have them do?

<mst>: Go through the RT queues for a few catalyst modules and see if there are any missed tickets that look like they're pending a quick test case or a simple patch.

I know from first hand experience that getting yourself starting writing patches and test cases can be a bit daunting. There's documentation for this, but it's a bit scattered right now so the best thing is to come on to IRC and ask for help (irc://irc.perl.org#catalyst-dev). Or you can browse the RT tickets.

<mst (continued)>: Write up something that made you go "oh, -cool-" when you got it working onto the wiki Go through the mail archives for interesting tricks and put them onto the wiki or as POD patches.

<mst (continued)>: Read through some piece of documentation and think "what things do I know that aren't in here?" Pick a piece of the internals that interests you and think about adding something about how that works as comments, or stuff in the Manual::Internals, or as a wiki entry, or whatever. You can't attempt anything -big- in 8 hours, but you can do a hell of a lot of useful things. I once sent a patch to an author and he said he wasn't able to release soon because he was piled under with RT tickets for it and wanted to fix some first. I spent an hour with him running the queue and triaging it. We knocked >75 tickets down to about 20 and the release went out end of that day. To anybody thinking "yes, but I can't go as fast as that", so? If you spend 8 hours and only manage to close one ticket that's one less for the author to deal with and you've just spent 8 hours learning new things about some code so the time was still productive :)

If you can just find 8 hours a month to help, I am certain you can accomplish something useful. Just choose something that needs to be done that you would enjoy doing. I decided that we needed more advocacy so I said, “John, take 8 hours a month and blog some interviews about Catalyst and Moose, since those are two things you enjoy.” 8 hours a month, or even every other month is something most of us can achieve. I think the main thing that is stopping us is paralysis about what to do. Matt gave some great suggestions. I realize there is a knowledge gap for people to get started, but I know that over time that can be overcome.

<question>: Conversely, if there was something related to Catalyst that you could stop immediately, what would that be?

<mst>: Well, I have to answer two different areas. In terms of extensions, I really, really, really wish people would stop writing plugins by default. Things are a lot better than they used to be but unless you're doing something like the session code that has to wrap the request cycle there's really no point it hugging the application class so close. Adding stuff to $c should be the last resort, not the first. We saw massive code re-use issues here with the first generation form plugins which all used $c->form. The modern form controllers use $self->form or ->formbuilder or whatever. So the choice is per controller and code re-use and gradual conversions are much easier.

When Catalyst first came out, I think there was a big knowledge gap about how to use it. Mostly, the problems centered around the idea that Catalyst programming was something other than Perl programming, which caused a lot of people to throw out sensible coding practices as they tried to get their head around Catalyst and the MVC system. One of the more common problems was how people thought all functionality needed to be a plugin. Since plugins are loaded into the root application object, it's convenient but messy. To be fair, this idea was probably propagated by the fact that there was some easy documentation about how to write a plugin, but no other best practices documents. As of now, the document “Extending Catalyst” is considered the last word on this subject. I would simply add not to forget this Catalyst is just Perl, and that all the other best practices you've hopefully learned still apply. I think again this is a place where Moose can help, since using Moose tends to encourage a programmer to think in terms of properly organizing code.

<mst (continued)>: In terms of application code: GET THE BUSINESS LOGIC OUT OF THE CONTROLLER DAMMIT! Everything that's business logic rather than UI flow logic should live in the model. Write methods in your DBIC ->table classes. Use load_namespaces and write methods in your resultset classes. If there's an extra layer of logic that needs to be in front, then -that- should be a model that's exposed to your app. This way everything can be re-used in batch jobs, cron scripts, command line tools, etc. And it can be unit tested. Mech testing is a lot more elegant than it used to be but it's still much harder work than writing a standard set of tests so you can get a much more reliable app much more easily if you do proper MVC. It also means that your "user" interface and your "admin" interface - and let's face it, most apps -do- have those as two semi distinct areas can share code a lot easier because the code they share is all in the model. It's just the right way to do it - sadly the world got seduced by the whole "web MVC" / "MVC2" paradigm for quite a long time which basically took the logic people used to have at the top of their template pages, stuck it in another file and called it the controller and web development as a whole is only now really going back and understanding MVC. I mean, the original Smalltalk MVC and a bunch of the Java GUI stuff the model is actually the UI components, i.e. a sortablegrid would have a model and your db objects that are handed to it don't even have a name - they're "the stuff shared among all apps that use this info". We're not quite going that far back because it would just be too much of a paradigm shift but your HTML::FormFu or Rose::HTML::Formobject -that- is more like a "model" in old MVC so what we call the MVC model is more like the domain model but the point is that means all the domain logic should live there, not in the controller. I think now we're getting newer example apps out like jrockway'sKitiWiki that do it this way. People will slowly shift over but it'll be an uphill fight even so - MVC2 -looks- easier. It just turns out not to be 3 months into development.

There's a lot to learn if you are going to do MVC properly. In and of itself Catalyst (or any web development framework for that matter) is not going to be a magic bullet solution. Really, its major advantage is that you are using a system that sets you up to do things the right way, while doing so in a framework that lots of other people are also using, so that over time the documentation of best practices and knowledge will accumulate. I realize that there is a documentation gap for this type of development using Perl. That is improving slowly with the development of the types of good examples that Matt mentioned. KitiWiki is a Catalyst based wiki, you can find more at, “http://www.jrock.us/fp2008/catalyst/”. If you are good at writing, this is a high impact jobs you can do for the community.

One of the biggest code smells for Catalyst development is when your Controllers are chuck full of logic that rightly should be encapsulated in a Model. We've all done it when under heavy time pressure and when it didn't seem like there was any other good thing to do. For example you might:

There's a whole mess of stuff wrong with the above, even if you skip the security issues. First of all you are deeply tying you logic to your physical database model, which is going to be a nightmare should you ever need to change that stuff. Also, you query is 100% not reusable. What happens when eventually you need this in a transaction to make it reliable? And of course this is not easy to test at all, since the only way to write a test case is to write a selenium script (http://selenium.openqa.org/) or the mentioned Test::WWW::Mechanize::Catalyst, which will tie your test very tightly to a particular URL (and we know how often those can change...) Also, what happens when you need to replicate the same functionality in a cron job, or in a command line application? You end you with a lot of cut and paste code that ALL needs to be changed should you ever change the underlying database structure.

Matt's discussion of the best practice approach to organizing your DBIC based storage models merits a bit of explanation. When you create your Schema DBIC gives you some ways to automatically load Resultsets and Results classes, similarly to how the Catalyst will automatically load your Model, Views and Controllers (Perhaps a future Moosified version of DBIC might also use Bread::Board for this?). This allows you to properly organize your business logic. Look for a future blog post discussing this in detail, but for now you can ponder the documentation at.

<question>: What project management issues does Catalyst face? What do you think we could do to improve it?

<mst>: I think the biggest challenge to the project is making sure examples, docs etc. keep up with the work we're doing and making sure that as the number of people depending on and releasing code in the area increases, there's an effective shared understanding of how things should be done so everything plays nice together and so that people coming to catalyst new can get up to speed with all the latest stuff rather than getting to mid-level and suddenly realizing the experts are using a different set of tools. It's very hard to do that without raising the barrier to entry and we really need more projects that will lower it but it's very hard to get people to work on that, because once you're an expert you don't really find the bootstrap process hard anymore. It's the same effect as "newbies write the best docs" because once you get to the point where you know all the answers it's not obvious which questions people are going to ask in what order and the answers to a lot of them -seem- self evident. Which is why any time somebody says "I'm not good enough to help with code" I tell them that means they're inexperienced enough to be able to improve the docs far more than somebody like me ever could.

Helping with ways to close this gap that newcomers to the framework experience is probably one of the most useful things anyone could do. A lot of the documentation exists, stuff like getting a reversion control system working, setting up Local::Lib, using the makefile to manage you application, what your application should do verse when you should break things out to a separate CPAN module. This is something as a community we really need to do better. There's a ton of knowledge locked up in the heads of people too busy to write it all down. If you can't do anything except ask a few questions and add one or two wiki pages, that would be enough.

I'd like to thank Matt S Trout for his time and apologize for the lengthy delay in getting this interview out. I hope you learned something about the direction of Catalyst, and that these interviews have help dispel some fear you may have had about moving to Moose. I am hoping that this move will generate excitement and that some of you can find the 8 hours you need to make a difference.