planets

links

buttons

The last blog post was just notes on the feedback I got as a reaction to a moment of frustration on twitter. People had asked me to summarize the feedback and this is what I did in that blog post. But I realize now what is more relevant is that the CMF team needs to more clearly explain that we are not trying to go head on against the likes of Drupal, Typo3 or eZ Publish but why our work still might matter for to others. The people that are interested in the CMF are developers that have realized that while they have needs to maintain dynamic content on a website, none of the current crop of CMS solutions were really a good basis to solve the needs of our clients. Drupal and friends provide a bazillion features and if your clients for the most part simply need a subset of these features, then by all means use one of these CMS solutions as you and your clients are not in our target group. Obviously we would love it if we could also provide these bazillion features, but realistically it would take us years to catch up. We see our target group among those where their clients need isn't a subset of the offered features; rather where it is necessary to do a custom implementation of the bulk of the features. Today any project that has "content management needs" is build on top of some CMS even if they don't deliver any of the key features. As a result the final solutions end up bastardizing these CMS, creating a lot of pain for the developers and needless constraints for their customers. We want to be "disruptive technology" in the way Clayton Christen described in his book the Innovators Dilemma.

So what does this mean? We want to offer a solution for those developers that need to create a custom CMS solution for their customers. Again to make it clear I am talking about customers who needs are not a subset of existing CMS solutions. As a matter of fact this is why we decided to call it the Symfony CMF initiative and not the CMS initiative. So rather than trying to mold these ideas into an architecture that made lots of assumptions to optimize for other needs, we want to provide the very basic building blocks that we believe is common for content management, i.e. a content repository supporting versioning, tree structures, references and schema management via node types with a tool chain to expose this content to UI's via a clearly defined API. We believe that with these tools available combined with the power of Symfony2 is going to be a faster, more reliable path to creating custom CMS solutions with less legacy baggage for customers to accept. So when people wonder why I keep talking about low level pieces like PHPCR and RDFa then its because I believe that to build a custom CMS solution you need to know these layers. But contrary to legacy monolithic CMS solutions these layers have clearly defined API's that even support your use cases, rather than constrain and complicate them needlessly.

If you feel like being able to interface your custom code with such clean low level API isn't something you need, then again I fear that you will be happier with more established CMS products rather than what we are offering with a content management framework. But for what I see the reality is that many people are stuck in a place that can probably only be explained with stockholm syndrome. Thing that should be simple like migrating content between development, staging and production should be easy. It should not require some hackery to map auto increment values. Being able to hook in a custom UI shouldn't require forking some module, it should just mean coding against a defined API. There is a reason why we are using Symfony2 rather than hacking our needs into some finished product. The users of web frameworks believe that its more effective for customers to build them a solution based on a web architecture, rather than onto the constraints on a product. We want to provide the same for the world of content management.

Now I totally understand that without a stable release, tons of documentation, tutorials and several top of the line success stories from various different companies it is quite a big investment to turn your development model upside down: i.e. rather than spending your time molding some CMS that doesn't do what you need, but that you have come to know well over the years, into what you do need .. that suddenly your development starts and ends doing exactly what you need. For those of us who are working on the CMF today, the pain with these legacy systems was enough that we felt it was necessary to formulate a vision for a better future and invest the time to make this vision a reality. If you do not share this vision then maybe the pain isn't big enough, which is great for you .. because it means todays CMS solutions are already doing what you need them to.

Now I want to do a final word on Drupal 8 and eZ Publish 5. Maybe if these products would be finished today, the motivation would be smaller for us working on the CMF project too. But they are not finished yet and they will not for quite some time. Again Drupal 8 is planned to be released in the second half of 2013 if everything goes according to plan. But at that point lots of modules will not yet be ported either. eZ Publish 5 is going to be released this fall, but at that point from my understanding it will mostly be an eZ Publish 4 made to work inside Symfony2. This is going to already expose a lot of the power of Symfony2 to the world of eZ Publish and its going to lay down the foundation for even more improvements to come, but its imho mostly a great step for current users. It will take more time for it to come around for it to be fully integrated into Symfony2. And you know what is great? Both will benefit from the ground work we are laying with the CMF. As a matter of fact the CMF team is very welcoming of both Drupal 8 and eZ Publish 5 and are doing our best to support both products. So we are not seeing it as an us or them battle. We do believe that our low architecture indeed is interesting for them as well and the fact that both Drupal core developers and eZ Publish core developers are interested in adopting PHPCR sometime in the next couple of years validates this. But we all of course acknowledge that it makes no sense for them to jump over to what we are doing if that means leaving behind their current feature set and user base.

So the summary is: If you are doing fine with your current chosen CMS, by all means stick to it. If you are feeling considerable pain, please do consider investing the time to be able to break out of your current trot. We believe we have a valuable proposition for you, which albeit not yet released in a stable form, is already a great base for custom CMS solutions. And the good news is that we have enough people committed to the effort that it will get to a stable release, but the more people that join the effort the sooner our vision will come together in a stable release. For those interested, but no time, please keep an eye on [our website http://cmf.symfony.com] :)

I think the CMF is a great project and very valuable for many Symfony projects, that consist primarily of application code and secondarily of should-be-editable content pages.

Now as mentioned on Twitter, the problem with the CMF that I see is its image. It does not matter whether the CMF contains all the features of a full-blown CMS. Again, Symfony 0.6 did not create an application for you, but people had the feeling that it would greatly simplify their coding.

I think that the CMF probably (because I haven't used it) also greatly simplifies coding of CMS functionality. But you don't come across like that. When you speak about the CMF, you speak about how it's meant for "custom solutions". But the formulation "custom solutions" to me implicates lots of coding. It doesn't raise any feeling in me, like "wow" or "nice, creating CMS functionality is easy with that".

It's not about what the CMF is or can do. It's about emotions. Marketing. You are excited for the project, now get others excited. If you don't sell, people won't buy.

The more I look at PHPCR, the more operations worry me. And it's not the Java process (even though you pretty much open pandora's box). I think this adds an amount of complexity which is hard to grasp right now.

When you compare PHPCR or CMF to something like drupal or ezPublish, I think you compare apples and oranges. ezPublish and drupal can be considered turn-key solutions, where as PHPCR is far from it. And CMF being a framework and not a CMS product, it won't ever be turn-key.

I looked at Jackrabbit too, and it's the most verbose HTTP-API I have seen in a long while. It reminds me a little of SOAP – without its benefits. The sheer amount of requests necessary and parsing for something like a content page is a little worrying. None of this is straight forward or exactly fast. I could see that in the end this is moved to a c-extension again and you'll be where Midgard is.

But still, maybe I got it all wrong, so feel free to enlighten me why this is the future.

This is just the tip of the iceberg for me and your blog entries are in my opinion just scratching the source. A little too much judging going on and smart talk up from the ivory tower of opensource.

I miss some real world examples why I want this. Or why I want to invest 20 man days into something which Drupal, ezPublish or even Wordpress offer a lot more easier.

[Btw, your "login" is still broken. I login and get redirect to your blog homepage.]

The question is at what is Drupal, eZ Publish and friends actually easy? I mean do you really claim that Drupal is easy for operations? They do not even support UUID's yet, so you have to do all sorts of hacks to even be able to deploy a feature. They do have some increasingly good hacks in place to help there and yes they are also developing UUID support. But claiming that Drupal is great for operations is kind a strange claim.

In the same way Drupal is a one size fits all implementation. Right now they have two storage API's. One of them can be deployed on MongoDB, the other only on an RDBMS. Each supports different types of queries. Again they are looking to clean this up.

But what is the most interesting thing? They have acknowledged that they need to go back and really define a storage implementation agnostic API in order to provide an optimal solution for all their different user types. Sounds familiar?

Now is Jackrabbit perfect? Do we have all the features under the sun for Doctrine DBAL just yet? No .. neither would be true. But the difference is that we have a clean API which allows improvements without having to touch any of the business logic. With the steps that I mentioned above, Drupal will have to break quite a bit of business logic, though maybe not super high level modules. I don't know Drupal in that detail to be able to tell. The fact that Adobe CQ5, Mangolia and HippoCMS do manage to rake in big customers seems to imply that from an operations perspective Jackrabbit is quite doable. The fact that many PHP websites are using Solr/ElasticSearch to me also means that even PHP shops can run Java daemons.

At any rate, my goal is obviously to get PHPCR into Drupal, eZ Publish, Typo3 and neither of the 3 is against this on principle, because they know how hard it is to please all user types with a single implementation. Do you think that Magento is doing their customers a favor by coming up with bandaids for their enterprise customers to work around the limitation of their EAV database schema?

So let me turn this around, did you even have a look at the PHPCR API? Is there anything in the API you think is missing? That we should do differently?

From my personal experience, using a "framework" instead of a "turnkey solution" is not always the magic bullet you seem to describe.
The main problem I had with using the frameworks (eg. symfony) was not visible from a developer's perspective, but rather from a customer's one - it was the amount and "type" of custom code that had to be written. In fact most developers love frameworks, as they generally love to code and dislike to learn.

A concrete example: the "glue" used to build the site was too much, and it depended heavily on the mindset/skillset/coding style of the team who built the site. When maintenance was given over to a 2nd team, there was little documentation for it, and code quality was not stellar. So it was redone. And this happened again when 3rd team was called in. A colossal waste of money.

What the (biggish) customer expects from the turnkey solution is more of a longterm assurance that when calling in another (certified?) supplier for taking over, the project will not be rebuilt from scratch.

Custom code will need to be written both when using the framework and the CMS, but since project-time code is always written in a haste, lacks documentation and wide exposure, it is better to relegate it to the execution of small, specific tasks rather than having it at the core of the system.

Now, not all CMS are equal in that respect and fulfill that promise, some offer a much wider API to code against than others, a much more solid / extended content model, and stronger backward compatibility promises, which makes writing "plugins" a more streamlined task and should be a guarantee both of code quality and longterm supportability.

But in general, the wider your API, the more learning there is to do. And after a while you might find that developers leave in droves the CMS because of it.
As someone else pointed out, there is a risk with this "framework" thing you will have to both learn a lot and code a lot - things that only go down well in javaland. But since it's php, it will also be slow...

@gggeek: I agree on all your points and I definitely didn't want to imply that standard CMS solutions have no place. All I wanted to say that in some cases the needs are sufficiently custom that it makes sense to start with a framework, rather than chipping away on the cruft of a CMS product.

That being said, some of the architecture components we are working on like PHPCR and createjs.org will hopefully also make their way into CMS products eventually, thereby reducing their pain points and maybe even making the CMF obsolete eventually. But f.e. with Drupal it will take them at least until Drupal 9 to get there ... which I expect sometime between 2016-2017.

I've had a (very) quick look at the Symfony CMF every now and then from the beginning. At the beginning I was confused about all these new things being mentioned: Jackalope, PHPCR, Jackrabbit. In the meantime I came to understand (I think) what it's all about, and I think it's great. I'm seriously considering working with the CMF, but there's one big thing that worries me so far (and I think it may worry others as well, hence I'm sharing it): my absolute lack of experience with Jackalope. As far as I understand, Jackalope is a must for serious projects (the other persistance layers a there only to ease development/deployment for small-ish projects). First of all, there's scalability (and the performance). A Google search for "jackrabbit scalability" doesn't offer too much info, a blog post or a tutorial on how to scale the storage. And I haven't found that much info on performance neither. I haven't looked that hard though. And the first step into further considering the CMF would be to run some tests - importing some mediumly-large (a few hundred thousand items) amount of info into Jackrabbit and see how it performs - but these take time. I accidentally came accross a note that Jackrabbit doesn't perform well if you have more than 10000 subnodes in a node. So I have to think about a different way of storing them, instead of having categories as main nodes and their items as sub-nodes, as you (sort of) would in MySQL, but how? Splitting them by date? And then, how fast would a query (for example) to get the latest items from category X would be? And who knows what more gotchas are down the road?

I don't know how much sense I'm making here, but I hope you get the idea. To summarise, I guess it's the fear of so much/the unknown. I know fairly well how a RDMS works, I know how to lay out data to fit my purposes and I know how to read those EXPLAIN results and restructure data/add indexes. I know how to do backups and table repairs. But I know nothing about this "new" kid in town.

I'm not saying that you should first try to educate people on these common tasks/questions, there are probably many resources out there on the topic, I'm just trying to explain my concerns about adopting the CMF.

Oh, and speaking about performance, another turn off is that running the sandbox of the cmf "manually" installed (not in a VM through vagrant), Symfony reports a ~600ms generation time for the first page, whereas an out of the box Standard Edition of Symfony takes less than 20ms (I'm mentioning it just to say that my machine is not horribly slow). And some medium projects running with a MySQL backend generate in about 300ms. I have a feeling it might be because assetic's use_controller, but I didn't have time to play with it.

@Gnth: Our goal is to have decent performance with the DBAL upto 50k nodes. That being said we are not there yet by a long shot. We will have to work on secondary indexes for things to get there. But technically there is no reason why our DBAL based implementation could not be made at least as fast as what you see from any of the established PHP CMS.

But yes I think once you are talking about going beyond 50k nodes then even in the long run I expect Jackrabbit to be the go to solution (and I am very excited about the next major version that we will likely start integrating with towards the end of the year). Now in general I have found Java solutions to be less well documented than PHP solutions. Fewer blog posts and fewer or even no IRC channel. For Jackrabbit there is a fairly decent mailing list, but even there I am used to getting feedback quicker on PHP related technologies.

That being said we here at Liip have slowly gotten our foot into the community. One of the founders of Liip now also has commit access and we know who to ask .. if necessary poking people via twitter. Note we do not have any Java experts at Liip, but poking around in the source is quite doable even for someone that knows PHP best. Also we have learned quite a bit about making fault tolerant and scalable solutions and some of the pit falls to side step and we have been blogging about them.

Now getting back to some of the concrete things you have mentioned. Indeed Jackrabbit guys say that at 10k *direct* subnodes on a node performance starts to degrade. Indeed the solution is to partition then and using the date is quite a sensible approach there. Thanks to the ability to dynamically generate parent nodes with a "callback" mechanism on flush with PHPCR ODM, this can be automated quite easily. Thinking about it, having that many nodes doesn't make much sense. PHPCR (and JCR) basically define a very flexible file system and one of the great features is that this means content administrators can actually browse the content. Browsing a node with 10k children really makes no sense and so its quite helpful to partition nodes by some sensible key.

What key to pick is sort of part of the realization that the entire NoSQL world has been feeding the world a lie with the "schema free" message. NoSQL just gives you the ability to mix node types within a single collection, but there is of course a schema and contrary to RDBMS where there are clearly defined rules for defining the schema (ie. normal forms) there are no such rules for NoSQL and therefore the process of figuring out the right schema is indeed a much bigger challenge.

As for performance in the sandbox. There are a few things to note here. First up Jackrabbit uses an cache that is configureable in size. There the most recent nodes are persisted in memory. So repeat requests should show better performance. Next up we are currently doing no caching on the PHP side and on a common website things like the menu structure do not need to be read on every page request. This is a prima candidate for ESI and most CMS will be caching this somehow. If assetic is the culprit on your machine, then you could confirm that quite easily by testing with the prod front controller. Since we do have a fair bit of assets thanks to VIE and friends I would not be surprised if this shaved off a few ms of the request time, but I would also still expect things to be slower than with the SE.

To finish up, we have done a large production site using PHPCR ODM, but without the routing component where we have done benchmarks with 100GB of data generating a fairly complex page from cleanly restarted separate servers consisting of a Symfony2 frontend, a Symfony2 REST backend talking to Jackrabbit and come out with ~300ms. I think we could have pushed this down to ~200ms if we would have invested some more time, but in fact only ~50ms of that as spend on personalized content which didn't even have to go through any of the PHPCR stuff and the rest was all perfectly cacheable.

Thanks for sharing your thoughts. IMO most of the things you replied would be fit for another blog post, so other developers can read them, specially the success story you mention in the last paragraph, detailing more about the decisions you've made, architecture and so on. Indeed, that's the feeling I get about Jackrabbit: their community is either not as developed as the PHP community, or it's just that they're not as talkative. I'm aware of (most of) the blog posts on rocketlab, they provide a lot of valuable info and I think it will only help adoption of PHPCR if you keep them coming.