Content Management Forum

I intend to use WP for a project that, over time, will likely encompass 1000s of articles. Since I'm about to get started I realized it may be a good time to ask if anyone has experienced "the limits" of WordPress. :)

Does article serving speed, "blog search function" or other issues arise - in any unexpected way - as the number of articles or authors increase?

Has anyone hit the "Oh, crap!" level, where you realized that WP wasn't the content solution you thought it might be?

It's well beyond my ability to judge so can someone tell me: Is there evidence that the latest releases of WordPress are being designed with speed and scaling in mind?

Is there a source for information about optimizing the speed and function of WordPress? Are there any well known tips and tricks?

There are a lot of things that affect this. Wordpress.com has over 100,000 new posts added every day [wordpress.com] running on WordpressMU which is basically a version of wordpress with a wrapper to manage it for multiple users.

Having thousands of articles is not really the issue. Searching through a giant text file will scale roughly linearly as size increases. Retrieving data via simple queries ("get me article 10523") from a database with searches on properly indexed columns, load will increase quite slowly as a function of the size of the data set. Complex queries are another matter, but most queries that WP issues are fairly straightforward.

It's having zillions of page requests and scads of plugins that are more likely to bog things down (some add significant load, some do not, but you can test that with XDebug and Cachegrind).

How well it scales for traffic is affected by things like

- hardware (obviously)

- plugins - do they add DB queries? Do they add css and js files that get downloaded separately (remember, only 1-2 streams at a time)?

- theme - how many different pieces does the theme need to serve up a page? (i.e. same considerations as a non-dynamic page)

- caching - at the WP level as well as PHP accelerators (ioncube and whatnot)

So I don't have any WP sites with lots of pages or lots of traffic, let alone both, but with other CMS I have seen how something that runs like lightning can be brought to a halt by a couple of really load-heavy modules or one bad PHP funciton, and my limited experience (emphasize that!) says that WP can be tuned for speed or tuned to lug.

Another question to ask yourself about scaling is how well does it scale to building many sites off one code base. There are options with WP, but say drupal (and joomla?) have that capability designed into the bones. If you have, just hypothetically, say several thousand domains that ended in "directory" and wanted to build some sort of directory site for each one, how easy would it be to manage it in WP? My gut reaction is that without building some custom scripts, it could be a hassle (I'm in the process of combining a handful of WP sites onto one codebase, which is easy, but getting settings to propagate automatically is harder - and yes, I know about WordpressMU but don't want to use it).

Caching is the huge one. By a page with a really inefficient piece on a drupal site, I brought the load from almost bringing down the server and getting kicked off a shared account, to an almost unnoticeable load. I think with good caching you would notice similar with WP

Generally, a fully cached page can be served up with one DB query rather than dozens. That's a bit savings right there, before you even get into processing that data.

Gentlemen, thank for taking the time to help guide and enlighten me. Much appreciated. I feel somewhat more assured that I'm not planning for my "ruin by design" (by using WP).

What I'm currently working on will use WordPress as a sub-section CMS. Other "evergreen sub-directories/pages" within the website will be flat files. I'd rather take this approach than use "WP pages", unless someone thinks no doing so is incredibly . . stupid? Misguided? :-/

The website I'm referring to in this thread is not going to include a "topical directory". For such needs, for a few other directory type domains, I've licensed a fairly robust directory script.

Is Joomla, by design, better able to scale or process file/data requests faster than WP? If there's a "simple explanation" (Is there ever?) for the speed or capacity difference what is it?

Drupal and Wordpress, though, are both top notch for what they are and I think Joomla has probably gotten straightened out too, but it used to have just nasty code underneath. In general, the bar has risen considerably on open source PHP applications in the last couple of years and I haven't looked at Joomla since just after the Mambo/joomla rename.

So what explains speed differences?. In a nutshell, complexity more than inefficiency per say. As I say,

* How you set them up is probably as significant or more than the inherent differences in a default install. Lots of plugins means more files to load, more queries to make.

*Drupal (and probably any similar CMS) is just a larger, more complex app than Wordpress. If you want the power, you deal with the load.

*Caching. It's always about caching. If you cache, the load can be cut by 5x or 10x and while drupal installs with caching on by default, with Wordpress it's deployed via a third-party add-on. So out of the box, with no plugins, Drupal probably scales better.

More importantly, do you expect huge traffic right away? I ask because I subscribe to Joel ____ (Joel on Software guy) principle, which is to get it built and worry about scaling when you see if you're successful. People spend all kinds of time worrying about scalability and the site ends up pulling in 3,000 visitors per day (meaning it would run just fine on a shared server with any of these CMS). Often times people just get paralyzed worrying about minor performance issues, which of course aren't a problem at all if your site never gets built. If you start to bog, you cache more, you might drop a really "expensive" plugin or pay to have it rewritten.

Build a beta version, stress test it with apache bench and Xdebug/Cachegrind, find the egregious bottlenecks and get rid of them and you're good to go (and you can do that with any of these).

flat filesIf you mean thousands of straight, flat, static HTML... personally I would not do that. If you want it to be totally fast and light, I would do this - put the "data" pieces (the part that is page-unique) in one file. - use the PHP auto-prepend functionality to run it through a simple templating system.

That's just one minimal, ultra-fast method for dynamic templates that could be done with no DB queries.

Why? Because if you ever need to make site-wide changes, it's going to be a hassle if you just have static pages. You can write scripts to do maintenance or even sometimes get by with good use of things like sed (a "stream editor" on *nix that lets you edit using regular expressions). And of course you can build the static pages with Dreamweaver templates, but if you find a typo in your navigation, you have to upload every single page again.

So if you have the sort of page volume where you're worried about Wordpress or Drupal being able to handle that many pages, to me it is sort of crazy have static pages that don't have some rudimentary templating.

In actual practice, though, *I* (emphasize *I*, me ergophobe who decidely does NOT run high-traffic sites and is NOT *you* and does not know your needs) would just use the CMS for all of it. I ran into a problem with the home page bogging things down on a Drupal site. I just made an agressive cache and regenerate as need. Basically caches the entire page so it only takes one DB query to find the cached version and then serves it up, which is pretty damn close to the load of a straight static page. You can set it up to regenerate whenever there's a change made or if there are lots of changes, just have a cron job that refreshes the cache every 1 minute.

i cant answer your question about database format and queries because I dont know enough to compare them. I take a more practical approach. WP was designed as blog software where the latest post or few latest posts get hit hard and old posts get a lot less traffic assuming its popular. Joomla and Drupal were designed to support websites and have people designing and coding with the intent of supporting websites. This results in a lot more functionality for websites as opposed to blogs for these CMSs especially plugins/extensions.

So I build small/low traffic websites with WP or WP Mu for simplicity, Drupal for good user management (social sites etc) and Joomla for other medium/high traffic sites.

So reprint, it sounds like you have recent experience with Drupal and Joomla and you're pretty neutral (right tool for the job). Woudl you care to tackle this question in a bit more detail: your perspective on the pluses and minuses of drupal versus joomla?

My experience a couple of years ago was that Joomla was easier to make a site right out of the box, but if you wanted to really customize, in the long run drupal was easier to extend and use as a platform for a broader range of applications (not just social, but anything that doesn't fit the standard Joomla CMS paradigm).

Still true? Other differences from the philosophical to the particular but significant?

Joomla I find easier to work with and easier to extend functionality. Especially with the new version, it is much easier to use right out of the box. Add-ons are simpler and easier to use. The community is oriented to less technical users than the Dupal community. Drupal add-ons require more knowledge and often you have to combine multiple add-ons to get the functionality you want.

Drupal add-ons are better written code however on average probably reflecting the more technical user group. Many Joomla add-ons code does not validate and they insert badly into the joomla code, sometimes even breaking it. I have had to hack quite a few of them

User management is much better with Drupal than Joomla Multiple site management is far better with Drupal also.

So yes, in general i agree with your assessment. If you are building a lot of custom features then drupal is for you and you should have good technical expertise. Joomla provides what most people want in a website and doesnt tax you too much on the technical side. WordPress provides a blog that can be adapted to a simple and easily managed website WordPress MU is great if you want to develop and manage multiple domains with each being relatively simple websites. So I generally kick the tyres on my domains with WordPress or WordPress MU and then move the best domains to Joomla or Drupal as appropriate

I'd rather take this approach than use "WP pages", unless someone thinks no doing so is incredibly . . stupid? Misguided? :-/

It depends. This might be a bad idea because of the fact that then if you change the look of the wordpress part of the site you'll have to change the look of the flat files, etc. Also, it's nice to be able to edit the pages right in WP, etc. If it's only a few (1-5) files though and you really need the extra flexibility and speed offered by having them as separate html files then maybe it's a good idea.

Haven't used WP to give specific answer, but since it uses a fairly structured URL Scheme and serves content from only 2 or 3 php interfaces, it should be pretty easy to implement a very effective cashing system without being dependant on a plugin or anything like that

Basicly, you should cache full pages based on url params after they are generated and on every consequent page call, check first that the corresponding url is not in the cache (with a global timeout of say 3 minutes), if so serve that file and quit -> no DB queries at all.

and in the back office, add a function to purge cache when content is changed

We've used this non intrusive, cashing mecanism in a number of heavy loaded websites and load was very low after that

If you're planning on organizing the thousands of articles into a book type structure, or have individual articles with a book type structure, then you should investigate Drupal. They seem to have the best out of the box control for that type of document structure.

I have a wordpress site with over 1,000 articles and as long as you have a decent mysql cache and are using a php opcode cache (eaccelerator, xcache, etc) there is very little overhead. Dynamic pages are created in less than 0.150 second on a VPS. The search feature on wordpress leaves a great deal to be desired but not because of speed - it's because of lack of features. Fortunately there are third party plugins that can solve that. The new tag (vs just category) feature built in to versions after 2.3 is quite powerful for organizing things too.

Using wp-super-cache for wordpress (just google it) you can have thousands of simultaneous connections to your server without impossible loads. Super-cache (the successor to wp-cache2) compiles the pages literally to static html files (transparently) that are then served directly via htaccess internal redirects without php or mysql being used at all. Then you have the best of both worlds, a free, open-source CMS-like program with thousands of plugins available to customize it however you'd like, yet virtually no overhead for visitors (note, that logged in users don't get cached pages).

If you use an apache replacement like litespeed or lighttpd you can also double the capacity of any server over apache1/apache2. But that step may not even be necessary unless you are building some kind of site that is going to be constantly slashdotted/dugg.

You should also worry about how many SQL calls your hosting allows (assuming you're not running your own dedicated DB). The problem is really not WP scalings it's the DB that it runs on.

As many people have already mentioned caching is essential. You may want to implement some custom caching, especially if you're page content doesn't change much once it's entered. It may make sense to cache the entire page and rewrite it every X hours with a newer copy. I haven't really using any of the plugins that are used for scaling WP but having thought about the problem, there are a few places where its fairly simple to modify the DB access layer to implement better caching.

I should note that my day job involves worrying about scaling a PHP based service so what's simple to me may be complex for others. There are some 'hacks' and there are some elegant solutions to what you can do. Your basic goal is to minimize the amount of calls to the DB, and serve as much static content as possible. The reason I have thought about WP scaling lately is because a friend of mine runs a blog on WP and he kept getting MySQL errors but it turned out that the issue was query per day limiting on his shared host, so I didn't dig too deep into it after figuring that out.

Looking at premade solutions, "WP-Cache is a WordPress plugin that reduces the need to make many requests on the database by caching page output that has not changed. Fundamentally, when a visitor requests a page from your WordPress site, WP-Cache serves a stored, static version of the requested page. If the page does not exist, it is generated and stored in the cache as part of the request. If a page changes due to editing or a posted comment, the cache for that page is destroyed." [dev.wp-plugins.org...]

This would be exactly what I recommend a caching layer would do. I perhaps would cache it at the DB level, but this is just as good. If that plugin doesn't work for you then the idea should be...

Find where WP makes SQL calls, (example is wp-db.php). Then write a handler class that you call in front of all the code in all functions that ineract with db. The purpose of it would be to store the output of the function into a flat file for X minutes/hours, using serialize to store the array. Then if flat file is fresh, return flat file, otherwise let function exec and then store output. You'd also pass the query to function as an MD5 hash of it to make sure you're not overwriting the cache with different calls... So the code should look like:

Assuming we wrote a caching class.

$array_check = $cache->getCache($query), which would either returns an Array of data back or false if cache doesn't exist.

if we get false we let the entire function run and at the end you have another call

$cache->writeCache($query); to store the results for the next time.

I can elaborate more if needed, but I feel like the plugin should be good enough unless you're scaling to NYTimes size.

Edit Update: I know that WP has some built in caching functions but they are not implemented universaly, and this is where I would recommend putting the custom caching in (again if the plugin doesn't work).

Thank you, mikhaill, for taking the time to respond and thanks to everyone else for contributing to my understanding of WP's ability to handle growth and for suggesting other solutions. Thank you very much.

The only way to really judge is to compare traffic with the # of servers ... wordpress has tons of database calls that could probably cleaned up.

Yes, of course Wordpress.com isn't serving all that up off a single shared server, and so that's where the question of caching, DB calls, load and all that comes in. But again, how much time will Webwork spend building out a custom solution that is super fast, but instead of launching tomorrow, he launches in 2009?

Jeff - build your damn site in whatever is easiest to implement (I say that as a fellow-traveller who is easily paralyzed by overanalysis). It's just data. That's the beauty of something that is dynamic instead of using static pages. I just transfered a site with a few hundred pages from something else to WP and it wouldn't be that hard to transfer it again to a custom solution if WP was taxing the server.

might be a bad idea because of the fact that then if you change the look of the wordpress part of the site you'll have to change the look of the flat files, etc.

The first site I managed was a hybrid static/dynamic and it was absolutely a nightmare to maintain. This was before Wordpress or Drupal and it was this maintenance nightmare that prompted me to dust off my old programming skills. I swore that I would never again do anything with a website that required me to edit every page just to change the look of the site. Separation of content and presentation is absolutely essential to your sanity IMO. I know people manage sites with a few dozen pages using Dreamweaver and such, but when you're talking thousands of pages... UGH!

If you want to keep server cost absolutely minimal, you can eschew WP for this and go with the old standby

By the way, the original version of the Palm was a block of wood with the interface drawn on it in pen. Once the designer got that dialed in, then he started actually using electronics. That's always worth keeping in mind when thinking about how far you need to get before you can test an idea.

Just my opinion but I think Drupal is a better way to go for ANYTHING beyond just a blog. WP is the best for blogging but after that it get's a bit clunky and unwieldy with 3rd party modules stuck on.

Drupal can start as a blog site and scale to anything you want. Some of the largest commercial websites on the internet run Drupal. I run both and I can tell you that Drupal rocks when it comes to SEO. Google luvs drupal!

That's pretty much how I do it - WP for a blog, drupal for anything else. That said, I'm building a site now that will only have a handful of non-blog pages and then will have a blog at some point and WP is, I think, a little easier for the totally non-tech writer to tap into. I could be wrong though (that's why I used WP - I thought that if the person in question even saw the drupal admin interface it would freak him out.