Tuesday, August 16, 2011

Much ado and misinformation about how to really profile or performance analyze PHP is all over the Internet.

Rebuffs also exist to much of that misinformation all over the Internet.

How to actually really set yourself up to properly profile PHP, not so much...

So, here's an attempt to provide a step-by-step process, at least for RHEL 5.7

If you run something else, you'll have to improvise a bit.

It's not like I really picked RHEL, so that probably deserves some background...

I'm mostly an old school compile from source guy for Apache, MySQL and PHP to get what I want.

I used to use RedHat/Fedora for the base Linux / desktop, and then download source for Apache, PHP, MySQL.

I switched to gentoo a few years ago, and admit I pretty much fail to update very often, but whatever.

I only have to maintain a handful of boxes, and update when there is a real need.

In my workplace, the sysadmin is a stock RHEL guy, but then he has to maintain hundreds of servers with puppet and monitor them with Nagios and...

Well, let's just say I completely understand that need.

And he HAS opened up to Dag's repos and a UIC or ULC or somesuch to get something I needed.

Alas, when you want to do serious debugging, profiling, or performance analysis, that's not good enough.

So I recently embarked on a task to build a "debug" desktop with the following goals:

have debug builds of php, apache, and mysql

have xdebug

have kCacheGrind / valgrind / callgrind to visualize bottlenecks

this box is used for debug build, nothing else, no other users but me

stray no further from our stock build than I have to

Of course, just installing the GUI is a pretty far straying, and I let RHEL push me into gnome rather than KDE. Not sure that was wise, but there it is.

Still, I tried to have minimal impact on the build.

Testing Experts/Pundits:
Note that I'll exit X windows once I'm ready to really test, and then fire it up again to examine the results. If having the X binaries on the box screws up my tests, so be it. I'm already replacing the REAL binaries of the code with these debug-enhanced ones anyway, so at that level, it's not "the same" anyway. Having a separate desktop with all these tools on it just didn't seem practical in my case. Your needs may vary.

I first attempted to get the debug builds of at least PHP, and hopefully Apache and MySQL, just in case.

Everything seemed to work with yum, but I spun my wheels for hours checking phpinfo() configure line for --disable-debug and trusting its output about debug build.

It turns out that the RHEL debug builds are actually not built into the same binary files that ./configure --enable-debug would do. Instead, the debugging symbols are installed in separate binaries in /usr/lib/debug, and tools like gdb and strace and the all-important (to me) "httpd -X" just "know" to look in there and load those debug symbols in on top of the non-debug binaries in the usual places.

That kind of blew my mind, but whatever, you know?

And why the httpd startup scripts and PHP don't know to look there is beyond me, but I suppose if I really need to force them to do it, I could find a way... I doubt I'll need that, as you'll see shortly.

Anyway, in order to actually get these debug builds, you have to do something like this:
yum install --enable-repo=rhel-debuginfo httpd-debuginfo mysql-debuginfo php53-debuginfo

Note that this installs only the debugging symbol binaries into /usr/lib/debug. If you actually want to install the software itself, you still need to do these if you didn't already:
yum install httpd mysql php53

You also want to install valgrind, which basically allows you to run your binaries under debugging conditions with a whole suite of tools. We'll look at just one tool, callgrind, because a) it's the only one I know so far and b) it's the one that makes pretty pictures that make it blindingly obvious where your bottlenecks are in your code.
yum install valgrind

Just to be sure you installed Apache and PHP, set up your usual PHP script for testing purposes:
echo "<?php phpinfo();?>" > /var/www/html/index.php

(Or edit /var/www/html/index.php with your favorite editor.)

Surf to http://localhost/ and see the PHP status page with lots of blue tables telling you everything you ever needed to know about your PHP install.

Now to prove you can generate a callgrind script, to generate a visual representation of where your code spends all its time:
/etc/init.d/httpd stop
valgrind --tool=callgrind --dump-instr=yes -v /usr/sbin/httpd -X

(Someday, RHEL will catch up to the rest of the world and that httpd will change to apache or apache2...)

Note that you are now running a single httpd instance/child/thread with debugging on full blast, instead of your usual high-performance httpd process.

You definitely do not want to do this on a production box, or even a shared dev box.

You need to be doing this on a sandbox all your own.
Really.
If you can't figure out why, stop reading now and go do something else.

If you kill the process, and it's built up a HUGE file of data, it just dies before it writes the data. Don't do that.

Back in the shell where you did that valgrind command, it will have exited. It will also have dumped a file whose name starts with callgrind.out. and ends with the PID (process ID) of the httpd process that was started.

You can just safely assume it's a random number, if you are not familiar with process IDs. Actually, if you're not familiar with process IDs, this article is way too advanced for you. Oh well.

You can open that file up in an editor if you like.

It won't make much sense, really, but it's kind of cool to see it.

Don't change anything in that file, for goodness' sake. We've worked very hard to build it.

Now at this point, with EPEL installed, I pretty much opened myself up to straying very far from RHEL, but all I want is kghostview so kCacheGrind can export pretty images:
yum install kghostview

Woof.

I'm pondering if I should now uninstall the EPEL stuff, just to stay on the straight and narrow, and let kghostview lag, or try to remember to only use EPEL in dire circumstances...

Well, anyway, you probably don't care about that bit.

You now should have a very powerful tool suite to really find the bottlenecks in your PHP code, instead of trying some Voodoo Analysis to "optimize" things like changing all "" to '' or using a single (un-indexed) query instead of two (indexed) queries, because running more queries is slower, or...

I know, I promised not to go there in the first paragraph, so I'll shut up now.