I’m in love with GitHub and I don’t mind paying a few bucks a month
to host private code repositories there. It’s not without its issues though,
and I’ve often had trouble getting others to collaborate with me via GitHub
for one reason or another. Desiring more control, I was thrilled when
GitHub:FI was announced. Unfortunately, the licensing fees are staggering
and put the service far out of my reach. Recently, my buddy Marcus Whitney
had been messing around with Redmine and my interest was piqued by his results.
I decided to jump in head first and try to build a reliable GitHub:FI
alternative using an Ubuntu VPS, Git, Gitosis, Gmail (or Google Apps
for Domains), and Bitnami‘s Redmine Stack.

The latest stable release even includes the much anticipated Git branch
support. With a couple of plugins, Redmine can perform even more neat
tricks, like sending outgoing email through Google’s SMTP servers, and
managing your Gitosis repositories and public keys automatically. I’ve
only just begun to scratch the surface of functionality and I can already
sense the power and flexibility of Redmine is going to dramatically increase
my productivity and calm sense of well-being.

There’s no reason even a fairly complex system like a Redmine installation
can’t play nice with other system services, although it takes some
planning. Much misinformation exists on the Web about running Redmine in
particular, so we must tread very lightly to sidestep the pitfalls of a
shaky system architecture. I’m of the mind that you should fully
understand the intricacies of a piece of software before you attempt
to run it – or at least understand clearly that you do not need to
worry yourself with certain details – and hopefully when this setup
is completely we’ll understand not only what we’ve done, but why
we’ve done it. That’s the key to building a solid system and is of
the highest priority.

Different elements of the system should be reused where appropriate as
well. We want to only use the system-wide versions of things like
Apache2, since they’ll probably be used for other services too and it
makes no sense to double your load when you don’t have to. By the same
token, we don’t need to install services which are easily outsourced
to existing services or the cloud (like Gmail). In reality,
our Redmine server might be moonlighting as a DNS server, firewall,
development box – whatever – and we need to respect that. A general
policy of software and user isolation along with well controlled shared
resource management will ensure we aren’t wasting our time on a
server that will need to be rebuilt later to play some additional role.

A “stack” is a complete software system packaged so that, once installed,
it operates in a self contained environment that will not effect, or be
effected by, the rest of the host system. You generally don’t have to know
much more than how to run an installer script and you can be the proud admin
of your very own LAMP/Rails/etc server!

Stacks as a concept are almost prehistoric yet they remain an effective means
of reliable package installation. The more complex the package, the more
appealing a stack distribution becomes. The author may finely tune the most
fragile system and distribute it as a stack without having to worry about
it breaking in most cases.

A Redmine stack is a particularly appealing alternative to fighting Ubuntu’s
awkward Rails packages that are notoriously difficult to use and maintain.
Bitnami’s Redmine stack was chosen almost randomly, and I’m sure there are
other options out there. Bitnami has a great reputation though, and their
conventions are very clean and logical in my opinion.

To be honest, I’m a total Ruby-phobe. Not bringing that up would be lying
through omission, and it’s a major reason I’ve decided to go with a stack
– it’s a turn-key solution to a problem I (care to) know extremely little
about otherwise. If you’re not sold on the idea by my humility, feel
free to substitute your own Redmine installation process. Most of the
general setup steps will be similar anyway.

While this setup procedure should be fairly universal, there are a couple of
important things that must be in place before continuing. Firstly it’s assumed
that you have an Ubuntu server somewhere and a local development machine, which
is capable of reaching the Ubuntu server using a fully qualified domain name
(FQDN). You don’t have to have working DNS entries necessarily – you can get
away with using your /etc/hosts files. On both of these machines you must
have access to a non-root user, and that user must have sudo access on the
server.

If for some reason the output of hostname does not give a FQDN, you might
have to adjust the contents of your /etc/hosts file and/or your
/etc/hostname file. Your host must also be accessible to itself, so a quick
ping is in order.

You may need to tell the server to take note of the changes in /etc/hostname
by running sudo hostname -F /etc/hostname. It’s pretty important to get
your FQDN set up correctly before you continue the installation, so double check
everything now.

At this point you should also be able to ssh into the server from your local
development machine. If you can’t ssh yet but are logged in via the console,
make sure the openssh-server package is installed by running sudo apt-get
install openssh-server. If you’re using a FQDN that doesn’t have real-world
DNS entries, add the FQDN to your development machine’s /etc/hosts file:

The tee command is used because root access is required to change the
/etc/hosts file. STDOUT redirection is not granted the same access, so
redirecting to a protected file will still result in a permissions error.
The tee command simply redirects STDIN to a file, appending the contents
of STDIN if the -a option is passed. By default, the contents are also
dumped to STDOUT, so we redirect that to /dev/null to get rid of it.

Finally, you must ensure that your non-root user on the server can gain access
to other non-root user accounts. Some Ubuntu virtual images come with more
restrictive sudoers configurations that only let your user sudo to root.
As a simple test, try any random command as a random system user (who has a
shell defined). For example, the www-data user is a decent test case:

If access is denied, you must change the sudoers file using the command
sudo visudo. A common fix is to append %admin ALL=(ALL) ALL to the end
of the file, and then add your non-root user to the admin group if not
already a member.

The whole idea behind using Ubuntu is to take advantage of its conveniences,
like the package system. The Redmine stack comes with its own version of
many standard system services which we’re going to disable in favor of the
system versions. That way, we don’t have multiple instances of the same
service consuming precious resources, and we move further towards a generic
system that doesn’t rely on the stack for anything but the most specialized
of services.

The standard Ubuntu services are easy to install using the various package
management tools. Chances are, you’ve got many of these packages already
installed, but there are so many different configurations out there, it’s
easiest to just try to install them all in one go. Don’t forget to update
your Apt sources:

Access Control Lists (ACLs) are an often overlooked security feature that
allow additional, more specific access permissions to be applied to a file
or directory. They can control automatic file ownership settings inherited
from parent paths as well, which is crucial in situations where permissions
must be maintained regardless of which user is operating on a shared directory.
ACL support in Ubuntu is dependent upon the acl package which we installed
earlier.

Simply installing the acl package does not enable ACLs for all
disks. Hard disk partitions must be configured in /etc/fstab to load the
ACL option and then must be remounted for the changes to take effect. Assuming
/opt is on the same partition as the root filesystem, add the acl
option to its /etc/fstab entry (using sudo):

A reboot is required if ACLs were not already enabled and /opt is indeed
on the root filesystem. If /opt is on a different (non-root) partition,
you may get away with remounting it using mount -o remount,acl /opt or
similar. Once you’ve got it working, the following command should give similar
output:

ACLs aren’t absolutely required by this setup, but it’s good practice and
will probably save us a few headaches down the road, so just bite the bullet
and load it up now. If you didn’t already see this coming, we’ll be using ACL
features to let Redmine talk to Git repositories without a file ownership
nightmare a little later.

We installed the python-setuptools package mere moments ago, which
installs the very popular easy_install Python package installer. While
it’s a great little dude to have around, I much prefer a new-comer to
the Python package tool scene, Pip. It’s one Python package I’m comfortable
installing globally, so we’ll use easy_install only once, to fetch its own
successor and install it system-wide:

Now we can use the pip command to install any Python package we want,
which is a much more well-advised notion than attempting to use the Ubuntu
package manager. Interestingly, the only thing the Apt system is really bad at
is Python and Ruby package management, and this project deals with both.

On the subject of Python, it would be a great time to familiarize yourself
with the Virtualenv Python package, and how it’s used to create self contained
Python environments. Essentially, the virtualenv <path> command sets up
a tiny Python distribution with its own package repository and <path>/bin
directory containing special binaries. The special binaries will automatically
bootstrap the intended Python environment upon execution, doing away with a
lot of path and environment variable nonsense that used to be required when
attempting advanced deployments. It really is a life saver if you run more
than a single Python app, and the idea of not impacting the system-wide Python
installation is in line with the goals we’ve stated.

Install Virtualenv globally, which will be used when installing Gitosis:

Bitnami really makes it simple to get up and running with Redmine. If we didn’t
care about running “two of everything”, it could be installed and configured in
less than an hour with no sweat. Most of the extra steps we’re going to perform
will actually disable many parts of the carefully architected Bitnami system,
pointing the Redmine install to our standard system services instead.

A dedicated user is not technically required to run the Bitnami Redmine stack.
In fact, I first ran it as www-data and it worked like a charm. Problems
arose when trying to integrate Gitosis, and it became obvious that it’s just
easier and cleaner to go ahead and set up a new system user for the redmine
application. This configuration assumes the Redmine stack, and thus, the
dedicated Redmine user’s home directory, will reside at /opt/redmine. Feel
free to adjust this path at your whim. The obvious choice for the user name
was redmine, but you can change that if you must as well:

The Bitnami Stack system is super easy to get running. All you have to do is
download and run the installer and away you go. Before we get started,
ensure that the system Apache2 and MySQL services have been stopped.
That way the default values will be used by the Bitnami installer, which
are closer to the values we’ll ultimately want.

Download and run the Bitnami Redmine Stack installer. Choose to install into
/opt/redmine, or whichever folder you chose for the redmine user’s home.
Do not choose to set up an SMTP server at this time, because we will handle that
with Gmail later. Be sure to use sudo -u redmine to run the installer, so
the file ownership will be correct.

Bitnami provides a completely self contained Redmine stack with tightly coupled
components that require a very specific (but minimal) shell environment to work
together properly. For that reason, Bitnami includes a use_redmine script,
which sets up the correct environment and runs a shell that is guaranteed
to work with the Bitnami system. The use_redmine script should be run once
before any Redmine management task. If you invoke a management script from
an environment other than the one set up by use_redmine, you run the risk
of corrupting your system. All Redmine management should also be done as the
redmine user for obvious reasons. The use_redmine script is convenient in
this regard, because you only have to run sudo once per administration
session:

The Redmine stack services can be managed using the ~redmine/ctlscript.sh
script. It follows the familiar ctlscript.sh <start|stop|restart|status>
<service> syntax and, of course, must be run after use_redmine.

Firstly, stop all Bitnami stack services that are currently running. We’re
going to completely reconfigure Redmine anyway, and we won’t be needing any
of the other services from this point on.

Warning

These and all management commands must be run from within a Bitnami
management shell as the redmine user. Use the following command if you
don’t understand and want to be safe: sudo -H-u redmine
/opt/redmine/use_redmine.

Redmine uses YAML configuration files, which is a fairly common Rails thing
from what I understand. We need to point the database configuration towards
the system MySQL service with the default settings. Edit the file
~redmine/apps/redmine/config/database.yml and alter the
production section to reflect the system settings:

Now we can migrate the database, installing the Redmine application data on the
system MySQL server. Rails apps use the rake command for these types of
tasks, which I’ll admit is a pretty cool built-in tool. Rails also supports
the concept of deployment environments, but we can ignore that for this
setup and only pay attention to the production environment.

Note

All rake commands must be run from within the application’s root
folder, which is ~redmine/apps/redmine/ in this case.

The Ubuntu Apache2 server has more restrictive proxy security settings than
is assumed by the Bitnami Redmine stack. We need to edit
~redmine/apps/redmine/conf/redmine.conf and add the Order and
Allow setting keys to permit Apache2 to access the Mongrel cluster:

You don’t need to know much about Mongrel or clusters, which is another perk
of an application stack. The default Bitnami setup runs two instances of the
application and can load balance between them, which is totally fine for most
use-cases. Just change the permissions and forget the word “mongrel” forever.

The Apache2 server only requires a couple of additional modules to be enabled
in order to play nice with Redmine. They’re all of the mod_proxy family
and come in the default Apache2 package. The Redmine Apache2 configuration
file we just edited will need to be “included” in the main Apache2 config,
which we’ll accomplish by creating a simple file in /etc/apache2/conf.d/:

Log into Redmine at http://<server-fqdn>/redmine/ using the administrative
user which the installer created, in my case, admin. You will be greeted with
a screen prompting you to load the Redmine default configuration. This is highly
recommended, so do it now. You should probably at least change the URL setting
for the site as well.

Redmine occasionally needs to send users email to alert them to project changes
and other events. We chose not to set up an SMTP server when installing so that
we could hopefully interface with Gmail, avoiding yet another redundant service
running on our machine. It’s assumed that we have a Gmail or Google Apps for
Domains account specifically reserved for Redmine – ideally
redmine@<server-fqdn>.

Google’s SMTP servers require a special type of authentication called TLS. The
Ruby libraries on which Redmine is built do not support this authentication
scheme unfortunately, so we’ll need a Redmine plugin to get it working.
Specifically, the ActionMailer class will be extended to support an optional
TLS authentication mode. The plugin that provides this extension is (creatively)
called action_mailer_optional_tls.

Redmine (or is it Rails?) comes with a plugin installation script that makes
it trivial to try out new extensions. Just like rake commands,
the script/plugin script must be run from within the application’s root
folder – ~redmine/apps/redmine/ in our case. It accepts an install
sub-command and can install plugins directly from Git repositories:

Since we opted out of the SMTP configuration during installation, we probably
won’t have a ~redmine/apps/redmine/config/email.yml file, which we’ll need
to create. It’s another YAML configuration file that should be fairly self
explanatory. Create ~redmine/apps/redmine/config/email.yml (as the
redmine user) and add in your own Gmail settings:

The tls: true setting is only allowed after the action_mailer_optional_tls
plugin has been installed. If you’re using Google Apps for Domains, you would
substitute your domain for gmail.com in the domain and user_name
settings.

Gitosis is a great little piece of software that manages Git repository
access, magically storing its own configuration in a Git repository as well.
Technically, Gitosis relies on a single system user for access and manages
more fine grained access policies using public key management. It’s a very
solid system, on which GitHub itself is derived, and doesn’t clutter the
system up with unnecessary users or daemons.

We’ll be using a Redmine plugin to manage our Gitosis access directly through
the Redmine interface. Redmine users may manage their own public keys and they
are automatically added to Gitosis’s access control system for project
repositories of which they are members. This is similar to the way public
keys work in GitHub so it should sound vaguely familiar.

Gitosis absolutely requires a system user – specifically one with a real
home directory somewhere. Disabling the password makes sure only those with
the Gitosis private key we’ll create will be able to gain access to the box
and change Gitosis settings. We’ll start by creating this user, just as we
did for the redmine user. Feel free to select a different home directory
or name for the user, but note these differences. Also note that Bash is the
preferred shell, since it will give us a little more control over the git
user environment when running the Gitosis commands, which would otherwise fail
to find the Gitosis commands on its path.

Under the hood, Gitosis access is controlled with the use of private/public key
pairs. Each developer will create his own key pair, offering the public key to
the Gitosis system through SSH when a Git operation is requested. Gitosis will
check the public key against its known key list and grant or deny access
appropriately.

Naturally, the key management system must be accessible in some regard in order
to actually add developer public keys. Gitosis uses a single master key pair
in this case, which will always be allowed to manage keys and the Gitosis
configuration. It’s a good idea to generate this key from within a secure shell
on the server, and never transfer it over any network. Strict file
permissions are applied by default and should not be changed or the SSH system
may reject a perfectly good key out of (justified) paranoia.

To generate a master key pair, use the ssh-keygen command. We will be
using DSA keys, but any common scheme will work. Be sure to generate the key
as the git user as to not totally blow up the file permissions:

Gitosis is “just Python”, so it can easily be installed using Pip. However,
it’s a good idea to sequester the Gitosis installation to minimize the risk
of impacting our host system’s Python setup. We’ll use Virtualenv to create
a standalone Python environment solely used by Gitosis.

Now any time a Python script is run from within /opt/gitosis/virtualenv/bin,
our Gitosis Python environment will be bootstrapped automatically. That means
calling ~git/virtualenv/bin/pip install <package> will install a Python
package to the Gitosis virtual environment only. With that in mind, we can
install Gitosis straight from its Git repository:

To actually force the virtual environment to take precedence over the system
environment, we can load the ~git/virtualenv/bin/activate script. It sets
up the user’s PATH environment variable to prepend the special Python
scripts, automatically overriding the system versions. Loading the script from
a ~/.bashrc file will set up the virtual environment upon login, and in our
case, immediately before looking for the Gitosis commands. We’ll need to add a
simple ~/.bashrc for the git user to get it all working:

Many sources would suggest using a command similar to sudo -H-u git
~git/virtualenv/bin/gitosis-init < ~git/.ssh/id_dsa.pub. If your file
permissions are correct, this will not work. This is the same IO
redirection problem that required the use of tee earlier, and the
cat command must be used with sudo to successfully read and output
the git user’s public key.

Gitosis is now set up to manage keys and configuration through the special
gitosis-admin repository. Assuming they had loaded the master private key,
any local or remote user should now be able to clone your Gitosis repository
using git clone git@<server-fqdn>:gitosis-admin.git.

Public key management and automatic project-to-repository linking are
provided by the redmine-gitosis plugin. Just like before, plugin installation
must be done from within a Redmine management shell. The redmine-gitosis
plugin uses the lockfile and inifile Gems, so we’ll need to install
those as well:

Unfortunately, the redmine-gitosis plugin is a definite work in progress (read:
hack) and needs to be slightly modified to work in any case. In the future
I’d love to have a fork available that will automatically work in this setup,
but that presents a sort of chicken-vs-egg problem, doesn’t it?

Firstly, the plugin is actually configured by editing a Ruby source file
rather than a YAML configuration file. We must edit
~redmine/apps/redmine/vendor/plugins/redmine-gitosis/lib/gitosis.rb and
change the GITOSIS_URI and GITOSIS_BASE_PATH values to reflect our
setup. The GIT_SSH environment variable needs to be slightly tweaked as
well:

Bitnami stacks are great! The only thing I can think of to make them even
better would be if they linked their software differently at compile time.
It’s pretty technical (and very boring) but Bitnami could either statically
link the more common libraries or at least specify an rpath for the
resulting dynamically linked binaries. That way the infamous
LD_LIBRARY_PATH environment variable could be avoided entirely, promoting
responsible Linux software packaging. In the meantime, we’ll have to edit
~redmine/apps/redmine/vendor/plugins/redmine-gitosis/extra/ssh_with_identity_file.sh
to change the LD_LIBRARY_PATH temporarily before calling ssh to add the
private key. This prevents a linking error that would otherwise cause all
public keys to be rejected. The modified ssh_with_identity_file.sh should
look like the following:

SSH maintains a list of “known hosts” for each user, which that user has
determined should be trusted. We should create a quick SSH session to the
FQDN of the server as the redmine user, which will prompt SSH to add
the hostname to the list of known hosts. Redmine would not be able
to connect regardless of private key if the host is not previously known
to SSH, specifically for the redmine user. After the password prompt is
displayed, simply ctrl-c to cancel the session:

bash-4.0$ ssh <server-fqdn>
The authenticity of host '<server-fqdn> (127.0.1.1)' can't be established.RSA key fingerprint is d1:fb:2a:0e:53:4f:27:64:63:66:a5:09:28:bd:20:86.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added '<server-fqdn>' (RSA) to the list of known hosts.redmine@<server-fqdn>'s password:bash-4.0$

Unlike the action_mailer_optional_tls plugin, redmine-gitosis does use the
database internally. For this reason we need to “migrate” the plugin before
attempting to use it. The rake command is used again, just like when
migrating the initial Redmine application database. After the migration is
complete, it’s safe to go ahead and restart Redmine since we’re done altering
its configuration:

It was mentioned that the master Gitosis key pair is used to manage the Gitosis
configuration and public keys. Redmine will need access to the private key
in order to manage the public keys for us, but we have been very careful to
prevent access to the master key from anyone except the git user! This is
a perfect use-case for an ACL.

For some reason, the redmine-gitosis plugin comes with a default private key,
which should be removed:

We need to install an ACL that will allow the redmine user to read the
private key. ACLs are complicated but essentially we will add a rule that
only grants the redmine user read-only access to the actual key file
and the parent directory, which is also required. The mask parameter
must also be modified so the effective ACL permissions will be what we
are expecting:

The redmine user now has access to the Gitosis private key, but there’s
one last ACL we need to implement before we’re done. The redmine-gitosis
plugin occasionally manipulates the Gitosis repositories directly, using
the path defined as GITOSIS_BASE_PATH above. To ensure the plugin can
successfully access the repositories, we’ve got to add a “default” ACL to
the repositories directory. In the future, any repositories that are created
will inherit the “default” ACL, granting the plugin the required access:

The redmine-gitosis plugin should now have all the permissions it needs
to access the Gitosis repositories, without unnecessarily exposing them to
other users. Log into Redmine using the administrator user and verify that
the redmine-gitosis plugin is active at
http://<server-fqdn>/redmine/admin/plugins.

Believe it or not, our GitHub clone is finished! Interacting with Redmine
should be very familiar if you’ve used GitHub. To get started, we’ll need
a Redmine project and at least one project member. Create a project at
http://<server-fqdn>/redmine/projects/new/, noting the unique identifier.

The redmine-gitosis plugin will use the project identifier to determine the
path each project’s repository. A project whose ID is example will be
available at git@<server-fqdn>:example.git. To actually create the
project’s repository, a Redmine user (with a valid key pair) must push a
branch to the Gitosis server. Create a developer user at
http://<server-fqdn>/redmine/users/new who will later create our repo.

Add your new developer user as a member of the project at
http://<server-fqdn>/redmine/projects/<project-id>/settings/members/,
granting Git access to the project’s repository.

Now we can log out of the administrator account, since it’s really only
useful for project and user creation from now on. To enable your developer user
to push to the Gitosis repositories, a public key must be assigned. Log in
to Redmine as your new developer user and navigate to
http://<server-fqdn>/redmine/my/public_keys to add a public key.

To generate a key for use from a development machine, run ssh-keygen for
your local user exactly as we did when setting up Gitosis, except without the
sudo. We need to paste the public key into the Redmine interface, naming
the key appropriately. The general naming convention is
<local-user>@<local-fqdn>. When successfully added, the new public key shows
up in the key list.

Now our local development user should be able to create the project’s Git
repository from a development machine. It’s exactly the same process used
by GitHub, which is super convenient since I’m not sure I have the brain
space left to remember another command sequence. Once we push a branch to
the Gitosis server, the “Repository” tab in Redmine will be populated with
our brand new Git repo.

Redmine already includes great SCM tools like diff views and tracker
integration, just like GitHub. Git commit users are mapped to Redmine users
automatically as well so the interface is often even more intuitive than
GitHub’s when you’ve accidentally borked a commit or merge.

It’s pretty damn close, but – even with Gitosis integration – Redmine is
not GitHub. The feature that’s most likely to be missed is the per-user
repository configuration. It might be a sizable undertaking, but this could
presumably be implemented in another Redmine plugin. Pull requests, hook
management, and pretty graphs are probably next on the list, but should be
less complicated to build.

Regardless, Redmine is obviously exciting and it’s my hope that more people
get involved in the development. I’ve already brushed up on my Ruby basics
to tweak a few things so – who knows – I might even try my hand at putting
together a plugin or two of my own. In the meantime, I’ve got more than
enough functionality (and the promise of flexibility) to start using Redmine
full-time in my development work.

I have a check written out to me from a local Nashville company, drawn on their Bank of America account. I no longer have a personal BOA account (thankfully) but I’ve never had a problem cashing a BOA check at a BOA branch, since that’s a pretty standard transaction. Regions is notorious in my mind for charging $5 per-check if you don’t have a Regions account, which they upped to $7 about a month ago. This whole concept of charging a fee to cash checks drawn on their customers’ accounts is fairly new, and many bank customers don’t understand that their own checks are potentially costing their payee money. I for one would be enraged if I found out my bank was stealing money intended for the person to whom I wrote a check, but that’s just me.

Back to my check situation: I walked into a BOA branch trying to cash this paycheck. This wasn’t my first rodeo so I immediately dropped my right thumbprint on the check front and slid my license under the glass. The teller asked if I had an account and I said no, but mentioned that I was thinking of getting an account somewhere soon (which was the case at that moment.) His eyes lit up at the opportunity to nab a new-account commission as he walked me through the amazing benefits of having my very own BOA accounts – personal AND business. After all, why should I be operating as a measly independent contractor/DBA/sole-proprietorship when I could become "a real business"? I thanked him for the advice, accepting the business card he on which he had written "BUSINESS CHECKING FREE!"

What was I there for again? Oh yeah – cashing checks. The teller punched some keys, waited, more keys, then let out a sigh. "We’re going to have to charge you $6 for non-relationship transactions" he regretted, obviously not understanding that I do in fact have a very long, rocky "relationship" with BOA. "We just started doing that, I’m sorry. It doesn’t seem right really." "No kidding" I added, "but I know Regions has been doing that for at least a few months." "We all do it now," he confirmed. In a hurry and not wanting to press my luck, I conceded, "… just take it out of the check."

More key punching and waiting went by, then he calmly walked over to a doorman’s bell sitting on a filing cabinet and rang it loudly. "What did I win?", I laughed. However slightly amused, he straightened his face and told me he needed to ask his superior about a problem with the signature. Apparently the signee (one of the partners of the business who wrote me the check) hadn’t signed any checks recently so they couldn’t compare the signature to anything. "He’s on the signatory list, but we can’t verify the signature so I can’t cash it." "Um… want me to get him on the phone?" I asked. "That won’t be necessary, my superior can override it." I stood there waiting, looking through my iPhone for the email messages from both partners (and signatories) of the company which clearly describe the details of the check transaction since I thought those might help eventually (spoiler alert: they didn’t.) Assuming everything was going well, the teller literally started counting out my money when we both seemed to notice the woman approaching him. They spoke briefly (and quietly) before she left and he turned back to me. "She denied the override," the teller lamented. "You see, we’re on a brand new system, and the requirements for this type of transaction are more strict now. But I do have an idea," he added. I was listening. His plan was to try to cash the check on the one remaining terminal which still used the legacy system. He handed me off to another woman who ran the check through her machine. "Ooooh, yeah I should probably call this in since we can’t verify the…" I cut her off, "SIGNATURE – yeah I know. Look I’ve got these emails right here from both signatories talking about this very check, can’t you just read those real fast and not have to bother these dudes?" She looked confused. "Nevermind, call him: (646) 262…" She stopped me and said they had to call the number on file. "Fine."

The phone rang, she explained that she was from BOA and there was someone trying to cash a check they had written. No surprise there! She asked the guy to verify the amount (noting that telling him the amount would be a major breach of security, right?) After she was satisfied with the legitimacy of the transaction, she began apologizing for having to call him, framing it as some kind of account security feature – not just a bullshit hassle or another example of the breakdown of trust between banks and actual people. She might as well have ended the conversation with "You’re welcome!"

Now that we were VERY SURE that I’m not a criminal trying to commit check fraud, they could fork over my cash – or so I thought. "Do you have a second form of identification on you?" she asked. I cringed. "A credit card or ATM card will be fine." "I don’t have those," I told her, mentioning the now-ironic bank account tips I had just received from the other guy. I offered my PADI scuba certification card (a photo ID which is NOT issued by a financial institution) but was rejected. Kroger plus cards are also not valid identification apparently. I let my passport expire last year after swearing off foreign travel for the foreseeable future because of a bad European trip, so basically I was shit out of luck. "Yeah, we’re just going to need to go ahead and get a second form of ID for this transaction. Our new policy is to get those for checks over $1000," she said, doing her best Bill Lumberg impression. Realizing then that I really wasn’t going to leave there with any cash in hand, I deadpanned "No it’s my fault – I’ll earn less money next time." "Well, we wouldn’t want to do that would we?" she laughed. "I’m actually really damn hungry right now lady, so yeah – we would want to do that. Give me my check back and I’ll just get my girlfriend to deposit it in her account." I snatched the check and walked out – one full hour and zero dollars later.

]]>http://xdissent.com/2010/04/27/bank-of-america-breakdown/feed/rss2/0Mimicking Python Descriptors in PHPhttp://xdissent.com/2010/01/15/mimicking-python-descriptors-in-php/
http://xdissent.com/2010/01/15/mimicking-python-descriptors-in-php/#respondSat, 16 Jan 2010 00:55:56 +0000http://xdissent.com/?p=112So you’ve been working in Python for a year and then along comes a PHP gig
that you just can’t pass up. All of a sudden you’re realizing just how much
you miss some of the familiar Pythonisms you’ve come to rely on. One such
feature that is lacking in PHP is the concept of descriptors. Fortunately,
it’s possible to pretty closely simulate their behaviour in PHP with just
a few special interfaces and classes.

Mind Blowing

For the uninitiated, there are a fewgreatarticles online that go
into great detail about what Python descriptors are, and what they can do.
Descriptors are a notoriously mind boggling topic that has left many a
curious developer cross-eyed and drooling, and usually more confused than
we he started. But they’re simple to understand, really: They’re just object
instances posing as dynamic properties which are defined on a class,
evaluated at runtime by a method on the descriptor object itself, that are
accessible through either an object instance, or statically through a
property on the the class itself!

Yeah. "Simple."

To be honest, most of the difficulty in understanding descriptors stems
from the fact that (especially in PHP,) classes aren’t really thought of as
"objects" much. Classes in general are a little bit "quirky" in PHP, being
referenced as a quoted class name string half the time, an unquoted global
class name the other half, and as a special keyword the other-other half!
Only now are the more advanced class related tools even appearing in the PHP
world, and they’re still not complete (which will really screw us later in this
article). Plus there’s just not that much information about down and dirty
class-level manipulation for PHP out there. Where the official documentation
is lacking, it’s only supplemented by helpless user comments flailing wildly
trying to understand the murky examples. It’s a real shame because ultimately,
we as developers fall victim to the catch 22 of PHP’s evolution as a language:
Features will be added after widespread adoption of the feature." Progress
happens very slowly because developers lack the understanding (or perhaps
the vision) to take advantage of some of these features – much less
successfully lobby for improvements in the language itself. Regardless,
I’ll admit I never really even thought of classes in such a dynamic way
until I switched to Python. So, if you’ve never been exposed to Python and
its very object-like class syntax, then it’s going to be an even longer
road to enlightenment! No worries though, let’s just break it down slowly
and see what descriptors really look like: Descriptors are…

object instances…

Ok, so that means we’re dealing with an instance of a class, which will
obviously require a class definition. That’s a pretty simple concept –
define a class and create an instance of it. Somewhere along the line,
we’re going to have to define some kind of "descriptor" class, and
instantiate it. No big deal. With that in mind, let’s move on: Descriptors
are object instances…

posing as dynamic properties…

Dynamic properties are a slightly advanced concept, and usually make use of
the __getmagic method (and friends), which PHP automatically calls
when you try to access a non-existant property (member variable) on an object
instance. For example, in the following code, $obj will be an instance of
TheClass and the_property will be normal, non-dynamic instance
property:

The property is accessed as $obj->the_property. The following call will
issue a Notice level error by default, since we attempt to access
an undeclared property on $obj:

<?php// Access a nonexistent property.echo$obj->bogus_property;

However, if the class defines an instance method named __get, it can
intercept requests for undefined properties, and return a dynamic value.
That means we can calculate a value at runtime for what the world will
think is a normal instance property. The end product looks something like this:

<?php// Define the class.classDynamicClass{/** * Returns the value of a dynamic property. * * This method will issue a 'Notice' level error if a nonexistent * property is requested. * * @param string $name The name of the property to get. * * @return mixed */publicfunction__get($name){// Check the name of the requested property.if($name=='the_property'){// Return the dynamic value for 'the_property' property.returntime();}// Trigger an error if no property was found with the name.trigger_error(sprintf('Undefined property: %s::$%s',get_class($this),$name));}}// Create the object instance.$obj=newDynamicClass();// Access the object instance's property.echo'The property: '.$obj->the_property;

Now every time we access the $obj->the_property property, the __get
method will be evaluated, and – specifically – the current time will be
returned.

That’s really all there is to the concept of dynamic properties. They’re
incredibly convenient though, and indispensable when you want to define
a clean and easy to understand API or standard interface to some functionality.
There are in fact even more magic methods, allowing you to control property
access in contexts other than "getting." We’ll deal with a few of them later.

Let’s get back to unraveling the mystery of descriptors. We know that they’re
object instances (duh), and that they themselves somehow pose as dynamic
properties. That means that, similarly to the __get method we’ve just
seen, descriptors handle access requests for undefined properties.
It’s not clear how an object instance actually does this amazing feat yet,
but let’s take it one step at a time: Descriptors are object instances
posing as dynamic properties…

which are defined on a class…

Holy crap, say what? Defined on a class? How do you define a property on
a class? Well, in the biz, we call a property defined on a class, a static
property. "Class members", "class properties", "static properties" – it’s all
the same. In fact, the PHP documentation can’t even make up its mind what
to call them. The premise is simple though. Remember that in PHP, classes are
defined globally. It might help to think of the class definition itself as
just a block of sourcecode that creates a single global instance of a "class"
type object the first time it is run. These global "class" object instances
have their own properties and methods, plus they know the how to do the
voodoo (that they do) to create a local instance of the class they represent
(so well). To declare a static class property, you just use the static
keyword in the class definition, and that property will now only be accessible
through the class itself and not through instances. The scope resolution
operator can be used to get the static property’s value:

<?phpclassStaticPropertyClass{// Declare the static property.publicstatic$the_static_property;// Declare the property.public$the_property;}// Create the object instance.$obj=newStaticPropertyClass();// Access the object instance's property.echo'The property: '.$obj->the_property;// Access the static class property.echo'The class property: '.StaticPropertyClass::$the_static_property;// This will produce an error.echo$obj->the_static_property;// And so will this.echoStaticPropertyClass::$the_property;

As you can see, class properties are completely separate from (and oblivious
to) instance properties. They represent a unique global value, associated
with the theoretical "global instance" of the "class" type object. Got it? Good.
To review, a descriptor is really just an object that happens to be stored
in a class property on some random class. It also will intercept and
specially handle requests for some sort of dynamic property, which we haven’t
really talked about yet. We do know that the value for the hypothetical
property will be…

evaluated at runtime by a method on the descriptor object itself…

That means that every time we try to get the value of the dynamic property,
the descriptor instance will run an instance method to determine the returned
value. So a descriptor has to define some kind of processing method, which
is loosely equivalent to the __get method we used earlier. The real
difference is that the PHP magic methods only automatically handle access
to dynamic properties on the exact instance on which the property is accessed.
The descriptor differs in that (somehow) it intercepts property access for
a different object, not itself.

Now our picture of a descriptor is complete, even though we don’t really
know how it’s going to work or what it’s for yet. According to what we
just discussed, we’re simply dealing with an instance of a class, with
a method that will calculate and return the dynamic value of the property
which the descriptor instance represents. And it goes a little something
like this:

<?php// Define the descriptor class.classDescriptorClass{/** * Calculates and returns the dynamic value for the property * associated with the descriptor instance. * * @return mixed */publicfunctiongetDescriptor(){// Do some special calculations.returntime();}}// Create a descriptor.$the_descriptor=newDescriptorClass();// Add the descriptor to the class.StaticPropertyClass::$the_static_property=$the_descriptor;

That’s it! Now $the_descriptor object fits all of the criteria of a
descriptor so far. It’s stored in a static class property
(StaticPropertyClass::$the_static_property), it’s an instance
of a class and it has an instance method called getDescriptor that
will return a dynamic value for some property somewhere. Ah yes – we have
to figure out how the descriptor relates to the property. Recall that
descriptors intercept requests for dynamic properties…

that are accessible through either an object instance…

Hold it right there! Forget the "either" for right now and let’s focus on
this part. The dynamic properties we’re going to be using with descriptors
are accessible through an object instance. So really, when we’ve got our
descriptor working correctly, we should be able to access a dynamic instance
property just like we did using __get earlier. The important thing to
note is that the name of the static class property in which the descriptor
is stored should be the name of the instance property we use to get the
dynamic value from the descriptor instance. I’ll say that again – store
a descriptor instance in a class property, use the name of that class
property to access the descriptor in an instance of that class. For
example, using the descriptor we stored in
StaticPropertyClass::$the_static_property, we would access the
descriptor’s calculated property value using $obj->the_static_property:

Every time we access $obj->the_static_property, we need
StaticPropertyClass::$the_static_property->getDescriptor() to run. Now
that we’re clear, we can actually go about implementing some interfaces and
a class that will take care of the actual property request handling. It’s
not as hard as it sounds! Let’s start by defining a DescriptorInterface
interface, which will need to be implemented by each descriptor class we
create.

<?php// An interface that all descriptors will implement.interfaceDescriptorInterface{/** * Returns the value of the descriptor when accessed. * * @param object $instance The object instance on which the descriptor was * accessed. 'null' will be passed if accessed the * descriptor was accessed statically. * @param string $owner The name of the class which owns the descriptor. * * @return mixed */publicfunctiongetDescriptor($instance,$owner);}

Notice that we added parameters to the getDescriptor method. These are
the parameters specified by the Python descriptor interface and they’ll
really come in handy later. For now just understand that the first
parameter is always the object instance whose descriptor property is being
accessed and the second parameter will always be the name of the class on
which the descriptor is defined. For example:

<?php// Define a descriptor.classChattyDescriptorimplementsDescriptorInterface{// Calculate the dynamic value.publicfunctiongetDescriptor($instance,$owner){returnsprintf("Hey %s, you're a %s.",$instance->name,$owner);}}// Define a simple class.classDude{public$name;publicstatic$descriptor;// Assign the name at creation.publicfunction__construct($name){$this->name=$name;}}// Add the descriptor instance to the class.Dude::$descriptor=newChattyDescriptor();// Create an object instance.$obj=newDude('Broseph');// Access the descriptor.echo$obj->descriptor;

The previous example will output Hey Broseph, You're a Dude. if
everything goes according to plan – which it won’t, until we sprinkle some
magic code dust on the Dude class. What we need Dude to take care
of is the hand-off from the dynamic property request to the descriptor’s
getDescriptor method. Since we’re definitely not going to want to add
extra code to every single class we use a descriptor with, let’s agree
that we’ll need a single base class from which all descriptor-bearing classes
will be extended. In this class, we want to intercept requests for undefined
instance properties, check for a descriptor with the same name in the static
class properties of the instance’s class, and then call the class property’s
getDescriptor method if we do in fact find a descriptor object. Here’s
one implementation of just such a base class:

<?php// Define a base class for objects that will use descriptors.abstractclassDescriptable{// Intercepts requests for nonexistent properties.publicfunction__get($name){// Get the name of this class.$class=get_class($this);// Get an introspection reflection for the class.$class_ref=newReflectionClass($class);// Check for a static class property with the given name.$attr=$class_ref->getStaticPropertyValue($name,null);// If the class property is an object, check for a descriptor.if(is_object($attr)){// Get an introspection reflection for the property.$attr_ref=newReflectionClass(get_class($attr));// Check to see if the static property is a descriptor.if($attr_ref->implementsInterface('DescriptorGet')){// Call the descriptor method and return the value.return$attr->getDescriptor($this,$class);}}// Trigger an error if no descriptor was found.trigger_error(sprintf('Undefined property: %s::$%s',$class,$name));}}

Subclasses of Descriptable will now correctly handle descriptors with no
further fiddling. PHP’s new reflection API (speaking of undocumented
features) is used to inspect the subclass and the descriptor instance to
determine whether descriptor handling should take place. Here’s a final,
working example that will correctly implement a descriptor according to the
Python rules we’ve been over so far:

<?php// Define a simple class.classWorkingDudeextendsDescriptable{public$name;publicstatic$descriptor;// Assign the name at creation.publicfunction__construct($name){$this->name=$name;}}// Add the descriptor instance to the class.WorkingDude::$descriptor=newChattyDescriptor();// Create an object instance.$obj=newWorkingDude('Brosephus');// Access the descriptor (works!)echo$obj->descriptor;

Brilliant! We’ve made a descriptor! The only thing we need to remember about
using the Descriptable class is that if we override the __get method,
we must call the __get method inherited from Descriptable. When
we call it in the overridden method will essentially effect the resolution
order of dynamic properties, so do tread lightly in these cases.

So far we’ve built a descriptor class that emulates Python’s descriptors in
all cases but one. Python descriptor properties may be accessed either
through an object instance (as we’ve just shown)…

or statically through a property on the the class itself!

And here’s where we hit the proverbial brick wall. Basically what we need is
a way to trigger a descriptor’s dynamic processing when accessing a static
class property. The really confusing part is that generally, the static
property will actually contain the descriptor instance itself! Mind Blow
part II, anyone? So every time we access a static class property that
contains a descriptor instance, we want to receive the dynamic value of
the descriptor, not the descriptor instance itself. If we could get that to
work, it would look like this:

<?php// Define a simple class.classUnemployedDudeextendsDescriptable{public$name;publicstatic$descriptor;// Assign the name at creation.publicfunction__construct($name){$this->name=$name;}}// Add the descriptor instance to the class.UnemployedDude::$descriptor=newChattyDescriptor();// Access the descriptor on the class (doesn't work!)echoUnemployedDude::$descriptor." You don't even exist, you lazy bum!";

Unfortunately, PHP has a problem. It provides no __get equivalent magic
method for static access. Try as we might, we’ll never get a dynamic value
for a class property (defined or otherwise). There has been great
discussion, a legendary bug ticket and even an RFC dealing with the
addition of a __getStatic magic method to the PHP core, but PHP 5.3
shipped without even a hint of progress in this area. Quel bummer, man! So
all static class properties must be declared in the class definition,
period. In fact, our descriptor implementation subtley relies on this
quirk. When __getStatic is finally available our descriptor class will
require that the class property be undefined in the class definition, and
assigned later. The assignment would take place in the __setStatic magic
method, which would be tasked with keeping track of added descriptors, most
likely in an array keyed off the property name for each descriptor. Yes, it
will be a brave new world for sure! Oh well, it looks like we were just
wasting our time on this descriptor pipe dream.

Not so fast! Don’t throw your keyboard in the trashcan yet – there is hope.
First, let’s examine precisely how big of a deal this one restriction on
our PHP descriptors is. Why would you even want static access to a dynamic
descriptor property? Coincidentally, you probably wouldn’t
want access to the dynamic value at all! In practice, descriptors rarely
define special handling for the class itself, rather they focus on manipulating
object instances. So the use cases are few in which you will even be dealing
with a descriptor value statically. What you’ll see far more often is
a descriptor returning its own instance instead of a value when it is
accessed as a static class property. If you think about it, this makes
total sense. How else are you supposed to ever get at the descriptor instance
otherwise? If UnemployedDude::$descriptor returned a value, there
would be no way to get at the descriptor instance at all, since that’s the
only way we know how to refer to the damn thing. This is just how PHP works
(which we’re stuck with) and it happens to correlate with the most likely
use case for a descriptor (luckily) so our descriptor class is still very
faithful to the Python equivalent.

One quick thing to note is that the dynamic accessor method required by the
descriptor interface accepts an instance parameter which will contain the
object instance whose descriptor property was requested. In a class descriptor
context, this parameter would always be null since there is no instance
in a static context.

Meet the Family

Up to this point, we’ve ignored a huge detail. In reality, "getting" isn’t
the only access method supported by Python descriptors. There is also support
for dynamic value "setting" and "deleting." These actions roughly correspond
to the PHP magic methods __set and __unset, and should be fairly self
explanatory. Descriptors intercept property "setting" and "unsetting" just as
they do for property "getting" currently. To fill out our descriptor
implementation, we should define some new interfaces to define how these new
methods will be called by the descriptable class.

<?php// The descriptor interface for "set" access.interfaceDescriptorSet{/** * Sets the value of the descriptor. * * @param object $instance The object instance on which the descriptor was * accessed. * @param mixed $value The value that was given to the descriptor. * * @return null */publicfunctionsetDescriptor($instance,$value);}// The descriptor interface for "unset" access.interfaceDescriptorUnset{/** * Unsets the descriptor's value. * * @param object $instance The object instance on which the descriptor was * accessed. * * @return null */publicfunctionunsetDescriptor($instance);}

The setDescriptor method will be called when setting the value of a
descriptor property like $obj->descriptor = 'new value'; and will receive
the new property value in the $value parameter. Note that the
unsetDescriptor method only receives the object instance as a parameter.
If you’re wondering what happened to the $class parameter that we used
in getDescriptor, good catch! It turns out, Python descriptors do
not allow static access for "setting" and "deleting (unsetting)" descriptor
properties. The reasoning is simple: if these methods were allowed for
static property access, you could never remove the descriptor instance from
the class. A full restart of the program would be required to empty the
static class property. That just isn’t practical, so the Python authors left
the feature out completely to prevent confusion. This is an extra bonus for
us, since we can’t use static class property access at all with descriptors
in PHP. That means we’re really only missing out on the one single use-case.

There is one other detail to descriptors that’s specific to PHP. Many PHP
developers use isset() to evaluate whether a property exists and is
valid. Unfortunately isset() will return false for any undeclared
property, even if we’ve overridden the __get method to return a value.
To accurately simulate a real property, we need to override the magic
__isset method to return true if a property is evaluated dynamically.
With this last piece of the puzzle, we are able to construct a robust
descriptable class, completing our PHP descriptor support:

<?php// The base descriptor interface.interfaceDescriptor{}// The descriptor interface for "get" access.interfaceDescriptorGetextendsDescriptor{/** * Returns the value of the descriptor when accessed. * * @param object $instance The object instance on which the descriptor was * accessed. 'null' will be passed if accessed the * descriptor was accessed statically. * @param string $owner The name of the class which owns the descriptor. * * @return mixed */publicfunctiongetDescriptor($instance,$owner);}// The descriptor interface for "set" access.interfaceDescriptorSetextendsDescriptor{/** * Sets the value of the descriptor. * * @param object $instance The object instance on which the descriptor was * accessed. * @param mixed $value The value that was given to the descriptor. * * @return null */publicfunctionsetDescriptor($instance,$value);}// The descriptor interface for "unset" access.interfaceDescriptorUnsetextendsDescriptor{/** * Unsets the descriptor's value. * * @param object $instance The object instance on which the descriptor was * accessed. * * @return null */publicfunctionunsetDescriptor($instance);}// The base class for all descriptor-bearing subclasses.abstractclassDescriptable{/** * Finds and returns a descriptor instance for the class. * * This method will return null if either a descriptor was not found in * the class property with the specified name, or if a descriptor was * found but does not implement the requested interface. Passing the * default interface name 'Descriptor' will return any type of descriptor * as long as it is stored in the correct class property. * * @param string $name The name of the descriptor property being accessed. * @param string $iface The name of the descriptor interface which must * be supported by the descriptor instance. * * @return mixed */protectedfunction_descriptorInstance($name,$iface='Descriptor'){// Get an introspection reflection for the class.$class=get_class($this);$class_ref=newReflectionClass($class);// Get the static class property with the given name.$attr=$class_ref->getStaticPropertyValue($name,null);// Check to see if the property is a descriptor instance.if(is_object($attr)){// Get an introspection reflection of the property.$attr_ref=newReflectionClass(get_class($attr));// Check to see if the static property has the right interface.if($attr_ref->implementsInterface($iface)){// Return the found descriptor.return$attr;}}// Return null since we didn't find a matching descriptor.returnnull;}/** * Finds and runs a descriptor method for the class. * * This method will run the specified descriptor method of the descriptor * with the provided name if it exists. The method may be one of 'get', * 'set', or 'unset'. Any arguments provided in the '$args' array parameter * will be passed to the descriptor method. This method will return 'null' * if no matching descriptor instance was found. * * @param string $method The name of the descriptor method to run. * @param string $name The descriptor property name. * @param array $args An array of arguments to pass to the descriptor * method or 'null'. * * @return mixed */protectedfunction_descriptorAccess($method,$name,$args=null){// Initialize descriptor method arguments.if(is_null($args)){$args=array();}// Get the name of the required descriptor interface.$iface='Descriptor'.ucfirst($method);// Retrieve the descriptor instance matching the name and interface.$attr=$this->_descriptorInstance($name,$iface);// Check for a valid descriptor instance.if(!is_null($attr)){// Call the descriptor instance method with the passed arguments.returncall_user_func_array(array($attr,$method.'Descriptor'),$args);}// Trigger an error if the appropriate descriptor wasn't found.trigger_error(sprintf('Undefined property: %s::$%s',$class,$name));returnnull;}/** * Gets a dynamic instance property's value. * * @param string $name The name of the instance property being accessed. * * @return mixed */publicfunction__get($name){$args=array($this,get_class($this));return$this->_descriptorAccess('get',$name,$args);}/** * Sets a dynamic instance property's value. * * @param string $name The name of the instance property being set. * @param string $value The new value to use for the property. * * @return null */publicfunction__set($name,$value){$args=array($this,$value);return$this->_descriptorAccess('set',$name,$args);}/** * Unsets a dynamic instance property. * * @param string $name The name of the instance property being unset. * * @return null */publicfunction__unset($name){$args=array($this);return$this->_descriptorAccess('unset',$name,$args);}/** * Returns true if a descriptor instance exists in the named class property. * * @param string $name The name of the instance property being accessed. * * @return boolean */publicfunction__isset($name){// Simply test to see if we have any descriptor with that name.returnis_null($this->_descriptorInstance($name));}}

The Gotchas

There are some tiny differences between our PHP descriptors and those in
Python, aside from the static access restriction discussed previously. Python’s
property resolution automatically searches for a class property if an instance
does not contain property with the requested name. PHP does not do this, and
treats static properties very differently. This is unlikely to become an issue
in practice, and is a very specific edge case. Another similar detail is that
Python will short circuit its property resolution if a property defines a
"setting" access method, always using the descriptor even if the instance
defines it’s own property with the same name. Descriptors which only handle
"getting" of properties will not be used by default. This is very confusing
and really just an intricacy of Python that doesn’t relate to PHP since we
don’t have the concept of an object’s "data dict." In all except the wildest
of edge cases, these differences may be ignored.

Another small gotcha can pop up when developers erroneously re-include a
source file containing a class definition. Since descriptor instances must
be assigned to a class property and PHP doesn’t allow evaluated variables
as property values in the class definition, we’re forced to add the descriptor
instance to the class after it has been defined. This isn’t something that
should be feared and it’s certainly not "wrong" according to how PHP works,
but it is a little off putting to some developers. Regardless, it’s important
to not re-include a file if it adds a descriptor to a class. Otherwise, a
new instance of the descriptor may suddenly appear in all of the existing
instances of that class. The fix is simple: use require_once like you
should be doing in the first place when importing class definitions.

The Payoff

At this point you’re probably very angry (and crosseyed) after having read
this exhausting tome, and yet we still haven’t explored why descriptors
are useful in the first place. Let’s go over a few benefits descriptors have
over other types of dynamic properties.

Easy (Class-wide) Caching

A classic use case for dynamic properties is value caching. If the value
of a property is expensive to calculate, it’s trivial to set up a dynamic
property that calculates the value once, store it locally in the object
instance and return the calculated value upon subsequent access. The down
side to simple caching (probably using the __get method) is that the
cache is local to the object. That means in naive implementations the
value is calculated and stored each and every time you create an object
instance and try to access the dynamic property. This is unneccessary
when the calculated value will not differ between object instances. Of course
this problem only gets larger the more instances you actually create.

Descriptors open up the possibility of per-class caching, rather than
per-object caching since descriptor instances are defined on a class. Even
when accessed on different object instances, a single descriptor property
instance is handling all of the requests. This allows the descriptor to
be used as a class-wide cache with very little effort. Using a descriptor
in this way can be a lifesaver for intensive operations like database
access:

<?php// Define a descriptor to manage retrieving the high score.classHighScoreDescriptorimplementsDescriptorGet,DescriptorSet{protected$_high_score;// Returns the current high score.publicfunctiongetDescriptor($instance,$owner){// Check to see if we've already retreived the high score.if(is_null($this->_high_score)){// Fetch and cache the high score from the db.$this->_high_score=get_from_db('high_score');}// Return the cached high score.return$this->_high_score;}// Sets the current high score.publicfunctionsetDescriptor($instance,$value){// Update the cached high score.$this->_high_score=$value;// Save the new high score to the db.update_db('high_score',$this->_high_score);}}// Define a scored game class.classScoredGameextendsDescriptable{// Define a class property to hold the high score descriptor.publicstatic$high_score;}// Add a high score descriptor to the class.ScoredGame::$high_score=newHighScoreDescriptor();// Create a game instance.$game=newScoredGame();// Retrieve the high score for the game (db queried).echo'High Score: '.$game->high_score;// Set the high score, cheater.$game->high_score=31337;// Retreive the high score again (db *not* queried).echo'New High Score: '.$game->high_score;// Create another game instance.$another_game=newScoredGame();// Retrieve the high score for the new game (db *still* not queried).echo'Same High Score: '.$another_game->high_score;

Memory Footprint

It’s obvious that caching values with descriptors saves you from executing
expensive operations multiple times, but what about the amount of memory
used? When you cache a variable locally per-instance, you’re storing that
information once for each instance that requests it. Descriptors use far
less memory in this case by only storing one copy. This idea can be extended
into realms other than simple caching, and will always reap the rewards of
leaner memory usage.

Cleaner Than the Alternatives

Implementing a dynamic property with the __get magic method requires
checking the name of the requested property to determine whether or not
it should be dynamically handled. Once that is determined, the __get
method must then figure out what to actually do to calculate the dynamic
value. Innumerable approaches to this problem exist in the wild, and that’s
a problem in itself. There exists no standard way of determining what
should be called, when, and in which context with traditional dynamic
variables. Descriptors provide a standardized interface to these concepts,
and don’t require hacky switch statements with string checking, or a gazillion
setSomething and getSomething stop-gap methods. Descriptions are
cleaner and easier.

True Object Oriented Design

Whether or not you’re an OOP lover (or even a hater) doesn’t matter. You have
to agree that this confusing middle ground in which PHP has been living
sucks. Descriptors promote the concept of PHP classes as objects themselves,
which is what they really are. This is a good move for the language if it
is even going to try to compete with the Pythons and Rubys of tomorrow. If
functionality is class-wide, move it up to the class logic level where it
belongs!

Conclusion

Descriptors are fun, they’re cool, and they’re just an all around good tool
to have in your toolbox. They’re the ideal solution to some very complex,
real world problems that have plagued developers for years. I long for the
day when PHP releases support for dynamic class properties as well so we
can get our hands on a drop-in replacement for Python’s descriptors. Until
then, the descriptor interfaces and descriptable object class we’ve just
designed will fit the bill just fine.

]]>http://xdissent.com/2010/01/15/mimicking-python-descriptors-in-php/feed/rss2/0Leverage Twitter's Distraction Value to Stay Focusedhttp://xdissent.com/2009/07/03/leverage-twitters-distraction-value-to-stay-focused/
http://xdissent.com/2009/07/03/leverage-twitters-distraction-value-to-stay-focused/#respondFri, 03 Jul 2009 23:14:50 +0000http://xdissent.com/?p=56Twitter by nature is a stream of distractions. I personally never keep Twitter open and visible while I’m working because I know I’ll be too easily derailed from whatever I’m doing. If Life Hackerreported that a single email arriving in your inbox can cost you over a minute of mental recovery time, Twitter’s rapid-fire updates could prove to be literally stupefying. With that in mind, it’s probably a good idea to set aside a particular time of day to catch up on your tweets to prevent wasting too much time.

Once you’re committed to the idea that Twitter is a serious time waster, you can then use it to save a few more precious moments of your day that would otherwise be lost to internet distractions. For example, instead of subscribing to other distracting feeds and sites in your normal feed reader, downgrade them to twitter feeds and conglomerate your slacking. Most such sites mirror their feed on Twitter anyway, so catch up on all your time-wasters all at once – when you have the time to waste. Your reader can then be used for more serious items that might actually require your attention. Plus you’ll have the added bonus of missing a few posts during the day due to the sheer traffic on Twitter, which means less time wasted period. And really, you wouldn’t really miss another lolcat, would you?

]]>http://xdissent.com/2009/07/03/leverage-twitters-distraction-value-to-stay-focused/feed/rss2/0uTidyLib Fixhttp://xdissent.com/2009/03/15/utidylib-fix/
http://xdissent.com/2009/03/15/utidylib-fix/#commentsSun, 15 Mar 2009 19:41:41 +0000http://xdissent.com/?p=35uTidyLib is a Python wrapper for the HTML
Tidy Library. It’s a pretty handy library and is dead
simple to use, but unfortunately it does not compile on Leopard out of the
box. I wrote a quick patch to fix it, and will maintain a vendor branch on GitHub since
development seems to have been abandoned years ago.

The other day I wrote a tiny Django middleware that alerts me if a response
contains markup errors (debug only of course). uTidyLib looked promising, but
I couldn’t get pip to successfully install the damn thing. My deployment
environment needs to be able to use pip so I had to patch setup.py to be
setuptools friendly.

The message error: option --single-version-externally-managed not recognized
means setup.py is subclassing distutils.command.install.install, which
sucks because we want to use setuptools. The fix is easy though: just
change the import statement in setup.py to
from setuptools.command.install import install. I got it all patched up and
into my vendor SVN repository, but ran across another problem when trying to
use pip.

uTidyLib looks for libtidy in a few predefined locations, and Leopard’s
libtidy is not in one of those places. The author doesn’t take into
consideration the possibility of a library having the extension dylib
so I patched my vendor repository to correctly find libtidy. The source
is available at
http://github.com/xdissent/utidylib/.

]]>http://xdissent.com/2009/03/15/utidylib-fix/feed/rss2/2TextMate reStructuredText Bundlehttp://xdissent.com/2009/02/28/textmate-restructuredtext-bundle/
http://xdissent.com/2009/02/28/textmate-restructuredtext-bundle/#commentsSun, 01 Mar 2009 05:43:51 +0000http://xdissent.com/?p=90TextMate is great for general editing of most text formats. By default,
reStructuredText is not one
of them. This document describes how to fix TextMate so that it is a better
fit for the tools I work with. We want reStructuredText rendered previews
and HTML export functions in TextMate.

Update Monday July 6, 2009: TextMate’s bundle repository has moved. This
article has been updated to reflect this change.

Install reStructuredText TextMate Bundle

TextMate has a bundle repository at
http://svn.textmate.org/trunk/Bundles with a reStructuredText
bundle already built. We need to download that to a place where TextMate will
actually find the bundle and load it. Typically, thats
~/Library/Application Support/TextMate/Bundles.

Make reStructuredText Bundle use virtualenv

Now that we have the bundle, we need to let it know that it needs to be
called from inside our virtualenv. The Preview command uses rst2html.py
to generate the preview, and luckily provides a way to customize the path
to rst2html.py. All we need to do is define TM_RST2HTML variable
with the correct path in our .bash_profile and TextMate will find the
right one.

Bask in the glory

We’re done. Now we can preview or export reStructuredText easily. Open a
reStructuredText file and go wild. It’s incredibly convenient to preview
documents as you go, and saving them to HTML makes it simple to blog from
TextMate.

(TextMate)stewart:~ xdissent$ mate ~/Documents/TextMate\ Fixes.rst

Colour My World

If you’re like me, you tend to include source code or console sessions in
your articles. Pygments is a Python package that
can add syntax highlighting to these snippets, which comes in handy for
previewing in TextMate. Installation is slightly more advanced, but well
worth setting up.

Install Mercurial

We want to hack on our Pygments package, so we’re going to want to install
it in editable mode. Plus, the latest stable release doesn’t have the
BashSessionLexer that I use quite a bit. Pygments uses Mercurial for source control, but I don’t have
a system-wide copy of hg, so I needed to grab one real fast.

Install the reStructuredText sourcecode directive

Activating syntax highlighting in a document requires the sourcecode
directive which highlights the following block of code. We can switch out
different Pygments lexers using an option after the directive:

The sourcecode directive isn’t installed automatically, so we’ll need to
add it ourselves. Directives are simply python functions that must be
registered with docutils before any parsing is done. That means we can’t use
rst2html.py with highlighting yet because the directive won’t be loaded.
Pygments does come with the directive already in its source tree, but we’re
going to have to let virtualenv know where it is, and then modify rst2html.py
to register the directive before it begins parsing.

Colorize the Preview command

What good is all this work if we can’t see the damn colors? As it stands now,
pygments is marking up our code to be colorized, but the CSS to actually make
the colors show up isn’t being loaded into our preview window. We have to
enhance the Preview command in the reStructuredText bundle to include Pygments
stylesheets. You can control which Pygments theme will be used by defining
a TM_PYGMENTIZE_STYLE environment variable. I’ve created a patch that we
can apply to update our bundle.

Tell TextMate how to use pygmentize

The patch for the Preview command requires an environment variable called
TM_PYGMENTIZE that contains the path to the pygmentize script. We
need to add that to our .bash_profile along with any custom style we want
to use for highlighting.

Inform TextMate of the updated Preview command

Patch Pygments for virtualenv

Everything should be working perfectly now in full color, but I went one step
further in my setup to tweak the BashSessionLexer in Pygments. Because I
use virtualenv all the time, my bash prompt usually has a virtual env name
at the beginning, surrounded by parentheses. If I were to use the console
option for sourcecode with a virtualenv active in the session, the lexer
wouldn’t recognize my prompt and it wouldn’t get highlighted. I put together
a quick patch for Pygments that correctly highlights virtualenvwrapper style
prompts.

Chump and a Hoagie

]]>http://xdissent.com/2009/02/28/textmate-restructuredtext-bundle/feed/rss2/3Safari 4 Beta and Coda Tabs Joyhttp://xdissent.com/2009/02/28/safari-4-beta-and-coda-tabs-joy/
http://xdissent.com/2009/02/28/safari-4-beta-and-coda-tabs-joy/#commentsSat, 28 Feb 2009 18:32:50 +0000http://xdissent.com/?p=26So the Safari beta 4 has landed and opinions are all over the place
about the new features. The most consistent target for criticism has
been the changes to the tab interface. Although the differences were a
little disconcerting at first, I’ve come to love the new tabs and I even
took a cue from Safari and decided to improve one of my most precious
tools: Panic‘s Coda.

Beta Blues

I’ve been kind of bummed reading all the negative responses to Safari’s
update because I’m so impressed by the damn thing. It seems like most of
the gripers are claiming Apple ripped off Chrome‘s "Tabs-on-Top" layout
and "Top Sites" feature or that Safari 4 isn’t as many times faster than
every competitor as Apple claims. So… when Apple notices good design
they’re not supposed to act on improving their product? Should Apple have
included tabs at all since they certainly were not the first to develop
the idea?. Nevermind the fact that Chrome wouldn’t even exist if
Apple hadn’t shared WebKit with the world, since Chrome uses it exclusively
for rendering. So what’s the problem with a few "good as new" bad-ass
features, dudes? And if the absence of a handful of outdated Safari 3
behaviors is driving you nuts, you can easily bring them back.

Silver Bindings

The one cool thing that really put a smile on my face when trying out
Safari 4 was discovering the new tab-switching keyboard shortcuts. It seems
like every app in the world has tabs these days, and moving around between
them is the most common (yet also most application-specific) task you will
perform. Since the dawn of tab, developers haven’t been able to agree on a
global shortcut to switch tabs and my mouse has suffered great wear as the
battle rages on. I would guess this problem is a NIH related artifact from
a time when tabs were the killer new feature to have, and no application had
yet proven to be the ubiquitous tab-interaction authority. I’ve always found
Safari 3’s built-in tab navigation key combo about as awkward as this
sentence. Firefox certainly paved the way with some very intuitive shortcuts,
but I only use Firefox for UI testing with Firebug, so rarely do I find myself
treading water in a sea of tabs. Also, Safari 4’s new developer tools have
pretty much guaranteed a serious decline in my Firebug time anyway.

With Safari 4, a great tab usage barrier has been shattered as the Firefox
control-tab shortcut replaces Safari’s original loathsome key combo.
I was so inspired by this move towards consistency (or additional Apple
rip-off attempt as one could claim) that I decided to add it to another
tab-defying application that I love so dearly: Coda.

The Leopard Way

Mac OS X provides a System Preferences pane that allows you to add, edit and
remove global and application-specific keyboard shortcuts. It’s a handy
little tool found under the "Keyboard & Mouse" preferences that somehow I’ve
only used once… to change tab behaviors. A while back, I actually had the
(impressively moronic) idea to change Coda’s tab shortcuts to command-tab
so switching between tabs would be "as easy as switching applications." Now
that’s taking Coda’s "one window development" tagline to the eXtR3m#.

Luckily for me, Leopard doesn’t allow you to use the tab key when
defining new shortcuts, so I settled for command-~, which is the standard
"switch windows within application" key command. "Mostly one window
development" turned out to be a real drag though, because I frequently work
on two or three Coda Sites at once, and found myself with no way to switch between
Sites. Bummer.

It was looking like Apple was going to deny me my tab nirvana.

The Working Way

After poking around on Google, I discovered System Preferences stores
application shortcuts in your preferences plist. It uses a dictionary
value named NSUserKeyEquivalents to map the commands to the shortcuts
and can be changed easily with defaults write. I also found that you
can use the ^ character to represent the control key when defining
your shortcuts, but what about tab? Brute force experimentation landed me
in tab heaven on my first try. Here is how to make control-tab and
control-shift-tab map to "Select Next Tab" and "Select Previous Tab"
respectively in Coda:

]]>http://xdissent.com/2009/02/28/safari-4-beta-and-coda-tabs-joy/feed/rss2/5Fixing Paverhttp://xdissent.com/2009/02/24/fixing-paver/
http://xdissent.com/2009/02/24/fixing-paver/#commentsTue, 24 Feb 2009 17:23:04 +0000http://xdissent.com/?p=1I’ve been a fan of Paver since the first time I read about it. It gives
me all of the control I want with just enough of an implied structure to keep
things sane and easy. The problem was, I couldn’t get the latest version,
which had many shiny new features. You know how I feel about shiny, so
I obviously needed to find a way to run Paver’s trunk, lest I invalidate
my work by targeting the almost-irrelevant current version.

Paver is such a flexible tool, it can present a chicken-and-egg packaging
dilemma by relying heavily on itself to build… itself. The packaged
1.0a1 was having trouble installing via easy_install and pip so I
had to figure out a way to build my own package. Unfortunately Paver’s
trunk has neither paver-minilib.zip nor setup.py which means to
get things going, you have to use the current version of Paver. Of course,
the new alpha is so different, that the current release can’t handle it
and the build fails. I finally figured out a way to get a clean trunk
checkout installed and set out to squash the bug that got 1.0a1 recalled
in the first place.

After poking around for a while, I believe paver’s option parsing methods
are confusing distutils commands. It has to do with the way that options
are translated from the command line to Paver’s options. I updated
a relevant issue on Paver’s project page and attached a patch that fixes the
problem and allows Paver to build itself cleanly.

It becomes clear that cmdopts definitions will never override default
options for this task. We can’t use foo-bar as an option name in our
Bunch, nor could we access the runtime value using
options.show_opts.foo-bar. It turns out distutils and setuptools
already noticed this problem and convert hyphens to underscores when
determining the storage variable’s name for an option. I’ve attached a
small patch for Paver SVN r37 that automatically makes this conversion.