The naming of things: Package names and namespaces

Over the past year or two, the transition from "products" to "eggs" has taken hold. Almost no new third party products (and very few new releases of old products) are released as product tarballs, and it seems almost everyone is using buildout and eggs to distribute their code. This is good, and helps bring Plone's packaging story in line with that of other frameworks.

With eggs come namespace packages: the ability to organise code into namespaces. We see this in Plone itself, with namespaces like plone.* (packages like plone.memoize or plone.browserlayer), and nested namespaces like plone.app.* (plone.app.portlets, plone.app.contentrules) or plone.portlet.* (plone.portlet.collection, plone.portlet.static). A number of other "shared" namespaces have also popped up, such as collective.*, five.*, z3c.*, and so on, as well as namespaces used by specific organisations (jarn.*, slc.*).

Unfortunately, the rules for the use of namespaces are a bit fuzzy, and sometimes cause confusion. It's also difficult (and inadvisable) to change packages names after a package has been released and there are uses "in the wild", so getting the name right the first time is important. I'll try to articulate some of the unwritten rules here.

Egg name vs. Package name

First of all, a small, but important point: For Plone packages, we tend to have a one-to-one mapping between package names and egg names, but this is by no means required.

For the avoidance of doubt, the egg name is the thing that is registered on PyPI and listed in the buildout.cfgeggs option or a setup.py install_requires line. An egg is an archive of Python code and other resources (like documentation or images). The Python modules included in the egg may or may not be organised into packages, and those packages may or may not be namespace packages.

A namespace package is like any other Python package (i.e. a directory with an __init__.py file in it), except that when a package is declared as a namespace package (which involves a couple of lines of boilerplate in the __init__.py), it is possible to have multiple eggs provide modules and sub-packages within that package.

As an example, Plone depends on packages such as plone.theme and plone.browserlayer. Both of these put sub-packages into the the plone namespace package. If plone was not a top level namespace package, the two eggs would clash (the package(s) from one egg would effectively hide all others using the same top level package name).

The convention of naming the egg after the top level package is useful because it makes it easier to find code: if you are looking for the code behind the module plone.theme.interfaces, you know to look in the egg called plone.theme. It's also one less thing to worry about naming. However, this is not enforced. If you installed the Paste egg, for example, you will get a number of packages in the paste.* namespace, all distributed in a single egg. Other packages, like PasteScript, provide further packages in the same top level namespace.

Rules of thumb

There are no hard and fast rules for package naming, but there are some rules of thumb based on accepted practices from Plone and other projects.

A namespace is just that: a way to organise your code. Packages in the same namespace should be conceptually related.

Top level namespaces usually relate to code ownership. This helps avoid clashes: if both Plone and Pinax had package to PyPI called portlets they would clash, and it would be difficult to co-ordinate package naming. By using a plone.* namespace, we can be reasonably sure that we won't have clashes.

Understand the purpose of namespace before you use it. If in doubt, ask on the Plone product-developers mailing list or in the IRC chat room.

Your last chance to get the namespace right is when you make the first public release. Changing package names after it has been used in the wild is a major pain and should be avoided if at all possible. Everyone who's ever tried to rename a released package have come to regret it. Changing the egg name is slightly less harmful, but will also cause a lot of confusion.

For internal/customer projects, use your company name as the namespace. For example, the code written for the fictional "Optilux" chain of cinemas in the Professional Plone Development Book is all contained in the optilux.* namespace.

There is no shame in releasing a package as open source even if it has an "internal" name. Witness things like jarn.mkrelease or zest.releaser. Think of it as a bit of free marketing.

Only use a "shared" namespace if you really intend the code to be community owned. This is a corollary to the previous three points.

Avoid using namespace packages as a way to delineate words in a package with multiple names. It may look cool to have a package called plone.multimedia.tools (fictional) but that's not what namespace packages are for (incidentally, you could use an egg name like PloneMultimediaTools and still use a sane package structure, but as discussed above, the convention in Plone land is to use the package name as the egg name).

Avoid deep nesting - two levels is almost always enough. Some programmers have an urge to define everything in deeply nested hierarchies and end up with packages like collective.generic.skel.common (apologies - I'm pretty sure this is a worthwhile package with an unfortunate name). This type of name is both verbose and cumbersome (e.g. if you have many imports from the package). Furthermore, big hierarchies tend to break down over time as the boundaries between different packages blur. These days, the consensus is that two levels of nesting are preferred. For example, we have plone.principalsource instead of plone.source.principal or something like that. The name is shorter, the package structure is simpler, and there would be very little to gain from having three levels of nesting here. It would be impractical to try to put all "core Plone" sources (a source is kind of vocabulary) into the plone.source.* namespace, in part because some sources are part of other packages, and in part because sources already exist in other places. Had we made a new namespace, it would be inconsistently used from the start.

Pick meaningful names. Speaking of collective.generic.skel.common - it's difficult to guess what that package may do looking at the name. Two of the four names are generic (erm...) and near synonymous. Ask yourself "how would I describe in one sentence what this namespace is for?", and then "could anyone have guessed that by looking at the name?".

If in doubt, ask. The Plone product-developers mailing list is a good place to ask if you want advice on what to call things, as is the #plone IRC chatroom. See plone.org/support.

The common namespaces

There are several namespace packages in use today. Below are some of the most commonly used "community" ones and their purpose.

collective.*

This name comes from the Collective subversion repository. A collective.* package is a Plone community package not intended for the Plone core. It should live in the Collective repository and be released under the GPL or another open source license. The namespace implies community ownership and invites anyone with Collective commit access (i.e. anyone who wants it) to collaborate and help maintain it. If you are writing a Plone package and you want it to be community-owned, this is probably the first namespace that should come to mind. Examples include collective.xdv and collective.flowplayer.

plonetheme.*

This name is used by the default in the Plone theme paster template. It should be used for Plone theme products that are released as open source.

plone.*

This is the main top level package for Plone core packages. A plone.* package should live in the main Plone repository, be GPL licensed (unless the Plone Foundation Board has explicitly given permission for an alternative license, normally BSD), and thus covered by the Plone Contributor Agreement. The name implies that this is or is intended one day to be part of Plone core. If in doubt, it's best to ask on the plone-developers mailing list before releasing a package under this namespace.

Examples include plone.portlets, plone.z3cform and plone.memoize.

plone.app.*

This is a sub-namespace of the plone.* namespace and the same rules apply. The plone.app.* namespace is used for package that are part of "Plone-the-application", i.e. strongly dependent on Plone's user interface, default content types and so on. The difference between plone.* and plone.app.* is that packages directly under the plone.* namespace should work independently of Plone as a whole, and so be re-usable by other users of Python, Zope and/or CMF. A plone.* package is not allowed to import from any plone.app.* package, Products.CMFPlone, Products.ATContentTypes or other Plone-the-application-specific packages. You will sometimes see pairs of packages - like plone.portlets and plone.app.portlets - where the plone.* package is re-usable outside Plone and the plone.app.* package provides Plone-specific integration and additions.

plone.portlet.*

This namespace is used for standard portlets that ship with Plone. You will also see collective.portlet.*. The rules for plone.portlet.* are the same as those for plone.app.*. The rules for collective.portlet.* are the same as those for collective.*.

zope.*

This namespace is used for core Zope Toolkit / Zope 3 packages. Do not use it unless you have written a proposal to the zope-dev mailing list and this proposal has been accepted. zope.* packages must live in the Zope repository, be licensed under the ZPL and be covered by the Zope Contributor Agreement.

z3c.*

z3c stands for "Zope 3 community". This is the Zope community equivalent to the collective.* namespace. These packages are normally ZPL licensed and found in the Zope repository.

megrok.*

This namespace is used by the Grok project for community add-ons.

five.*

This namespace is used for certain packages that provide Zope 2 integration for Zope Toolkit / Zope 3 packages. Examples include five.grok and five.customerize. These packages tend to live in the Zope repository.

mr.*

A tongue-in-cheek namespace used for development and debugging tools such as mr.developer, mr.bent, mr.freeze and mr.git. These tend to live in the Collective repository.

Great explanation of some of the conventions found in the wild, Martin. I'll repeat and amplify your admonishments to choose [to use] namespaces carefully:

* One of my pet peeves is the "collective" namespace, whose meaning isn't well-defined. Some say it means "I'm doing one release of this and not maintaining it further". Others, including yourself, interpret it as "Patches welcome". Since it conveys only a fuzzy meaning, I encourage folks to avoid it. Used as a "default" namespace, it also has the side effect of making everything in svn start with "collective" (406 packages so far!), which just defeats alphabetization and makes navigation trickier.

* Developers new to Plone or Python should realize that using a namespace is by no means required. If you're making up a namespace just for the sake of having one, you probably don't need one at all. Remember, namespaces were originally intended as a way of splitting up monolithic legacy packages without breaking imports, not as a way of organizing PyPI. I'll extend your "two levels is almost always enough" rule to "one level is often enough": if you choose names well, avoiding overreaching names such as "portlets", even a flat namespace is really, really big.

Thanks for a good post, Martin; I'll point people here for explanations of what some of the more confounding namespaces mean. How one can start a mr namespace without a contained "fusion" package, for instance, is beyond my comprehension. ;-)