1 Overview

Java applications, although isolated by the jvm and standard library
from most OS details, need to interact with the environment to be
useful. We need to configure the app and this is often file based,
and we need to store data and logfiles somewhere.

The danger is that this will introduce subtle dependencies to the
underlying operating system. This causes friction when used in other
environments like development and testing.

Alternatively certain OS-es make assumptions on how things are
handled there. Debian has some strong opinions on what should be
where.

2 Patterns for External Files

2.1 Standalone Directory

Most java apps seem to have chosen the standalone directory pattern
as basis of their deployment.

This has many advantages. All references to files can now be done
using relative links based from the root folder. The amount of
assumptions that need to be made about the underlying OS is minimized.

In the case of the apache folks these differences are handled in
platform specific startup scripts, but the app hardly sees anything
of it. Now these startup scripts are very complicated and are reused
and refined over many projects.

For our own projects we do not need to make such complicated startup
scripts, but this is the right place to put this glue between the OS
and the app. If it is buried in the app, it is also complicated, and
impossible for sysadmins to modify when the deployment conditions
change.

Many frameworks support this way of working by exposing the root
directory in a configuration variable, allowing easy configuration
relative to the root.

2.2 Resource Loading Files

Many Java products use resource loading files from the classpath
instead of directly opening files.

This makes it easy to provide sensible defaults in the jar
files. Maven has also special support for this by adding the
resources folder to the classpath before the classes and the jars.
In case of testing the src/test/resources is added before that. On
deployment the ${appname}/conf folder is added before the jars.

By putting the right config file in the right location for default
(src/main/resources), test (src/test/resources), deploy
(${appname}/conf), the app is properly configured without the need
for any smarts in the app.

2.3 Separate Internal Configuration from External Configuration

This is especially for Spring, i.e. it would be suicide to do
otherwise, but it is in fact generally applicable. The point is that
some configuration is intended to be changed by the sysadmins and
some is not.

Writing modular, loosely coupled software is great and good
practice. Glueing stuff together using some form of configuration
file is just as great. Now part of this is real product design and
changing it will make it a different product. This includes how the
core pieces are wired together. This part should be internal and
separated from the external config.

Other configuration is related to details which do not alter the
purpose but fill in changing details. Like ip-addresses, names,
email, database connections, … and other related detail
config options. These we will find in the external config files.

Note that significant parts of the app can be provided by plugging in
components. Of course these components need to be externally
configured too. So these are in external config files.

Import the external files and the internal files in a way that the
external files can override the internal ones.

Copy the default external config files to the ${appname}/conf
folder so the admin needing to manage it can immediately see the
defaults. Also take care to comment it so that the person editing it
does not need to be digging for the manual.

Please keep the configuration files small. The ideal application is a
zero configuration app which auto-detects its settings from existing
resources, not an app where every feature can be tweaked and
customized. Every configuration parameter need to be coded, documented,
deployed, managed, reviewed, adjusted, corrected (usually several
times). So this ends up being very expensive.

External Configuration is poison, use it in medicinal quantities (not
necessarily homeopathic quantities, if it is needed, it is needed).

2.4 Logging and Monitoring

Since both these things are essentially non-functional requirements,
they should be pushed down to the platform and out of the app.

All logging frameworks are essentially pluggable. The collect all the
log messages in a back-end independent way and send them to an actual
logging implementation, an appender, usually writing to a file, but
this could be an email, JMS message, SNMP trap, …

Of course where those messages end up is largely dependent on the
organization supporting the app and should be decided by them. So the
final loggers should be treated as external configurable components.

So the app should not get involved with the details of logging, just
add a default config with some sensible defaults (size-based rotating
log files so the dev machines and test machines do not run out of
disk space) in an external log file. Please add a comprehensive set
of commented log targets so the admins can change easily the
log levels in a granular way to support the app effectively.

Similarly the app may rely that an external monitoring system is
available which monitors the error logs for critical
errors. Document these in the Operations Manual under the monitoring
section.

Also make sure that the app behaves consistent to the protocols it is
using. A website which has an error should return error status 5xx,
referencing an entity in a REST API which does not exist should return
404, … , whatever is the norm here. This makes monitoring with
tools like Nagios a breeze, as no parsing of the page needs to be
done.

2.4.1 What if the business asks for Special Monitoring

Tough question. In principle it is now a functional business
requirement and there should be a story for it. The risk exists that
this requirement might become broken after a config change during
routine maintenance.

The best way to deal with it is to make it part of the application,
but still push it as far down in the framework/libraries as possible.

For example, if the logging framework can be leveraged, then the
internal configuration could include a predefined appender for the
business notifications, separate from the external appenders.

In practice, deal with them on a case by case basis. Maybe you can
talk the business out of it, or rely on Nagios configuration managed
by Ops? Talk to the stakeholders.

2.4.2 Real-time monitoring and administration

Allowing access to internal value, parameters and admin functions
through some standard management framework like JMX is another
interesting pattern often seen.

Implementing this is straightforward and will be exposed by a
plethora of tools providing a UI for the management of this
information so that the code can focus on the business value, not
adding stuff to manage that stuff. Just do not forget to document it,
self documentation is best of course. Also give instructions in the
Ops manual to control access to this functionality.

Some projects notify developers and stakeholders immediately when
exceptions or other things happen. Another great pattern, but try to
push it out of the app using standard features of frameworks like the
logging framework, Camel, …

Copying classes from other projects is definitely not recommended,
this is a library shouting to come out. Refactor it as a separate
module, ask to make it part of the company foundation so it is just
there when needed. Just look and ask around first if this is not a
wheel which was already invented.

2.5 Complying to OS rules through packaging.

The above assumption to store everything under a folder runs against
the grain of the Linux standards. (Although they are actually the Mac
and Windows way of working).

I’ll treat the case for debian based distros here, but the same is
possible for the redhat and other distros.

In short, use symbolic links to move the folders to the locations
where linux is happy and keep them visible in the local folder for
the JVM. Everyone happy.

2.5.1 Main Deploy folder

All read-only stuff, which is the real application stuff, is expected
somewhere beneath /usr (but not /usr/local which is reserved for
locally compiled packages which we never do).

I recommend to create the app home folder in /usr/share/${appname}
and copy all libraries, binaries, scripts, static resources, etc in
it.

2.5.2 Config files

Config files in debian are expected under the /etc folder and the
package manager will automatically flag files deployed there as
config files so this does not need to be done separately (unless you
want to change the defaults of course).

Just move the default config files to /etc/${appname} and create a
symbolic link

${appname}: ln -s /etc/${appname} conf

Well, I guess debhelpers have better tools for this, so use whatever
is usual using the buildtool you use.

2.5.3 Data files

Storing data should end up under /var somewhere. I recommend to use
a folder under */var/lib/${appname} and create folders there which
you link back to the main deploy folder. If you only need 1 data
folder you do not need to create subfolders of course.

2.5.4 Log Files

Log files are expected beneath /var/log.

Create a folder /var/log/${appname} and link this to
${apphome}/logs. Make sure the folder is owned by the user the app
will be running as.

2.5.5 Dotfiles

Now we get in the hard cases. Normally this is only needed for
desktop apps, server apps should never use personal dotfiles. However
this is one of those cases where you never should say never.

For desktop apps, use the java support for dotfiles. This will use
personal dotfiles on Unixy OS-es and the registry on
Windows. Easy-peasy for greenfield apps. Problem solved.

For 3rd party apps or libs, we have to play the hand we’re
dealt. Typical examples are .netrc which is used to store passwords
outside the app. Good practice, but major headache.

For server apps, try to avoid it. Before you know it, you can no
longer do a ‘git clone …; mvn install’ to build it. Keeping build
dependencies down is critical for long term support and easy
onboarding.

In any case they are no deployment issue other that making sure it is
documented and there are some samples available for complicated files.

2.6 Apps deployed on a runtime platform

Many java apps, components, webapps, … are deployed on some kind of
runtime, be it a servlet container, appserver, OSGi container, …

Great. Leverage it. Push all this stuff down into the container, so
you can surf on the work done by the container packager.

For instance the jboss server has a folder …/conf in the instance
being started, which is on the class path. Just dump your external config
files there with cfengine or whatever you use for deploying.

Logfiles are also taken care of as that is a service the container
should be offering. Just document the important categories and
loglevels as usual, the rest in the concern of the container admin.

In general if you deploy on a controlled environment, expect that
your external dependencies are provided by the container. Work with
the container owner to find the sweet spot.

For testing this is no issue as maven will do the right thing in
unit and integration testing.

3 Conclusion

In orded to focus on the value of apps we must be separating business and stuff like datafiles, configfiles, logfiles as far from each other as possible. It is often already difficult enough (read: expensive) to fix bugs without having all that cruft sprinkled through the codebase and essential configs. Most of the requirement posed by the details of connecting the app code and the external stuff fall in the realm of non-functional and should be
moved as much as possible outside of the programmed code and into
frameworks and runtime containers and into the hands of the admins.

The best way to deal with those external dependencies is to push
them away from the app code and ignore them for the rest. With the
guidelines above this can be realized to a large extent in a
straightforward way.

Both configuration code and configuration parameters are poison over
time. Use them in medicinal doses.

Ubuntu contains a nice notification system to talk to the user about noteworthy events. However when the message dissappears, by default, it is gone. Often I am busy with something else when the notification pops up and I pay little notice. Then somewhere in my brain, something gets triggered by a word, and by the time I focus on the message to really read it, the message disappears. Some of these appear after booting or logging in, so it is not trivial to redisplay them. Then I really need a system for logging notification messages for reading what I missed.

Ideally there was a scroll back buffer in the notification feature. Maybe there is, but I didn’t find it.

Standard Notification Logging

The developers provided a log in *~/.cache/notify-osd.log*. (see
[[https://wiki.ubuntu.com/NotifyOSD][NotifyOSD page an the ubuntu wiki]]) for logging notification messages. So this used to be a moot point as
a quick

$ tail ~/.cache/notify-osd

gave me what I needed.

However somewhere between 12.04 and 13.04 the default behavior was
changed to no longer log to this file by default. My last entry was
from December 2012. It was considered as a developer feature and was
disabled for the general public, but controllable using an environment
variable : LOG=1.

To enable this globally, edit */etc/environment* and add

LOG=1

However, since this is such a vague variable, I guess that this will
enable logging in more than just *notify-osd*. I am fine with this
(at least until my disk runs out).

The wiki, which actually contains a design document rather than
documentation, makes no mention of it. It is also not very logical as
the file is reset when logging in, so it should never pose a threat
to the disk space. In any case, I dropped a small note in the wiki to
point users in the right direction.

Logging Notifications with an App

There is also an app for that. You can install the package
indicator-notifications which keeps track of notifcations that you
receive. You can install with the following

Preparing Plone to start the WebDAV service and setting the permissions to allow the users to make use of it is only half the battle, actually using it, especially from automated systems like build servers is another struggle.

Using the Cadaver WebDAV client

Although WebDAV is currently well integrated in modern desktop environments, a CLI alternative is useful for automation, like Jenkins build scripts.

Automatic Login

Ideally we want password-less operation from script, both from a usability as from a security standpoint.

Cadaver supports automatically logging in to servers requiring authentication via a .netrc file, like the ftp client. The syntax is such that the file can be shared for both tools.

The file ~/.netrc may be used to automatically login to a server requiring authentication. The following tokens (separated by spaces, tabs or newlines) may be used:

machine host

Identify a remote machine host which is compared with the hostname given on the command line or as an argument to the open command. Any subsequent tokens up to the end of file or the next machine or default token are associated with this entry.

default

This is equivalent to the machine token but matches any hostname. Only one default token may be used and it must be after all machine tokens.

login username

Specifies the username to use when logging in to the remote machine.

password secret

Specifies the password to use when logging in to the remote machine. (Alternatively the keyword passwd may be used)

Example ~/.netrc file

default
login jenkins
passwd secret

Example Session

Troubleshooting

Error 409 Conflict

This can mean a lot of things, like a real conflict. However most of the time it means that the folder where the stuff is uploaded does not exist:

Automating Cadaver with davpush.pl

Cadaver uses an ftp like command language to interact with the WebDAV server. This is very flexible, but impractical when a large number of files and folders must be uploaded. This happens often when the documentation for a new release must replace the previous version.

Cadaver accepts its input on the stdin stream, which allows us to pipe a script of commands to it. Since it is non-trivial to create and maintain such a script by hand, a script generator is needed. The generator presented here is meant to be simple and easy to use and modify. No attempt was made to made to add advanced syncing (like removing deleted files), handle exceptions gracefully or ‘do the right thing’.

With that in mind, organize the docs in such a way that it is easy to delete the target folder and push a fresh copy to clean everything up. This is common (and good) practice anyway in order to effectively use relative links within a subsite.

The principle is to cd to the root directory of the documentation root and run the script there and point it to the target.

Usage

davpush.pl dav://_hostname_:_port_/_upload path_

Uploads all files and folders recursively to the WebDAV folder passed in the url.

Code Notes

The standard perl File::Find module traverses the folder tree in the right order to make sure all folders are created before other files or folders are created in them. Default behavior is to chdir to the directory, but then we lose the nice paths relative from the root, which would require additional administration entering and leaving the directory. Setting the no_chdir flag in the options keeps the paths like we want them in the script. (Look at the preprocess and postprocess options to help with the directory admin, but I think the added complexity will outweigh the gains for small to moderate trees)

For every file or folder, the wanted subroutine is called. For files we just add a mput command to copy the file over, because it keeps the path intact. If there is a file already (and the permissions are not screwed up) then it is overwritten. When we enter a new folder then we create the folder. If the folder already exists we get a (harmless) 405 Method Not Allowed error. Here we make another offer to the God of Simplicity, and ignore it.

After walking the tree, we have the script in the $script variable. It is unceremoniously piped as input for cadaver. We add the bye command to close the session, and we’re done. The output of cadaver appears on the stdout for easy verification using a MkI Eyeball check or by piping it to grep.

Ubuntu has a Network Proxy chooser which allows you to select a location (a la MacOSX). This works well enough except that the UI is a bit counter-intuitive (in my humble opinion)which causes me to regularly nuke some predefined setting inadvertently. This is not a big deal though.

However for update manager (and several other tools) to pick up the new proxy settings you need to push the settings down to the system level. This takes 2 times typing your password. Now, this IS a big deal.

When I go back and forth between work and home I have to change this at least 2 times per day. Also it irks me that a detail setting like the proxy is not auto-detected and I need to login to change this ‘system’ setting. My laptop is essentially a single user system and I do not see switching the proxy as a serious security issue, even with 3 kids running around the home.

To come back to auto-detection, while this works fine at work, it fails to figure out that at home that there is a direct connection to the Internet. I can probably fix this by replacing my aging wireless router with my Time Capsule as the Internet gateway router, but I prefer to have the Time Capsule close to my desk.

In any case the Network proxy shows 2 times the authentication dialog box. A particularly nice feature (Is this new in Natty?) is that the dialog shows for which DBUS setting access is being asked.

The first dialog asks access to com.ubuntu.systemservice.setProxy. This response is configured in the file /usr/share/polkit-1/actions/com.ubuntu.systemservice.policy. This is a very readable XML file which contains a section for the setProxy action. I feel no reservation in allowing unchecked access to the setProxy. Although this might make a man-in-the-middle attack easier someone with the sophistication to pull this off, does not need to doctor my PC to do it.

Note that the action was configured with auth_admin_keep which according to the docs would mean we should be authenticated for some time,so I would not expect the second authentication I am getting. Must be a subtlety which escapes me at the moment.

The second action is more problematic since the set-system om the system gconf settings is much less fine-grained than setProxy and can potentially cause more damage to the system.

Dot files are the term for folders and files starting with a ‘.’ so they do not show upwhen using plain ls.

Tha Mac has a cool keycode to toggle visibility of dotfiles in the File Open/Save dialog, but this does not work in the finder for one reason or another.

In practice this meant I had to deal with dotFiles and dotDriectories I found on the net some incantation to force the setting for the finder to show/hide the dotfiles. Upon restarting the finder the widows will reopen with the updated setting.

I found on the net some snippets of sh script (but I forgot where and cannot immediately retrieve it), and I immediately dumped them in my ~/bin folder.

Overview and Goals

We build our solutions mostly on Ubuntu Natty and we deploy to Debian (currently lenny). One problem we face is that Debian has a slow release cycle and the packages are dated. Before a new release is approved and deployed to our target servers it can still take many months causing us to have to use up to 3 year old technology.

So we are often faced to ‘backport’ packages or debianize existing packages if we want to use the current releases.

In the past we had different build servers for the target architectures. However this is a heavy solution and scales poorly. It also makes upgrading to the next release that much heavier.

The pbuilder program create a clean room environment of a freshly installed empty debian or ubuntu distro, chroot into it and starts building based on the project metadata, mostly from the debian/control file.

It does this by unpacking a preconfigured base image of the selectable target , installing the build dependencies, building the package in the cleanroom, moving the artifacts to the hosting machine and cleaning everything up again. And it does this actually surprisingly fast. This clearly satisfies goals 1 and 2 (and half of 3 if we assume a developer has full control over his laptop).

The pbuilder is configured through commandline options, which are clear and friendly enough but you end up with commandlines of several lines long which are impossible to type in a shell and are a maintenance nightmare in build scripts (clearly conflicts with point 5). Also in the ideal world we would be able to retarget a build without touching the checked out files, e.g. with environment variable (see goals 3 and 4).

Configuring pbuilder

On the Pbuilders Tricks page I found a big smart shell script to use as the pbuilder configuration file ~/.pbuilderrc.

I just updated the distribution names to the current situation and added the directory where the packages are collected as a repository so subsequent builds can use these packages as dependencies. I also specified the keyrings to use for Debian and Ubuntu and made sure the expected folders are created to mount them in the clean room.

I created this in my account on my development laptop and added a symbolic link in ~root/.pbuilderrc to this file so I can update it from my desktop environment and do not have to get my brain all twisted up to try to remember with which configuration I am busy in my shell, sudo, su –, …

THe way the script works is that the configuration adapts itself to the content of the DIST and ARCH environment variables. So to configure lenny-amd64 as target is sufficient to do

~ > export DIST=lenny
~ > export ARCH=amd64

This approach is also perfect Jenkins or Hudson to determine the target build from checked out sources, since it can be specified in the build recipe. (satisfies goals 3b, 4 and 5)

Since we have to run these programs using sudo we must make sure the environment variables are passed by sudo. We can do this in the Defaults line of the /etc/sudoers file with the envkeep instruction.

You add the DIST and ARCH variables there. I also included the environment variables for proxying so I can easily switch between environment on my laptop and these changes propagate to sudo (which is also useful for plain apt-get, by the way).

I also added a line to show how to make the tools available for a user without having to give their password. This is not needed for interactive work, but very much so for the user as which the CI server is running (in our case jenkins). Note that the definition should be after the group definitions, otherwise these take precedence and jenkins has to provide his password (read: is hanging during the build).

Creating the target base images

The heavy lifting is now done. Let’s create an base.tgz for lenny-amd64.

and you should get a nice set of debian packages in */var/cache/pbuilder/lenny-amd64.

In practice you will often end up with errors like :

... snip ...
The following packages have unmet dependencies:
pbuilder-satisfydepends-dummy: Depends: xulrunner-dev (>= 2.0~) but it is not installable
The following actions will resolve these dependencies:
Remove the following packages:
pbuilder-satisfydepends-dummy
Score is -9850
Writing extended state information... Done
... snip ...
I: cleaning the build env
I: removing directory /var/cache/pbuilder/build//6279 and its subdirectories

In these case you have to walk the dependency tree till you find the leafs, and walk back up the branches to the trunk. Note also that chances are that unless you target machines which only serve a very specific purpose, you might end up with packages which are uninstallable since you pull out the rug from other installed packages. However we have the principle to use 1 virtual host to deliver 1 service, hence there are very little packages deployed to them and nothing complicated like desktop environments.

Sometimes the dependencies break on the version of debhelpers. This version is added conservatively by the dh* scripts and often is overly conservative. Many packages build just fine with older versions of the debhelpers. ** *

Setting up automated build

To set this up on the build server we have to replicate the steps above

Org Mode can use the idle time to correct time tracking entries. On the Mac this works on the idle time of the computer, on other platforms it uses the idle time of emacs. Hence if you do a significant task in another program it will be idle for org-mode.

There is a little program delivered with org-mode sources to estimate the “real” idle time based on the information used by screensavers. Unfortunaltely it is in source code and needs to be compiled first. Everything is actually well documented, just scattered.

Now we can compile x11idle. We need to specify to link against the X11 and X Screensaver libraries (Xss). I immediately place the resulting executable file in ~/bin/x11idle. Alternatively you could use the /usr/local/bin folder, but first compile to a temporary location like /tmp and move it there : sudo mv /tmp/x11idle /usr/local/bin.

The Mac has a great utility called Divvy to easily map windows to loacation on the screen using the keyboard. Fiddling with the mouse to get multiple windows in the right location is a productivity killer and a pain in the neck (or shoulder, or elbow, or …)

Ubuntu (and of course any other Linux distro with compiz) has a similar feature built in as a plugin for compiz. Type Alt-F2 and enter ccsm + Return in the command prompt to launch the CompizConfig Settings manager. Select the Window Management from the left menu and enable the Grid plugin. Click on it so you can look at the key bindings in the Bindings tab.

If you are on a desktop or a big laptop with a separate numeric keyboard you are set. As you can see the locations for the windows are by default mapped in a logical fashion like the arrow keys on the numeric keypad.

However my laptop does not have a separate numeric keyboard and enabling it before typing the key code is a pain. Remapping them is easy by clicking on the button with the key code. A window appears with a Grab key code button. Click it and type the new keycode you want to assign it to. If there is a conflict, you will get a window explaingin the conflict and asking how to resolve it.

My first attempt was to remap the laptop numeric keys using the super keys. This conflicted with the unity launcher since the top row 7-8-9 map to the apps in the launcher.

To avoid conflicts I use now Control-Super with the keys around the j-key (which is the home key for the right hand)

Also autokey (a text macro expander) is mapped to Super-K

Control-Super-j : 100% (maximize)

Control-Super-h : 50% left

Control-Super-k : 50% right

Control-Super-u : 50% top

Control-Super-m : 50% bottom

Control-Super-y : 25% top-left

Control-Super-i : 25% top-right

Control-Super-n : 25% bottom-left

Control-Super-, : 25% bottom-right

If Control-Super-j is used to maximize a window, clicking on one of the other keys will first restore it to the original size and position and only map it to its place on the second click. I consider this a feature, but you are free to interprete it as a bug.

Result, this is now super practical way to divide my windows on my screen.

Recently our sonar installation on our Hudson CI choked again, this time with an error I have not seen before. It was just before the release of an important milestone for the team, so not being unable to publish new version on the test server could not come on a worse time.

Ok, apparently something returns 2 results which should only have one result and it stands to reason that it is coming from the database. Pity there is no indication on which table would be affected.

Some googling retrieved this post which points to the snapshots table and more specifically to records with the field islast set to true.

A quick check in …/conf/sonar.properties revealed the database connection parameters.

Ok, this is a good time to check you have an up-to-date backup of your database. (Or make one if you’re not sure)

Using your SQL query tool of choice (I used the built in Data Sources tool in IntelliJ) connect to the database.

The snapshots table contains a project_id which confusingly does not really contains projects, but actually all kinds of assets, artefacts, files, however you ike to call it. Unfortunately there are many thousands. Even if I limited to only the ones with islast=1 there were close to 1000 records.

Narrowing further down with :

select project_id, cnt
from (select project_id,
count(*) as cnt
from snapshots
where islast=1
group by project_id) as cntsnap
where cnt > 1

Gave me the project_id I’ve been looking for, in my case 2089

Now a quick

select * from snapshots where project_id=2089

Gave the 2 offending rows. A quick glance showed that one of them was very suspicious : the parent_id was the same as his own project_id and there were lots of null columns.

I deleted the row based on the id. Retriggeredhudsonto rebuild the project, and the build succeeded and sonar seems to be happy again.

I hope there will be no other repercussions. For our purposes the sonar data is not really critical and we could also restart without any impact. If your sonar data is more critical a better choice would be to restore from backup before the build started failing.