An Apology

This page is here to codify the development procedures we will employ
for PySIT. I apologize in advance if anything here seems overly restrictive
or annoying. A set policy is necessary on many fronts to enable smooth,
quality development with multiple collaborators. I put this together, not
to be bossy, but so that this project can succeed. We all have different
backgrounds and coding styles, but it is best to present the software with a
united front. If there are any suggestions for modifications to these
standards and practices, by all means bring them up. There is no 'one true
right way,' and if even if there is, I don't profess to know it.

The notes below are compiled from experience developing numerical code
in a mixed Python and C++ environment, extensive reading on best practices
(and reading arguments about best practices) in Python software
development, an understanding of the conventions within the Python and
numerical Python software communities, and by borrowing (and modifying)
ideas from existing projects. By no means do I claim that this is the
be-all-end-all, best (or only) way to do things, but they should work. If
there are any problems or flaws, let's fix things!

Source Control

We will be using the Mercurial (hg) package for source code revision
control. It does not appear that hg is available by default in the math
department computing environment, so you will have to install it to your
home directory. Download the
package and follow the installation directions. Make sure the hg
binary is in your path.

Mercurial is a distributed version control system (DVCS; similar to
git, if that is familiar). If you have used CVS or SVN in the past, a
DVCS has the same core goal, but the philosophy is different. In either
case, if you are new to hg, a great place to start understanding the
software is Hg Init. There are plenty
of other tutorials on the web, as well. Also, feel free
to ask me any questions.

We will use a centralized repository to maintain a common version of the
code. The central repository is hosted on BitBucket.org. The general idea is that
each of us will have their own 'fork' of the central
repository in which we will do our development. Once a feature or bug
fix is complete, we will then issue a 'pull request' so that the change can be reviewed and merged into the central
repository.

A major difference in between classical version control (like SVN) and
Mercurial is that in SVN, a commit is immediately pushed to
the central repository while in hg, a commit is a local
operation. Commits/changes in hg are kept in the local repository until
they are explicitly pushed to the central repository. This means that
you can rollback local changes easily, without ever worrying about
breaking the main code. You can even remove all of your changes by
deleting your local clone, with no adverse effects on the central
repository. You can also have multiple clones, each to work on
different features. This last option may be useful if you are working
on two large, but independent additions to PySIT, though 'branching' is the preferred approach.

Work on a change or bug fix in the local repository, perhaps having used hg branch NEW_BRANCH_NAME if it is a significant change.

Commit changes to your clone frequently and on a per change
basis. This means that if you change multiple files for the same
purpose (e.g., changing a common variable name globally within the
project), you should commit all of these files at once.

Periodically, you can push your changes up to your fork (for backup purposes) using hg push

Once you are satisfied that the bug fix or feature is complete
and that the new code passes the unit tests (more on this later),
you are ready to push the changes to the central repository.

First, run hg pull upstream default to get the latest changes from the central repository. This ensures that no one else has changed the main repository since you last pulled from it.

If there has been changes, and if there are conflicts, follow
instructions for merging to resolve the conflicts.

Once all conflicts have been resolved, then go to your fork on BitBucket and issue a pull request to the default branch of the main repository.

From there, we can all comment and review the new code and I can merge in completed code.

Continue developing your local version, and repeat from step 2
as necessary.

Note: For the pull to upstream to work, you must add the line (under the [paths] heading)

upstream = https://rhewett@bitbucket.org/rhewett/pysit

to your .hg/hgrc file in the repository directory

We will deal with the procedure for releasing the software (i.e., major
version releases) when we get there. No need to burn that bridge until
we are standing on it.

Documentation

Global Project Documentation

We will be using the sphinx python package for automagic generation of
documentation. Follow the example docstrings in the current version of
PySIT for an idea of how we are documenting classes and function calls.

More details on this will be included here as the project evolves and we
have a better idea of what we need. As a rule, it is best to start
documenting early, and keeping up with it as you go. It is much more
difficult to handle documentation after code is written. For this
reason, when you make a change to existing code (e.g., a function
signature), be sure to update its global documentation as well, and be
sure to do this before you push your changes to the central repository.

Local Source Code Documentation

In addition to the global documentation, let's be consistent in our
local code documentation too.

Use common sense and document code as it is written.

Avoid clutter, really obvious code should not need long
explanations.

Using descriptive variable and function names goes a long way
toward making code easier to read, and cuts down on the number of
comments required.

Clever coding and numerical tricks should be documented so other
developers (and users!) are not left guessing at what tricky code is doing.
Explain the logic of the code if necessary.

Unit Tests (aka, Test Driven Development)

We will use the python package nose for unit testing.
Briefly, the idea behind unit testing is that every bit of functionality
(within reason, of course) should have a test that guarantees that it
works correctly.

For example, if you wrote an iterative linear solver, you might have
unit tests to confirm that the output vector has the correct dimensions
(you would be surprised how easy this is to screw up), and using a
series of small examples, check to see that the solution is within the
specified tolerance and check that any error modes (e.g., if the linear
system should be SPD but it is not) are properly accounted for.

With this development model, if I make a change to a separate piece of
code, I can run the unit tests to verify that my change did not break
something elsewhere. Such bugs can be hard to track down otherwise.
Before a change is pushed to the central repository, it (likely) should
have its own unit test and (definitely) should not cause any existing
unit tests to fail.

As development of PySIT proceeds, examples of unit tests will be
available. This aspect of the project we can develop as we go.

Conventions

For consistency and ease of development, Python code should be written
using the following conventions:

Python uses indentation to delineate code blocks. This has
caused a bit of a holy war within the Python community. Some prefer
to use a single tab character for indentation, some prefer a certain
number of spaces (e.g., two or four). The former is preferred by
some because different coders prefer different tab display widths
settings. The latter is preferred because it is easier to align
code and it is more consistent across editors. Inconsistent
indentation breaks Python scripts.

We will use a hybrid approach that resolves both of these issues.
Tabs will be used for indentation block purposes, spaces
will be used for alignment. For example (where <tab>
is a tab and ~ is a space) an if block should look
like

if (x == 3):
<tab>print(x)

and inside a block, a function call with many arguments might
look like

and other variables, functions, and class members should
be lowercase_with_underscores separating whole words.

Use intelligent abbreviations if the variable name (or on
occasion, the method name) is too long or if there is a well
known abbreviation (e.g., PML). Good, descriptive variable
names makes for more readable code. Excessively long variable
names hinders development.

More as I think of them...

For consistency and ease of development, C/C++ code should be written
using the following conventions: [defined later].