Friday, December 30, 2005

I've been looking around for other blogs that are primarily (or at least regularly) devoted to the subject of Software CM and/or Version Control. I did some searching thru blogsearch.google.com but mostly my own surfing turned up good results. I chose to omit blogs that don't seem to be updated anymore (like Brian White's Team Foundation blog - especially since Brian left Microsoft).

Anyway, here is what I found. If you know of others, please drop me a line.

Charles Betz' ERP4IT blog is chock full of great stuff that is directly relevant to CM even when it doesn't directly discuss CM! It also has some great blog entries on Model-Driven CM, CMDB's and much more.

Lord, please grant me ...the serenity to accept that I can't read everything,the time to read and understand everything that I can,the wisdom to know the difference [so I won't have to leave my estate to Amazon.com],and a sufficiently well-read network of friends [to tell me all about the books they've read].

We thought 2005 was a pretty gosh darn great year for Agile and Software CM alike. We wanted to share what we feel are some of the timeless classics that we have most looked to throughout the year, as well as the new books in the last year that we have been most impressed with.

Those of you reading this are encouraged to read the article to see what we had to say about some of the following books (as well as several others):

Happy Holidays and Hopeful New Years!A Very Happy Merry ChristmaHannaValiRamaKwanzaakah (or non-denominational solstice celebration) to all in 2005! And looking forward to what 2006 will bring to all of us in the coming year!

Sunday, December 18, 2005

In my last blog-entry I wondered if the interface segregation principle (ISP) translated into something about baselines/configuration, or codelines, or workspaces, or build-management. Then I asked if it might possibly relate to all them,

Thus far, the SCM principles I've "mapped" from the object-oriented domain revolve around baselines and configurations, tho I did have one foray into codeline packaging. What if each "view" defined a handful of object-types that we want to minimize and manage dependencies for? And what if those principles manifested themselves differently in each of the different SCM/ALM subdomains of:

What might the principles translate into in each of those views, and how would the interplay between those principles give rise to the patterns already captured today regarding recurring best-practices for the use of baselines, codelines, workspaces, repositories, sites, change requests & tasks, etc.

The "short version" of ISP in the initial article states that:=> "Clients should NOT be forced to depend on interfaces that they do not use."

The summary of ISP in Uncle Bob's website says it differently:=> "Make fine grained interfaces that are client specific."

In previous blog-entries, I've wondered how this might correctly translate into an SCM principle (if at all).

In Change-Packaging Principles, I wondered if maybe it corresponds to change-segregation or incremental integration: Make fine-grained incremental changes that are behavior-specific. (i.e., partition your task into separately verifiable/testable yet minimal increments of behavior.)

On the scm-patterns list I wondered if maybe it corresponds to composite baselines: composing baselines of other, more fine-grained baselines

Now I'm thinking maybe it corresponds to promotion lifecycle modeling and defining the promotion-levels in a promotion-lifecycle of a configuration-item (e.g., a build).

Why am I thinking this?

I guess I'm trying to go back to the basis of my means of comparison: configurations (and hence baselines) as "objects." If a configuration is an object, then what is an interface of a configuration, and what is a fine-grained interface (or "service")?

If I am thinking in terms of configuration building, then the interface for building the object (configuration) is the equivalent of Make/ANT "methods" and targets for a given item: (e.g., standard make targets like "clean", "all", "doc", "dist", and certain standard conventions for makeflags). That is certainly a plausible translation.

But if I am thinking in terms of baselining and supporting CM-mandated needs for things like reproducibility, repeatability, traceability, from the perspective of the folks who "consume" the baseline (it's clients), then maybe the different consumers of a baseline need different interfaces.

If those consumers end up each "consuming" the baseline at different times in the development lifecycle (e.g., coding, building, testing, etc.) then perhaps that defines what the promotion model and promotion levels should be for that configuration.

What if they aren't at different times in the lifecycle? What if they are at the same time?

Then I guess it matters if the different consumers are interested in the same elements of the baseline. If they're not, maybe that identifies a need for composite baseline.

What if they aren't at different times and aren't for different elements, but rather the same sets of elements?

Then maybe that identifies different purposes (and services) needed by different consumers for the same configuration at the same time. Building -versus- Coding might be one such example. Would branching -versus- labeling be another? (i.e. "services" provided by a configuration as represented by a "label" as opposed to by a "codeline", or a "workspace"?)

What if no one of these is the "right" interpretation? What if it's ALL of them?

Then that would be very interesting indeed. If the result encompassed the interfaces/services provided by different Promotion-Levels, Make/ANT-targets, branch -vs- label -vs- workspace, then I don't even know what I would call such a principle. I might have to call it something like the Configuration ISP, or the Representation separation principle, or the manifestation segregation principle, or ....

What, if anything, do YOU think the ISP might mean when applied to Software CM and software configurations as represented by a build/label/codeline/workspace?

"In classic software development tool environments, many different point solutions are used for software life-cycle management. There are requirements management tools, bug trackers, change management, version and configuration management tools, audit and metrics engines, etc. The problem: your development artifacts are scattered, making it difficult to derive useful, timely management information. POLARION® ... keeps all artifacts of the entire software life-cycle in one single place ... gives organizations both tools (for requirements, tasks, change requests, etc.) AND project transparency through real-time aggregated management information ... combines all tools and information along the Software lifecycle in one platform. No tool islands, no interface problems, no difficult, potentially fragile integrations anymore."

However, it does NOT appear to be opensource.

I'd LOVE to see a mixed commercial offering of, say, AccuRev, Jira and Confluence be able to provide this all in one package (just as I described in the blog-entry). [And with AccuRev's and Atlassian's roots in and commitment to opensource (the folks at AccuRev had previously developed the open-source CM system "ODE" for the OSF), they might even consider making it freely available for opensource projects (like Atlassian currently does for both Jira and Confluence)]

Friday, December 09, 2005

For those CVS users who don't already know about Subversion I urge you to take a look. Subversion was designed to be a next-generation replacement for CVS that has a lot of the same basic syntax and development model while fixing or updating most of its well known shortcomings.

Another spiffy open-source project that integrates with both CVS and Subversion is Trac, which provides simple but powerful defect/issue/enhancement tracking (DIET) using a Wiki-web interface, and readily integrates with both CVS and Subversion to add collaborative, low-friction request/activity tracking to your version control and can be used to track change-sets in the version control tool and associate them with change-tasks/requests in the tracking tool.

Using Trac with Subversion can help "subtract" a lot of the tedium of traceability from your day-to-day work and give more "traction" to your development efforts. So, in a way, Subversion plus Trac gives SubTraction :-)

So I wanted to be the first to try and coin the phrase "Agile Six Sigma" - except I'm not real fond of the resulting acronym, plus someone else might have come up with it already (if only in passing). So I wanted to embellish it a bit to create an even better acronym before I commence the marketing madness for my new "cash cow" idea. Thus I have decided upon:

"Agile Six Sigma - Holistic, Organic, Lean, Emergent."

Seriously tho! I actually think there is a lot of GREAT stuff in and synergies between Agile, Lean, TOC, and Systems Thinking. I think DFSS has some useful tools in its toolbox. I'm less sure of the overall methodology for SixSigma being compatible with Agile methods -- tho I admit David J. Anderson has some GREAT articles that seem to show a connection, particularly the one on Variation in Software Engineering.

I am getting weary of lots of hype that simply throws these buzzwords together (hence my marketing slogan and acronym above :-) but I think they have a lot to offer, and I would be interested in applying them to CM.

I'm particularly curious about using the Lean tools of value-stream mapping along with TOC in analyzing anti-patterns and bottlenecks that often occur in building, baselining and branching & merging (since there seems to be a fairly direct correlation to "code streams" of "change flows" and a "value stream" or "value chain"). Has anyone already done this for CM? (I wonder if something like this could better substantiate the "goodness" of the Mainline pattern.)

Tuesday, November 29, 2005

John was probably best known as one of the "Gang of Four" who authored the book Design Patterns, which was the seminal work on the subject of patterns if not on all of O-O software design, and one of the best selling computer-science books of all time. A wiki-page created in John’s memory is available for all to read, and to contribute to for those who remember him or have been influenced by him. I'll be posting the following memory there in a couple of days...

My first encounter with John was in 1995 on the "patterns" and "patterns-discussion" mailing lists. I was just a lurker on those lists at the time, and didn't feel "weighty" or "worthy" enough to post anything to them.

Then after having lunch (Pizza actually) with Robert Martin ("Uncle Bob") who encouraged me to do so, I ventured a posting to the patterns-list and described the Pizza Inversion pattern. I was actually quite nervous about it - me being a complete unknown and "daring" to post something that poked a little fun at patterns. John and Richard Gabriel were among the first to respond, and the response was very positive. I felt I had been officially "warmly welcomed" into the software patterns community.

A couple years later I attended the PLoP'97 conference and got to meet John in person for the first time at one of the lunches. Like many others, I was in awe of how unpretentious and humble he was. Again he made me feel very welcome amidst himself and others at the table of "rock star status" in the patterns community: he apparently recognized my name and included me in the running conversation, mentioning that when he first read my Pizza Inversion pattern, he "thought it was briliant!"

Later, at PLoP'98 and PLoP'99, John encouraged me to get together with Steve Berczuk and write a book on Software CM Patterns for the Addison-Wesley Software Patterns Series of books, for which he was the series editor. And during 1999 I actually became editor for the Patterns++ section of the C++ Report, including John's "Pattern Hatching" column and Jim Coplien's "Column Without a Name."

It was both an exciting and humbling experience for me to serve as editor for the contributions of two people so famous and revered in the patterns and object-oriented design communities. They both mentored and taught me so much (as did Bob Martin) during the "hey day" of patterns and OOD.

During the years between 1998 and 2002, John personally shared with me a great deal of insight and sage advice about writing, authoring and editing, as well as lending loads of encouragement and support. I truly feel like I have lost one of my mentors in the software engineering community. John's humor, insight, humility and clarity will be sorely missed.

Saturday, November 19, 2005

I have a whole bunch of reviewer-copies of books that I've been intending to review for several months. So I'll be doing a number of book reviews throughout the remainder of this year, particularly titles from The Pragmatic Programmers and from Addison-Wesley Professional (who were nice enough to give me copies of the books).

Today however I'll be posting about a review fo a book from a different publisher. I did a review of the book JUnit Recipes for StickyMinds a few months ago. My summary of my review was:

JUnit Recipes should probably be mandatory reading for anyone using Java, J2EE and JUnit in the real-world. This comprehensive and imminently pragmatic guide not only conveys a great deal of highly practical wisdom but also clearly demonstrates and explains the code to accomplish and apply the techniques it describes.

I defined Integrity as a triplet of related properties: {Correctness, Consistency, Completeness}. Integrity is a property of a deliverable item such as a feature, a configuration or a configuration item. So a feature or item has "integrity" if it is correct, consistent and complete.

I also defined Simplicity as a triplet of related properties: {Clarity, Cohesiveness (Coherency), Conciseness}. So a feature, item, or logical entity is "simple" if it is clear, cohesive and concise.

I then asked the question:

What about "form, fit and function"? Are "form" and "fit" also components of perishable value?

What I've been thinking since then is that the perishable form of value is the extrinsic value that it is given by the customer. From the end-consumers perspective, what they perceive as the form, fit, and function of the deliverable is what makes it valuable or not. We might call type of value "Commodity" or "Marketability". [Note: There are several things I both like and dislike about both those possible names, so please comment if you have a preference for one over the other (or for something else) and let me know why.]

Commodity is customer-desired Form, Fit and Function. ... Commodity has to do with what requirements are most valued by the customer at a given time. I think maybe those requirements are in terms of "Form, Fit, and Function". Which requirements those are and how much they are valued is most definitely time-sensitive. When I add "commodity"-based value to a codebase, I am adding time-sensitive perishable value that can depreciate or greatly fluctuate over time....

[from Ron Jeffries]:

A thing, to me, has integrity and simplicity but is a commodity.

I thought about this. And I completely agree - that probably is the main thing that makes the word "commodity" stand-out apart from the other two like "one of those things that just doesnt belong" with them.

Then I think about it some more, and I think, maybe the thing that makes it seem so "wrong" when listed with the other two is perhaps what is so "right" about it after all. Maybe it's a good think to think that a feature (or "story") is a commodity.

Maybe that's what it is first and foremost (a commodity) that we should always keep in mind, and where the most direct value to the customer is perceived. And maybe those other two things (integrity and simplicity) are the "secret sauce" that make all the difference in how we do it:

Maybe the integrity is the "first derivative" that gives us velocity AND continuity at a sustainable pace.

And maybe when we throw in simplicity, that is the second derivative of value, and it maybe harder for the customer to see directly, but when we do it right, that gives us more than just continuity+sustainability, it also gives us the acceleration to adaptiveness and responsiveness and "agility" to overcome that cost-of-change curve.

...

[follow-up from Ron Jeffries]:

However, a bit further insight (or what I use in place of insight) for why it troubles me. A "commodity" is a kind of product with value, but it is a fungible one. A commodity is a product usually sold in bulk at a price per item or per carload. One potato is like every other potato. A story/feature, in an important sense, isn't like every other story/feature.

Thanks Ron for all the thoughtful feedback. You are spot-on of course. And that notion of a commodity as a bulk shipment or mass purchase of units definitely "kills" the notion of value I'm trying to get at.

I'm still at a loss for a word/term that I like better. Marketability perhaps? It's more syllables than I'd like, although there is a precedent set for it in the book Software By Numbers in its use of an "Incremental Funding Method" (IFM) with "Minimal, Marketable Features" (MMFs).

So to my readers that have read this far ... what is your take on all of this talk about commodity/marketability and "perishable value"? Are commodity, integrity, and simplicity each just different perspectives of form, fit, and function, where:

"commodity/marketability" would be the customer view

"integrity" would be the view of requirements analysts/engineers, V&V/QA, and CM

These two lifecycles models are very similar. The main difference is that the 'V' model makes a deliberate attempt to "engage" stakeholders located on the back-end of the 'V' during the corresponding front-end phase:

During Requirements/Analysis, system testers are engaged to not only review the requirements (which they will have to test against), but also to begin developing the tests.

During architectural and high-level design, integrators and integration testers are engaged to review the design and the interface control specs, as well as to begin developing plans and test-cases for integration and integration-testing

at this point, hopefully you get the main idea ... at a given phase where deleverables are produced, the folks who are responsible for validating conformance to those specs are engaged to review the result and to begin development of their V&V plans and artifacts

When used in conjunction with Test-Driven Development (TDD), and especially with a lean focus on minimizing intermediate artifacts, the agile lifecycle in a very real sense makes the two sides of the 'V' converge to create almost a single line (instead of two lines forming a 'V'):

TDD attempts to use tests as the requirements themselves to the greatest extent possible

emphasis on lean, readable/maintainable code oftenlead to a literate programming style (e.g., JavaDocs) and/or a verbose naming convention style such that detailed design and source code are one and the same.

Use of iterative development with short iterations makes the 'V' (re)start and then converge over and over again throughout the development of a release.

The result: using cross-lifecycle collaboration in combination with tests as requirements and self-documenting code as detailed design and writing tests before the code makes the ends of the 'V' model converge together so that each end practically collapses against the other in a thick, almost single line. Plus successive short iterations serve to increase the frequency of this trend.

The agile lifecycle tries to eliminate (or at least create a tessarect for) the distance between the symmetric points at each end of the V-model by making the stakeholders come together and collaborate on the same artifacts (rather than separate ones) while also working in many small vertical slices on a feature-by-feature (or story-by-story) basis. There are no separately opposing streams of workflow: just a single stream of work and workers that collaborate to deliver business value down this single stream as lean + agile as possible.

Among the differences between streams and project-oriented branches were that project-oriented branches were still only the changes that took place on that branch; whereas streams gave me a dynamically evolving "current configuration" of the entire item (not just the changes); And in many cases "streams" are first-class entities which can have other attributes as well.

Streams are, in a sense, giving a view of a codeline that is similar to a web portal. They are a "code portal" that pulls the right sets of elements and their versions into the "view" of the stream and eases the burden of configuration specification and selection by providing us this nice "portal."

So what might be next in the evolution of branches and branching after this notion of "code portal"?

Will it be in the area of distribution across multiple sites and teams?

Will it be in the area of coordination, collaboration and workflow?

Will it be in the area of increasing scale? What would a "stream of streams" look like?

Maybe it will be all three! Maybe a stream of streams is a composite stream where the parent stream gave a virtual view across several (possibly remotely distributed) streams and repositories, but via a dynamic reference (rather than a copy), so that the current configuration was a view of the combined currenty configuration of each consitituent stream? (somewhat reminiscent of how composite baselines work in ClearCase/UCM)?

What do you think will be the next steps in the evolution of branching beyond "streams" and what do you think are the trends that will fuel the move in that direction?

Saturday, October 22, 2005

Here's something I've run into agile and non-agile projects alike: the blurring of distinction between bugs and enhancement requests. To me a bug is erroneous operation of the software based on the customer's requirements. That's fine when both sides agree to what the requirements are. Sometimes a bug can also be caused by a misunderstanding of the requirements by the team, however, and yes I'll still call this a bug. Often, however, customers will dub "missing" functionality (which was never discussed initially) or "nice-to-have" features, shortcuts and so on as "bugs"....

When I have tried to make the distinction between bugs and enhancements clearer to the PO or customer, sometimes through a SM, the customer thinks we are nit-picking, or trying to "play the blame game", rather than properly categorize and identify their feedback. One approach is to keep trying to educate and convince them anyways (on a case by case basis, if necessary). Another approach is just to let them call anything they want a "bug". Of course this can screw up your metrics (incidence of bugs) - something we are interested in at my current job (i.e. reducing the rate of new bugs and fixing bugs in the backlog).

Any words from the wise out in the trenches on how to best approach this? Obviously, with unit testing and other XP practices there is a claim that bug rates will be low. But if anything can be declared a bug, it becomes more difficult to make management and the customer believe the claims you make about your software development process and practices. And when this happens, thetypical response is to revert to "old ways" (heavy-handed, waterfall-type approaches with formal QA).

-- Stephen

I've actually had a lot of personal experience in this for the past several years. Here are some of the things I have learned...

1. DONT ASSUME ALL DEFECTS ARE BUGS!

The term "bug" and the term "defect" don't always mean the same thing:

Bug tends to refer to something "wrong" in the code (either due to nonconformance with design or requirements).

Defect often means something that is "wrong" in any work-product (including the requirements).

Hence, many consider ALL of incorrect, inconsistent, incomplete, or unclear requirements to be "defects": if they believe a requirement is"missing" or incorrectly interpreted, it's still a "bug" in their eyes.

Ive also seen some folks define "bug" as: anything that requires changing ONLY the code to make it work "as expected". If it requires a change to docs, the consider it a "change request" (and the issue ofwhether or not it is still a "defect" isnt really addressed)

If so, then be prepared to battle over the differences. Very often, the difference between them is just a matter of opinion, and the resolution will almost always boil down to a matter of which process (the bugfix process or the enhancement process) is most strongly desired for the particular issue, or else will become an SLA/contractual dispute. Then you can bid farewell to the validity of your defect metrics.

If your development process/practice is to treat "bugs" differently than "enhancements" (particularly if there is some contractual agreement/SLA on how soon/fast "bugs" are to be fixed and whether or not enhancements cost more $$$ but bugfixes are "free"), then definitions of what a bug/defect is will matter only to the extent outlined in the contract/SLA, and it will be in the customer's interest to regard any unmet expectation as a "bug".

If, on the other hand, you treat all customer reported "bugs" and "enhancements" sufficiently similar, then you will find many of the previous battles you used to have over what is a "bug" and what isn't will go away, and wont be as big of an issue. And you can instead focus on getting appropriate prioritization and scheduling of all such issues using the same methods.

If the customer learns that the way to get the thing they want when they want it is a matter of prioritization by them, and if the "cost" for enhancements versus bugfixes is the same or else isn't an issue, then they will learn that in order to get what they want, they don't have to claim its a bug, they just need to tell you how important it is to them with respect to everything else they have to prioritize for you.

3. IT'S ALL ABOUT SETTING AND MANAGING EXPECTATIONS!

None of the above (or any other) dickering over definitions is what really matters. What really matters is managing and meeting expectations. Sometimes business/organizational conditions mandate some contractual definition of defects versus enhancements and how each must be treated and their associated costs. If your project is under such conditions, then you may need to clearly define "bug" and "enhancement" and the expectations for each, as well as any agreed upon areas of "lattitude"

Other times, we don't have to have such formal contractual definitions. And in such cases, maybe you can treat enhancements and defects/bugs the same way (as noted earlier above).

Lastly, and most important of all, never forget that ...

4. EVERYONE JUST WANTS TO FEEL HEARD, UNDERSTOOD, AND VALUED!

If you can truly listen empathically and non-defensively (which isn't always easy), connecting with their needs at an emotional as well as intellectual level, and demonstrate that it is important to you, then EVERYONE becomes a whole lot easier to work with and that makes everything a whole lot easier to do.

Then it's no longer about what's a bug or what's an enhancement; and not even a matter of treating bugs all that differently from enhancements ... it simply becomes a matter of hearing, heeding and attending to their needs in a win-win fashion.I'm sure there are lots of other lessons learned. Those were the ones that stuck with me the most. I've become pretty good at the first two, and have become competent at the third. I still need a LOT of work on that fourth one!!!

One thing that occurs to me that might actually make traceability be easier for agile methods is that some agile methods work in extremely fine-grained functional increments. I'm talking about more than just iterations or features. I mean individually testable behaviors/requirements:

If one is following TDD, or its recent offshoot Behavior-Driven Development (BDD), then one starts developing a feature by taking the smallest possible requirement/behavior that can be tested, writing a test for it, then making the code pass the test, then refactoring, then going on to develop the next testable behavior etc., until the feature is done.

This means, with TDD/BDD, a single engineering task takes a single requirement through the entire lifecycle: specification (writing the test for the behavior), implementation (coding the behavior), verification (passing the test for the behavior), and design.

That doesnt happen with waterfall or V-model development lifecycles. With the waterfall and V models, I do much of the requirements up front. By the time I do design for a particular requirement it might be months later and many tasks and engineers later. Ditto for when the code for the requirement actually gets written.

So traceability for a single requirement thru to specs, design, code, and test seems much harder to establish and maintain if those things are all splintered and fragmented across many disjointed tasks and engineers over many weeks or months.

But if the same engineering task focused on taking just that one single requirement thru its full lifecycle, and if I am doing task-based development in my version control tool, then ...

The change-set that I commit to the repository at the end of my change-task represents all of that work across the entire lifecycle of the realization of just that one requirement, then the ID of that one task or requirement can be associated with the change-set as a result of the commit operation/event taking place.

And voila! Ive automatically taken care of much of the traceability burden for that requirement!

The IDE could easily know what kind of artifact Im working on (requirement, design, code, test

Operations in the IDE and the version-control tool would be able broadcast "events" that know my current context (my task, my artifact type, my operation) and could automatically create a "traceability link" in the appropriate place.

I realize things like CASE tools and protocols like Sun's ToolTalk and HP's SoftBench tried to do this over a decade ago, but we didnt have agile methods quite so formalized then and werent necessarily working in a TDD/TBD fashion. I think this is what Event-Based Traceability (EBT) is trying to help achieve.

Tuesday, October 11, 2005

I believe Extreme Programming (XP) and other Agile Methods are indeed a strong counter-reaction to some prevailing management and industry trends from arround 1985-1995. [Note I said counter-reaction rather than over-reaction]

I think the issue ultimately revolves around empowerment and control. During 1985-1995 two very significant things became very trendy and management and organizations bought into their ideas: The SEI Software Capability Maturity Model (CMM), and Computer-Aided Software Engineering.

During this same time, programming and design methods were all caught up in the hype of object-oriented programming+design, and iterative+incremental development.

Many a large organization (and small ones too) tried to latch-on to one or more of these things as a "silver bullet." Many misinterpreted and misimplemented CMM and CASE as a magic formula for creating successful software with plug-and-play replaceable developers/engineers:

Lots of process documentation was created

Lots of procedures and CASE tools were deployed with lots of contraints regarding what they may and may not do

and "compliance/conformance" to documented process was audited against.

Many felt that the importance of "the people factor" had been dismissed, and that creativity and innovation were stifled by such things. And many felt disempowered from being able to do their best work and do the things that they new were required to be successful, because "big process" and "big tools" were getting and their way and being forced upon them.

(Some would liken this to the classic debate between Hamiltonian and Jeffersonian philosophies of "big government" and highly regulated versus "that governemnt is best which governs least")

I think this is the "crucible" in which Agile methods like XP were forged. They wanted to free themselves from the ball and chain of restrictive processes and disabling tools.

So of course, what do we do when the pendulum swings so far out of balance in a particular direction that it really makes us say "we're mad as h-ll and we're not gonna take it any more!" ??

Answer: we do what we always do, we react with so much countering force that instead of putting the pendulum back in the middle where it belongs and is "balanced", we kick it as far as we can in the other direction. And we keep kicking as hard as we can until we feel "empowered" and "in control of our own destiny" again.

Then we don't look back and see when the pendulum (or the industry) starts self-correcting about every 10 years or so and starts to swing back and bite us again :)

XP started around 1995 and this years marks its 10th anniversary. Agile methods have been officially embraced by industry buzz somewhere around 2002, and for the last couple years, there has been some work on how to balance agility with large organizations and sophisticated technology.

Among the main things coming out of it that are generating a goodly dose of much deserved attention are:

testing and integration/buidling are getting emphasized much earlier in the lifecycle, and by development (not just testers and builders)

the "people factor" and teaming and communication is getting "equal time"

iterative development is being heavily emphasized up the management hierarchy - and not just iterative but HIGHLY iterative (e.g., weeks instead of months)

These are all good things!

There are some folks out there who never forgot them to begin with. They never treated CASE or CMM as a silver bullet and took a balanced approach from the start. And they didnt treat "agile" as yet another silver bullet either. And they have been quietly delivering successful systems without a lot of noise - and we didnt hear much about them because they weren't being noisy.

Unfortunately some other things may seem like they are "babies" being "thrown out with the bathwater". Agile puts so much emphasis on the development team and the project - that practitioners of some of the methods seem to do so at the expense of other important disciplines and roles across the organization (including, and perhaps even especially, SCM)

Saturday, October 08, 2005

We had a recent (and interesting) discussion on the scm-patterns YahooGroup about the notion of "value" and Frank Schophuizen got me thinking about what is the "value" associated with a configuration or a codeline: how does value increase or decrease when a configuration is "promoted" or when/if the codeline is branched/split?

Agile methods often talk about business value. They work on features in order of the most business-value. They eschew activities and artifacts that don't directly contribute to delivery business value. etc...

David Anderson, in several of his articles and blogs at agilemanagement.net, notes that the value of a feature (or other "piece" of functionality) is not dependent upon the cost to produce it, but upon what a customer is willing to pay for it. Therefore the value of a feature is perishable and depreciates over time:

The longer it takes to receive delivery of a feature, the less a customer may begin to value it.

If it doesn't get shipped in the appropriate market-window of opportunity, the value may be significantly lost.

If the lead-time to market for the feature is too long, then competitive advantage may be lost and your competitor may be able to offer it to them sooner than you can, resulting in possible price competition, loss of sale or business

So business value is depreciable; and the value of a feature is a perishable commodity.

Might there be certain aspects to business value that are not perishable? Might there be certain aspects that are of durable value? Is it only the functionality associated with the feature that is of perishable value? Might the associated "quality" be of more durable value?

I've seen the argument arise in Agile/XP forums about whether or not one should "commit" one's changes every time the code passes the tests, or if one should wait until after refactoring, or even until more functionality is implemented (to make it "worth" the time/effort to update/rebase, reconcile merge conflicts and then commit).

Granted, I can always use the Private Versions pattern to checkin my changes at any time (certainly any time they are correct+consistent) without also committing them to the codeline for the rest of the team to see and use. So, assuming that the issue is not merely having it secured in the repository (private versions), when is it appropriate to commit my changes to the codeline for the rest of the team to (re)use?

If refactoring is a "behavior preserving transformation" of the structure of the code, and if it improves the design and makes it "simpler", then is "good design" or "simplicity" something that adds durable value to the implementation of a running, tested feature? Kent Beck's initial criteria for "simple code" (and how to know when you are done refactoring your latest change) was described in an XPMagazine article by Ron Jeffries as the following, in order of importance:

it expresses every thought we intended it to convey about the program (i.e. reveals all our intent, and intends all that it reveals)

it minimizes the size and number of classes and methods

If I squint a little when I read thru the above, it almost looks like it's saying the same thing that writing-instructors and editor's say about good writing! It should be: correct, consistent, complete, clear and concise!

I have often heard "correct, consistent and complete" used as a definition of product integrity. So maybe integrity is an aspect of durable value! And I have sometimes heard simplicity defined as "clear and concise" or "clear, concise and coherent/cohesive" (where "concise" would be interpreted as having very ruthlessly rooted out all unnecessary/extraneous or repeated verbage and thoughts). So maybe simplicity is another aspect of durable value.

And maybe integrity is not enough, and simplicity is needed too! That could possibly explain why it might make more sense to wait until after a small change has been refactored (simplified) before committing it instead of waiting only until it is correct+consistent+complete.

Perhaps the question "when should I commit my changes?" might be answered by saying "whenever I can assure that I am adding more value than I might otherwise be subtracting by introducing a change into a 'stable' configuration/codeline!"

If my functionality isn't even working, then it's subtracting a lot of value, even if did get it into the customer's hands sooner. It causes problems (and costs) for my organization and team to fix it, has less value to the customer if it doesn't work, and can damage the trust I've built (or am attempting to build) in my relationship with that customer

if my functionality is working, but the code isn't sufficiently simple, the resulting lack of clarity, presence of redundancy or unnecessary dependency can make it a lot harder (and more costly) for my teammates to add their changes on top of mine

if I wait too long, and/or don't decompose my features into small enough working, testable increments of change, then the business value of the functionality I am waiting to commit is depreciating!

Now I just have to figure out some easy and objective means of figuring out the "amount" of value I have added or subtracted :-)

So are "integrity" (correct + consistent + complete) and "simplicity" (clear + concise + coherent/cohesive) components of durable value? Is functionality the only form of perishable value?

What about "form, fit and function"? Are "form" and "fit" also components of perishable value? Am I onto something or just spinning around in circles?

The baseline identification principle said that I need to be able to identify what I have to be able to reproduce. The baseline immutability principle said that the definition of a baselined configuration needs to be timesafe: once baselined, the identified set of elements and versions associated with that baseline must always be the same set of elements and versions, no matter how that baseline evolves in the form of subsequent changes and their resulting configurations.

Maybe somewhere in between the baseline identification principle and the baseline immutability principle should be the single configuration principle:

The Single Configuration Principle would say that a baseline should correspond to one, and only one, configuration.

Of course the baseline itself might be an assembly of other baselined configurations, but then it still corresponds to the one configuration that represents that assembly of configurations. So the same baseline "identification" shouldnt be trying to represent multiple configurations; just one configuration.

What does that mean? It means don't try to make a tag or label serve "double-duty" for more than one configuration. This could have several ramifications:

maybe it implies that "floating" or "dynamic" configurations, that are merely "references", should have a separate identifier, even when the reference the same configuration as what was just labeled. So maybe the identifiers like "LATEST or "LAST_GOOD_BUILD" should be different from the one that identifies the current latest build-label (e.g., "PROD-BUILD-x.y.z-a.b")

maybe it might also imply that when we use a single label to capture a combination of component versions, that we really want true "composite" labeling support. This would literally let me define "PROD_V1.2" as "Component-One_V1.1" and "Component-Two_V1.0" without requiring the label to explicitly tag all the same elements already tagged by the component labels

maybe it implies something similar for the notion of a "composite current configuration" or even a "composite codeline" where a product-wide "virtual" codeline could be defined in terms of multiple component codelines

What do you think? Is the single configuration principle a "keeper" or not?

Saturday, September 24, 2005

Just a random synapse firing in my brain ... I remember back in my high school days being enthralled with physics and the latest grand-unified theories (GUTS), and how gravity was always the "odd ball" in trying to unify the four fundamental forces of nature into a single, simple, consistent and coherent theory:

Quantum mechanics could unify all but gravity. It was great, and incredibly accurate at explaining all the rich and myriad interactions of things at the molecular, atomic and subatomic levels.

But throw in celestial bodies and large distances, and the thing called "gravity" rears its ugly head and makes things complicated. In theory it's nowhere near as strong as the other forces, and yet any time you had to scale up to things large enough and far enough away to need a telescope instead of a microscope, it made everything fall apart.

Sometimes I think Agile "theory" and large projects and organizations are the same dichotomy.

The "Agile" stuff seems great in small teams and projects that can be highly collaborative and iterative over short (collocated) distances with small "lightweight" teams and processes.

But throw it into a large project or organization, and "gravity" sets in, adding weight and mass and friction to processes and communication, and yet necessarily so, in order to scale to a larger living system of systems of systems.

So we are left with quantum agility and organizational gravity and trying to reconcile the two. What's an Agile SCMer to do about all that?

Saturday, September 17, 2005

One of the things I spend a lot of time dealing with is integration between application lifecycle management tools and their corresponding process areas: requirement management, configuration management, test management, document management, content management, change management, defect management, etc.

So I deal with process+tool architecture integration for a user community of several thousand, and the requirements, version control, change-tracking, and test management tools almost always each have their own separate repositories. Occasionally the change-tracking and version-control are integrated, but the other two are still separate.

And then if there is a design modeling tool, it too often tries to be a "world unto itself" by being not merely a modeling environment but attempting to store each model or set of models as a "version archive" with its own checkin/checkout, which makes it that much more of a pain in the you-know-what to get it versioned and labeled/baselined together with the code, particularly if code-generation is involved and needs to be part of the build process.

And what really gets to me is that, other than the version control tool, the other tools for requirements and test management, and typically change management usually have little or no capability to deal with branching (much less merging). So heaven forbid one has to support multiple concurrent versions of more than just the code if you use one of the other tools.

The amount of additional effort for tool customization and configuration and synchronization and administration to make these other tools be able to deal with what is such a basic fundamental version-control capability is enormous (not to mention issues of architectural platforms and application server farms for a large user base). So much so that it makes me wonder sometimes if the benefit gained by using all these separate tools is worth the extra integration effort. What if I simply managed them all as text files in the version control system?

At least then I get my easy branching and merging back. Plus I can give them structure with XML (and then some), and could easily use something like Eclipse to create a nice convenient GUI for manipulating their contents in a palatable fashion.

And all the data and metadata would be in the same database (or at least one single "virtual" database). No more having to sync with logically related but physically disparate data in foreign repositories and dealing with platform integration issues, just one big (possibly virtual) repository for all my requirements, designs, code, tests, even change-requests, without all the performance overhead and data redundancy and synchronization issues.

It could all be plain structured text with XML and Eclipse letting each artifact-type retain its own "personality" without having to be a separate tool in order to do it.

Why can't someone make that tool? What is so blasted difficult about it!!!

I think the reason we dont have it is because we are use to disconnected development as "the rule" rather than as the exception. Companies that shell out the big bucks for all of those different tools usually have separate departments of people for each of requirements (systems/requirements engineers), design (software architects), source-code ("programmers"), test (testers), and change-management.

It's a waterfall-based way of organizing large projects and it seems to be the norm. So we make separate tools for each "discipline" to help each stay separate and disconnected, and those of us doing EA/EAI or full lifecycle management of software products have to deal with all the mess of wires and plumbing of integration and platforms and workflow.

and a collaborative knowledge/content management system like Confluence

and roll them together into a single integrated system with a single integrated repository.

Notice I didn't mention any specific tools for requirements-management or test-management. Not that I dont like any of the ones available, I do, but I think it's time for a change in how we do those things with such tools:

they basically allow storing structured data, often in a hierarchical fashion with traceability linkages, and a way of viewing and manipulating the objects as a structured collection, while being able to attach all sorts of metadata, event-triggers, and queries/reports

I think a great wiki + CMS like Confluence and Jira can do all that if integrated together; Just add another "skin" or two to give a view of requirements and tests both individually and as collections (both annotated and plain).

The same database/repository could give me both an individual and hierarchical collection-based views of my requirements, designs, code, tests and all their various "linkages." Plus linking things in the same database is a whole lot easier to automate, especially thru the same basic IDE framework like Eclipse.

the requirements "skin" gives me a structured view of the requirements, and collaborative editing of individual requirements and structured collections of them;

ditto for the test "skin";

and almost "ditto" for the "change-management" skin (but with admittedly more workflow involved)

the design tool gives me a logical (e.g., UML-based) view of the architecture

the IDE gives me a file/code/build-based view of my architecture

And once MS-Office comes out with the standard XML-based versions, then maybe it will be pretty trivial to do for documents too (and to integrate XML-based Word/Office/PPT "documents" with structured requirements and tests in a database)

Oh why oh why can't I have a tool like that! Pretty please can I have it?

I think that didn't work too well. I still think "promotion" corresponds to "release", but "reuse" corresponds to something else. I'm going to try translating "reuse" to "integration". If I integrate (e.g., merge) someone else's changes into my workspace, I am quite literally reusing their work. If I commit my own change to the codeline, then I am submitting my work for reuse by the rest of the team that is using the codeline (particularly the "tip" of the codeline) as the basis of their subsequent changes.

So if I equate "release" with "promotion", and "reuse" with "integration" I think the result is the following:

The Promotion-Integration Equivalency Principle -- The granule of integration is the granule of promotion. (So it's not just the change content, but also the context – the entire configuration – that we end up committing to the codeline/workstream.)

The Change Closure Principle -- Elements that must be changed together are promoted together (implies task-level commit).

The Change Promotion Principle -- Elements that must be integrated together are promoted together (implies doing workspace update prior to task-level commit)

These "work" for me much better than the previous translation attempt. Note that the "change closure principle" didn't change much from before - it was just clarified a bit to indicate the dependency between elements.

This also makes me think I've stumbled onto the proper meaning for translating the Interface Segregation Principle (ISP): ISP states "Make fine-grained interfaces that are client-specific." If "integration" is reuse, then each atom/granule of change is an interface or "container" of the smallest possible unit of reuse.

The smallest possible unit of logical change that I can "commit" that doesn't break the build/codeline would be a very specific, individually testable, piece of behavior. Granted, sometimes it might not be run-time behavior ... it could be build-time behavior, or behavior exhibited at some other binding time.

This would yield the following translation of the ISP into the version-control domain:

I'm not thrilled about the name (please feel free to suggest a better one -- for example ... how about "segmentation" instead of "separation"?) but I think the above translation "works" quite well, and also speaks to "right-sizing" the amount of change that is committed to the codeline as an individual "transaction" of change. The way it's worded seems like it's talking exclusively about "code", but I think it really applies to more than just code, so long as we arent constraining ourselves to execution-time "behavior."

Let me know what you think about these 4 additions to the family of SCM principles!

I frequently save intermediate drafts of my blog entries before I publish them. I had been working on my most recent draft for a couple hours. I'd been finalizing many of the sentences and paragraphs, making sure the flowed, checking the word usage, spellchecking, adding and verifying links, and then ... when I was finally ready to publish, I hit the publish button on the blogger compose window, and it asked me to login again. When I did, my final edits were GONE! I'd just lost two hours worth of work.

My first thought was ARRRRRRRGGGGGHHHHH! My next thought was "no freakin' WAY did that just happen to ME!" Then much profanity ensued (at least in my own silent frustration) and I tried my darndest to look thru any and all temp files and saved files on my system and on blogger.com, all for naught. I had indeed fallen victim to one of the most basic things that CM is supposed to help me prevent. How infuriating! How frustrating! How embarrassing. I was most upset not about the lost text, but about the lost time!

I figure there must be a lesson in there somewhere to pass along. Ostensibly, the most obvious lesson would be to use the Private Versions pattern as outline in my book. The thing is ... I had been doing just that! It was in the very act of saving my in-progress draft (before publishing it) that my changes were lost.

What I could (and possibly should) have done instead was not use blogger's composer to compose my drafts. I could have done it locally instead, on my own machine (and my own spellchecker). And perhaps I will do that a bit more from now on. Still, it's pretty convenient to compose it with blogger because"

I get rapid feedback as to what it will actually look like, and ...

I can access it from any machine (not just the one I use late at night)

I later realized why it happened. I was trying to do two things at once:

In one window I was composing my blog entry.

In another browser window I was visit webpages I wanted to hyperlink to from my entry and verifying the link.

Okay - so there's nothing wrong with that. I mean I was doing two things at the same time, but I wasn't really trying to multi-task because I was still trying to work on my blog-entry.

The real culprit wasnt that I had two windows open at the same time, it was that one of the webpages I wanted to hyperlink to was also a blogger.com hosted blog-entry. And since I was positing a question in my entry that referred to this one, I also wanted to create a comment in the referred-to entry that asked the question and referenced back to my own blog.

Posting that comment caused me to have enter my blogger id and passwrod, and that essentially forced a new login - which made it look like my current login (where I was composing my entry) either ended, or had something unusual going on that warranted blogger wanting me to re-authenticate myself. And when it did, I lost my changes! OUCH!

Actually, I hadnt even posted the comment - I had only previewed it (saving it as a draft). Anyway - I was too upset (and it was too late at night) to try and recreate my change sthen. So I waited another day before doing it. I have to say Im not as happy with the result. I had really painstakingly satisfied myself with my wording and phrasing before I lost my changes. I wasn't as thorough the second time around because I wanted to be done with it!

So what was my big mistake? I was using private versions, and I wasn't trying to multi-task. I was in some sense trying to simultaneously perform "commits" of two different things at the same time, but they were to different "sections" of the same repository, so that really shouldn't have been such a terrible thing.

My big mistake wasn't so much a lack of good CM as it was a lack of good "agility": I let too much time lapse in between saving my drafts. I wasn't working in small enough batch-sizes (increments/iterations)!

Granted, I don't want to interrupt my flow of thought mid-sentence or mid-paragraph to do a commit. But certainly every time I was about to visit and verify another hyperlink in my other browser window, I should have at least saved my current draft before doing so. And I probably should have made sure I did so at least every 15-20 minutes. (You can be darn sure that's what I did this time around :-)

This sort of relates to how frequently someone should commit their changes in a version control system. Some of the SCM principles that I havent described yet will relate to this. Uncle Bob's Principles of Object-Oriented Design have a subset that are about "package cohesion" and granularity

REP: The Release Reuse Equivalency Principle -- The granule of reuse is the granule of release.

CCP: The Common Closure Principle -- Classes that change together are packaged together.

CRP: The Common Reuse Principle -- Classes that are used together are packaged together.

In the context of version control, these "packages of classes" would probably correspond to "packages of changes" that make up a single logical "change transaction" or "commit" operation. If that is a valid analogy, then I need to decide what "reuse" and "release" mean in this context:

I think "release" would mean to "promote" or "commit" my changes so they are visible to others using the same codeline.

I think "reuse" would mean ... hmmn that's a tough one! It could be many things. I think that if a change is to be reusable, it must be testable/tested. Other things come to mind too, but that's the first one that sticks.

So let's see what happens if I equate "release" with "commit", equate "reuse" with "test" and see if the result is coherent and valid. This would give me the following:

The Commit/Test Equivalency Principle -- The granule of test is the granule of commit.

The Change Closure Principle -- Files that change together are committed together.

The Test Closure Principle -- Files that are tested together are committed together (including the tests).

Comments? Thoughts? What do these mean to you? Does it mean anything more than using a task-level commit rather than individual file checkin? Should these always "hold true" in your experience? When shouldnt they? (and why?)

Oh - and feel free to suggest better names if you dont like the ones I used. I'm not going to supply abbreviations for these because, or name any blog-entries after them just yet because I'm not yet certain if they are even valid.

The OCP means I should have a way of being able to extend a thing without changing the thing itself. Instead I should be able to create some new "thing" of my own that reuses the existing thing and somehow combines that with just my additions, resulting in an operational "extension" of the original thing. The OCP is the basis for letting me reuse rather than reinvent when I need to create something that is "like" an existing thing but which still requires some additional stuff.

If applied for baselined configurations (a.k.a. baselines) the OCP would read "A baseline should be open for extension but closed for modification." That means if I want to create a "new" configuration that extends the previously baselined configuration, I should do so by creating a new configuration that is the baseline PLUS my changes. The result is not a "changed" baseline - the baselined configuration stays the same as it was before my change. We don't actually ever "change" a baseline. What we do is request/apply one or more changes against/to a baseline; and the result is a new configuration, possibly resulting in a new baseline.

According to the Baseline Immutability Principle ...

If a baseline is to be reproducible, and if it needs to be identifiable, then the name that identifies the baseline with its corresponding configuration must always refer the exact same configuration: the one that was released/baselined.

For example, suppose I have release 1.2 of my product and I apply a label/tag of "REL-1.2" to everything that was used to make 1.2 (not just the code, but ALL of it: requirements, designs, tests, make/ANT files, etc.). Suppose that version 1.2.3.4 of element FUBAR was one of the file revisions that was labeled. Now suppose that during the following month, "REL-1.2" is moved/reapplied to version 1.2.3.5 of FUBAR.

In this example, I have just violated the baseline immutability principle. If a customer needs me to be able to reproduce Release 1.2, and if Release 1.2 contained v1.2.3.4 of FUBAR, then if I use "REL-1.2" to recreate the state of the codebase for Release 1.2, I just got the wrong result, because the version of FUBAR in Release 1.2 is different from the version that is tagged with the "REL-1.2" label.

Notice that I am not saying that we can't make changes against a baseline. We most certainly can. And the result is a new configuration!

When we make a change to a baseline, we aren't really changing the configuration that was baselined and then trying to use the same name for the result. Our changed result is a new configuration that took the current baseline and added our changes to it. And if we chose to name this new configuration, we give it a new name (one that is different from the name of any previously baselined configuration).

So a baseline name and the configuration it references are married: once the configuration is baselined, that name must forever after be faithfully monogamous to that configuration for better or for worse, for richer or for poorer, in sickness and in health for as long as they both shall live.

Always and forever? What about a divorce, or an anullment?

An "anullment" in this case is when I didnt get it right the first time. Either I "blessed" a configuration as "baselined" that didnt really meet the criteria to be called a "baseline." Or else I incorrectly identified the corresponding configuration: I might have labeled the wrong version of a file, or I forgot to label some file (e.g., people often forget to label their makefiles), or I labeled something I shouldnt have.

Correcting a baseline's labeled-set so that it accurately identifies ("tags") the baselined configuration isnt really changing the baseline; it's merely correcting the identification of it (because it was wrong up until then).

What about a "divorce"? We all know that a divorce can be quite expensive, and require making payments for a long time thereafter. Retiring (and trying to reuse) a baseline name can have significant business impact. Retiring the baseline often means no longer providing support for that version of the product. Trying to then reuse the same baseline name of the same product for a new configuration can create lots of costly confusion and can even be downright misleading.

Note that the term "a baseline" should not be confused with the term "the baseline":

The term "the baseline" really means the latest/current baseline. It is a reference!

This means that "the baseline" is really just shorthand for "the latest baseline." And when we "change the baseline", we are changing the designation of which baseline is considered "latest": we are changing the reference named "latest baseline" to point to a newer configuration.

So The Baseline Immutability Principle states that once a configuration is baselined, the identification of the baseline name with its corresponding configuration is immutable: The set of elements (e.g., files and revisions) referenced by the baseline name must always be the same set. And that set must always correspond to the set that was used to produce the version of the product that was baselined.

Sunday, August 21, 2005

Yesterday (actually just a few hours ago) was my 40th birthday. I had a really nice celebration with my wife and kids at a picnic in the park. I really dont feel like I'm 40. My body thinks I am 50 - at least that how it seems to be acting. My mind still isnt used the the fact that I'm now more than just a little bit older than all those leading men and leading ladies on TV and movies. (Guess I can no longer identify them as part of my historical "baseline" :-)

If the ability to reproduce a baseline is fundamental to SCM, then it stands to reason that the ability to identify a baseline that I must be able to reproduce should also be pretty fundamental. If I have to be able to "show it", then I must first be able to "know it." If I can't uniquely identify a baseline, then it's pretty hard to reproduce it if I'm not sure what I'm trying to reproduce.

So the baseline reproducibility principle gives rise to The Baseline Identification Principle: a baseline must be identified by a unique name that can be used to derive all the constituent elements of the baseline. In other words, we have to have a name, and a way of associating that name with all the object (e.g. files) and their revisions that participate in the baseline.

How do we identify a baseline? By defining a name (or a naming system) to use, and using that name to reference the set of elements that were used to build/create the baselined version of the product.

A "label" or "tag" is one common way that a version control tool allows us to identify the sources of a baseline. This lets us associate a name with a specific set of repository elements and their corresponding revisions. Or it lets us associate a name with an existing configuration or event from which the set of elements and versions may be derived.

Sometimes tagging all the "essential" files and revisions in the repository is sufficient. Sometimes I need more information. I can always take any files or information that werent previously in the version control repository, and put them in the repository:

I can put additional information in a text file and checkin the file

I can export a database or binary object into some appropriate format (e.g., XML, or other formatted text)

If you currently have to label or tag more than just source-code and manually created text-files, then tell me about the other kinds of things you checkin and tag, and what special things you do to ensure they are identified as part of a baseline.

Monday, August 15, 2005

Getting back to my earlier topic of The Principles of SCM, I think probably the first and most fundamental principle would be the requirement to be able to reproduce any baselined/released version of the software.

I'll call this The Baseline Reproducibility Principle: a baseline must be reproducible. We must be able to reproduce the "configuration" and content of all the elements that are necessary to reproduce a "released" version of the product.

By "released" I really mean "baselined" - it doesn't have to be a release to a customer. It could be a hand-off to any other stakeholder outside of development (like a test group, or a CM group, or QA, etc.). There is some basic vocabulary we need, like the terms "baseline" and "configuration." Damon Poole has started a vocabulary/glossary for SCM. Damon defines configuration but doesn't yet define a baseline.

A baseline is really shorthand for a "baselined configuration." And a baselined configuration is basically "a configuration with an attitude!" The fact that it's been "baselined" makes it special, and more important than other configurations that aren't baselined. We baseline a configuration when we need to promote/release it to another team/organization. By "baselining" it, we are saying it has achieved some consensually agreed upon level of "blessedness" regarding what we said it would contain and do, and what it actually contains and does.

Why do we need to be able to reproduce a baselined version of the product we produce and deliver? For several reasons:

Sometimes we want to be able to reproduce a reported problem. It helps to be able to reproduce the exact versions of the source code that made up version of the product that the customer is using.

In general, when we hand-off a version of the product to anyone that may report problems or request enhancements, it is useful to be able to reproduce the versions of the files that make-up that version of the system to verify or confirm their observations and expectations.

When a "fix" is needed, customers are not always ready/willing to deploy our latest version (containing new funcitonality plus the fix). Even if they are, sometimes our business is not - it wants to "give" them the fix, but make more money on any new functionality. So we must provide a "patch" to their existing version

When a baseline is a version of the product, it includes the specs and the executable software. Configuration auditing requires us to know the differences between the current product+specs versus their actual+planned functionality at the time that the product was released to them.

Those are just a few reasons. There are many more I'm sure.

What does it mean to reproduce a baseline? At the very least it means being able to reproduce the exact set of files/objects and their corresponding versions that were used to produce/generate the delivered version of the product. (That includes the specs that may be audited against, as well as the code).

Sometimes being able to reproduce the source files for the code+docs (and build scripts) is enough. Often we need to be able to do more than that. Sometimes it may be necessary to reproduce one or more of the following as well:

The version of the compilers/linkers or other tools used to create that version of the product

The version of any third-party libraries, code/interfaces/headers used to build the product

Any other "significant" aspect of the computing environment/network utilized during the creation of the delivered version of the product

It can be too easy to go to more effort than necessary to ensure reproducibility of more than is absolutely essential. What is essential to reproduce may depend upon many business and technical factors (including some possible contractual factors regarding deployment/upgrade, operational usage and support).

The ability to be able to reproduce a baseline is so basic to SCM; I can't believe it hasn't been a "named" principle before. I know others have certainly written about it as a principle, I'm just not recalling if any of them gave the principle a name.

I think names are powerful things. Part of what makes software patterns so powerful is that they give a name to an important and useful solution to a recurring problem in a particular context. The pattern name becomes an element of the vocabulary of subsequent discussion on the subject. So I can use the terms "Private Workspace" or "Task Branch" in an SCM-related conversation instead of having to describe what they are over and over again.

This is why I'd like to develop a set of named principles for SCM. I think lots of folks have documented SCM principles, but didn't give them names. And they might "stick" better if we gave them names. If you know of any examples of SCM principles that are already well known and have a name, please let me know! (Please include a reference or citation if possible)