A tool that manages and tracks different versions of software or other content is referred to generically as a version control system (VCS), a source code manager (SCM), a revision control system (RCS), and several other permutations of the words “revision,” “version,” “code,” “content,” “control,” “management,” and “system.” Although the authors and users of each tool might debate esoterics, each system addresses the same issue: develop and maintain a repository of content, provide access to historical editions of each datum, and record all changes in a log. In this book, the term version control system (VCS) is used to refer generically to any form of revision control system.
This book covers Git, a particularly powerful, flexible, and low-overhead version control tool that makes collaborative development a pleasure.

…

Version Control with Git
Jon Loeliger
Matthew McCullough
Published by O’Reilly Media
Beijing ⋅ Cambridge ⋅ Farnham ⋅ Köln ⋅ Sebastopol ⋅ Tokyo
Preface
Audience
Although some familiarity with revision control systems will be good background material, a reader who is not familiar with any other system will still be able to learn enough about basic Git operations to be productive in a short while. More advanced readers should be able to gain insight into some of Git’s internal design and thus master some of its more powerful techniques.
The main intended audience of this book should be familiar and comfortable with the Unix shell, basic shell commands, and general programming concepts.
Assumed Framework
Almost all examples and discussions in this book assume the reader has a Unix-like system with a command-line interface.

…

In order to support the sheer volume of update operations that would be made on the Linux kernel alone, he knew that both individual update operations and network transfer operations would have to be very fast. To save space and thus transfer time, compression and “delta” techniques would be needed. Using a distributed model instead of a centralized model also ensured that network latency would not hinder daily development.
Maintain Integrity and Trust
Because Git is a distributed revision control system, it is vital to obtain absolute assurance that data integrity is maintained and is not somehow being altered. How do you know the data hasn’t been altered in transition from one developer to the next? Or from one repository to the next? Or, for that matter, that the data in a Git repository is even what it purports to be?
Git uses a common cryptographic hash function, called Secure Hash Function (SHA1), to name and identify objects within its database.

The analysis system has a base of formulas to process. The alerting system needs to know who to alert, how, and who to escalate to. The visualization system needs to know which graphs to generate and how to do so. The storage system needs to know how to store and access the data.
These configurations should be treated like any other software source code: kept under revision control, tested using both unit tests and system tests, and so on. Revision control tracks changes of a file over time, enabling one to see what a file looked like at any point in its history. A unit test framework would take as input time-series data for one or more metrics and output whether the alert would trigger. This permits one to validate alert formulas.
In distributed monitoring systems, each component may be separated out and perhaps replicated or sharded.

…

• Managed Configuration and Automation: The configuration files of all applications that are required for the service are kept in source code control and are subject to the same change management as the rest of the code. The same is true for all automation scripts.
• Infrastructure as Code: With a software-defined datacenter (i.e., virtual machines), you can keep a description of the entire infrastructure as code that can be maintained under revision control. Infrastructure as code is further described in Section 10.6.
• Automated Provisioning and Deployment: Every step of the deployment process is automated and/or scripted so that one can trigger a build that will go all the way through self-test to a deployment, or can trigger a deployment via a separate build command.
• Artifact-Scripted Database Changes: Rather than manual manipulation of database schema, changes to databases are also treated as code.

…

This includes machines with an operating system loaded and properly configured, the software packages installed and configured, plus network, storage, and other resources available. While infrastructure can be set up manually, it is best to automate that process. Virtualization enables this kind of automation because virtual machines can be manipulated via software, as can software-defined networks. The more we treat infrastructure as code, the more we can benefit from software development techniques such as revision control and testing. This automation should be treated like any other software product and put through the same service delivery flow as any application.
When service delivery is done right, it provides confidence, speed, and continuous improvement. Building, testing, and deployment of both an application and virtual infrastructure can be done in a completely automated fashion, which is streamlined, consistent, and efficient.

[URL 32] Perl Power Tools
⇒ www.perl.com/pub/language/ppt/
A project to reimplement the classic Unix command set in Perl, making the commands available on all platforms that support Perl (and that's a lot of platforms).
Source Code Control Tools
[URL 33] RCS—Revision Control System
⇒ www.cs.purdue.edu/homes/trinkle/RCS/
GNU source code control system for Unix and Windows NT.
[URL 34] CVS—Concurrent Version System
⇒ www.cvshome.com
Freely available source code control system for Unix and Windows NT. Extends RCS by supporting a client-server model and concurrent access to files.
[URL 35] Aegis Transaction-Based Configuration Management
⇒ http://www.canb.auug.org.au/~millerp/aegis.html
A process-oriented revision control tool that imposes project standards (such as verifying that checked-in code passes tests).
[URL 36] ClearCase
⇒ www.rational.com
Version control, workspace and build management, process control.

This is not a trivial task. For example, every programmer needs to use a revision control system to track changes and easily branch and merge versions of code. The best-regarded revision control system today is Git, created by Linus Torvalds (and named, incidentally, after his famous cantankerousness).21 Git’s interface is command-line driven and famously UNIX-y and complex, and for the newbie its inner workings are mysterious. In response to a blog post titled “Git Is Simpler Than You Think,” an irritated Reddit commenter remarked, “Yes, also a nuclear submarine is simpler than you think … once you learn how it works.”22 I myself made three separate attempts to learn how Git worked, gave up, was frustrated enough by other revision control systems to return, and finally had to read a 265-page book to acquire enough competence to use the thing.

In fact, the most reliable method
for converting a CVS repository to Git is to convert it to a Subversion
repository first using the cvs2svn command.2 cvs2svn is a tool for converting a CVS repository to a Subversion repository.
Git ships with a tool for importing CVS repositories called git cvsimport.
Since the cvs2svn command is much more stable to a Git repository,
we’ll focus on it.
To start with, you need to get revision control system (RCS) files from
your CVS repository. Once you have those, you need to convert them to
an SVN dump file using the cvs2svn tool:
prompt> cd /path/to/cvs-rcs-files
prompt> cvs2svn --dumpfile=svndump
Creating the dump file allows you to filter out any unnecessary data
by using tools such as svndumpfilter.3 With your new svndump prepared,
now you need to create a Subversion repository to import it into.

[ 206 ]
Appendix A
A typical IDE, if we talk of programs such as MS Visual Studio®, basically consists
of the following things:
•
An editor with automatic indentation, syntax coloring, and autocompletion
•
An integrated compiler that makes it possible to jump directly to compile
errors in the code
•
An integrated debugger that makes it possible to step through the code
•
A file explorer such that you can look through the files to add to the project
•
A project browser to look through the files included in the project
•
A tag browser to look through the tags (definitions, functions,
methods, classes)
•
An easy way to jump between files, definitions, and so on
•
Maybe integration with some version / revision control system
So are all of these things, at all, possible in Vim? Now, let's go through all of the
items, one by one, and see how we can get that functionality in Vim.
The first item is obvious, as Vim is an editor it does just that.
The second item in the list is the integration with a compiler. Vim is often used for
programming, so it is built with support for compiler integration. For most common
programming languages, the settings for compiler integration in Vim are already set.

…

You can find the script and information about
how to install it at http://vim-taglist.sourceforge.net.
[ 208 ]
Appendix A
When it comes to moving around in the Vim editor window itself, Vim already has
the shortcuts gf and gd, which take you to the file or declaration of the tag under
the cursor. This makes it very fast to jump around in the files.
Finally, there is the integration with version / revision control systems such as CVS,
SVN, and Perforce. As with all the other functionality that you need to construct a
Vim IDE, this integration is of course also available via scripts. I would recommend
the following scripts for the mentioned systems:
•
CVS and SVN, which are located at http://vim.sourceforge.net/
scripts/script.php?script_id=90
•
Perforce, which is located at http://vim.sourceforge.net/scripts/
script.php?

For example:
<build> <plugins> <plugin> <groupId>org.codehaus.mojo</groupId> <artifactId>properties-maven-plugin</artifactId> <version>1.0-alpha-2</version> <executions> <execution> <phase>initialize</phase> <goals> <goal>read-project-properties</goal> </goals> <configuration> <files> <file>${fullpath.to.properties}</file> </files> </configuration> </execution> </executions> </plugin> </plugins> </build>
If you use a relative path to the properties file, then the file can reside in your Revision control system. If you use a full path, then the property file can be stored on the Jenkins server. The second option is preferable if sensitive passwords, such as for database connections, are included.
Jenkins has the ability to ask for variables when you run a Job manually. This is called a Parameterized build (https://wiki.jenkins-ci.org/display/JENKINS/Parameterized+Build). At the build time, you can choose your property files by selecting from a choice of property file locations.

…

The test phase occurs before the generation of a JAR file, and thus avoids creating a warning about an empty JAR. As the name suggests, this phase is useful for testing before packaging.
The second example script highlights the strength of combining Groovy with Ant. The SCP task (http://Ant.apache.org/manual/Tasks/scp.html) is run a large number of times across many servers. The script first asks for the USERNAME and password, avoiding storage on your file system or your revision control system. The Groovy script expects you to inject the variables host: full_path_to_location and myfile.
Notice the similarity between the Ant SCP task and the way it is expressed in the pom_ant_contrib.xml file.
There's more...
Creating custom property files on the fly allows you to pass on information from one Jenkins Job to another.
You can create property files through AntBuilder using the echo task.

To a developer, it is clear that a defect lives in the source code
Tools such as Bugs Everywhere§ operate on the purist concept that, since bugs ultimately reside
in the source code, defect tracking should live alongside the code and use the same revision
control system (e.g., CVS, SVN, Mercurial). This does have significant advantages, especially
for small systems where the entire code base can live within one revision control system. When
a defect is fixed, both the patched source code and updated defect report are checked into the
revision control system, and the correspondence between the defect state and the defect report
state are correctly and automatically propagated to branches of the source tree without
requiring the developer or software QA person to maintain an ad-hoc set of cross references
in the source tree and the state in the defect database.

If you're working on a program with other people and want to see the changes they have made since the last time you worked on the code, a comparator tool such as Diff will make a comparison of the current version with the last version of the code you worked on and show the differences. If you discover a new defect that you don't remember encountering in an older version of a program, rather than seeing a neurologist about amnesia, you can use a comparator to compare current and old versions of the source code, determine exactly what changed, and find the source of the problem. This functionality is often built into revision-control tools.
Merge Tools
One style of revision control locks source files so that only one person can modify a file at a time. Another style allows multiple people to work on files simultaneously and handles merging changes at check-in time. In this working mode, tools that merge changes are critical. These tools typically perform simple merges automatically and query the user for merges that conflict with other merges or that are more involved.

…

Will programmers write unit tests for their code regardless of whether they write them first or last?
Will programmers step through their code in the debugger before they check it in?
Will programmers integration-test their code before they check it in?
Will programmers review or inspect each other's code?
Cross-Reference
For more details on quality assurance, see Chapter 20, "The Software-Quality Landscape."
Tools
Have you selected a revision control tool?
Have you selected a language and language version or compiler version?
Have you selected a framework such as J2EE or Microsoft .NET or explicitly decided not to use a framework?
Have you decided whether to allow use of nonstandard language features?
Have you identified and acquired other tools you'll be usingeditor, refactoring tool, debugger, test framework, syntax checker, and so on?

…

A few simple guidelines can prevent refactoring missteps.
Opening up a working system is more like opening up a human brain and replacing a nerve than opening up a sink and replacing a washer. Would maintenance be easier if it was called "Software Brain Surgery?"
Gerald Weinberg
Save the code you start with Before you begin refactoring, make sure you can get back to the code you started with. Save a version in your revision control system, or copy the correct files to a backup directory.
Keep refactorings small Some refactorings are larger than others, and exactly what constitutes "one refactoring" can be a little fuzzy. Keep the refactorings small so that you fully understand all the impacts of the changes you make. The detailed refactorings described in Refactoring (Fowler 1999) provide many good examples of how to do this.

The company claims that inbound leads converts twice as often as outbound leads because they are of a higher quality in the first place. This is the continued justification for the emphasis on free content creation that caters to the point of view of the potential customer.
GitHub
GitHub, a web-based software development environment, launched in 2008, initially focused on facilitating projects that used the Git system for revision control.
The idea for the site was developed on a whim over a weekend. In 3 years and 8 months, however, it grew to become a site with a million code repositories. In December 2013, that number reached a staggering 10 million.
At its core, GitHub is a success because it solves a problem. The Git version-control system developed by Linus Torvalds in 2005 made collaboration possible for Linux kernel development, but it was far from an “easy” solution.

We love working with them; we still use them every day; but usually, we prefer Haskell.
Haskell in Industry and Open Source
Here are just a few examples of large software systems that have been created in Haskell. Some of these are open source, while others are proprietary products:
ASIC and FPGA design software (Lava, products from Bluespec, Inc.)
Music composition software (Haskore)
Compilers and compiler-related tools (most notably GHC)
Distributed revision control (Darcs)
Web middleware (HAppS, products from Galois, Inc.)
The following is a sample of some of the companies using Haskell in late 2008, taken from the Haskell wiki:
ABN AMRO
An international bank. It uses Haskell in investment banking, in order to measure the counterparty risk on portfolios of financial derivatives.
Anygma
A startup company. It develops multimedia content creation tools using Haskell.

…

It can only look at the name of a directory entry; it cannot, for example, find out whether it’s a file or a directory. This means that our attempt to use simpleFind will list directories ending in .c as well as files with the same extension.
The second problem is that simpleFind gives us no control over how it traverses the filesystem. To see why this is significant, consider the problem of searching for a source file in a tree managed by the Subversion revision control system. Subversion maintains a private .svn directory in every directory that it manages; each one contains many subdirectories and files that are of no interest to us. While we can easily filter out any path containing .svn, it’s more efficient to simply avoid traversing these directories in the first place. For example, one of us has a Subversion source tree containing 45,000 files, 30,000 of which are stored in 1,200 different .svn directories.

…

, Introducing Local Variables–The Offside Rule and Whitespace in an Expression, Local Functions, Global Variables
global, Local Functions, Global Variables
local, Introducing Local Variables–The Offside Rule and Whitespace in an Expression
W
Wadler, Philip, Further Reading
waitFor function, Finding the Status of a Thread
-Wall GHC option, Compilation Options and Interfacing to C
weak head normal form (WHNF), Normal Form and Head Normal Form, Separating Algorithm from Evaluation, Strictness and Tail Recursion
web client programming, Extended Example: Web Client Programming–Main Program
well typed rules, Strong Types
where clause, The where Clause, The Anatomy of a Haskell Module
whitespace in expressions, The Offside Rule and Whitespace in an Expression–The case Expression, A Note About Tabs Versus Spaces
vs. tab characters, A Note About Tabs Versus Spaces
WHNF (weak head normal form), Normal Form and Head Normal Form, Separating Algorithm from Evaluation, Strictness and Tail Recursion
widgets (GUI programming), Glade Concepts
wild card patterns, The Wild Card Pattern
Windows, installing GHC/Haskell libraries, Windows
withForeignPtr function, Pattern Matching with Substrings
withTransaction function, Transactions
Word type, Numeric Types
Word16 type, Numeric Types
Word32 type, Numeric Types
Word64 type, Numeric Types
Word8 type, Numeric Types
writeChan function, Chan Is Unbounded
writeFile function, readFile and writeFile
Writer monad, The Writer Monad and Lists
WriterT monad transformer, Motivation: Boilerplate Avoidance
X
x86_64 assembly, Tuning the Generated Assembly
XML, Extended Example: Web Client Programming, Glade Concepts
widget descriptions saved as, Glade Concepts
xor function, Numeric Types
Z
zero-width escape sequences, The Zero-Width Escape Sequence
zip function, Working with Several Lists at Once
zipWith function, Working with Several Lists at Once
About the Authors
Bryan O'Sullivan is an Irish hacker and writer who likes distributed systems, open source software, and programming languages. He was a member of the initial design team for the Jini network service architecture (subsequently open sourced as Apache River). He has made significant contributions to, and written a book about, the popular Mercurial revision control system. He lives in San Francisco with his wife and sons. Whenever he can, he runs off to climb rocks.
John Goerzen is an American hacker and author. He has written a number of real-world Haskell libraries and applications, including the HDBC database interface, the ConfigFile configuration file interface, a podcast downloader, and various other libraries relating to networks, parsing, logging, and POSIX code.

This fragment comes from Listing 10-2 and sets up a one-way data binding on the
todos.length property, which I explain in the following section.
Some developers don’t like the attribute approach, and—surprisingly often—attributes cause problems in
development tool chains. Some JavaScript libraries make assumptions about attribute names, and some restrictive
revision control systems won’t let HTML content be committed with nonstandard attributes. (I encounter this most
often in large corporations where the revision control system is managed by a central group that lags far behind the
needs of the development teams it supports.) If you can’t or won’t use custom attributes, then you can configure
directives using the standard class attribute, as follows:
...
There are <span class="ng-bind: todos.length"></span> items
...
236
Chapter 10 ■ Using Binding and Template Directives
The value of the class attribute is the name of the directive, followed by a colon, followed by the configuration
for the directive.

…

The controller I am going to create will be used throughout the application—something I refer to as
the top-level controller, although this is a term of my own invention—and I define this controller in its own file. Later,
I’ll start to group multiple related controllers in a file, but I put the top-level controller in its own file. Listing 6-2 shows
the contents of the controllers/sportsStore.js file, which I created for this purpose.
■■Tip The reason I keep the top-level controller in a separate file is so that I can keep an eye on it when it changes in a revision
control system. The top-level controller tends to change a lot during the early stages of development, when the application is
taking shape, and I don’t want the avalanche of change notifications to mask when other controllers are being altered. Later in the
project, when the main functionality is complete, the top-level controller changes infrequently, but when it does change, there is a
potential for breaking pretty much everything else in the application.

Shell Modes: Interactive and Embedded
The Erlang shell is started by default in what we call interactive mode. This means that
at startup, only the modules the runtime system needs are loaded. Other code is dynamically loaded when a fully qualified function is made to a function in that module.
In embedded mode, all the modules listed in a binary boot file are loaded at startup.
After startup, calls to modules that have not been loaded result in a runtime error.
Embedded mode enforces strict revision control, as it requires that all modules are
available at startup. Lastly, it does not impact the soft real-time aspects of the system
by stopping, searching, and loading a module during a time-critical operation, as might
be the case for interactive mode. You can choose your mode by starting Erlang with
the erl -mode Mode directive, where Mode is either embedded or interactive.
Purging Modules
The code server can get rid of or purge old versions of a module by calling
code:purge(Module), but if any processes were running that code, they will first be terminated, after which the old version of the code is deleted.

…

In large systems, you would expect to find quite
a few of them.
355
In a huge and complex system, without any knowledge of the module in which the
entry was corrupted, you would have to find the tuple {error, unknown_msg}, potentially having to search through millions of lines of code. Once you found the tuple,
inserting an io:format/2 statement that prints the error and process information would
not solve the problem, as in live systems, processes come and go and millions of entries
are inserted and deleted from ETS tables each hour. In addition, because this is a live
system with strict revision control, code changes must be tested and approved before
being deployed. This option, even if it’s tempting, would result in a slow turnaround
time. Don’t get wound up on release procedures and slow turnaround times from the
quality assurance team, however, as you can do better!
In Erlang, the first thing a developer or support engineer would consider doing is to
turn on the trace facility for all calls to the ets:insert/2 function.

Version Control
If you’re developing any kind of software at all, and you’re not using version control,
you’re missing out.Version control systems keep a complete revision history of your project, enabling you to rewind your code to any point in time (for example, to the hour
before you made that innocent-looking change that broke something you didn’t notice
for a week).
Older version control systems such as SCCS (Source Code Control System) and RCS
(Revision Control System) were simple, either maintaining the original versions of files
plus deltas (minor changes to files between versions) or vice versa—keeping the latest
editions of files and applying “backward deltas.” One limitation of such systems was the
controlled files lived on and were modified on the same server.
As software development has progressed, especially in the open source community, it
became clear the existing systems were awkward for distributed group development,
hence leading to more modern version control systems such as CVS (Concurrent
314
Appendix C Tools for Practical Django Development
Versions System), an improved and distributed offshoot of RCS, and later, the Subversion
project, which was meant to be “a better CVS” and a compelling replacement for it.

Usually it’s in a place like (In Ubuntu 12.04) /usr/local/lib/python2.7/dist-packages/.
Installing the “master” Version
The latest and greatest Django development version is referred to as master, and it’s available from Django’s git repository. You should consider installing this version if you want to work on the bleeding edge, or if you want to contribute code to Django itself.
Git is a free, open source distributed revision-control system, and the Django team uses it to manage changes to the Django codebase. You can use a Git client to grab the very latest Django source code and, at any given time, you can update your local version of the Django code, known as your local checkout, to get the latest changes and improvements made by Django developers.
When using master, keep in mind there’s no guarantee things won’t be broken at any given moment.

…

Launch the Python interactive shell with manage.py shell and verify that the new field was added properly by importing the model and selecting from the table (e.g., MyModel.objects.all()[:5]). If you updated the database correctly, the statement should work without errors.
Then on the production server perform these steps:
Start your database’s interactive shell.
Execute the ALTER TABLE statement you used in step 3 of the development environment steps.
Add the field to your model. If you’re using source-code revision control and you checked in your change in development environment step 1, now is the time to update the code (e.g., svn update, with Subversion) on the production server.
Restart the Web server for the code changes to take effect.
For example, let’s walk through what we’d do if we added a num_pages field to the Book model from Chapter 5. First, we’d alter the model in our development environment to look like this:
class Book(models.Model): title = models.CharField(max_length=100) authors = models.ManyToManyField(Author) publisher = models.ForeignKey(Publisher) publication_date = models.DateField() num_pages = models.IntegerField(blank=True, null=True) def __unicode__(self): return self.title
(Note: Read the section “Making Fields Optional” in Chapter 6, plus the sidebar “Adding NOT NULL Columns” below for important details on why we included blank=True and null=True.)

All of these examples represent one way of implementing Puppet in your environment.
The examples are designed to give you an idea of how a production Puppet implementation
could work. Some of the examples represent some best practice guidelines, but others are
simply ideas about how you might go about using Puppet to articulate your configuration.
We’ll also look at how to manage and store your Puppet configuration including using a
revision control system, file serving, and modules. By the end of this chapter, you should
have a strong understanding of what Puppet can do to manage your nodes and the best way
to articulate, store, and administer that configuration.
Note ➡ All the configuration examples in this chapter are available as a source code download from the
Apress site.
Our Example Environment
We’re going to configure an example environment with Puppet.

‡ LISP programmers are always happy to tell you just how much better LISP was than anything before or since,
how much better it was as a development language, and how all the advances that have occurred since then
are just attempts at recreating what they had 30 years ago. Even if they are right, they need to let it go.
IDEs | 165
This is not to say that everything is perfect with these tools. I have had, well…interesting
experiences with the way IDEs interface with revision control systems. The view of
what code is appropriate may strike some as a bit fascistic. And the user interface—
well, let’s just say this is the sort of interface that could only have been produced by
programmers on an open source project. Everything can be configured and changed,
which is the good news. And most everything needs to be configured and changed,
which is the bad news.
Still, these are great tools, and their rapid evolution and improvement over time is a
testament to the open source development model they have adopted.

As far as the association was concerned, humanity had not left the industrial age, yet their members were about to enter the information age. Doug called the system he had built the oN-Line System (NLS), and in the 100-minute demonstration that would follow he planned to introduce the world to (in the words of Engelbart’s biographer Thierry Bardini) “windows, hypertext, graphics, efficient navigation, command input, videoconferencing, the computer mouse, word processing, dynamic file linking, revision control, and a collaborative real-time editor.” But for the moment no one was sure that what would later be called the Mother of All Demos would work. Doug had told someone at NASA earlier in the week that he was going to show the system publicly—“Maybe it’s a better idea you don’t tell us, just in case it crashes,” the NASA employee advised him. Doug’s chief engineer, Bill English, had been a theatrical stage manager and knew that the demonstration had to be ready as soon as the audience showed up.

Most of software development goes on in your head anyway. I think having worked with that simpler system imposes a kind of disciplined way of thinking. If you haven't got a directory system and you have to put all the files in one directory, you have to be fairly disciplined. If you haven't got a revision control system, you have to be fairly disciplined. Given that you apply that discipline to what you're doing it doesn't seem to me to be any better to have hierarchical file systems and revision control. They don't solve the fundamental problem of solving your problem. They probably make it easier for groups of people to work together. For individuals I don't see any difference.
Also, I think today we're kind of overburdened by choice. I mean, I just had Fortran. I don't think we even had shell scripts.

If the DNS team’s test was blocked on the Machine Database team’s configuration of a new cluster, as soon as the cluster appeared in the database, the DNS team’s tests and fixes would start working.
Take the test shown in Figure 7-2 as an example. If TestDnsMonitoringConfigExists fails, as shown, we can call FixDnsMonitoringCreateConfig, which scrapes configuration from a database, then checks a skeleton configuration file into our revision control system. Then TestDnsMonitoringConfigExists passes on retry, and the TestDnsMonitoringConfigPushed test can be attempted. If the test fails, the FixDnsMonitoringPushConfig step runs. If a fix fails multiple times, the automation assumes that the fix failed and stops, notifying the user.
Armed with these scripts, a small group of engineers could ensure that we could go from “The network works, and machines are listed in the database” to “Serving 1% of websearch and ads traffic” in a matter of a week or two.

…

You’d better hope those tests for future compatibility (which are running as monitoring probes) had good API coverage.
Fake Backend Versions
When implementing release tests, the fake backend is often maintained by the peer service’s engineering team and merely referenced as a build dependency. The hermetic test that is executed by the testing infrastructure always combines the fake backend and the test frontend at the same build point in the revision control history.
That build dependency may be providing a runnable hermetic binary and, ideally, the engineering team maintaining it cuts a release of that fake backend binary at the same time they cut their main backend application and their probes. If that backend release is available, it might be worthwhile to include hermetic frontend release tests (without the fake backend binary) in the frontend release package.

Local Version Control Systems
Many people’s version-control method of choice is to copy files into another directory (perhaps a time-stamped directory, if they’re clever). This approach is very common because it is so simple, but it is also incredibly error prone. It is easy to forget which directory you’re in and accidentally write to the wrong file or copy over files you don’t mean to.
To deal with this issue, programmers long ago developed local VCSs that had a simple database that kept all the changes to files under revision control (see Figure 1–1).
Figure 1–1. Local version control diagram.
One of the more popular VCS tools was a system called rcs, which is still distributed with many computers today. Even the popular Mac OS X operating system includes the rcs command when you install the Developer Tools. This tool basically works by keeping patch sets (that is, the differences between files) from one change to another in a special format on disk; it can then re-create what any file looked like at any point in time by adding up all the patches.

Many of these generators are simple, small, and written in scripting languages such as Python, Perl, and Ruby; so if you are familiar with these languages, you’ll be able to easily extend them to customize their behavior. This process is arguably much simpler than learning how to customize a large system such as WordPress.
Other points in favor of Jekyll and similar static generators are the ability to store your blog under revision control through tools like Git or Subversion and the simplicity of being able to deploying the output site pretty much anywhere, as well as its positive performance implications. In fact, since your blog ends up being a static site, its performance should be very good—even on commodity hardware.
The major disadvantage is that you are on your own. There are very few premade add-ons that aid you in accomplishing even a small percentage of what you can do with software like WordPress.

This one was by far the easiest.) The company hopes that the ease and quality of its distribution
(not to mention its price) will drive many more individual computer
users to use Ubuntu Linux.
Canonical aims to profit from the community-driven and
community-developed Ubuntu. Its vision is inspired by Shuttleworth, who says he has been “fascinated by this phenomenon of
collaboration around a common digital good with strong revision
control.”7 That collaboration is done through a community. Canonical intends to “differentiate ourselves by having the best community. Being the easiest to work with, being the group where sensible
things happen first and happen fastest.” “Community,” Shuttleworth said to me, “is the absolute essence of what we do.” “Thousands”
now collaborate in the Canonical project.
To make this collaboration work, as Shuttleworth describes,
at least three things must be true about the community.

Android tools preprocess these definitions, and turn them into highly optimized representations and the Java source through which application code refers to them. The autogenerated code, along with code created for AIDL objects (see AIDL and Remote Procedure Calls ), is put into the gen directory. The compiler compiles the code from both directories to produce the contents of bin. The full structure of a project was described in detail in Chapter 3.
Note
When you add your project to a revision control system like Git, Subversion, or Perforce, be sure to exclude the bin and gen directories!
Your application source code goes in the src directory. As noted in Chapter 2, you should put all your code into a package whose name is derived from the domain name of the owner of the code. Suppose, for instance, that you are a developer at large, doing business as awesome-android.net. You are under contract to develop a weather-prediction application for voracious-carrier.com.

The autogenerated code, along with code created for AIDL objects (see AIDL and Remote Procedure Calls ), is put into the gen directory. The compiler compiles the code from both directories to produce the contents of bin. We’ll see in a minute how the res directory is particularly important for making application data accessible using a Context object.
Tip
When you add your project to a revision control system like Git, Subversion, or Perforce, be sure to exclude the bin and gen directories!
Organizing Java Source
Your application source code goes in the src directory. As noted in Chapter 2, you should put all your code into a package whose name is derived from the domain name of the owner of the code. Suppose, for instance, that you are a developer at large, doing business as awesome-android.net.

Ultimately, we’re economic creatures, and the sense that “we own this, and our work can never be used against us” makes it much easier for people to invest in an open source project like ØMQ. And it can’t be just a feeling, it has to be real. There are a number of aspects to making collective ownership work; we’ll see these one by one as we go through C4.
Preliminaries
The project SHALL use the Git distributed revision control system.
Git has its faults. Its command-line API is horribly inconsistent, and it has a complex, messy internal model that it shoves in your face at the slightest provocation. But despite doing its best to make its users feel stupid, Git does its job really, really well. More pragmatically, I’ve found that if you stay away from certain areas (branches!), people learn Git rapidly and don’t make many mistakes.

In [5]: code Out[5]: output
At times, for clarity, multiple code examples will be shown side by side. These should be read left to right and executed separately.
In [5]: code In [6]: code2 Out[5]: output Out[6]: output2
Data for Examples
Data sets for the examples in each chapter are hosted in a repository on GitHub: http://github.com/pydata/pydata-book. You can download this data either by using the git revision control command-line program or by downloading a zip file of the repository from the website.
I have made every effort to ensure that it contains everything necessary to reproduce the examples, but I may have made some mistakes or omissions. If so, please send me an e-mail: wesmckinn@gmail.com.
Import Conventions
The Python community has adopted a number of naming conventions for commonly-used modules:
import numpy as np import pandas as pd import matplotlib.pyplot as plt
This means that when you see np.arange, this is a reference to the arange function in NumPy.

This is the
equivalent of having the MS-DOS .bat ﬁle interpreter and command.com functionality available at the command line. Experienced users use this feature to write
small programs from the command line to accomplish simple tasks.
For instance, suppose you give a friend a set of source code, and a few weeks later
he returns it with a bunch of tweaks and ﬁxes. You may want to go through all
the source code to see what has been changed in each ﬁle before you commit
the changes to your revision control system. Using a UNIX-style shell, you can
compare all ﬁles in one directory to ﬁles with the same name in another directory
and place the results in a ﬁle with a DIF sufﬁx. The commands at the Bourne,
Korn, or Bash shell prompt are as follows:
$ for i in *.c
do
diff $i ../otherDir/$i >$i.DIF
done
You can use tclsh to gain this power in the Windows and Macintosh worlds. The
code to accomplish the same task under Windows using a tclsh shell is
% foreach i [glob *.c]{
fc $i ..

Consider the code a petri dish for your own experiments.
Building the code requires a few auxiliary command-line tools:
Java
HBase is written in Java, so you do need to have Java set up for it to work. Java has the details on how this affects the installation. For the examples, you also need Java on the workstation you are using to run them.
Git
The repository is hosted by GitHub, an online service that supports Git—a distributed revision control system, created originally for the Linux kernel development.[3] There are many binary packages that can be used on all major operating systems to install the Git command-line tools required.
Alternatively, you can download a static snapshot of the entire archive using the GitHub download link.
Maven
The build system for the book’s repository is Apache Maven.[4] It uses the so-called Project Object Model (POM) to describe what is needed to build a software project.

Changes are published by issuing a commit command, which will save the changes from the working directory to the repository.
12.1.1. Centralized Version Control
The first version control system was the Source Code Control System, SCCS, first described in 1975. It was mostly a way of saving deltas to single files that was more efficient than just keeping around copies, and didn't help with publishing these changes to others. It was followed in 1982 by the Revision Control System, RCS, which was a more evolved and free alternative to SCCS (and which is still being maintained by the GNU project).
After RCS came CVS, the Concurrent Versioning System, first released in 1986 as a set of scripts to manipulate RCS revision files in groups. The big innovation in CVS is the notion that multiple users can edit simultaneously, with merges being done after the fact (concurrent edits).

Look-ahead bias can also occur if a rule uses data series that are reported with a lag, such as mutual fund cash statistics, or that are subject
Case Study Results and the Future of TA
449
to revisions, such as government economic statistics. When this is the
case, lagged values must be used that take into account reporting delays
or revisions.
The case study avoided look-ahead bias by assuming entries and exits
on the open price of the day following a position-reversal signal. Moreover, none of data series used were subject to reporting lags or revisions.
Controlled for Data-Mining Bias. Few rule studies in popular TA
apply signiﬁcance tests of any sort. Thus, they do not address the possibility that rule proﬁts may be due to ordinary sampling error. This is a
serious omission, which is easily corrected by applying ordinary hypothesis tests.
However, ordinary tests of signiﬁcance are only appropriate when
only one rule has been back tested. When many rules have been tested
and a best is selected, the ordinary hypothesis test will make the best rule
appear more statistically signiﬁcant than it really is (false rejection of the
null hypothesis).