BioImplement

Thursday, January 15, 2015

I do kinda feel like my head is
full!
My context switching penalty is high and
my process isolation is not what it used to be.

-Elon Musk, Reddit AMA, Jan 5, 2015

Cognitive load is a term applied to the overall
effort used in working memory for an individual performing a task. Faced with
any technology choice, we tend to concoct an approximation in our minds of the
cost of effort, compared to the benefit of change. The cost that has been on my mind recently – is
that of cognitive load. Even thinking about the irony of that statement adds to
my cognitive load.

I moved to Singapore in 2007, with roads and
driver’s seats opposite those I learned on in Canada. When driving there, conversations
with passengers were halting, stressful and mentally draining. I could feel my
brain fighting to avoid old reflexes, which seemed to conspire against my
progress. Switching contexts between driving and navigating was a chore. I
circled and doubled back quite often.

This experience left me sensitized to my own
cognitive load. I began to notice how I reacted to context switching between
computing technology and my own scientific domain, biology. Like Elon Musk, my
context switching comes with a penalty best described as a “mental lag”. A
period of time where I can remember nothing about what I know. This lag is a
brief moment of stupidity, lasting seconds to minutes. It is as though my brain
needs time and more clues to rebuild the branching needing to recall those
things that do indeed reside deep in my memory. It seems like that the path into
deep memory gets displaced by whatever I was last doing. The more cognitive
load my last task used, the longer the lag seems. The discomfort of switching
contexts seems to drive me to try to reduce my cognitive load.

Educators design instructional material to
reduce cognitive load in a few ways.

Physical integration of information. (Think Wikipedia as our savior for any trivia question.)

Eliminating unnecessary redundancy. (We Canadians fill out government forms in one or the other official language, never both, no matter how fluently bilingual we are. )

Worked examples.

Open-ended exercises.

So my hypothesis here is that technology stack components
that are successful – ones that entice people to switch to them – seem to reduce
cognitive load in ways that approach the list above. At the same time they have
a low switching cost. It is as though knowledge
workers carry this approximation in their heads, balancing the real and
cognitive costs and benefits of switching to new technologies, while at the
same time watching network behavior so as not to get stranded on the wrong
side of emerging successes.

Over the last 3 years, I have changed my complete
computing stack, including infrastructure, operating systems, databases, and
language. I attribute this big changeover to my fundamental need to reduce my
cognitive load, and I am pleased to say it has.

Switching from physical
infrastructure to a hybrid IaaS system has made life much easier. Aside from
the usual number-of-cores-on-the-head-of-a-pin, or CapEx-vs-OpEx arguments, I
would argue that the popularity of IaaS and PaaS cloud computing relates directly
to reducing the cognitive load of developers. The cloud based web developer doesn't
need to carry around the mental baggage of physical systems knowledge. Maybe
this is obvious to everyone.

Cloud computers are invisible computers, and “out
of sight – out of mind” seems to me to be a key to minimizing cognitive load. Cloud
VMs remove the burden of needing to know much about the idiosyncrasies of
physical server hardware installations: power, security, firewalls, networking,
cabling, storage formatting, remote console booting, and OS driver updating. By
using cloud computing, developers get a free chunk of cognitive load back to
use for other fun software development stuff.

Another example is the rise of server-side
Javascript. Through Node.js, Javascript
has become a general purpose computing language, adding to its original role as
the common language in web browsers. This is a clear case of – Two birds, one stone. Developers work more efficiently in a single
language that spans the back-end and front-end of modern web applications.
Efficiency improvements come from re-using code on both the server and browser
side, but also from the reduced cognitive load needed to code in two separate
languages and by avoiding context switching penalties.

The popular rise of Docker also seems to fit
the pattern of a reduced cognitive load – as it chunks software deployments,
abstracts Linux system calls, and promises to make it easier to deploy across
heterogeneous cloud platforms. Packaging a Docker image consolidates familiar
git based builds and package management based installations, carrying with it
all the application dependencies. The
Docker ecosystem promises to democratize the compute image itself, making it
vendor neutral, and, at the same time, reducing the worry over cloud vendor lock-in.
Docker embodies the concept of the physical integration of computing information,
which was the first bullet point from the list above. But it will take at least another year of
ecosystem development before Docker is widely used in production systems.

While these three are arguably successful and
appear to reduce cognitive load, other new systems are eating your newly-freed cognitive
capacity for lunch.

There has been a sprawl of database and storage
technologies supporting unstructured data and Big Data. These aspire to provide
solutions that scale by changing fundamental compute paradigms of databases and
storage. If you believe Big Data
listicles, our fate seem sealed and driving towards a NoSQL MongoDB, Hadoop/HDFS
world for Data Science. But Hadoop, in particular, has not won many converts among
my kindred spirits in Bioinformatics. The Bioinformatics stack depends on general
purpose Unix/Linux computing and POSIX file systems. Somehow the cost to convert
tools to the Hadoop/HDFS world is not yet justified to my peers.

So - what happens when general purpose
computing systems catch up with these diverse and specialized Big Data systems?
Consider two new open-source systems PostgreSQL
9.4 and Manta, both released in open source in late 2014. Both of these systems
offer consolidations that enable unstructured data and Big Data computing
within existing computing paradigms.

In December 2014 PostgreSQL 9.4 was released
with the JSONB data type and extensions for JSON indexing and querying. These
extensions allow any SQL column containing JSON data to work as a fast, SQL indexed
and query-able JSON store. In practice, this makes NoSQL/JSON databases
functionally redundant and embeds the functionality in an SQL system. One can
query arbitrary JSON structures with SQL syntax extensions, and importantly,
connect rich SQL-based visualization and computing tools like Tableau, and R, to
the database. The specialization movement towards NoSQL/JSON databases, as
implemented in MongoDB, has now lost its raison d'être. SQL has caught up in functionality. PostgreSQL
could be a winner in the long-term and start edging out NoSQL and MySQL systems
in a couple of short years.

Manta consolidates this fragmentation of
storage by providing a persistent object store with general purpose Map/Reduce
computing using the CPU cores of commodity storage nodes. It provides a
strongly consistent, hierarchical, ZFS (POSIX compliant) copy-on-write file system-based,
object store. It allows secure
container-based compute tasks to be moved to storage, rather than moving
storage to computing nodes. This eliminates AWS S3 – EC2 – HDFS transfer time, duplicated
data costs, and saves us from having to optimize HDFS by packing small files into a larger SequenceFile via a peculiar re-invention of tar.

On Manta, a user can build complex Map/Reduce
pipelines involving any runtime language
or compiled special purpose software (e.g. ffmpeg, OpenCV, BLAST, CRAM), and
run Map/Reduce in a secure multi-tenant environment. Manta functions without the
drag of Hadoop’s Java-based process control code and HDFS management code
running on top of the operating system and storage. Manta ZFS storage is LZ4
compressed at the operating system level, a unique feature making the better
use of commodity disk space. Open-source Manta software development kits (SDKs)
allow Map/Reduce systems to be rapidly constructed with one-line Bash or
PowerScript scripts, as well as in
Node.js, Python, Go, R, Java and Ruby.

Despite the elimination of real storage redundancies
and cognitive load redundancies imposed by maintaining three separate storage
paradigms, Manta has a switching penalty – it is not a Linux based system. It runs
on the x86 illumos based distro – SmartOS, a server operating system forked
from OpenSolaris. SmartOS itself is specialized in running secure operating
system containers (Zones), as well as the KVM hypervisor which can run Linux or
Windows virtual machines.

Released as a cloud service in 2013, Manta had
four distinct cognitive costs making adopters take pause. Now that it is open
source, as of Nov 2014, there are three left. First, it has to be installed on
physical or hosted commodity computing infrastructure. Someone has to rig the
required networking and learn how to administer the system. Second, is the cognitive
cost of switching your operating system command memory from Linux to SmartOS (i.e. reading the docs, cheat sheets,
list-servers).

Third, and arguably the biggest sticking point,
is the cognitive cost in cross-compiling Linux specialized code to SmartOS. Although
over ten thousand packages have been ported to SmartOS’s package manager, writing
software that is cross-platform between Linux and Unix systems is, in many
cases, a dying art form. It would seem that the success of Linux has allowed
developers to reduce their cognitive load and stop caring about cross-platform software
compatibility.

But this last cognitive load barrier is falling
fast. In 2014 Joyent’s engineers began updating the lx-branded zone, based on the old OpenSolaris Linux operating system call emulator. Now in open
beta, lx-branded zones are containers that run current versions of CentOS and Ubuntu, and a
majority of current 32- and 64-bit Linux software. Extensive community testing
is helping find bugs, which are being eliminated fast and furiously.

The lx zone provides a way for Linux
software to run on SmartOS at bare metal
speeds without cross compiling code and, gives die-hard Linux users their
cherished apt and yum package managers.

A key piece of Linux software targeted for the lx zone is the Docker daemon itself. For Manta this is most significant
as it will allow Docker images to form the building blocks of Big Data
computing on storage. When Joyent succeeds at this effort, it will tip the
balance. The actual cost of maintaining data and moving it between three
separate storage paradigms, POSIX, S3 and HDFS, will outweigh the cognitive cost
of switching to Manta.

Then the Big Data/Data Science stack may be ready for serious
disruption.

Disclaimer: I no longer work for Joyent, and have no competing financial interests.

Tuesday, April 30, 2013

1 May 2013- Singapore. This is an Open Science post intended to spark ideas and conversations with other scientists.

I was thinking about some of the assembly problems that have been described in large metagenomic (e.g. soil) samples by @ctitusbrown, @phylogenomics and many others.

To start with - I am not an “NGS” metagenomics expert. Iam a “FGS” metagenomics guy. We published log-odds scores for detecting over 100 species by amino acid composition in 2002. (http://www.biomedcentral.com/1471-2105/3/39). I know a thing or two about large scale bioinformatics pipelines and optimizations.

So. Here are my current thoughts about inverting the metagenomics assembly pipeline.

First - the "small database effect".
What is the small database effect? It is the inverse effect of the large database problem. Google "cancer" and you get (today) 644 million hits. The larger the database, the more hits you get. A smaller database gives you correspondingly fewer hits.

Consider that the number of protein functional classes (PSSMs or HMMS) is smaller than the number of possible species. Organisms are built on common protein building blocks, and there are more ways to organize the blocks and encode them (i.e. species) than there are blocks themselves.

In principle, searching for the function of a short read should outperform a search for the taxon of a short read, in terms of both hit quality and number. Can this work in practice? I think so. Here’s why. (I have tried out steps 1-3 they work from the command-line RPS-BLAST, which got me started on this idea).

RPS-BLAST can find PSSMs that match sequences of 9-12 amino acids. Change the threshold E-score to around 100. Multiple PSSMs may be hit in the process, and some of these may be false positives. See point 3 about reducing the number of hits by up to an order of magnitude, and points 6 about false positive hits.

Short reads on the order of 30bp can be input directly into RPS-BLAST as nucleotides. Select the appropriate genetic code. Assume everything encodes protein and close your eyes for the moment.

CDDs are not independent, rather they are curated. The PSSMs are organized into a hierarchy of evolutionary families and superfamilies. Most hits are in fact within a branch of the hierarchy. The collection of over 40,000 PSSMs is itself redundant and a last-common family or superfamily can be readily found by traversing the ancestral hierarchy. This can collapse the number of candidate PSSM hits to the short sequence considerably. In some cases, by an order of magnitude. For example 50 PSSM hits can collapse into 2-5 superfamily CDD PSSMs.

With the assignment of lowest common ancestor CDD PSSMs to the short read sequence, one can use the resulting PSSM / CDD identifier as a database key for binning. A short read sequence can be assigned to multiple PSSM bins without harm at this point. By grouping together all the short-reads into PSSM based bins, one can subdivide the metagenomics dataset into potentially related protein functional units by putative peptide sequence.

Once the reads are binned, one can assemble the sequences in each PSSM bin into contigs independently, by feeding the binned read set into a conventional assembler.

One can remove erroneous hybrid contigs from the bin by a second RPS-BLAST scan against the parent PSSM or CDD PSSM branch hierarchy. This is a tiny fraction of the entire PSSM set. At this stage the successful contigs should represent species fragments with coding regions based on nucleotide and reading frame overlap, with minimal interference from random overlap. The resulting set of contigs is functionally related to the PSSM, so the assignment of function is already in hand.

Digital normalization could be applied to the contigs within the PSSM bin to remove redundant information for further genome assembly. There will be a pile of un-binned reads as well that do not map to any PSSMs. These could be assembled/digitally normalized separately from the binned sequences, and merged for the final assembly pass.

The PSSM itself could be used a scaffold for sub-assembly. That will require some code, but it is probably not necessary.

The processed contigs can be sent to another system in the pipeline for phylogenetic assignment (e.g. BlastX ). Final assembly can performed assuming species specific sequences are grouped better by the preceding steps.

PSSMs are divisible and thus can be cut up and subdivided into smaller PSSMs. If there are unmatched portions of the original PSSMs in a set of contigs, they can be chopped into mini-PSSMs, related in the existing CDD hierarchy to the parent PSSM we cut them from. This now changes the search statistics. A new custom RPS-BLAST database can then be compiled out of these leftovers. The unassigned or discarded short reads can be searched against this. This mini-PSSM database will be much smaller in size than the original. This could further extend the previously found contigs by assigning new, weaker hit contigs into the existing PSSM bins up the hierarchy.

Feel free to substitute HMM for PSSM in most of this. However I have no idea whether implementations of protein family HMM searching can use short nt sequences as input (points 1,2), or how to achieve the hierarchical reduction in number of hits without an evolutionary model for the entire set, as curated by CDD (point 3), or whether a HMM can be simply divisible (point 10) without a lot of re-processing.

Tuesday, October 23, 2012

The historical provenance of the mouse trap's unique design back to 1847 reveals its inventor, Job Johnson, and that it is reducible to a functioning single part animal trap, the fish-hook.

Figure 1. A modern Victor Brand mouse trap with bait-pedal up, showing the vestigial profile of the fishhook, from which it originated.

In
his book Darwin’s Black Box1,
and follow-up The Edge of Evolution2,
Biochemist and Intelligent Design instigator Michael J. Behe uses the mouse
trap as the defining example of a device that is irreducibly complex. He
explains how it can not function without all of its parts, and that none of its
parts, alone or in various combinations, can do the function of the entire
trap. And he explains how the trap could not have been created by a small
succession of modifications to some simpler precursor that performed the same
function of mouse trapping. Niall Shanks,
in his critique of Intelligent Design3, made an effort to address
the historical origin of the mouse trap, but could not get past what seems like
a popular myth when in fact there is much older supporting evidence for the
origin of the mouse trap in both the Patent Office and in antique mouse trap
collections.

The
origin story for the mouse trap is important because it maps the progress of a
complex idea. It is at the first moment of invention of the mouse trap that marks
the start of the idea of its complexity. It is in the original trap that the human
intelligent design work was performed. The complex mouse traps designed
thereafter are a cascade of copycat follow-ups with incremental changes to the
original design. The idea of a snap-style mouse trap is an idea that, once it
started, became hugely popular.

If
you could collect one example of each mouse trap produced every year after its
original invention and place each trap on a long table side by side in
chronological order, how do you think these mouse traps would appear to have changed
over time? How far back in time would we have to go to find the original snap-style
mouse trap? Would the original snap-style mouse trap even look the same? The
long line of mouse traps we see before us would be a record of the legacy of
small improvements to the original snap-style mouse trap design.

While we have
no table of mouse traps handy, there is an
excellent record in the U.S. Patent office that documents, illustrates and
patents each important improvement in its evolution. For example, today some Victor
brand mouse traps4,5,6 are sold “Pre-Baited” with yellow plastic
bait parts shaped like little slices of Swiss cheese impregnated with a chemical
scent to attract mice, a high-tech improvement eliminating the need to supply
and load bait. But this chemical innovation is really just a small incremental
change to the previous stamped metal bait tray7. Everything else is
the same. If the plastic was soft enough to gnaw on and could be further
impregnated with a mouse-specific poison, perhaps the rest of the mechanical
parts of the mouse trap would become dispensable, converting it from a trap to
a poison. So minor innovations, like the new “Pre-Baited” trap, can accumulate
and obscure the details of the original mouse trap design.

Figure 2. Patent drawings of the pre-baited cheese-shaped bait pedal.

This is precisely
why we must go back in time and conduct some research to try to find the
original invention and an authentic artifact or illustration to comprehend its
design. The
original inventor of the mouse trap had no mouse trap to look at or think
about, just a problem to solve: how to trap and kill a wild animal busily
gnawing away at his food stores in the cold of winter. Would Behe’s questions
about irreducible complexity and the nature of the parts of the mouse trap hold
true for the original first mouse trap? Or would the original device betray
Behe with answers that would prove to be the undoing of the most fundamental
straw-man of the Intelligent Design argument?

The
snap-style mouse trap design is what Richard Dawkins8 calls a meme, which
he defines as an original thought or first of a kind idea that has been copied
and repeated many times since its origin. The mouse trap idea began with the
very first such trap, and it is at the time of origin of this idea where the
analysis of the irreducibly complex features must be properly considered, not the
countless copies that follow or their incremental changes to the original
design.

Let’s
look at the history of the mouse trap to understand this. Now, the modern mouse
trap is a very common item, simple enough to be understood mechanically, and
very easy to illustrate. Definitive statements can be made about today’s mouse
trap mechanism and its complexity. In his argument, Behe asks his readers to
imagine how well the modern version of a mouse trap would work missing one, two
or even more parts. But we know nothing of earlier designs, so this puts the
reader in the position of having to imagine a design history for the modern mouse
trap. For us to conclude that there could be no simpler form of the mouse trap,
based on the very sparse imaginary history we fabricate in our minds, is a
flawed way of thinking. Like the many versions of the Robin Hood story we have
seen, starring a broad range of stars, from an animated fox to Russell Crowe, there
can be a big difference between an imaginary history and a real history.

Since
old mouse trap designs are obscure, we must look up some of the mouse trap
patents and find old mouse traps in collections in order to follow them back to
their origin. There are thousands of illustrations of mouse traps filed with
the U.S. Patent Office and other patent offices worldwide. By looking back
through old patent documents, we can find many forms of mouse traps, some with
recognizable snap-style trap features, but also many others with springs in
strange places or iron parts where wood is expected. So it is better to
consider the information in the patent records and in actual artifacts as original
sources of information.

Who invented the mouse trap? A crowd
of inventors might say “I did!” each raising his hand. With over 4,400 U.S.
patents on mouse traps, many can claim to have invented a mouse trap. But who
was first? The mouse trap’s invention story is remarkably
convoluted and obscured by thousands of inventors all trying to “build a better
mouse trap”. More problematic is that the most successful American mouse trap
company, Woodstream, has perpetuated, in the popular media, a myth about the
invention of the mouse trap, attributing the modern mouse trap design mostly to
its founder, John Mast.

In fact there seems to be two separate
corporate mouse trap origin myths in publications, one American and the other
British. The good folk of Lititz PA would point to their long-running company, Woodstream,
makers of the famous Victor brand mouse trap, and credit company founder John
Mast with the invention in 1899 which was patented in 19039. Although
some popular articles10 correctly place the year of invention of the
Victor Brand mouse trap to 1894, that is not the date of the Mast patent. In an
interesting parallel, folks more familiar with the British “Little Nipper” mouse
trap would credit James H. Atkinson with the invention in 1897. His British patent
(GB 13277 of 1899) was sold in
1913 to a company named Proctor for 1000 British Pounds. Both Proctor and
Woodstream have been making mouse traps for a very long time, so it is no
surprise that each takes credit for the invention, though separated by vastly different
markets and an ocean.

Now, while Mast and Atkinson are the
inventor-founders of the two major mouse trap companies, the race to perfect
and market a cheap and reliable mouse trap was not new in the late 1890s, the
period in which they started their respective companies. Their own patent
documents show that they borrowed many design ideas from other trap inventors.
One earlier patented mouse trap design11 is more similar to Woodstream’s
modern Victor brand design than any other, even closer to it than Mast’s own 1903
patent. It was the 1894 design of William C. Hooker of Abingdon Illinois. Hooker founded
The Animal Trap Co. of Abingdon Illinois with his invention. According to mouse
trap collector and expert Rick Cicciarelli, The Animal Trap Co. first marketed
their unmistakably modern looking mouse trap as the “Out O’ Sight” mouse and
rat trap in two different sizes.

This is important because the design of the
snap-style mouse trap and the rat trap are both identical designs, just scaled
copies of the same design, one mouse-sized, and one-rat sized. Cicciarelli
tells us that the company that Mast had founded acquired Hooker’s design in 1905.
This explains why sometimes the Victor mouse trap is credited with being
invented in 189411, the date of the Hooker patent, which became
their intellectual property.

So it was Hooker’s design that became the
modern mouse trap, but even his design was also predated by earlier traps and
patents. And while Mast was indeed a master of turning the idea into a low-cost
product, the essence of the snap-trap was not his idea, nor even Hooker’s idea,
for that matter. The origin is found further back in time, back when trapping animals
for food was far more important than trapping pests like mice and rats.

To find the original mouse trap design
idea, let us try to define the essence of the mouse trap. How do we describe the
unmistakable part of the mouse trap that can be recognized by its structure as the
invention? The snap-style mouse trap is distinguished by a U-shaped bar that
travels 180 degrees from one side of a thin rectangular platform of wood, to
the other. It is powered by a coiled spring under torque, not by compression or
stretch. And it is triggered by the mouse moving a bait platform, releasing a
catch bar, and allowing the torque spring to move the U-shaped bar forward with
deadly force. A rat trap is just a bigger version of the mouse trap. So we must
also consider that the very first design could have been either in the form of
a mouse or a rat trap. So let us restate the “Who invented the mouse trap?”
question to reflect the details of its successful design idea; “Who invented
the snap-style mouse/rat trap with wooden base, bait trigger mechanism,
perpendicular torque-spring and U-shaped kill bar traveling over 180 degrees?”
Hooker’s patent satisfies this description. Are there older ones?

After examining snap-style mouse trap
patents, what varies most in the design, generally speaking, is the trigger
mechanism and bait tray. This is the part that holds the cheese that the mouse
or rat gnaws on that triggers the trap to snap shut. The bait tray has been changed
and refined over time by many inventors, including the modern fake plastic
chemical-scented Swiss cheese version. You can see the difference in triggers
on the Victor mouse traps for sale today4,7 as compared to Hooker’s
patent11. The modern bait tray and trigger is much more sensitive
than the original design and stamped from a single piece of metal. We can see a
number of small improvements in trigger sensitivity as Hooker’s design worked
only when pressed down. Hooker’s bait tray and trigger was made from three
separate pieces of metal but the improved single-part bait tray and trigger
senses chewing motion in any direction whereas earlier ones could be defeated
by clever mice who knew how to chew in the right direction. I imagine that sideways-gnawing
mice may have escaped with their lives and a cheek full of bait cheese, and
gone on to breed even more sideways-gnawing mice, were it not for these
numerous slight modifications to the bait pedal and trigger mechanism over time.

Now let us go back further, as several
patents predate Hooker’s 1894 patent. The C.B. Trumble Patent of 189212
and the W. H. Castle patent of 188813 are key examples of what
patent lawyers call “prior art”. These older designs retain the essence of the
snap trap design as I defined it above, but with variations that include
different triggers and cast iron bases instead of wooden bases, and double
torque springs instead of a single torque spring, in the case of the Castle
patent. John Mast referred to both of these earlier patents in the text of his
own 1903 Patent9. In addition, Mast referred to two patents from 1855
that were granted to Lucien B. Bradley for rat trap designs14,15 but
he avoids mention of the Hooker Patent11 in his filing, instead
drawing attention to the older designs, which helps our quest.

The 1855
Bradley rat trap patents14,15 were powered by springs that were compressed
and released and were oriented in a straight line to the bait, rather than in
the perpendicular torque spring arrangement. This may have been an effort to
alter an earlier torque spring design as a patent workaround. With the 1855
patents of Bradley, we are close to the origin point, and nearly half a century
before the date of the Mast patent which is so commonly mistaken for the
original.

Rick Cicciarelli
is an avid collector and authority on the history of antique mouse and rat
traps, and he has read through all the US patents in his quest as a
collector. Cicciarelli owned a prized snap-style rat trap marked “JOB JOHNSON
BROOKLYN NY PATENTED 1847”, which he bought at an antique store as a boy for
only $30. He believes it is the earliest flat snap-trap design. In his correspondence
to me, Rick says, “Generally speaking, that flat snap trap design is attributed
to Hooker. However, I have a flat snap rat trap which was patented in 1847, and
I believe THIS trap to be the earliest flat snap trap design.”

Figure 5. Rat trap, marked

“JOB JOHNSON

BROOKLYN NY

PATENTED
1847”.

Photo courtesy Rick Cicciarelli.

Upon
examination, the antique trap of Job Johnson does fully satisfy our question of
the essence of the mouse trap design idea, with wooden base, U-shaped kill bar,
and torque spring. But the patent of 1847 to Job Johnson16, assigned
the very early U.S. Pat. Number 5,256, was not the design of a rat trap. Rather,
it was one of the three first novel designs for a spring-loaded fish hook.
Johnson’s fish hooks are gloriously illustrated in photographs in a collector’s
book on spring-loaded fishing tackle and fish traps by William Blauser and
Timothy Mierzwa17. This text of patent number 5,256, being sworn and
witnessed statement to the U.S. government, tells us part of Job Johnson’s
story:

Be it known that I, JOB JOHNSON, of
the city of Brooklyn, State of New York, fish hook manufacturer, a native of
England, having been resident more than one year next preceding the date hereof
in the United States, and having duly declared my intention to become a citizen
thereof, have invented and made and applied to use certain new and useful
improvements in the constructive application, arrangement, and combination of
mechanical means whereby the bite of a fish at the bait on a hook causes a
crooked barb-dart to strike into and hold the nose, head or gills of the fish,
independently both of the line and of the person holding the line, and the
general arrangement of which, when of a proper size, may be applied to the capture of any kind of fish or of any
destructive or ferocious animal, and for which improvement I seek Letters
Patent of the United States;16

Job Johnson,
as it turns out, was a prodigious inventor. His legacy was nearly lost to
history but it has been recently rediscovered by Dr. Todd Larson, a historian
at XavierUniversity. Larson is an expert on Job
Johnson’s inventive legacy, which can be found in Larson’s book, The History of the Fish hook in America18.
According to Larson, Job Johnson was a prolific American inventor with 38 patents
ranging from fishing tackle to elevated railways, demonstrating his very broad
creative capabilities. Johnson got rich from a thriving automatic fish hook
manufacturing business. This success obviously put him in a position to explore
commercial designs for traps for other animals. His experience with springs and
wire, and his workshop, filled with springs and triggers from his work on the
spring-loaded fish hook, would have been the perfect place to experiment with
other forms of traps.

We know,
thanks to Dr. Larson’s research, that American farmers needed rat traps and
thought about using fish traps to catch rats, as is mentioned in an article in
the 1847 edition of The Prairie Farmer.
Entitled “A New Fish hook”19, the article described Job Johnson’s fish
hook and concluded, “Those who wish to catch rats have got the right machine
here.” So how did Johnson make the move from spring-loaded fish hooks to the
rat trap? According to Cicciarelli it is pretty obvious to the naked eye that
Job Johnson used his patented spring-loaded fish hook trigger design, stuffed the
spring-loaded fish hook mechanism in a hole in the wooden base, and rigged the
torque-spring with a vicious serrated U-shaped striking bar. In Cicciarelli’s
words:

You can't really see the mechanism of
the trap very well from the photo, but the bait hook is actually a long fish
hook, the end of which goes through the base and comes out at the rear to hook
into the jaw when the jaw is pulled back into the set position. It works just
like the spring hook.

So the Job
Johnson rat trap actually contained a copy of his spring-loaded fish hook as a bait
platform and trigger mechanism. As such, it was fully covered by the wording
and considerations in U.S. Patent 5,256 and so the rat trap was duly stamped
“PATENTED 1847”.

Johnson was
already a well-known fish hook maker, having started the first American effort
in their fabrication in 1843. While now banned as unsportsmanlike, spring-loaded
fish hooks were, in their day, an important way for working fishermen to
maximize their catches. Todd Larson devotes an entire chapter to Job Johnson’s
inventions and says that he was a man “whose hooks were so good they inspired
poetry.” Truly, Job Johnson’s fish hooks are mentioned repeatedly in the 400
line ballad “The Legend of the Great Tautog” written by an anonymous author and
published on 23 October in The Spirit of
the Times, in 1852. According to Dr. Larson, Job Johnson’s fish traps were
known to be capable of catching small game, including rats, simply by hanging
them above the ground by a string and baiting them.

Larson’s
careful research shows that his patent 5,256 for the spring-loaded fish hook
was, in fact, an improvement on the first spring-loaded fish hook invented by a
16 year old boy in 1845 named George Washington Griswold18. Griswold’s
design was assigned and patented by entrepreneur Englebrecht and lawyer Skiff
in 184617,19, and given U.S. Patent number 4,670. It was the first U.S. patent
involving a device to catch a fish17.

Griswold used
a flat spring to cause two hooks to close on the mouth of a nibbling fish. Job
Johnson improved the Griswold design with a more powerful contractile helical
spring, driving the two hooks together. Job Johnson was a natural at spring-making,
probably from his early training with iron wire fabrication for fish hooks. So
we know Johnson had the skills to design and assemble the parts of the first
rat trap. And we know that Johnson contemplated that other animals could be
trapped with his spring-loaded fish hook by the words in the patent text itself16.
Finally, we know that Cicciarelli’s prized artifact shows that Job Johnson fabricated
rat traps containing the exact same spring-loaded fish hook mechanism, with the
added torque spring and kill bar, and mounted on a flat piece of wood.

Well, this is
where we come to the end of the line of commercialized mouse and mouse traps, where
we run out of artifacts and patents. Older inventions may have existed, but
they were not spread as ideas. There was a wave of innovation in the mid 1840s
involving a proliferation of patents in fish traps, popularized by many
articles in the early editions of Scientific
American18 as one of the carriers of design ideas in its day.
The starting point of all this innovation was from the 16 year old Griswold’s
first spring-loaded fish trap.

As I
mentioned, collectors Blauser and Mierzwa have a wonderfully illustrated book
showing these early spring-loaded fish hook designs17. Their photographs
show that Johnson’s first design was a spring-loaded fish hook from 1846
fabricated with three metal parts, three rivets and one spring (page 16) but no
corresponding 1846 patent was found matching this artifact. Interestingly, his
1847 patented device was altered to be made with 4 metal parts, four rivets and
one spring (page 21). So Job Johnson likely took the Griswold design and made
at least two successive, slight modifications to create a superior
spring-loaded fish hook design that would continue to be sold into the 1900s.

There is little doubt that both Johnson and Griswold had other prototypes made in
between these that failed to work. While most of the failed prototypes of the
earliest fish traps and rat traps are lost and long forgotten, the additional
1846 fish trap of Job Johnson is evidence of a prototyping process prior to the
broad spread of the idea and more inventions in spring-loaded fish traps, mouse
and rat traps.

Thanks to Cicciarelli,
we can conclude that Job Johnson was the earliest known inventor and original
spreader of the snap-style rat and mouse trap idea. Yet it was a branch of an
idea started by the young Griswold, a simple idea about how to trap a fish, which
was modified into a snap-style rodent-killing machine. Like a viral predator, spring-loaded
fish traps crossed the species barrier to become rat trap and mouse traps,
providing two independent and successful lineages of traps, one trapping food,
and the other for getting rid of destructive pests.

So finally, now we can go
back to the context of Behe’s steps1 to determine irreducible
complexity and apply them to the Job Johnson rat trap. According to Behe, the
first step is to specify the function of the system and all the system
components. The second step is to ask if all the components are required for
the system function. Well, the system function of the rat trap is to lure in
unsuspecting rats and immobilize them so they can no longer render havoc on
stored food supplies. So the system function is unchanged; it is just adapted
for a larger rodent.

Now, we also
know that the spring-loaded fish hook used inside the Job Johnson rat trap was
itself a standalone animal trap, so we can state conclusively that not all the
components are required for the system function. Importantly, this is where the
answer changes from yes to no. Not all parts are required for the entire system
to function. And in fact we can dispose of all but a single part and still
retain system function. The simple one-part animal trap, the fish hook, is
clearly visible as the bait holder in the Job Johnson rat trap. So one part of
the Job Johnson rat trap is a physical precursor that remains largely unchanged
throughout its design: the fish hook.

Griswold
started off with a design that duplicated the fish hook into a grabbing mechanism
in a configuration like the gripping talons of a bird of prey. Johnson’s intermediate
work shows increased part counts in his first two spring-loaded fish hook
designs. He added one part and one rivet in a step-increasing complexity in a
small increment to provide better leverage for the trigger mechanism. It is
clear from the historical accounts that people used spring-loaded fish hooks by
themselves to catch small animals. So the piece of wood is dispensable, as is
the torque spring and kill bar. It is just as likely that one could catch and
kill a rat with a baited barbed fish hook as one could a fish, provided you
could stay awake long enough to wait for a rat to bite the hook in the middle
of the night.

So we have now
uncovered that the irreducibly complex mouse trap is a conclusion made by a convenient
omission of a forgotten history. The original invention is reducible to a functioning
single-part animal trap, the fish hook, retaining the same system function
throughout the transition. The historical lineage of the snap-style mouse trap
comprises evidence showing that the mouse
trap is an example ofreducible
complexity, thereby disproving the notion that it was irreducibly complex
when it was originally created. The mouse trap voyage through time takes us
back to the fish hook. And as I show in Figure 1, on the current Victor mouse trap7, the
profile of the metal bait pedestal is, remarkably, still shaped with the same
curve as a vestigial fish hook.

Job Johnson is
the most likely inventor of the mouse trap design. Why don’t we know more about
Johnson? Dr. Larson’s book tells us how Johnson’s last few inventions and
investments were considered the work of a crackpot, as he suffered from
senility in his 80s18. His inventive reputation suffered greatly as
these failures mounted. Johnson’s senility left a poor impression on the
historical record of his later years, and his earlier successes were overlooked
as things like spring-loaded fish hooks fell out of popularity for being
unsportsmanlike. And so the story of the invention of the mouse trap may be
obscured by Mr. Johnson’s own illnesses later in life. Yet his highly
successful 1847 fish hook design continued well after the patent expired, sold
by the Sears and Roebuck and Montgomery Ward catalogues into the early 1900s. Only
two examples are known of the Job Johnson rat trap, and there are also very few
examples of the original Job Johnson spring-loaded fish hooks in the hands of
collectors. Oddly, the spring-loaded fish hook line of inventions starting with
Griswold was destined for extinction, while the early diverging line of rat and
mouse traps is still successful, and still being used today.

Recall that long
table and line of old mouse traps we were going to set up? Well we know it goes
back to 1847 and we know that next to the first Job Johnson rat trap we must
put the first three spring-loaded fish hook designs, two of Johnson’s, and the
first from Griswold. The mouse trap line is a branch from another long line of
inventions – the spring-loaded fish hooks. Men invented machines to trap pests
at the same time they were inventing machines to trap food. At the very
beginning of the two lines of traps lies a single-part animal trap called the fish
hook, the manufacture of which was Johnson’s trade, a set of skills passed from
father to son.

So let me more boldly ask whether any other device of complexity
is, historically speaking, going to withstand the kind of scrutiny we just gave
to the mouse trap design? Of course I will not suggest the mouse trap evolved
without human intelligence. After all, it is a product of human design. The key
point of the historical study is this: when you stop and identify individual intelligent
human designers as individuals and
you look carefully at how they design things, you see that they always apply
their intelligence in small doses of creativity in the context of prior
knowledge. Incremental additions to prior designs and prior knowledge are the
way humans achieve creative new designs. This is a principle recognized by the
Patent Office and the patent process. Small changes are how intelligent human beings
get to complexity. The true nature of human intelligent design is that humans
design complex systems in a step-wise approach, not in an all-at-once magical fashion.
Simple designs, such as the mouse trap bait pedestal/trigger part, are made up
from small improvements over time borrowing from earlier ideas. The Patent for
the silly Swiss cheese
shaped and scented bait pedal5, invented in 1981 and made ornamental
in 19896 , makes reference to an earlier cartoonish mouse trap with decorative
holes from a design patented in 194821. Again, simple ideas combine into
a minor modification4 to an existing mouse trap design7.

Inventions
accumulate new parts with variations on existing design memes just as evolving
creatures accumulate new genes based on variations of prior ones. The key is that
the knowledge is stored, either in memories, in writing or in the form of the
artifact itself. In living creatures the memory of the prior prototype is
simply the DNA, the gene which encodes the part. Each gene is an accumulated
store of information about the successful small innovations in the mechanical
parts, the proteins, within a cell. No intelligence is required in evolution
because the memory of prior prototypes does not require an intelligent being to
extract the information out and copy it and make small modifications to it.

Now, in the
case of human design, many other examples exist, and the patent
record holds many of the forgotten details. I invite you to scrutinize the
patent history of any object foisted as irreducibly complex. So far there are
no examples of spontaneous intelligent designs of inherent complexity that
withstand proper scrutiny, and if there are any that appear to, it is simply
because through the historical evidence of prototypes, trial and error has not
been preserved to tell the story. Even the modern design complexity research
community itself acknowledges that complex product design is a process of small
step-by-step improvements to prior designs22, and that this is the
preferred way our own most intelligent engineering teams pool their efforts to
approach complex design tasks.

My conclusion is that human intelligent design is itself a
process more similar to the evolutionary process than it is different. Only a
few design ideas are successful over the long term. Even successful ones, like
the spring-loaded fish hook, can become obsolete and disappear in a process
rather like extinction, while another related design, the mouse trap, thrives. One
of the benefits of having the Patent Office and its process is that these design
ideas are captured for all to examine and modify and reproduce. Patents are the
collective DNA of our human innovative genius and the genome of the industrial
and technological revolutions

18.Todd
E.A. Larson. The History of the Fish
hook in America:
An Illustrated Overview of the Origins, Development, and Manufacture of the
American Fish hook.Volume I 2007, Whitefish Press, DuluthMNUSA.

19.A New Fish
hook
in the 1847 edition of The Prarie Farmer, op
cit18.