Glossary

A

absolute path:
A path that refers to a
particular location in a file system. Absolute paths are usually
written with respect to the file system's root directory, and begin with
either “/” (on Unix) or “\” (on Microsoft
Windows).
See also:
relative path.

absolute reference:
A spreadsheet cell reference that
is not automatically adjusted when a formula is moved from one
location to another. Absolute references are created by putting
"$" in front of the row and/or column designation, as in
$C$4.
See also:
relative_reference.

abstract data type (ADT):
A specification of a set of values, and the operations that can
be performed on them. The term “abstract” means that
the implementation of the ADT is hidden from other code.

abstract syntax tree (AST):
A data structure that represents the structure of a program or
program fragment. Its leaves are literals, such as numbers and
variable names, while its internal nodes represent higher-level
structures, such as loops and expressions.

access control:
A way to specify who has permission to view, edit, delete, run,
or otherwise interact with something, by explicitly listing what
rights each individual or group has. This is in contrast with the
standard Unix authorization mechanism, which only
allows a fixed set of privileges to be listed for owner, one
group, and everyone else.

access control list (ACL):
A list that explicitly describes who can do what to a file,
directory, or other entity. ACLs permit finer control over a
computer's resources than Unix's classic user/group/all system,
but are more complicated to administer.

ACID:
An acronym for atomic, consistent, isolated, and durable, which
are the properties that a database
transaction must guarantee.

aggregate:
To create a single value by combining multiple values, e.g. by
adding or averaging.

algorithmic complexity:
The rate at which the work performed by an algorithm grows as a
function of problem size, ignoring constant factors. Algorithmic
complexity is usually expressed using O-notation; for
example, the time required to compare each value in a list to each
other value is O(N2).

alias:
A second (or subsequent) reference to a single piece of data.
Aliasing can make programs more difficult to understand, since
changes made through one reference “magically” affect
the other.

analysis and estimation (A&E):
The step in a software development process in which developers
figure out how they're going to implement the desired features,
and how long they expect it will take. The term is also applied
to the summary documents this process produces.

anchor:
An element of a regular
expression that matches a location, rather than a sequence
of characters. «^» matches the beginning of a line,
«\b» matches the break between word and non-word
characters, and «$» matches the end of a line.

arc:
A connection between two nodes in a
graph. Arcs may be directed (i.e.,
unidirectional) or undirected (i.e., bidirectional).

assertion:
An expression which is supposed to be true at a particular
point in a program. Programmers typically put assertions in their
code to check for errors; if the assertion fails (i.e., if the
expression evaluates as false), the program halts and produces an
error message.

B

basic authentication:
A simple username/password authentication mechanism that is part
of the HTTP standard. It sends passwords as cleartext (actually,
as base-64 encoded text), so it should never be used.

Big Design Up Front (BDUF):
A somewhat pejorative term applied to development processes that
rely on careful up-front analysis and design to prevent errors
from occurring.

big-endian:
Having the most significant byte in the memory location with
the lowest address. In a big-endian system, the integer
0x12345678 is stored as [0x78, 0x45, 0x34, 0x12].
See also:
little-endian.

binary data:
Non-textual data. All data is “binary”, in the
sense that it's represented as 1's and 0's, but many tools
distinguish between 1's and 0's that represent printable
characters, and 1's and 0's that don't.

binary mode:
Python (and some other programming languages) automatically
convert Windows-style line endings (carriage return followed by
newline) to Unix-style line endings (newline only) when reading
and writing files. This is appropriate for textual data, but not
for binary data, such as images.
If the file is in binary mode, this conversion is not done.

binary search:
A search technique which divides the values being searched in
half at each step, just as a person would go to the middle of a
phonebook, then the middle of either the upper or lower half, and
so on when looking for a name. The algorithmic complexity of
binary search is O(log2 N).

bitwise operations:
Operations that act at the level of the bits making up a value,
rather than on what those bits mean. The four most common bitwise
operations are and, or, xor, and not.

blacklist:
A list of addresses from which email will not be accepted, which
is part of an “allow unless forbidden” authorization policy.
See also:
whitelist.

blog:
Short for weblog; an on-line diary or
forum to which authors append new content. Unlike mailing lists, blogs use a publish-subscribe model: readers
pull content when they want it, rather than having it sent to
them.

boilerplate:
The standardized parts of a family of programs that don't change
from instance to instance.

branch:
A separate line of development managed by a version control system.
Branches help projects manage incompatible sets of changes that
are being made concurrently.
See also:
merge.

breakpoint:
A marker put in a program by a debugger that causes it to pause so that the
program's internal state can be inspected (and possibly
modified).

buffer:
A block of memory used to store values temporarily in order to
“smooth out” communication.

buffer overflow attack:
A method used to attack programs (primarily those written in C and
C++) that injects code by writing past the end of a buffer.

build tool:
A piece of software, such as Make, whose main
purpose is to rebuild software, documentation, web sites, and
other things after changes have been made.

burn rate:
The rate at which project tasks are actually being completed.
Comparing a project's actual burn rate with its schedule tells
developers when to start scaling back their plans, and/or moving
their deadlines.

C

cache:
A data structure, or portion of a disk, that stores temporary
copies of values. Caches are normally used when fetching items is
expensive: keeping copies of values that are likely to be needed
again close at hand can make a program much faster, at the cost of
requiring extra synchronization effort.

call stack:
A data structure used to keep track of functions that are
currently being executed. Each time a function is called, a new
stack frame is put on the top of
the stack to hold that function's local variables. When the
function returns, the stack frame is discarded.
See also:
heap,
static space.

cell range:
An expression specifying a contiguous block of cells in a spreadsheet. The cell range
C4:E5, for example, includes all the cells in the rectangle
bounded by C4 (upper left corner) and E5 (lower right corner).

chain:
A sequence of method calls, each of which uses the result of
the previous one, as in "x".upper().center(5).

code review:
The act of inspecting code to find errors, violations of style
guidelines, etc. While labor intensive, code reviews are a good
way to transfer knowledge between project members, and can be more
effective at finding bugs than testing.

collision:
A situation in which one or more values are mapped to the same
location by a hash function.
Has tables typically handle this by
storing colliding values in a sublist.

conditional breakpoint:
A breakpoint that only causes the
program to pause under certain conditions. For example, a debugger may specify that the program is to
pause only when a certain function parameter is an empty string,
or when a loop index is greater than a specified value.

comma-separated values:
A format for representing tabular data. Each row in the table
is represented by a line of text; the values in that row are
separated by commas.

command-line flag:
A terse way to specify an option or setting to a command-line
program. By convention, Unix applications use a dash followed by
a single letter, such as -v, or two dashes followed by a
word, such as --verbose, while DOS applications use a
slash, such as /V. Depending on the application, a flag
may be followed by a single argument, as in -o
/tmp/output.txt.

component object model:
A software architecture that specifies how components communicate
with each other, without specifying how they are implemented.
Microsoft's COM is the most widely used desktop example; modern
web services are increasingly emulating its most important
features.

core dump:
A file containing a byte-for-byte representation of the contents
of a program's memory. On some operating systems, programs produce
core dumps whenever they terminate abnormally (e.g., try to divide
by zero, or access memory that is out of bounds). Core dumps are
often used as the basis for post
mortem debugging.

current working directory:
The directory that relative
paths are calculated from; equivalently, the place
where files referenced by name only are searched for. Every
process has a current working
directory. The current working directory is usually referred to
using the shorthand notation . (pronounced
“dot”).

cursor:
A pointer into a database that keeps track of outstanding
transactions and other operations.

D

data scrubbing:
Reformatting, rescaling, or otherwise cleaning up data to make it
easier to process.

dead code:
A block of code which can never be reached, such as the body of
an if statement whose conditional expression is always
false. Dead code often occurs when programmers modifying an
inherited program leave something in because they're not sure it's
safe to take out.

deadlock:
Any situation in which no one can proceed unless someone else
does first (analogous to having two locked boxes, each of which
holds the key to the other).
See also:
race condition.

debugger:
A computer program that is used to control and inspect another
program (called the target
program). Most debuggers are symbolic debuggers
that show the target program's state in terms of the variables
that the programmer created, rather than showing the raw contents
of memory.

declarative:
A programming system in which the relationships between values are
stated, rather than the algorithm used to compute or update those
values. All widely-used programming languages are imperative, but build tools like Make,
spreadsheets, and some high-level
programming languages, are declarative.

decorator:
An advanced programming construct in Python that allows one
function to wrap or modify another.

denial of service (DoS):
An attack designed to overwhelm a system so that it cannot service
legitimate requests. DoS does not destroy data, or reveal
secrets, but making a site or service unavailable can be just as
damaging.

dependency:
In a build system, a file whose state some other file depends
on. If any of a file's dependencies are newer than the file
itself, the file must be updated. A file's dependencies are also
called its prerequisites.
See also:
action,
target.

directed graph:
A graph whose arcs have a direction, i.e., if an arc connects two nodes A and B, then it is possible to reach
B from A, but not necessarily possible to reach A from B.
Directed graphs are often used to visualize dependencies in build systems.

directory tree:
File system directories are normally organized hierarchically:
each directory except the root has a single parent, and each
may have zero or more children. This means that directories may
be viewed as a tree. Since files may not contain directories or
other files, they are always leaf nodes of this tree.

docstring:
Short for “documentation string”, this refers to
textual documentation embedded in Python programs. Unlike
comments, docstrings are preserved in the running program, and can
be examined in interactive sessions.

document:
A well-formed instance of XML. Documents
can be represented as trees (using DOM), stored as files on disk,
etc.

drive:
A disk drive is a piece of computer hardware used to store data
on a rotating disk. In older operating systems, each drive was a
separate file system;
modern version of Microsoft Windows still use this notion, placing
one or more file systems on
each physical drive, and giving each a separate drive letter (such
as the familiar C:).

driver:
A software module designed to communicate with an external device
or software package. A device driver is a piece of software that
can control a piece of hardware; a database driver is one that
knows how to open connections and send commands to a particular
database manager.

duck typing:
An informal name for dynamic type systems that relies on
objects being able to do the same things, rather than on inheritance or formal specification of
properties. The term comes from the saying, “If it walks
like a duck, and quacks like a duck, it's a duck.”

environment variable:
A named value associated with a running process by the
operating system. Typical environment variables include
HOME (the user's home directory) and PWD (the
process's present working directory). Environment variables are
typically used to specify things that many applications may want
to know, or to provide default configuration values.

epoch:
The moment from which times are measured. On Unix, the epoch
is midnight, January 1, 1970; on Windows, the epoch is January 1,
1601 (further proof that Microsoft takes backward compatiblity
very seriously).

escape sequence:
A sequence of characters that represents some other character
or special entity. "\t" and "\n" are escape sequences
in normal Python strings that represent tab and newline characters
respectively; "&lt;" and "&amp;" are escape
sequences in HTML and XML that represents the less than sign and
ampersand.

event-driven programming:
A style of programming in which a framework triggers events in the user's
program. Event-driven programming is used by most graphical user
interfaces, and by CGI programs.

exception:
An object that represents an error condition. As a program
executes, it creates a stack of exception handlers. When an
exception is raised, the
program searches this stack for the top-most handler, which catches and handles the exception.
Exceptions typically contain information such as the file and line
where the error occurred, the type of the error, and an error
message.

F

feature creep:
Changes in the aims or scope of a project over time. The usual
result is that everyone spends so much time rewriting code that
the project never moves closer to completion.

file system:
A set of files, directories, and I/O devices (such as
keyboards, screens, printers, and so on). A file system may be
spread across many physical devices, or many file systems may be
stored on a single physical device. The operating system will only allow
some file operations (such as copying, or creating symbolic links
or shortcuts) within a file system.

filename extension:
The portion of a file's name that comes after the final
“.” character. By convention, this identifies the
file's type: .txt means “text file”,
.png means “Portable Network Graphics file”,
and so on. These conventions are not enforced by most
operating systems: it is perfectly possible to name an MP3 sound
file homepage.html. Since many applications use filename
extensions to identify the MIME
type of the file, misnaming files may cause those
applications to fail.

filter:
A program that transforms a stream of data. Many Unix
command-line tools are written as filters: they read data from
standard input, process
it, and write the result to standard output. Image
processing applications are often constructed by connecting
filters to one another.

fixture:
The particular configuration of a system that is the subject of
a unit test. It is a good practice
to create a fresh fixture for each test, so that the actions and
outcomes of early tests cannot affect later ones.

framework:
A library, or set of libraries, that implements the generic
parts of a family of applications. Developers customize the
framework for a particular application by replacing generic
placeholders with more specific code.

G

gold plating:
Adding more features to the system than it needs, or making parts
of it much more elaborate than is required.

graph:
A mathematical structure that consists of nodes connected by arcs.
Graphs may be directed (i.e., the arcs are unidirectional) or
undirected (i.e, the arcs are bidirectional), and are used to
represent everything from program structure to bus routes.

H

hash code:
The output of a hash
function. Typically, a hash code is a seemingly-random
integer, which is then used to determine to put or look for an
object in a hash table.

hash function:
A function which takes an object as its input, and produces an
integer value as its output. Good hash functions produce outputs
that are as random as possible, i.e., they have the property that
different inputs are likely to produce different outputs.

hash table:
A data structure which allows programs to look up objects by
value, rather than by location. Hash tables do this by using
a hash function to calculate
seemingly-random identifiers for values, and using those as
indices into an array. Under normal conditions, it takes
constant time to find a value in a hash table.

heap:
An area of memory out of which a program can dynamically
allocate blocks of various sizes in order to store values.
See also:
call stack,
static space.

heisenbug:
A bug that hides when you are looking for it. Bugs can arise in
sequential programs (for example, adding a printf call to a
C program may move things around in memory so that the bug is no
longer triggered), but are much more common in concurrent programs.

hexadecimal:
A base-16 numeric representation in which the letters A-F (or
a-f) are used to represent the “digits” 10-15. The
decimal integer 61 is 3D in hexadecimal:
3×161+D(=13)×160.

hijack:
To take control of a connection between a user and a web
application after the user has authenticated, e.g., to impersonate
a user after he or she logs in.

HTTP header:
A name/value pair at the start of an HTTP request or response.
Unlike dictionary keys, names are not required to be unique.

I

idiom:
A manner of expression commonly used by native speakers of a
language. A programming language's idioms are the ways that most
programmers habitually express their ideas.

immutable:
Unchangeable. The value of immutable data cannot be altered
after it has been created.
See also:
mutable.

imperative:
A programming system in which the steps taken to calculate values
are specified explicitly. All widely-used programming languages
are imperative.
See also:
declarative.

in-place operator:
An operator such as += that provides a shorthand
notation for the common case in which the variable being assigned
to is also an operand on the right hand side of the assignment.
The statement x += 3 means the same thing as x = x +
3.

inner join:
A join in which rows are combined only
where values in corresponding columns satisfy some condition
(usually equality).

invariant:
An expression whose value doesn't change during the execution
of a program. For example, an invariant property of a loop
indexed by a variable i might be that the value of the
variable M is always greater than or equal to the values of
the array elements whose indices are less than i.
See also:
pre-condition,
post-condition.

instruction pointer:
A register that points at either
the instruction the program is currently executing, or the one
that it is to execute next (depending on the computer). When a
function is called, the instruction pointer's value is copied onto
the call stack, along with the
values of the function's parameters, so that the program can
return to the point of the call when the function finishes.

Integrated Development Environment (IDE):
A program that combines several software development tools into
one. Typically, an IDE contains a “smart” editor
(that automatically indents and colorizes code), a build system
(for languages that need to be compiled), a class browser, a debugger, and a graphical GUI designer.

invert:
To invert a dictionary is to
swap its keys and values; in mathematical terms, this is the same
as inverting the discrete function that the dictionary
represents. Any inversion algorithm must deal with the fact that
values are not guaranteed to be unique.

issue tracker:
A tool that keeps track of a project's outstanding work items, or
tickets; a to-do list for the project.
Issue trackers are sometimes called bug
trackers, since many of the items they record are bugs.

J

Java Server Page (JSP):
A Java-based template system, in which
programmers mix HTML and Java code in a single file. The file is
automatically translated to create a pure Java program that prints
pure HTML.

K

key:
The data that is used to index a particular entry in a dictionary. In a phone book, for example,
people's names are keys.

literate programming:
The practice of writing computer programs using a mix of
natural language, mathematics, and code, in order to make them
easier for human beings to read. Tools are used to translate
literate programs into code (for compilation and execution) and
documentation (for human consumption).

little-endian:
Having the least significant byte in the memory location with
the lowest address. In a little-endian system, the integer
0x12345678 is stored as [0x12, 0x34, 0x56, 0x78].
See also:
big-endian.

logging:
The act of recording program events in a systematic way so that
they can be examined later; a morally-defensible refinement of the
practice of using print statements to debug.

long integer:
An integer whose value takes up as many words of computer
memory as necessary. Most programming languages use 32 bits to
represent integers, which permits values in the range
-231…231-1 (or -2147483648 to
2147483647). In contrast, a language will allocate as many words
of computer memory to a long integer's value as that value needs.
The advantage is that very large values can be represented and
manipulated; the disadvantage is that operating on such values is
much slower than operating on native ones.

lookup table:
In a spreadsheet, a pair of rows or
columns in which the first is used to select a value from the
second.

metadata:
Literally, “data about data”, i.e., data such as a
format descriptor, which describes other data.

method:
In object-oriented programming, a function which is tied to a
particular object. Typically, each of
an object's methods implements one of the things it can do, or one
of the questions it can answer.

milestone:
A date by which some work has to be completed. Milestones are
usually given symbolic names, such as “First Beta
Release”, to accommodate date changes.

module:
A set of functions and variables that are grouped together to
make them more manageable. In Python, every source file is
automatically a module; in other languages, source files may
contain many modules, or a single module may span several
files.

multi-valued assignment:
An assignment statement which changes several values at once.
For example, a,b = 2,3 sets a to 2 and b to
3, while a,b = b,a swaps those variables' values.

Multipurpose Internet Mail Extensions:
An Internet standard for the format of email that also
specifies which filename suffixes should be used to identify
particular types of content (such as ".png" for a PNG-format
image).

mock object:
A stand-in for a real object that mimics behavior using a fixed
set of preprogrammed responses. Mock objects are used in testing
in order to isolate components, and/or improve performance.

mutable:
Changeable. The value of mutable data can be updated in
place.
See also:
immutable.

N

nimble language:
A language designed to facilitate rapid development, rather
than high performance or static safety checks. Nimble languages
are often called “scripting” or “agile”
languages, and include Python, Perl, Ruby, Tcl, Rexx, and
Scheme.
See also:
sturdy language.

node:
An element in a graph that may be
connected to other nodes by arcs.

normal form:
One of the conditions a database must satisfy to conform with best
practices.

O

object:
A combination of data and functions (called methods) that are meant to work together.
In most programming languages, objects are instances of classes; each object represents one
“thing” that the program can operate on.

operating system:
The software responsible for managing a computer's hardware and
other processes. Operating systems are also responsible for
making different computers present the same interface to other
programs, so that applications like word processors and compilers
don't have to be re-written each time a new generation of chips
comes out. Popular desktop operating systems include Microsoft
Windows, Linux, and Mac OS X.

operator overloading:
Redefining the behavior of a built-in operator, such as +,
by overriding a specially-named
method. C++ and Python permit it; Java does not.

parent directory:
The directory “above” a particular directory;
equivalently, the directory that “contains” the one in
question. Every directory in a file system except the root must a unique parent. A
directory's parent is usually referred to using the shorthand
notation .. (pronounced “dot dot”).

path:
A non-empty string specifying a single file or directory.
Paths consist of zero or more directory names, optionally followed
by a filename. Directory and file names are separated by
“/” (on Unix) or “\” (on Microsoft
Windows). If the path begins with this character, it is an
absolute path; otherwise,
it is a relative path.
On Microsoft Windows, a path may optionally begin with a drive letter.

pattern rule:
In Make, a rule that specifies a general way to
manage an entire class of files. For example, a pattern rule
might specify how to compile any C file, rather than just a
particular C file. Pattern rules typically make use of automatic variables.

prerequisite:
In a build system, a file whose state some other file depends
on. If any of a file's prerequisites are newer than the file
itself, the file must be updated. A file's prerequisites are also
called its dependencies.
See also:
action,
target.

process:
A running instance of a program, containing code, variable
values, open files and network connections, and so on. Processes
are the “actors” that the operating system manages;
typically, the OS runs each process for a few milliseconds at a
time to give the impression that they are executing
simultaneously.

program slice:
The subset of a program's statements which can affect the value
of a particular variable at some point in a program.

publish-subscribe:
A technique for sharing content, in which an author makes the
material available, and readers download it when they want it
(rather than having it sent to them automatically).
Publish-subscribe is sometimes called “content pull”,
to distinguish it from the “content push” model of
mailing lists.

Q

query:
A database operation that reads values, but does not modify
anything. Queries are expressed in a special-purpose language
called SQL.

R

race condition:
A situation in which the final state of a system depends on the
order in which two or more competing processes modifies the state
last. For example, if two people make changes to a shared file,
the final contents of the file depends on who saves their changes
last. Race conditions are usually bugs, and are notoriously hard
to track down.

refactor:
To rewrite or reorganize software in order to improve its
structure or readability [Fowler 1999].

reference counting:
Keeping track of the number of references to an object while a
program is running, so that it can automatically be destroyed when
it is no longer in use. Reference counting is an easy way to do
garbage collection, but
isn't guaranteed to collect all objects: if A and B refer to each
other, but nothing else refers to them, their reference counts
will not be zero, and they will not be recycled.

referential integrity:
The internal consistency of values in a database. If an entry in
one table contains a foreign key,
but the record that key is supposed to
identify doesn't exist, referential integrity has been violated.

reflection:
Having a program treat itself as data, i.e., examine or manipulate
its own state.

register:
A small amount (typically only 4 or 8 bytes in size) of very
fast memory that is built into a microprocessor. Most modern
computer architectures only operate on values in registers; data
must be moved from memory into registers, and results moved the
other way. The term is also used to refer to variables in virtual machines that play a similar
role.

regression test:
A test that checks whether things that used to work are still
working; equivalently, a test that checks whether errors that had
been eliminated have been reintroduced.
See also:
integration test,
unit test.

replay attack:
An attack in which messages are recorded, then played back at a
later date. For example, an attacker might record the signal that
means “open the vault”, then use it to fool the system
into opening the vault door several hours later.

role:
A description of what some class of users can and cannot do to a
system. A role is typically described by listing the actions its
members can perform; they simplify administration by making it
possible to redefine the capabilities of an entire group in a
single step.

RSS:
An XML data format used for syndicating content, such as blogs: the acronym stands for Rich Site Summary,
RDF Site Summary, or Really Simple Syndication. Someone who
wishes to publish a blog creates an RSS file (typically using
off-the-shelf software) and places it on their web server.
Blogreaders can then periodically check for updates, and, if there
are any, download and display the associated articles.

S

screen scraping:
Using a program to extract information from an HTML page intended
for human viewing. Screen scraping is a quick way to solve simple
problems, but breaks down when the pages are complex, or their
format changes frequently.
See also:
web services,
web spider.

search path:
The list of directories that the operating system searches when
the user asks to run a program. The search path is usually stored
in the user's PATHenvironment_variable. On
Unix, entries are separated by “:”, while on Windows,
they are separated by “;”.

sequence:
A set of objects arranged in a dense, linear fashion, so that
they may be referred to by their index. In Python, strings,
lists, and tuples are built-in sequence types, since the elements
of each may be referred to as s[0], s[1], and so on
up to s[N-1], where N is the sequence's length.

shared library:
A compiled library that is loaded into memory at most once, and
whose contents are shared by all running programs that reference
it. Shared libraries are implemented on Windows by .dll
files, and on Linux by .so files.

shell:
A command-line user interface program, such as Bash (the
Bourne-Again Shell) or the Microsoft Windows DOS shell. Shells
commonly execute a read-evaluate-print cycle: when the user enters
a command in response to a prompt, the shell either executes the
command itself, or runs the program that the command has
specified. In either case, output is sent to the shell window,
and the user is prompted to enter another command. Most shells
include commands for looping, conditionals, and defining
functions, so that small (and sometimes large) programs can be
written by putting a sequence of shell commands in a file.

short-circuit evaluation:
Evaluation of an expression from left to right that stops as
soon as the expression's final value is known. For example, if
x is false, the computer does not call the function
f in the expression x and f(x). Similarly, if
x is true, f does not have to be called in x or
f(x).

silver bullet:
A tool or technique that purports to solve a hard problem, but
which is too good to be true. The term comes from the myth that
only silver bullets can kill werewolves (in fact, ruthenium,
rhodium, and palladium are equally effective).

slice:
A regular subsequence of a larger sequence, such as the first five elements,
or every second element.

social engineering:
An attack based on deceiving users, or on relying on social
conventions. Posing as an old lady in distress, or as a bank
official who is checking information, are both attacks of this
kind.

sparse:
Being mostly empty. A sparse vector or matrix is one in which
most values are zero.

special method:
In Python, a method which has a
special meaning to the interpreter. By conventions, these
methods' names begin and end with two underscore characters.
For example, if an object has a __str__ method, Python
automatically calls it whenever it needs a text representation of
the object.

specification:
A formal or semi-formal description of what a piece of software
is supposed to do. Specifications may include everything from
English prose (“The system must be able to handle at least
100 requests per second”) to algebra so complex that neither
customers nor developers really understand it.

spiral model:
A software development process which creates successively larger
prototypes on the way to delivering the final application. The
development of each prototype goes through the steps of the waterfall model.

spreadsheet:
A program for manipulating tabular numeric data, or the data
manipulated in that way. Microsoft Excel is the most widely used
spreadsheet in the world, but many others (such as Gnumeric) also exist.

SQL:
A special-purpose language for describing operations on relational databases. SQL is
not actually an acronym for “Structured Query
Language”.

stack frame:
A data structure that provides storage for a function's local
variables. Each time a function is called, a new stack frame is
created and put on the top of the call
stack. When the function returns, the stack frame is
discarded.

standard input:
A process's default input stream. In interactive command-line
applications, it is typically connected to the keyboard; in a
pipeline, it receives data from
the standard output of
the preceding process.

standard output:
A process's default output stream. In interactive command-line
applications, data sent to standard output is displayed on the
screen; in a pipeline, it is
passed to the standard
input of the next process.

starvation:
A situation in which a process never completes a task because
other processes are continually being given access to a resource
that the starving process needs. Starvation is not the same as
deadlock, although the
symptoms are similar.

stateless protocol:
A communication protocol in which each basic operation is
independent of each other. HTTP is the
best-known example: servers do not
remember anything about clients between
requests.

static space:
A portion of a program's memory reserved for storing values
that are allocated even before the program starts to run, such as
constant strings.
See also:
call stack,
heap.

status code:
An integer value returned to the operating system by a program
when that program terminates, which indicates whether the program
terminated normally or abnormally. By convention, 0 is used to
indicate normal termination (“zero errors”), while
non-zero values indicate specific problems (e.g., 1 for
“file not found”, 2 for “no permission”,
etc.).

stored procedure:
A function or program that has been compiled and stored in a
database for more efficient execution.

stub:
A temporary placeholder for a function or method that hasn't been
written yet. Stubs typically return the same value on every call,
or (less often) a random value.

sturdy language:
A language that separates compilation from execution in order
to maximize performance, check safety conditions, or both. Sturdy
languages typically have longer turnaround times than nimble languages, but scale up to
very large problems better. Sturdy languages include C/C++,
Fortran, Java, and C#.

submodule:
A module that is contained inside
another module. Large software libraries are divided into
submodules for the same reason that large programs are divided
into functions.

T

tag (in version control):
A symbolic label in a version control system that uniquely
identifies a particular state of the repository.
See also:
branch.

tag (in XML):
A textual representation of an XMLelement. Tags come in matched opening and
closing pairs, such as <x> and </x>; if
the element the tag pair represents does not contain text or other
elements, the short form <x/> may be used.
See also:
branch.

target:
In a build system, a thing that may be created or updated.
Targets typically have prerequisites that must be up to
date before the target itself can be updated. Targets may also be
symbolic, i.e., there may be targets that do not correspond to
files or other objects. In this case, the target is simply a
symbolic name for a set of actions.
See also:
action,
default target,
dependency,
phony target.

template:
An outline of a web page, which a program then fills in with
specific content. In older systems, templates contain a mix of
program code and HTML; newer systems try to keep the two separate
in order to simplify maintenance.

test suite:
A collection of unit tests. Tests
are grouped into test suites in order to make them easier to
manage, and so that developers can easily re-run
logically-connected sets of tests.

test-driven development (TDD):
The practice of writing unit tests before writing
application code. TDD is a core practice in Extreme Programming, but has
been around since at least the 1970s. Its main advantage is that
it helps programmers clarify their ideas about what their code is
supposed to do before they have become emotionally attached to
that code. It also increases the odds of some tests actually
being written, and gives programmers a finish line to aim for:
when all the tests pass, the code must be done.

text:
The non-element content of an XMLdocument; in an
HTML page, the text is what is displayed, while the tags control its formatting.

three-tier architecture:
An architecture in which data is stored in a database (tier 1),
which is manipulated by a server (tier
2), and viewed in a web browser (tier 3).

ticket:
A single work item in an issue
tracker. A ticket may describe a bug that needs to be
fixed, an enhancement that is to be added, a question that needs
to be answered, or any other task.
See also:
ticket, closed,
ticket, open.

triage:
The process of sorting, prioritizing, and assigning tickets. As the project deadline approaches,
triage is done more frequently, in order to keep the team focused
on things that actually need to be done.

trigger:
A procedure which is automatically invoked when a database table
is modified. The term is also applied to code that runs whenever
the content of a version control repository are updated.

two's complement:
A way to represent signed integers in computer memory. The most
significant bit in positive integers is 0; the other bits are used
to store magnitude. Negative values “wrap around”,
like a car's odometer, so that -1 is the bit string 111…111,
-2 is 111…110, and so on, all the way to the most negative
number, which is 100…000. Note that two's complement is
asymmetric: since zero counts as a positive number, the absolute
value of the most negative number is one greater than the absolute
value of the most positive number. Put another way, N bits can
represent values from -(2N-1) to
2N-1-1. Thus, three bits can represent the
integers -4…3.

Unicode:
An international standard for representing characters and other
symbols. Each symbol is assigned a unique number; those numbers
are then encoded in any of several standard ways (such as UTF-8).

unit test:
A test that exercises a single basic element of a program, such
as a particular function or method.
See also:
integration test.

unpack:
To take data that has been packed into
a contiguous block of memory, and put it back in the language's
native storage format; also called unmarshalling.

V

validate:
To check that input data is of the right type, in range, etc.
Failing to validate data is a common source of security problems.

verifiable deliverable:
A project task (such as implementation of a particular feature)
whose completion can be checked by an independent observer. Where
possible, features should be described in terms of verifiable
deliverables, so that there is some way to tell what's actually
done at any time.

version control system:
A tool for managing changes to a set of files. Each set of
changes creates a new revision
of the files; the version control system allows users to recover
old revisions reliably, and
helps manage conflicting changes made by different users.

virtual machine (VM):
A program that makes a computer behave as if it were some other
type of computer. Many modern programming languages, such as Java
and Python, run on virtual machines, rather than directly on the
computer's hardware. The main advantages are portability (once
the VM has been ported to a new machine, all of the programs
running on it will also run on that machine) and security (the VM
can enforce much more complicated safety rules than today's
hardware). The main disadvantage is speed: since the VM may have
to execute several physical instructions to simulate a single
logical instruction, programs running on a VM may be many times
slower than programs running natively.

W

watchpoint:
A breakpoint that is associated
with a variable, or a region of memory, rather than with a
location in the program's source code. The program suspends
execution whenever any of the data associated with a watchpoint is
modified.
See also:
conditional breakpoint.

waterfall model:
A software development process in which requirements analysis,
design, implementation, and testing are done strictly in that
order. The waterfall model is almost never used in practice;
instead, its main reason for existing is to give software
engineering professors something to critique.

web services:
A software application that exchanges data with others by sending
XML data via the HTTP protocol. Most modern web services encode
data using the SOAP standard.
See also:
screen scraping.