The JACKDAW database packageChallis, M.F.University of Cambridge, Computer Laboratory1974-10enTextUCAM-CL-TR-1ISSN 1476-2986
This report describes a general database package which has been
implemented in BCPL on an IBM 370/165 at the University of
Cambridge. One current application is the provision of an
administrative database for the Computing Service.
Entries within a database may include (in addition to primitive
fields such as ‘salary’ and ‘address’) links to other entries: each
link represents a relationship between two entries and is always
two-way.
Generality is achieved by including within each database class
definitions which define the structure of the entries within it;
these definitions may be interrogated by program.
The major part of the package presents a procedural interface
between an application program and an existing database, enabling
entries and their fields to be created, interrogated, updated and
deleted. The creation of a new database (or modification of an
existing one) by specifying the class definitions is handled by a
separate program.
The first part of the report describes the database structure and
this is followed by an illustration of the procedural interface.
Finally, some of the implementation techniques used to insure
integrity of the database are described.
Scheduling for a share of the machineLarmouth, J.University of Cambridge, Computer Laboratory1974-10enTextUCAM-CL-TR-2ISSN 1476-2986
This paper describes the mechanism used to schedule jobs and control
machine use on the IBM 370/165 at Cambridge University, England. The
same algorithm is currently being used in part at the University of
Bradford and implementations are in progress or under study for a
number of other British Universities.
The system provides computer management with a simple tool for
controlling machine use. The managerial decision allocates a share
of the total machine resources to each user of the system, either
directly, or via a hierarchial allocation scheme. The system then
undertakes to vary the turnaround of user jobs to ensure that those
decisions are effective, no matter what sort of work the user is
doing.
At the user end of the system we have great flexibility in the way
in which he uses the resources he has received, allowing him to get
a rapid turnaround for those (large or small) jobs which require it,
and a slower turnaround for other jobs. Provided he does not work at
a rate exceeding that appropriate to his share of the machine, he
can request, for every job he submits, the ‘deadline’ by which he
wants it running, and the system will usually succeed in running his
job at about the requested time – rarely later, and only
occasionally sooner.
Every job in the machine has its own ‘deadline’, and the machine is
not underloaded. Within limits, each user can request his jobs back
when he wants them, and the system keeps his use to within the share
of the machine he has been given. The approach is believed to be an
original one and to have a number of advantages over more
conventional scheduling and controlling algorithms.
A replacement for the OS/360 disc space management
routinesStoneley, A.J.M.University of Cambridge, Computer Laboratory1975-04enTextUCAM-CL-TR-3ISSN 1476-2986
In the interest of efficiency, the IBM disc space management
routines (Dadsm) have been completely replaced in the Cambridge
370/165.
A large reduction in the disc traffic has been achieved by keeping
the lists of free tracks in a more compact form and by keeping lists
of free VTOC blocks. The real time taken in a typical transaction
has been reduced by a factor of twenty.
By writing the code in a more appropriate form than the original,
the size has been decreased by a factor of five, thus making it more
reasonable to keep it permanently resident. The cpu requirement has
decreased from 5% to 0.5% of the total time during normal service.
The new system is very much safer than the old in the fact of total
system crashes. The old system gave little attention to the
consequences of being stopped in mid-flight, and it was common to
discover an area of disc allocated to two files. This no longer
happens.
The dynamic creation of I/O paths under
OS/360-MVTStoneley, A.J.M.University of Cambridge, Computer Laboratory1975-04enTextUCAM-CL-TR-4ISSN 1476-2986
In a large computer it is often desirable and convenient for an
ordinary program to be able to establish for itself a logical
connection to a peripheral device. This ability is normally provided
through a routine within the operating system which may be called by
any user program at any time. OS/360 lacks such a routine. For the
batch job, peripheral connections can only be made through the job
control language and this cannot be done dynamically at run-time. In
the restricted context of TSO (IBM’s terminal system) a routine for
establishing peripheral connections does exist, but it is extremely
inefficient and difficult to use.
This paper describes how a suitable routine was written and grafted
into the operating system of the Cambridge 370/165.
Parrot – A replacement for TCAMHazel, P.Stoneley, A.J.M.University of Cambridge, Computer Laboratory1976-04enTextUCAM-CL-TR-5ISSN 1476-2986
The terminal driving software and hardware for the Cambridge TSO
(Phoenix) system is described. TCAM and the IBM communications
controller were replaced by a locally written software system and a
PDP-11 complex. This provided greater flexibility, reliability,
efficiency and a better “end-user” interface than was possible under
a standard IBM system.
System programming in a high level languageBirrell, Andrew D.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-6ISSN 1476-2986Local area computer communications networkHopper, AndrewUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-7ISSN 1476-2986Evaluation of a protection systemCook, Douglas JohnUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-9ISSN 1476-2986Prediction oriented description of database
systemsPezarro, Mark TheodoreUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-10ISSN 1476-2986Automatic resolution of linguistic ambiguitiesBoguraev, Branimir KonstatinovUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-11ISSN 1476-2986
The thesis describes the design, implementation and testing of a
natural language analysis system capable of performing the task of
generating paraphrases in a highly ambiguous environment. The
emphasis is on incorporating strong semantic judgement in an
augmented transition network grammar: the system provides a
framework for examining the relationship between syntax and
semantics in the process of text analysis, especially while treating
the related phenomena of lexical and structural ambiguity.
Word-sense selection is based on global analysis of context within a
semantically well-formed unit, with primary emphasis on the verb
choice. In building structures representing text meaning, the
analyser relies not on screening through many alternative structures
– intermediate, syntactic or partial semantic – but on dynamically
constructing only the valid ones. The two tasks of sense selection
and structure building are procedurally linked by the application of
semantic routines derived from Y. Wilks’ preference semantics, which
are invoked at certain well chosen points of the syntactic
constituent analysis – this delimits the scope of their action and
provides context for a particular disambiguation technique. The
hierarchical process of sentence analysis is reflected in the
hierarchical organisation of application of these semantic routines
– this allows the efficient coordination of various disambiguation
techniques, and the reduction of syntactic backtracking,
non-determinism in the grammar, and semantic parallelism. The final
result of the analysis process is a dependency structure providing a
meaning representation of the input text with labelled components
centred on the main verb element, each characterised in terms of
semantic primitives and expressing both the meaning of a constituent
and its function in the overall textual unit. The representation
serves as an input to the generator, organised around the same
underlying principle as the analyser – the verb is central to the
clause. Currently the generator works in paraphrase mode, but is
specifically designed so that with minimum effort and virtually no
change in the program control structure and code it could be
switched over to perform translation.
The thesis discusses the rationale for the approach adopted,
comparing it with others, describes the system and its machine
implementation, and presents experimental results.
HASP “IBM 1130” multileaving remote job entry protocol with
extensions as used on the University of Cambridge IBM
370/165Oakley, M.R.A.Hazel, P.University of Cambridge, Computer Laboratory1979-09enTextUCAM-CL-TR-12ISSN 1476-2986Resource allocation and job schedulingHazel, PhilipUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-13ISSN 1476-2986
The mechanisms for sharing the resources of the Cambridge IBM
370/165 computer system among many individual users are described.
File store is treated separately from other resources such as
central processor and channel time. In both cases, flexible systems
that provide incentives to thrifty behaviour are used. The method of
allocating resources directly to users rather than in a hierarchical
manner via faculties and departments is described, and its social
acceptability is discussed.
Store to store swapping for TSO under OS/MVTPowers, J.S.University of Cambridge, Computer Laboratory1980-06enTextUCAM-CL-TR-14ISSN 1476-2986
A system of store-to-store swapping incorporated into TSO on the
Cambridge IBM 370/165 is described. Unoccupied store in the dynamic
area is used as the first stage of a two-stage backing store for
swapping time-sharing sessions; a fixed-head disc provides the
second stage. The performance and costs of the system are evaluated.
The implementation of BCPL on a Z80 based
microcomputerWilson, I.D.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-15ISSN 1476-2986
The main aim of this project was to achieve as full an
implementation as possible of BCPL on a floppy disc based
microcomputer, running CP/M or CDOS (the two being esentially
compatible). On the face of it there seemed so many limiting
factors, that, when the project was started, it was not at all clear
which one (if any) would become a final stumbling block. As it
happened, the major problems that cropped up could be programmed
round, or altered in such a way as to make them soluble.
The main body of the work splits comfortably into three sections,
and the writer hopes that, in covering each section separately, to
be able to show how the whole project fits together into the
finished implementation.
Reliable storage in a local networkDion, JeremyUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-16ISSN 1476-2986Three papers on parsingBoguraev, B.K.Spärck Jones, K.Tait, J.I.University of Cambridge, Computer Laboratory1982enTextUCAM-CL-TR-17ISSN 1476-2986Automatic mesh generation of 2 & 3 dimensional
curvilinear manifoldsWördenweber, BurkardUniversity of Cambridge, Computer Laboratory1981-11enTextUCAM-CL-TR-18ISSN 1476-2986Analysis and inference for EnglishCater, Arthur William SebrightUniversity of Cambridge, Computer Laboratory1981-09enTextUCAM-CL-TR-19ISSN 1476-2986On using Edinburgh LCF to prove the correctness of a parsing
algorithmCohn, AvraMilner, RobinUniversity of Cambridge, Computer Laboratory1982-02enTextUCAM-CL-TR-20ISSN 1476-2986
The methodology of Edinburgh LCF, a mechanized interactive proof
system is illustrated through a problem suggested by Gloess – the
proof of a simple parsing algorithm. The paper is self-contained,
giving only the relevant details of the LCF proof system. It is
shown how tactics may be composed in LCF to yield a strategy which
is appropriate for the parser problem but which is also of a
generally useful form. Also illustrated is a general mechanized
method of deriving structural induction rules within the system.
The correctness of a precedence parsing algorithm in
LCFCohn, A.University of Cambridge, Computer Laboratory1982-04enTextUCAM-CL-TR-21ISSN 1476-2986
This paper describes the proof in the LCF system of a correctness
property of a precedence parsing algorithm. The work is an extension
of a simpler parser and proof by Cohn and Milner (Cohn & Milner
1982). Relevant aspects of the LCF system are presented as needed.
In this paper, we emphasize (i) that although the current proof is
much more complex than the earlier one, mqany of the same
metalanguage strategies and aids developed for the first proof are
used in this proof, and (ii) that (in both cases) a general strategy
for doing some limited forward search is incorporated neatly into
the overall goal-oriented proof framework.
Constraints in CODDRobson, M.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-22ISSN 1476-2986
The paper describes the implementation of the data structuring
concepts of domains, intra-tuple constraints and referential
constraints in the relational DBMS CODD. All of these constraints
capture some of the semantics of the database’s application.
Each class of constraint is described briefly and it is shown how
each of them is specified. The constraints are stored in the
database giving a centralised data model, which contains
descriptions of procedures as well as of statistic structures. Some
extensions to the notion of referential constraint are proposed and
it is shown how generalisation hierarchies can be expressed as sets
of referential constraints. It is shown how the stored data model is
used in enforcement of the constraints.
Two papers about the scrabble summarising systemTait, J.I.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-23ISSN 1476-2986
This report contains two papers which describe parts of the Scrabble
English summarizing system. The first, “Topic identification
techniques for predictive language analyzers” has been accepted as a
short communication for the 9th International COnference on
Computational Linguistics, in Prague. The second, “General summaries
using a predictive language analyser” is an extended version of a
discussion paper which will be presented at the European Conference
on Artificial Intelligence in Paris. Both conferences will take
place during July 1982.
The [second] paper describes a computer system capable of producing
coherent summaries of English texts even when they contain sections
which the system has not understood completely. The system employs
an analysis phase which is not dissimilar to a script applier
together with a rather more sophisticated summariser than previous
systems. Some deficiencies of earlier systems are pointed out, and
ways in which the current implementation overcomes them are
discussed.
Steps towards natural language to data language translation
using general semantic informationBoguraev, B.K.Spärck Jones, K.University of Cambridge, Computer Laboratory1982-03enTextUCAM-CL-TR-24ISSN 1476-2986A clustering technique for semantic network
processingAlshawi, HiyanUniversity of Cambridge, Computer Laboratory1982-05enTextUCAM-CL-TR-25ISSN 1476-2986Portable system software for personal computers on a
networkKnight, Brian JamesUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-26ISSN 1476-2986Exception handling in domain based systemsJohnson, Martyn AlanUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-27ISSN 1476-2986Poly reportMatthews, D.C.J.University of Cambridge, Computer Laboratory1982-08enTextUCAM-CL-TR-28ISSN 1476-2986
Poly was designed to provide a programming system with the same
flexibility as a dynamically typed language but without the run-time
oveheads. The type system, based on that of Russel allows
polymorpphic operations to be used to manipulate abstract objects,
but with all the type checking being done at compile-time. Types may
be passed explicitly or by inference as parameters to procedures,
and may be returned from procedures. Overloading of names and
generic types can be simulated by using the general procedure
mechanism. Despite the generality of the language, or perhaps
because of it, the type system is very simple, consisting of only
three classes of object. There is an exception mechanism, similar to
that of CLU, and the exceptions raised in a procedure are considered
as part of its ‘type’. The construction of abstract objects and
hiding of internal details of the representation come naturally out
of the type system.
Introduction to PolyMatthews, D.C.J.University of Cambridge, Computer Laboratory1982-05enTextUCAM-CL-TR-29ISSN 1476-2986
This report is a tutorial introduction to the programming language
Poly. It describes how to write and run programs in Poly using the
VAX/UNIX implementation. Examples given include polymorphic list
functions, a double precision integer package and a subrange type
constructor.
A portable BCPL libraryWilkes, JohnUniversity of Cambridge, Computer Laboratory1982-10enTextUCAM-CL-TR-30ISSN 1476-2986Ponder and its type systemFairbairn, J.University of Cambridge, Computer Laboratory1982-11enTextUCAM-CL-TR-31ISSN 1476-2986
This note describes the programming language “Ponder”, which is
designed according to the principles of referencial transparency and
“orthogonality” as in [vWijngaarden 75]. Ponder is designed to be
simple, being functional with normal order semantics. It is intended
for writing large programmes, and to be easily tailored to a
particular application. It has a simple but powerful polymorphic
type system.
The main objective of this note is to describe the type system of
Ponder. As with the whole of the language design, the smallest
possible number of primitives is built in to the type system. Hence
for example, unions and pairs are not built in, but can be
constructed from other primitives.
How to drive a database front end using general semantic
informationBoguraev, B.K.Spärck Jones, K.University of Cambridge, Computer Laboratory1982-11enTextUCAM-CL-TR-32ISSN 1476-2986An island parsing interpreter for Augmented Transition
NetworksCarroll, John A.University of Cambridge, Computer Laboratory1982-10enTextUCAM-CL-TR-33ISSN 1476-2986
This paper describes the implementation of an ‘island parsing’
interpreter for an Augmented Transition Network (ATN). The
interpreter provides more complete coverage of Woods’ original ATM
formalism than his later island parsing implementation; it is
written in LISP and has been modestly tested.
Recent developments in LCF: examples of structural
inductionPaulson, LarryUniversity of Cambridge, Computer Laboratory1983-01enTextUCAM-CL-TR-34ISSN 1476-2986Rewriting in Cambridge LCFPaulson, LarryUniversity of Cambridge, Computer Laboratory1983-02enTextUCAM-CL-TR-35ISSN 1476-2986
Many automatic theorem-provers rely on rewriting. Using theorems as
rewrite rules helps to simplify the subgoals that arise during a
proof.
LCF is an interactive theorem-prover intended for reasoning about
computation. Its implementation of rewriting is presented in detail.
LCF provides a family of rewriting functions, and operators to
combine them. A succession of functions is described, from pattern
matching primitives to the rewriting tool that performs most
inferences in LCF proofs.
The design is highly modular. Each function performs a basic,
specific task, such as recognizing a certain form of tautology. Each
operator implements one method of building a rewriting function from
simpler ones. These pieces can be put together in numerous ways,
yielding a variety of rewriting strategies.
The approach involves programming with higher-order functions.
Rewriting functions are data values, produced by computation on
other rewriting functions. The code is in daily use at Cambridge,
demonstrating the practical use of functional programming.
The revised logic PPLAMBDA : A reference manualPaulson, LawrenceUniversity of Cambridge, Computer Laboratory1983-03enTextUCAM-CL-TR-36ISSN 1476-2986
PPLAMBDA is the logic used in the Cambridge LCF proof assistant. It
allows Natural Deduction proofs about computation, in Scott’s theory
of partial orderings. The logic’s syntax, axioms, primitive
inference rules, derived inference rules and standard lemmas are
described as are the LCF functions for building and taking apart
PPLAMBDA formulas.
PPLAMBDA’s rule of fixed-point induction admits a wide class of
inductions, particularly where flat or finite types are involved.
The user can express and prove these type properties in PPLAMBDA.
The induction rule accepts a list of theorems, stating type
properties to consider when deciding to admit an induction.
Representation and authentication on computer
networksGirling, Christopher GrayUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-37ISSN 1476-2986Views and imprecise information in databasesGray, MikeUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-38ISSN 1476-2986Tactics and tacticals in Cambridge LCFPaulson, LawrenceUniversity of Cambridge, Computer Laboratory1983-07enTextUCAM-CL-TR-39ISSN 1476-2986The SKIM microprogrammer’s guideStoye, W.University of Cambridge, Computer Laboratory1983-10enTextUCAM-CL-TR-40ISSN 1476-2986LCF_LSM, A system for specifying and verifying
hardwareGordon, MikeUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-41ISSN 1476-2986Proving a computer correct with the LCF_LSM hardware
verification systemGordon, MikeUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-42ISSN 1476-2986Extending the local area networkLeslie, Ian MalcomUniversity of Cambridge, Computer Laboratory1983-02enTextUCAM-CL-TR-43ISSN 1476-2986
This dissertation is concerned with the development of a large
computer network which has many properties associated with local
area computer networks, including high bandwidth and lower error
rates. The network is made up of component local area networks,
specifically Cambridge rings, which are connected either through
local ring-ring bridges or through a high capacity satellite link.
In order to take advantage of the characteristics of the resulting
network, the protocols used are the same simple protocols as those
used on a single Cambridge ring. This in turn allows many
applications, which might have been thought of as local area network
applications, to run on the larger network.
Much of this work is concerned with an interconnection strategy
which allows hosts of different component networks to communicate in
a flexible manner without building an extra internetwork layer into
protocol hierarchy. The strategy arrived at is neither a datagram
approach nor a system of concatenated error and flow controlled
virtual circuits. Rather, it is a lightweight virtual circuit
approach which preserves the order of blocks sent on a circuit, but
which makes no other guarantees about the delivery of these blocks.
An extra internetwork protocol layer is avoided by modifying the
system used on a single Cambridge ring which binds service names to
addresses so that it now binds service names to routes across the
network.
Structural induction in LCFPaulson, LawrenceUniversity of Cambridge, Computer Laboratory1983-11enTextUCAM-CL-TR-44ISSN 1476-2986Compound noun interpretation problemsSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory1983-07enTextUCAM-CL-TR-45ISSN 1476-2986
This paper discusses the problems of compound noun interpretation in
the context of automatic language processing. Given that compound
processing implies identifying the senses of the words involved,
determining their bracketing, and establishing their underlying
semantic relations, the paper illustrates the need, even in
comparatively favourable cases, for inference using pragmatic
information. This has consequences for language processor
architectures and, even more, for speech processors.
Intelligent network interfacesGarnett, Nicholas HenryUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-46ISSN 1476-2986Automatic summarising of English textsTait, John IrvingUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-47ISSN 1476-2986
This thesis describes a computer program called Scrabble which can
summarise short English texts. It uses large bodies of predictions
about the likely contents of texts about particular topics to
identify the commonplace material in an input text. Pre-specified
summary templates, each associated with a different topic are used
to condense the commonplace material in the input. Filled-in summary
templates are then used to form a framework into which unexpected
material in the input may be fitted, allowing unexpected material to
appear in output summary texts in an essentially unreduced form. The
system’s summaries are in English.
The program is based on technology not dissimilar to a script
applier. However, Scrabble represents a significant advance over
previous script-based summarising systems. It is much less likely to
produce misleading summaries of an input text than some previous
systems and can operate with less information about the subject
domain of the input than others.
These improvements are achieved by the use of three main novel
ideas. First, the system incorporates a new method for identifying
the idea or topics of an input text. Second, it allows a section of
text to have more than one topic at a time, or at least a composite
topic which may be dealt with by the computer program simultaneously
applying the text predictions associated with more than one simple
topic. Third, Scrabble incorporates new mechanisms for the
incorporation of unexpected material in the input into its output
summary texts. The incorporation of such material in the output
summary is motivated by the view that it is precisely unexpected
material which is likely to form the most salient matter in the
input text.
The performance of the system is illustrated by means of a number of
example input texts and their Scrabble summaries.
A mechanism for the accumulation and application of context
in text processingAlshawi, HiyanUniversity of Cambridge, Computer Laboratory1983-11enTextUCAM-CL-TR-48ISSN 1476-2986
The paper describes a mechanism for the representation and
application of context information for automatic natural language
processing systems. Context information is gathered gradually during
the reading of the text, and the mechanism gives a way of combining
the effect of several different types of context factors. Context
factors can be managed independently, while still allowing efficient
access to entities in focus. The mechanism is claimed to be more
general than the global focus mechanism used by Grosz for discourse
understanding. Context affects the interpretation process by
choosing the results, and restricting the processing, of a number of
important language interpretation operations, including lexical
disambiguation and reference resolution. The types of context
factors that have been implemented in an experimental system are
described, and examples of the application of context are given.
Programming language design with polymorphismMatthews, David Charles JamesUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-49ISSN 1476-2986Verifying the unification algorithm in LCFPaulson, LawrenceUniversity of Cambridge, Computer Laboratory1984-03enTextUCAM-CL-TR-50ISSN 1476-2986
Manna and Waldinger’s theory of substitutions and unification has
been verified using the Cambridge LCF theorem prover. A proof of the
monotonicity of substitution is presented in detail, as an example
of interaction with LCF. Translating the theory into LCF’s
domain-theoretic logic is largely straightforward. Well-founded
induction on a complex ordering is translated into nested structural
inductions. Correctness of unification is expressed using predicates
for such properties as idempotence and most-generality. The
verification is presented as a series of lemmas. The LCF proofs are
compared with the original ones, and with other approaches. It
appears difficult to find a logic that is both simple and flexible,
especially for proving termination.
Using information systems to solve recursive domain
equations effectivelyWinskel, GlynnLarsen, Kim GuldstrandUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-51ISSN 1476-2986The design of a ring communication networkTemple, StevenUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-52ISSN 1476-2986
This dissertation describes the design of a high speed local area
network. Local networks have been in use now for over a decade and
there is a proliferation of different systems, experimental ones
which are not widely used and commercial ones installed in hundreds
of locations. For a new network design to be of interest from the
research point of view it must have a feature or features which set
it apart from existing networks and make it an improvement over
existing systems. In the case of the network described, the research
was started to produce a network which was considerably faster than
current designs, but which retained a high degree of generality.
As the research progressed, other features were considered, such as
ways to reduce the cost of the network and the ability to carry data
traffic of many different types. The emphasis on high speed is still
present but other aspects were considered and are discussed in the
dissertation. The network has been named the Cambridge Fast Ring and
and the network hardware is currently being implemented as an
integrated circuit at the University of Cambridge Computer
Laboratory.
The aim of the dissertation is to describe the background to the
design and the decisions which were made during the design process,
as well as the design itself. The dissertation starts with a survey
of the uses of local area networks and examines some established
networks in detail. It then proceeds by examining the
characteristics of a current network installation to assess what is
required of the network in that and similar applications. The major
design considerations for a high speed network controller are then
discussed and a design is presented. Finally, the design of computer
interfaces and protocols for the network is discussed.
A new type-checker for a functional languageFairbairn, JonUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-53ISSN 1476-2986Lessons learned from LCFPaulson, LawrenceUniversity of Cambridge, Computer Laboratory1984-08enTextUCAM-CL-TR-54ISSN 1476-2986Executing temporal logic programsMoszkowski, BenUniversity of Cambridge, Computer Laboratory1984-08enTextUCAM-CL-TR-55ISSN 1476-2986A new scheme for writing functional operating
systemsStoye, WilliamUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-56ISSN 1476-2986Constructing recursion operators in intuitionistic type
theoryPaulson, Lawrence C.University of Cambridge, Computer Laboratory1984-10enTextUCAM-CL-TR-57ISSN 1476-2986
Martin-Löf’s Intuitionistic Theory of Types is becoming popular for
formal reasoning about computer programs. To handle recursion
schemes other than primitive recursion, a theory of well-founded
relations is presented. Using primitive recursion over higher types,
induction and recursion are formally derived for a large class of
well-founded relations. Included are < on natural numbers, and
relations formed by inverse images, addition, multiplication, and
exponentiation of other relations. The constructions are given in
full detail to allow their use in theorem provers for Type Theory,
such as Nuprl. The theory is compared with work in the field of
ordinal recursion over higher types.
Categories of models for concurrencyWinskel, GlynnUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-58ISSN 1476-2986On the composition and decomposition of
assertionsWinskel, GlynnUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-59ISSN 1476-2986Memory and context mechanisms for automatic text
processingAlshawi, HiyanUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-60ISSN 1476-2986User models and expert systemsSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory1984-12enTextUCAM-CL-TR-61ISSN 1476-2986Constraint enforcement in a relational database management
systemRobson, MichaelUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-62ISSN 1476-2986Poly manualMatthews, David C.J.University of Cambridge, Computer Laboratory1985-02enTextUCAM-CL-TR-63ISSN 1476-2986A framework for inference in natural language front ends to
databasesBoguraev, Branimir K.Spärck Jones, KarenUniversity of Cambridge, Computer Laboratory1985-02enTextUCAM-CL-TR-64ISSN 1476-2986Introduction to the programming language “Ponder”Tillotson, MarkUniversity of Cambridge, Computer Laboratory1985-05enTextUCAM-CL-TR-65ISSN 1476-2986A formal hardware verification methodology and its
application to a network interface chipGordon, M.J.C.Herbert, J.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-66ISSN 1476-2986Natural deduction theorem proving via higher-order
resolutionPaulson, Lawrence C.University of Cambridge, Computer Laboratory1985-05enTextUCAM-CL-TR-67ISSN 1476-2986HOL : A machine oriented formulation of higher order
logicGordon, MikeUniversity of Cambridge, Computer Laboratory1985-07enTextUCAM-CL-TR-68ISSN 1476-2986Proving termination of normalization functions for
conditional expressionsPaulson, Lawrence C.University of Cambridge, Computer Laboratory1985-06enTextUCAM-CL-TR-69ISSN 1476-2986
Boyer and Moore have discussed a recursive function that puts
conditional expressions into normal form. It is difficult to prove
that this function terminates on all inputs. Three termination
proofs are compared: (1) using a measure function, (2) in domain
theory using LCF, (3) showing that its “recursion relation”, defined
by the pattern of recursive calls, is well-founded. The last two
proofs are essentially the same though conducted in markedly
different logical frameworks. An obviously total variant of the
normalize function is presented as the ‘computational meaning’ of
those two proofs.
A related function makes nested recursive calls. The three
termination proofs become more complex: termination and correctness
must be proved simultaneously. The recursion relation approach seems
flexible enough to handle subtle termination proofs where previously
domain theory seemed essential.
A remote procedure call systemHamilton, Kenneth GrahamUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-70ISSN 1476-2986Executing temporal logic programsMoszkowski, BenUniversity of Cambridge, Computer Laboratory1985-08enTextUCAM-CL-TR-71ISSN 1476-2986Logic programming and the specification of
circuitsClocksin, W.F.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-72ISSN 1476-2986Resource management in a distributed computing
systemCraft, Daniel HammondUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-73ISSN 1476-2986Hardware verification by formal proofGordon, MikeUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-74ISSN 1476-2986Design and implementation of a simple typed language based
on the lambda-calculusFairbairn, JonUniversity of Cambridge, Computer Laboratory1985-05enTextUCAM-CL-TR-75ISSN 1476-2986
Despite the work of Landin and others as long ago as 1966, almost
all recent programming languages are large and difficult to
understand. This thesis is a re-examination of the possibility of
designing and implementing a small but practical language based on
very few primitive constructs.
The text records the syntax and informal semantics of a new language
called Ponder. The most notable features of the work are a powerful
type-system and an efficient implementation of normal order
reduction.
In contrast to Landin’s ISWIM, Ponder is statically typed, an
expedient that increases the simplicity of the language by removing
the requirement that operations must be defined for incorrect
arguments. The type system is a powerful extension of Milner’s
polymorphic type system for ML in that it allows local
quantification of types. This extension has the advantage that types
that would otherwise need to be primitive may be defined.
The criteria for the well-typedness of Ponder programmes are
presented in the form of a natural deduction system in terms of a
relation of generality between types. A new type checking algorithm
derived from these rules is proposed.
Ponder is built on the λ-calculus without the need for additional
computation rules. In spite of this abstract foundation an efficient
implementation based on Hughes’ super-combinator approach is
described. Some evidence of the speed of Ponder programmes is
included.
The same strictures have been applied to the design of the syntax of
Ponder, which, rather than having many pre-defined clauses, allows
the addition of new constructs by the use of a simple extension
mechanism.
Preserving abstraction in concurrent programmingCooper, R.C.B.Hamilton, K.G.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-76ISSN 1476-2986Why higher-order logic is a good formalisation for
specifying and verifying hardwareGordon, MikeUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-77ISSN 1476-2986A complete proof system for SCCS with model
assertionsWinskel, GlynnUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-78ISSN 1476-2986Petri nets, algebras and morphismsWinskel, GlynnUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-79ISSN 1476-2986
It is shown how a category of Petri nets can be viewed as a
subcategory of two sorted algebras over multisets. This casts Petri
nets in a familiar framework and provides a useful idea of morphism
on nets different from the conventional definition – the morphisms
here respect the behaviour of nets. The categorical constructions
with result provide a useful way to synthesise nets and reason about
nets in terms of their components; for example various forms of
parallel composition of Petri nets arise naturally from the product
in the category. This abstract setting makes plain a useful functor
from the category of Petri nets to a category of spaces of
invariants and provides insight into the generalisations of the
basic definition of Petri nets – for instance the coloured and
higher level nets of Kurt Jensen arise through a simple
modificationof the sorts of the algebras underlying nets. Further it
provides a smooth formal relation with other models of concurrency
such as Milner’s Calculus of Communicating Systems (CCS) and Hoare’s
Communicating Sequential Processes (CSP).
Interactive theorem proving with Cambridge LCF : A user's
manualPaulson, Lawrence C.University of Cambridge, Computer Laboratory1985-11enTextUCAM-CL-TR-80ISSN 1476-2986The implementation of functional languages using custom
hardwareStoye, William RobertUniversity of Cambridge, Computer Laboratory1985-12enTextUCAM-CL-TR-81ISSN 1476-2986
In recent years functional programmers have produced a great many
good ideas but few results. While the use of functional languages
has been enthusiastically advocated, few real application areas have
been tackled and so the functional programmer's views and ideas are
met with suspicion.
The prime cause of this state of affairs is the lack of widely
available, solid implementations of functional languages. This in
turn stems from two major causes: (1) Our understanding of
implementation techniques was very poor only a few years ago, and so
any implementation that is “mature” is also likely to be unuseably
slow. (2) While functional languages are excellent for expressing
algorithms, there is still considerable debate in the functional
programming community over the way in which input and output
operations should be represented to the programmer. Without clear
guiding principles implementors have tended to produce ad-hoc,
inadequate solutions.
My research is concerned with strengthening the case for functional
programming. To this end I constructed a specialised processor,
called SKIM, which could evaluate functional programs quickly. This
allowed experimentation with various implementation methods, and
provided a high performance implementation with which to experiment
with writing large functional programs.
This thesis describes the resulting work and includes the following
new results: (1) Details of a practical turner-style combinator
reduction implementation featuring greatly improved storage use
compared with previous methods. (2) An implementation of Kennaway’s
director string idea that further enhances performance and increases
understanding of a variety of reduction strategies. (3)
Comprehensive suggestions concerning the representation of input,
output, and nondeterministic tasks using functional languages, and
the writing of operating systems. Details of the implementation of
these suggestions developed on SKIM. (4) A number of observations
concerning fuctional programming in general based on considerable
practical experience.
Natural deduction proof as higher-order
resolutionPaulson, Lawrence C.University of Cambridge, Computer Laboratory1985-12enTextUCAM-CL-TR-82ISSN 1476-2986
An interactive theorem prover, Isabelle, is under development. In
LCF, each inference rule is represented by one function for forwards
proof and another (a tactic) for backwards proof. In Isabelle, each
inference rule is represented by a Horn clause. Resolution gives
both forwards and backwards proof, supporting a large class of
logics. Isabelle has been used to prove theorems in Martin-Löf’s
Constructive Type Theory.
Quantifiers pose several difficulties: substitution, bound
variables, Skolemization. Isabelle’s representation of logical
syntax is the typed lambda-calculus, requiring higher-order
unification. It may have potential for logic programming.
Depth-first search using inference rules constitutes a higher-order
Prolog.
Operation system design for large personal
workstationsWilson, Ian DavidUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-83ISSN 1476-2986BSPL: a language for describing the behaviour of synchronous
hardwareRichards, MartinUniversity of Cambridge, Computer Laboratory1986-04enTextUCAM-CL-TR-84ISSN 1476-2986Category theory and models for parallel
computationWinskel, GlynnUniversity of Cambridge, Computer Laboratory1986-04enTextUCAM-CL-TR-85ISSN 1476-2986
This report will illustrate two uses of category theory: Firstly the
use of category theory to define semantics in a particular model.
How semantic constructions can often be seen as categorical ones,
and, in particular, how parallel compositions are derived from a
categorical product and a nun-deterministic sum. These categorical
notions can provide a basis for reasoning about computations and
will be illustrated for the model of Petri nets.
Secondly, the use of category theory to relate different semantics
will be examined; specifically, how the relations between various
concrete models like Petri nets, event structures, trees and state
machines are expressed as adjunctions. This will be illustrated by
showing the coreflection between safe Petri nets and trees.
The Entity System: an object based filing systemCrawley, Stephen ChristopherUniversity of Cambridge, Computer Laboratory1986-04enTextUCAM-CL-TR-86ISSN 1476-2986Computer-aided type face designCarter, Kathleen AnneUniversity of Cambridge, Computer Laboratory1986-05enTextUCAM-CL-TR-87ISSN 1476-2986A shallow processing approach to anaphor
resolutionCarter, David MacleanUniversity of Cambridge, Computer Laboratory1986-05enTextUCAM-CL-TR-88ISSN 1476-2986Making form follow function : An exercise in functional
programming styleFairbairn, JonUniversity of Cambridge, Computer Laboratory1986-06enTextUCAM-CL-TR-89ISSN 1476-2986
The combined use of user-defined infix operators and higher order
functions allows the programmer to invent new control structures
tailored to a particular problem area.
This paper is to suggest that such a combination has beneficial
effects on the ease of both writing and reading programmes, and
hence can increase programmer productivity. As an example, a parser
for a simple language is presented in this style.
It is hoped that the presentation will be palatable to people
unfamiliar with the concepts of functional programming.
The Cambridge Fast Ring networking system (CFR)Hopper, AndyNeedham, Roger M.University of Cambridge, Computer Laboratory1986-06enTextUCAM-CL-TR-90ISSN 1476-2986Hardware verification using higher-order logicCamilleri, AlbertGordon, MikeMelham, TomUniversity of Cambridge, Computer Laboratory1986-09enTextUCAM-CL-TR-91ISSN 1476-2986
The Hardware Verification Group at the University of Cambridge is
investigating how various kinds of digital systems can be verified
by mechanised formal proof. This paper explains our approach to
representing behaviour and structure using higher order logic.
Several examples are described including a ripple carry adder and a
sequential device for computing the factorial function. The dangers
of inaccurate models are illustrated with a CMOS exclusive-or gate.
Implementation and programming techniques for functional
languagesWray, Stuart CharlesUniversity of Cambridge, Computer Laboratory1986-06enTextUCAM-CL-TR-92ISSN 1476-2986Automated design of an instruction set for BCPLBennett, J.P.University of Cambridge, Computer Laboratory1986-06enTextUCAM-CL-TR-93ISSN 1476-2986A mechanized proof of correctness of a simple
counterCohn, AvraGordon, MikeUniversity of Cambridge, Computer Laboratory1986-06enTextUCAM-CL-TR-94ISSN 1476-2986Event structures : Lecture notes for the Advanced Course on
Petri NetsWinskel, GlynnUniversity of Cambridge, Computer Laboratory1986-07enTextUCAM-CL-TR-95ISSN 1476-2986
Event structures are a model of computational processes. They
represent a process as a set of event occurrences with relations to
express how events causally depend on others. This paper introduces
event structures, shows their relationship to Scott domains and
Petri nets, and surveys their role in denotational semantics, both
for modelling laguages like CCS and CSP and languages with higher
types.
Models and logic of MOS circuits : Lectures for the
Marktoberdorf Summerschool, August 1986Winskel, GlynnUniversity of Cambridge, Computer Laboratory1986-10enTextUCAM-CL-TR-96ISSN 1476-2986A study on abstract interpretation and “validating microcode
algebraically”Mycroft, AlanUniversity of Cambridge, Computer Laboratory1986-10enTextUCAM-CL-TR-97ISSN 1476-2986Power-domains, modalities and the Vietoris monadRobinson, E.University of Cambridge, Computer Laboratory1986-10enTextUCAM-CL-TR-98ISSN 1476-2986
It is possible to divide the syntax-directed approaches to
programming language semantics into two classes, “denotational”, and
“proof-theoretic”. This paper argues for a different approach which
also has the effect of linking the two methods. Drawing on recent
work on locales as formal spaces we show that this provides a way in
which we can hope to use a proof-theoretical semantics to give us a
denotational one. This paper reviews aspects of the general theory,
before developing a modal construction on locales and discussing the
view of power-domains as free non-deterministic algebras. Finally,
the relationship between the present work and that of Winskel is
examined.
An overview of the Poly programming languageMatthews, David C.J.University of Cambridge, Computer Laboratory1986-08enTextUCAM-CL-TR-99ISSN 1476-2986Proving a computer correct in higher order logicJoyce, JeffBirtwistle, GrahamGordon, MikeUniversity of Cambridge, Computer Laboratory1986-12enTextUCAM-CL-TR-100ISSN 1476-2986Binary routing networksMilway, David RusselUniversity of Cambridge, Computer Laboratory1986-12enTextUCAM-CL-TR-101ISSN 1476-2986A persistent storage system for Poly and MLMatthews, David C.J.University of Cambridge, Computer Laboratory1987-01enTextUCAM-CL-TR-102ISSN 1476-2986HOL : A proof generating system for higher-order
logicGordon, MikeUniversity of Cambridge, Computer Laboratory1987-01enTextUCAM-CL-TR-103ISSN 1476-2986A proof of correctness of the Viper microprocessor: the
first levelCohn, AvraUniversity of Cambridge, Computer Laboratory1987-01enTextUCAM-CL-TR-104ISSN 1476-2986
The Viper microprocessor designed at the Royal Signals and Radar
Establishment (RSRE) is one of the first commercially produced
computers to have been developed using modern formal methods. Viper
is specified in a sequence of decreasingly abstract levels. In this
paper a mechanical proof of the equivalence of the first two of
these levels is described. The proof was generated using a version
of Robin Milner’s LCF system.
A compositional model of MOS circuitsWinskel, GlynnUniversity of Cambridge, Computer Laboratory1987-04enTextUCAM-CL-TR-105ISSN 1476-2986Abstraction mechanisms for hardware verificationMelham, Thomas F.University of Cambridge, Computer Laboratory1987-05enTextUCAM-CL-TR-106ISSN 1476-2986DI-domains as a model of polymorphismCoquand, ThierryGunter, CarlWinskel, GlynnUniversity of Cambridge, Computer Laboratory1987-05enTextUCAM-CL-TR-107ISSN 1476-2986Workstation design for distributed computingWilkes, Andrew JohnUniversity of Cambridge, Computer Laboratory1987-06enTextUCAM-CL-TR-108ISSN 1476-2986
This thesis discusses some aspects of the design of computer systems
for local area networks (LANs), with particular emphasis on the way
such systems present themselves to their users. Too little attention
to this issue frequently results in computing environments that
cannot be extended gracefully to accommodate new hardware or
software and do not present consistent, uniform interfaces to either
their human users or their programmatic clients. Before computer
systems can become truly ubiquitous tools, these problems of
extensibility and accessibility must be solved. This dissertation
therefore seeks to examine one possible approach, emphasising
support for program development on LAN based systems.
Hardware verification of VLSI regular structuresJoyce, JeffreyUniversity of Cambridge, Computer Laboratory1987-07enTextUCAM-CL-TR-109ISSN 1476-2986Relating two models of hardwareWinskel, GlynnUniversity of Cambridge, Computer Laboratory1987-07enTextUCAM-CL-TR-110ISSN 1476-2986Realism about user modellingSpärck Jones, K.University of Cambridge, Computer Laboratory1987-06enTextUCAM-CL-TR-111ISSN 1476-2986
This paper reformulates the framework for user modelling presented
in an earlier technical report, ‘User Models and Expert Systems’,
and considers the implications of the real limitations on the
knowledge likely to be available to a system for the value and
application of user models.
Reducing thrashing by adaptive backtrackingWolfram, D.A.University of Cambridge, Computer Laboratory1987-08enTextUCAM-CL-TR-112ISSN 1476-2986The representation of logics in higher-order
logicPaulson, Lawrence C.University of Cambridge, Computer Laboratory1987-08enTextUCAM-CL-TR-113ISSN 1476-2986An architecture for integrated services on the local area
networkAdes, StephenUniversity of Cambridge, Computer Laboratory1987-09enTextUCAM-CL-TR-114ISSN 1476-2986
This dissertation concerns the provision of integrated services in a
local area context, e.g. on business premises. The term integrated
services can be understood at several levels. At the lowest, one
network may be used to carry traffic of several media—voice, data,
images etc. Above that, the telephone exchange may be replaced by a
more versatile switching system, incorporating facilities such as
stored voice messages. Its facilities may be accessible to the user
through the interface of the workstation rather than a telephone. At
a higher level still, new services such as multi-media document
manipulation may be added to the capabilities of a workstation.
Most of the work to date has been at the lowest of these levels,
under the auspices of the Integrated Services Digital Network
(ISDN), which mainly concerns wide area communications systems. The
thesis presented here is that all of the above levels are important
in a local area context. In an office environment, sophisticated
data processing facilities in a workstation can usefully be combined
with highly available telecommunications facilities such as the
telephone, to offer the user new services which make the working day
more pleasant and productive. That these facilities should be
provided across one integrated network, rather than by several
parallel single medium networks is an important organisational
convenience to the system builder.
The work described in this dissertation is relevant principally in a
local area context—in the wide area economics and traffic balance
dictate that the emphasis will be on only the network level of
integration for some time now. The work can be split into three
parts:
i) the use of a packet network to carry mixed media. This has
entailed design of packet voice protocols which produce delays low
enough for the network to interwork with national telephone
networks. The system has also been designed for minimal cost per
telephone—packet-switched telephone systems have traditionally been
more expensive than circuit-switched types. The network used as a
foundation for this work has been the Cambridge Fast Ring.
ii) use of techniques well established in distributed computing
systems to build an ‘integrated services PABX (Private Automatic
Branch Exchange)’. Current PABX designs have a very short life
expectancy and an alarmingly high proportion of their costs is due
to software. The ideas presented here can help with both of these
problems, produce an extensible system and provide a basis for new
multi-media services.
iii) development of new user level Integrated Services. Work has
been done in three areas. The first is multi-media documents. A
voice editing interface is described along with the system structure
required to support it. Secondly a workstation display has been
built to support a variety of services based upon image manipulation
and transmission. Finally techniques have been demonstrated by which
a better interface to telephony functions can be provided to the
user, using methods of control typical of workstation interfaces.
Formal validation of an integrated circuit design
styleDhingra, I.S.University of Cambridge, Computer Laboratory1987-08enTextUCAM-CL-TR-115ISSN 1476-2986Domain theoretic models of polymorphismCoquand, ThierryGunter, CarlWinskel, GlynnUniversity of Cambridge, Computer Laboratory1987-09enTextUCAM-CL-TR-116ISSN 1476-2986Distributed computing with RPC: the Cambridge
approachBacon, J.M.Hamilton, K.G.University of Cambridge, Computer Laboratory1987-10enTextUCAM-CL-TR-117ISSN 1476-2986
The Cambridge Distributed Computing System (CDCS) is described and
its evolution outlined. The Mayflower project allowed CDCS
infrastructure, services and applications to be programmed in a high
level, object oriented, language, Concurrent CLU. The Concurrent CLU
RPC facility is described in detail. It is a non-transparent, type
checked, type safe system which employs dynamic binding and passes
objects of arbitrary graph structure. Recent extensions accomodate a
number of languages and transport protocols. A comparison with other
RPC schemes is given.
Material concerning a study of casesBoguraev, B.K.Spärck Jones, K.University of Cambridge, Computer Laboratory1987-05enTextUCAM-CL-TR-118ISSN 1476-2986Pilgrim: a debugger for distributed systemsCooper, RobertUniversity of Cambridge, Computer Laboratory1987-07enTextUCAM-CL-TR-119ISSN 1476-2986Block encryptionWheeler, D.University of Cambridge, Computer Laboratory1987-11enTextUCAM-CL-TR-120ISSN 1476-2986
A fast and simple way of encrypting computer data is needed. The
UNIX crypt is a good way of doing this although the method is not
cryptographically sound for text. The method suggested here is
applied to larger blocks than the DES method which uses 64 bit
blocks, so that the speed of encyphering is reasonable. The
algorithm is designed for software rather than hardware. This
forgoes two advantages of the crypt algorithm, namely that each
character can be encoded and decoded independently of other
characters and that the identical process is used both for
encryption and decryption. However this method is better for coding
blocks directly.
A high-level petri net specification of the Cambridge Fast
Ring M-access serviceBillington, JonathanUniversity of Cambridge, Computer Laboratory1987-12enTextUCAM-CL-TR-121ISSN 1476-2986Temporal abstraction of digital designsHerbert, JohnUniversity of Cambridge, Computer Laboratory1988-02enTextUCAM-CL-TR-122ISSN 1476-2986Case study of the Cambridge Fast Ring ECL chip using
HOLHerbert, JohnUniversity of Cambridge, Computer Laboratory1988-02enTextUCAM-CL-TR-123ISSN 1476-2986Formal verification of basic memory devicesHerbert, JohnUniversity of Cambridge, Computer Laboratory1988-02enTextUCAM-CL-TR-124ISSN 1476-2986An operational semantics for OccamCamilleri, JuanitoUniversity of Cambridge, Computer Laboratory1988-02enTextUCAM-CL-TR-125ISSN 1476-2986Reasoning about the function and timing of integrated
circuits with Prolog and temporal logicLeeser, M.E.University of Cambridge, Computer Laboratory1988-02enTextUCAM-CL-TR-126ISSN 1476-2986A development environment for large natural language
grammarsCarroll, JohnBoguraev, BranGrover, ClaireBriscoe, TedUniversity of Cambridge, Computer Laboratory1988-02enTextUCAM-CL-TR-127ISSN 1476-2986Debugging concurrent and distributed programsCooper, Robert Charles BeaumontUniversity of Cambridge, Computer Laboratory1988-02enTextUCAM-CL-TR-128ISSN 1476-2986A methodology for automated design of computer instruction
setsBennett, Jeremy PeterUniversity of Cambridge, Computer Laboratory1988-03enTextUCAM-CL-TR-129ISSN 1476-2986
With semiconductor technology providing scope for increasingly
complex computer architectures, there is a need more than ever to
rationalise the methodology behind computer design. In the 1970’s,
byte stream architectures offered a rationalisation of computer
design well suited to microcoded hardware. In the 1980’s, RISC
technology has emerged to simplify computer design and permit full
advantage to be taken of very large scale integration. However, such
approaches achieve their aims by simplifying the problem to a level
where it is within the comprehension of a simple human being. Such
an effort is not sufficient. There is a need to provide a
methodology that takes the burden of design detail away from the
human designer, leaving him free to cope with the underlying
principles involved.
In this dissertation I present a methodology for the design of
computer instruction sets that is capable of automation in large
part, removing the drudgery of individual instruction selection. The
methodology does not remove the need for the designer’s skill, but
rather allows precise refinement of his ideas to obtain an optimal
instruction set.
In developing this methodology a number of pieces of software have
been designed and implemented. Compilers have been written to
generate trial instruction sets. An instruction set generator
program has been written and the instruction set it proposes
evaluated. Finally a prototype language for instruction set design
has been devised and implemented.
The foundation of a generic theorem proverPaulson, Lawrence CUniversity of Cambridge, Computer Laboratory1988-03enTextUCAM-CL-TR-130ISSN 1476-2986
Isabelle is an interactive theorem prover that supports a variety of
logics. It represents rules as propositions (not as functions) and
builds proofs by combining rules. These operations constitute a
meta-logic (or ‘logical framework’) in which the object-logics are
formalized. Isabelle is now based on higher-order logic – a precise
and well-understood foundation.
Examples illustrate use of this meta-logic to formalize logics and
proofs. Axioms for first-order logic are shown sound and complete.
Backwards proof is formalized by meta-reasoning about object-level
entailment.
Higher-order logic has several practical advantages over other
meta-logics. Many proof techniques are known, such as Huet’s
higher-order unification procedure.
Architecture problems in the construction of expert systems
for document retrievalSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory1986-12enTextUCAM-CL-TR-131ISSN 1476-2986Reasoning about the function and timing of integrated
circuits with Prolog and temporal logicLeeser, Miriam EllenUniversity of Cambridge, Computer Laboratory1988-04enTextUCAM-CL-TR-132ISSN 1476-2986A preliminary users manual for IsabellePaulson, Lawrence C.University of Cambridge, Computer Laboratory1988-05enTextUCAM-CL-TR-133ISSN 1476-2986
This is an early report on the theorem prover Isabelle and several
of its object-logics. It describes Isabelle’s operations, commands,
data structures, and organization. This information is fairly
low-level, but could benefit Isabelle users and implementors of
other systems.
Correctness properties of the Viper black model: the second
levelCohn, AvraUniversity of Cambridge, Computer Laboratory1988-05enTextUCAM-CL-TR-134ISSN 1476-2986Using reclusive types to reason about hardware in higher
order logicMelham, Thomas F.University of Cambridge, Computer Laboratory1988-05enTextUCAM-CL-TR-135ISSN 1476-2986Formal specification and verification of asynchronous
processes in higher-order logicJoyce, Jeffrey J.University of Cambridge, Computer Laboratory1988-06enTextUCAM-CL-TR-136ISSN 1476-2986
We model the interaction of a synchronous process with an
asynchronous memory process using a four-phase “handshaking”
protocol. This example demonstrates the use of higher-order logic to
reason about the behaviour of synchronous systems such as
microprocessors which communicate requests to asynchronous devices
and then wait for unpredictably long periods until these requests
are answered. We also describe how our model could be revised to
include some of the detailed timing requirements found in real
systems such as the M68000 microprocessor. One enhancement uses
non-determinism to model minimum setup times for asynchronous
inputs. Experience with this example suggests that higher-order
logic may also be a suitable formalism for reasoning about more
abstract forms of concurrency.
Mass terms and plurals : From linguistic theory to natural
language processingHasle, F.V.University of Cambridge, Computer Laboratory1988-06enTextUCAM-CL-TR-137ISSN 1476-2986Authentication: a practical study in belief and
actionBurrows, MichaelAbadi, MartínNeedham, RogerUniversity of Cambridge, Computer Laboratory1988-06enTextUCAM-CL-TR-138ISSN 1476-2986Petri net theory: a surveyManson, Paul R.University of Cambridge, Computer Laboratory1988-06enTextUCAM-CL-TR-139ISSN 1476-2986
The intense interest in concurrent (or “parallel”) computation over
the past decade has given rise to a large number of languages for
concurrent programming, representing many conflicting views of
concurrency.
The discovery that concurrent programming is significantly more
difficult than sequential programming has prompted considerable
research into determining a tractable and flexible theory of
concurrency, with the aim of making concurrent processing more
accessible, and indeed the wide variety of concurrent languages
merely reflects the many different models of concurrency which have
also been developed.
This report, therefore introduces Petri nets, discussing their
behaviour, interpretation and relationship to other models of
concurrency. It defines and discusses several restrictions and
extensions of the Petri net model, showing how they relate to basic
Petri nets, while explaining why they have been of historical
importance. Finally it presents a survey of the analysis methods
applied to Petri nets in general and for some of the net models
introduced here.
Executing behavioural definitions in higher-order
logicCamilleri, Albert JohnUniversity of Cambridge, Computer Laboratory1988-07enTextUCAM-CL-TR-140ISSN 1476-2986
Over the past few years, computer scientists have been using formal
verification techniques to show the correctness of digital systems.
The verification process, however, is complicated and expensive.
Even proofs of simple circuits can involve thousands of logical
steps. Often it can be extremely difficult to find correct device
specifications and it is desirable that one sets off to prove a
correct specification from the start, rather than repeatedly
backtrack from the verification process to modify the original
definitions after discovering they were incorrect or inadequate.
The main idea presented in the thesis is to amalgamate the
techniques of simulation and verification, rather than have the
latter replace the former. The result is that behavioural
definitions can be simulated until one is reasonably sure that the
specification is correct. Furthermore, proving the correctness with
respect to these simulated specifications avoids the inadequacies of
simulation where it may not be computationally feasible to
demonstrate correctness by exhaustive testing. Simulation here has a
different purpose: to get specifications correct as early as
possible in the verification process. Its purpose is not to
demonstrate the correctness of the implementation – this is done in
the verification stage when the very same specifications that were
simulated are proved correct.
The thesis discusses the implementation of an executable subset of
the HOL logic, the version of Higher Order Logic embedded in the HOL
theorem prover. It is shown that hardware can be effectively
described using both relations and functions; relations being
suitable for abstract specification and functions being suitable for
execution. The difference between relational and functional
specifications are discussed and illustrated by the verification of
an n-bit adder. Techniques for executing functional specifications
are presented and various optimisation strategies are shown which
make the execution of the logic efficient. It is further shown that
the process of generating optimised functional definitions from
relational definitions can be automated. Example simulations of
three hardware devices (a factorial machine, a small computer and a
communications chip) are presented.
Reliable management of voice in a distributed
systemWant, RoyUniversity of Cambridge, Computer Laboratory1988-07enTextUCAM-CL-TR-141ISSN 1476-2986
The ubiquitous personal computer has found its way into most office
environments. As a result, widespread use of the Local Area Network
(LAN) for the purposes of sharing distributed computing resources
has become common. Another technology, the Private Automatic Branch
Exchange (PABX), has benefited from large research and development
by the telephone companies. As a consequence, it is cost effective
and has widely infiltrated the office world. Its primary purpose is
to switch digitised voice but, with the growing need for
communication between computers it is also being adapted to switch
data. However, PABXs are generally designed around a centralised
switch in which bandwidth is permanently divided between its
subscribers. Computing requirements need much larger bandwidths and
the ability to connect to several services at once, thus making the
conventional PABX unsuitable for this application.
Some LAN technologies are suitable for switching voice and data. The
additional requirement for voice is that point to point delay for
network packets should have a low upper-bound. The 10 Mb/s Cambridge
Ring is an example of this type of network, but is relatively low
bandwidth gives it limited application in this area. Networks with
larger bandwidths (up to 100 Mb/s) are now becoming available
comercially and could support a realistic population of clients
requiring voice and data communication.
Transporting voice and data in the same network has two main
advantages. Firstly, from a practical point of view, wiring is
minimised. Secondly, applications which integrate both media are
made possible, and hence digitised voice may be controlled by client
programs in new and interesting ways.
In addition to the new applications, the original telephony
facilities must also be available. They should, at least by default,
appear to work in an identical way to our tried and trusted
impression of a telephone. However, the control and management of a
network telephone is now in the domain of distributed computing. The
voice connections between telephones are virtual circuits. Control
and data information can be freely mixed with voice at a network
interface. The new problems that result are the management issues
related to the distributed control of real-time media.
This thesis describes the issues as a distributed computing problem
and proposes solutions, many of which have been demonstrated in a
real implementation. Particular attention has been paid to the
quality of service provided by the solutions. This amounts to the
design of helpful operator interfaces, flexible schemes for the
control of voice from personal workstations and, in particular, a
high reliability factor for the backbone telephony service. This
work demonstrates the advantages and the practicality of integrating
voice and data services within the Local Area Network.
A fast packet switch for the integrated services backbone
networkNewman, PeterUniversity of Cambridge, Computer Laboratory1988-07enTextUCAM-CL-TR-142ISSN 1476-2986Experience with Isabelle : A generic theorem
proverPaulson, Lawrence C.University of Cambridge, Computer Laboratory1988-08enTextUCAM-CL-TR-143ISSN 1476-2986
The theorem prover Isabelle is described briefly and informally. Its
historical development is traced from Edinburgh LCF to the present
day. The main issues are unification, quantifiers, and the
representation of inference rules. The Edinburgh Logical Framework
is also described, for a comparison with Isabelle. An appendix
presents several Isabelle logics, including set theory and
Constructive Type Theory, with examples of theorems.
An operational semantics for occamCamilleri, JuanitoUniversity of Cambridge, Computer Laboratory1988-08enTextUCAM-CL-TR-144ISSN 1476-2986Mechanizing programming logics in higher order
logicGordon, Michael J.C.University of Cambridge, Computer Laboratory1988-09enTextUCAM-CL-TR-145ISSN 1476-2986Automating recursive type definitions in higher order
logicMelham, Thomas F.University of Cambridge, Computer Laboratory1988-09enTextUCAM-CL-TR-146ISSN 1476-2986Formal specification and verification of microprocessor
systemsJoyce, JeffreyUniversity of Cambridge, Computer Laboratory1988-09enTextUCAM-CL-TR-147ISSN 1476-2986Extending coloured petri netsBillington, JonathanUniversity of Cambridge, Computer Laboratory1988-09enTextUCAM-CL-TR-148ISSN 1476-2986
Jensen’s Coloured Petri Nets (CP-nets) are taken as the starting
point for the development of a specification technique for complex
concurrent systems. To increase its expressive power CP-nets are
extended by including capacity and inhibitor functions. A class of
extended CP-nets, known as P-nets, is defined that includes the
capacity function and the threshold inhibitor extension. The
inhibitor extension is defined in a totally symmetrical way to that
of the usual pre place map (or incidence function). Thus the
inhibitor and pre place maps may be equated by allowing a marking to
be purged by a single transition occurrence, useful when specifying
the abortion of various procedures. A chapter is devoted to
developing the theory and notation for the purging of a place’s
marking or part of its marking.
Two transformations from P-nets to CP-nets are presented and it is
proved that they preserve interleaving behaviour. These are based on
the notion of complementary places defined for PT-nets and involve
the definition and proof of a new extended complementary place
invariant for CP-nets
The graphical form of P-nets, known as a P-Graph, is presented
formally and draws upon the theories developed for algebraic
specification. Arc inscriptions are multiples of tuples of terms
generated by a many-sorted signature. Transition conditions are
Boolean expressions derived from the same signature. An
interpretation of the P-Graph is given in terms of a corresponding
P-net. The work is similar to that of Vautherin but includes the
inhibitor and capacity extension and a number of significant
differences. in the P-Graph concrete sets are associated with
places, rather than sorts and likewise there are concrete initial
marking and capacity functions. Vautherin associates equations with
transitions rather than the more general Boolean expressions.
P-Graphs are useful for specification at a concrete level. Classes
of the P-Graph, known as Many-sorted Algebraic Nets and Many-sorted
Predicate/Transition nets, are defined and illustrated by a number
of examples. An extended place capacity notation is developed to
allow for the convenient representation of resource bounds in the
graphical form.
Some communications-oriented examples are presented including queues
and the Demon Game of international standards fame.
The report concludes with a discussion of future work. In
particular, an abstract P-Graph is defined that is very similar to
Vautherin’s Petri net-like schema, but including the capacity and
inhibitor extensions and associating boolean expressions with
transitions. This will be useful for more abstract specifications
(eg classes of communications protocols) and for their analysis.
It is believed that this is the first coherent and formal
presentation of these extensions in the literature.
Improving security and performance of capability
systemsKarger, Paul AshleyUniversity of Cambridge, Computer Laboratory1988-10enTextUCAM-CL-TR-149ISSN 1476-2986
This dissertation examines two major limitations of capability
systems: an inability to support security policies that enforce
confinement and a reputation for relatively poor performance when
compared with non-capability systems.
The dissertation examines why conventional capability systems cannot
enforce confinement and proposes a new secure capability
architecture, called SCAP, in which confinement can be enforced.
SCAP is based on the earlier Cambridge Capability System, CAP. The
dissertation shows how a non-discretionary security policy can be
implemented on the new architecture, and how the new architecture
can also be used to improve traceability of access and revocation of
access.
The dissertation also examines how capability systems are vulnerable
to discretionary Trojan horse attacks and proposes a defence based
on rules built into the command-language interpreter. System-wide
garbage collection, commonly used in most capability systems, is
examined in the light of the non-discretionary security policies and
found to be fundamentally insecure. The dissertation proposes
alternative approaches to storage management to provide at least
some of the benefits of system-wide garbage collection, but without
the accompanying security problems.
Performance of capability systems is improved by two major
techniques. First, the doctrine of programming generality is
addressed as one major cause of poor performance. Protection domains
should be allocated only for genuine security reasons, rather than
at every subroutine boundary. Compilers can better enforce
modularity and good programming style without adding the expense of
security enforcement to every subroutine call. Second, the ideas of
reduced instruction set computers (RISC) can be applied to
capability systems to simplify the operations required. The
dissertation identifies a minimum set of hardware functions needed
to obtain good performance for a capability system. This set is much
smaller than previous research had indicated necessary.
A prototype implementation of some of the capability features is
described. The prototype was implemented on a re-microprogrammed
VAX-11/730 computer. The dissertation examines the performance and
software compatibility implications of the new capability
architecture, both in the context of conventional computers, such as
the VAX, and in the context of RISC processors.
Simulation as an aid to verification using the HOL theorem
proverCamilleri, Albert JohnUniversity of Cambridge, Computer Laboratory1988-10enTextUCAM-CL-TR-150ISSN 1476-2986
The HOL theorem proving system, developed by Mike Gordon at the
University of Cambridge, is a mechanism of higher order logic,
primarily intended for conducting formal proofs of digital system
designs. In this paper we show that hardware specifications written
in HOL logic can be executed to enable simulation as a means of
supporting formal proof. Specifications of a small microprocessor
are described, showing how HOL logic sentences can be transformed
into executable code with minimum risk of introducing
inconsistencies. A clean and effective optimisation strategy is
recommended to make the executable specifications practical.
Formalising an integrated circuit design style in higher
order logicDhingra, Inderpreet-SinghUniversity of Cambridge, Computer Laboratory1988-11enTextUCAM-CL-TR-151ISSN 1476-2986
If the activities of an integrated circuit designer are examined, we
find that rather than keeping track of all the details, he uses
simple rules of thumb which have been refined from experience. These
rules of thumb are guidelines for deciding which blocks to use and
how they are to be connected. This thesis gives a formal foundation,
in higher order logic, to the design rules of a dynamic CMOS
integrated circuit design style.
Correctness statements for the library of basic elements are
fomulated. These statements are based on a small number of
definitions which define the behaviour of transistors and capacitors
and the necessary axiomisation of the four valued algebra for
signals. The correctness statements of large and complex circuits
are then derived from the library of previously proved correctness
statements, using logical inference rules instead of rules of thumb.
For example, one gate from the library can drive another only if its
output constraints are satisfied by the input constraints of the
gate that it drives. In formalising the design rules, these
constraints are captured as predicates and are part of the
correctness statements of these gates. So when two gates are to be
connected, it is only necessary to check that the predicates match.
These ideas are fairly general and widely applicable for formalising
the rules of many systems.
A number of worked examples are presented based on these formal
techniques. Proofs are presented at various stages of development to
show how the correctness statement for a device evolves and how the
proof is constructed. In particular it is demonstrated how such
formal techniques can help improve and sharpen the final
specifications.
As a major case study to test all these techniques, a new design for
a gigital phase-locked loop is presented. This has been designed
down to the gate level using the above dynamic design style, and has
been described and simulated using ELLA. Some of the subcomponents
have been formally verified down to the detailed circuit level while
others have merely been specified without formal proofs of
correctness. An informal proof of correctness of this device is also
presented based on the formal specifications of the various
submodules.
Motion development for computer animationPullen, Andrew MarkUniversity of Cambridge, Computer Laboratory1988-11enTextUCAM-CL-TR-152ISSN 1476-2986Efficient data sharingBurrows, MichaelUniversity of Cambridge, Computer Laboratory1988-12enTextUCAM-CL-TR-153ISSN 1476-2986
As distributed computing systems become widespread, the sharing of
data between people using a large number of computers becomes more
important. One of the most popular ways to facilitate this sharing
is to provide a common file system, accessible by all the machines
on the network. This approach is simple and reasonably effective,
but the performance of the system can degrade significantly if the
number of machines is increased. By using a hierarchical network,
and arranging that machines typically access files stored in the
same section of the network it is possible to build very large
systems. However, there is still a limit on the number of machines
that can share a single file server and a single network
effectively.
A good way to decrease network and server load is to cache file data
on client machines, so that data need not be fetched from the
centralized server each time it is accessed. This technique can
improve the performance of a distributed file system and is used in
a number of working systems. However, caching brings with it the
overhead of maintaining consistency, or cache coherence. That is,
each machine in the network must see the same data in its cache,
even though one machine may be modifying the data as others are
reading it. The problem is to maintain consistency without
dramatically increasing the number of messages that must be passed
between machines on the network.
Some existing file systems take a probabilistic approach to
consistency, some explicitly prevent the activities that can cause
inconsistency, while others provide consistency only at the some
cost in functionality or performance. In this dissertation, I
examine how distributed file systems are typically used, and the
degree to which caching might be expected to improve performance. I
then describe a new file system that attempts to cache significantly
more data than other systems, provides strong consistency
guarantees, yet requires few additional messages for cache
management.
This new file-system provides fine-grain sharing of a file
concurrently open on multiple machines on the network, at the
granularity of a single byte. It uses a simple system of
multiple-reader, single writer locks held in a centralized server to
ensure cache consistency. The problem of maintaining client state in
a centralized server are solved by using efficient data structures
and crash recovery techniques.
A natural language interface to an intelligent planning
systemCrabtree, I.B.Crouch, R.S.Moffat, D.C.Pirie, N.J.Pulman, S.G.Ritchie, G.D.Tate, B.A.University of Cambridge, Computer Laboratory1989-01enTextUCAM-CL-TR-154ISSN 1476-2986Computational morphology of EnglishPulman, S.G.Russell, G.J.Ritchie, G.D.Black, A.W.University of Cambridge, Computer Laboratory1989-01enTextUCAM-CL-TR-155ISSN 1476-2986
This paper describes an implemented computer program which uses
various kinds of linguistic knowledge to analyse existing or novel
word forms in terms of their components. Three main types of
knowledge are required (for English): knowledge about spelling or
phonological changes consequent upon affixation (notice we are only
dealing with isolated word forms); knowledge about the syntactic or
semantic properties of affixation (i.e. inflexional and derivational
morphology), and knowledge about the properties of the stored base
forms of words (which in our case are always themselves words,
rather than more abstract entities). These three types of
information are stored as data files, represented in exactly the
form a linguist might employ. These data files are then compiled by
the system to produce a run-time program which will analyse
arbitrary word forms presented to it in a way consistent with the
original linguistic description.
Events and VP modifiersPulman, SteveUniversity of Cambridge, Computer Laboratory1989-01enTextUCAM-CL-TR-156ISSN 1476-2986Introducing a priority operator to CCSCamilleri, JuanitoUniversity of Cambridge, Computer Laboratory1989-01enTextUCAM-CL-TR-157ISSN 1476-2986Tailoring output to the user: What does user modelling in
generation mean?Spärck Jones, KarenUniversity of Cambridge, Computer Laboratory1988-08enTextUCAM-CL-TR-158ISSN 1476-2986
This paper examines the implications for linguistic output
generation tailored to the interactive system user, of earlier
analyses of the components of user modelling and of the constraints
realism imposes on modelling. Using a range of detailed examples it
argues that tailoring based only on the actual dialogue and on the
decision model required for the system task is quite adequate, and
that more ambitious modelling is both dangerous and unnecessary.
Non-trivial power types can’t be subtypes of polymorphic
typesPitts, Andrew M.University of Cambridge, Computer Laboratory1989-01enTextUCAM-CL-TR-159ISSN 1476-2986PFL+: A Kernal Scheme for Functions I/OGordon, AndrewUniversity of Cambridge, Computer Laboratory1989-02enTextUCAM-CL-TR-160ISSN 1476-2986Papers on Poly/MLMatthews, D.C.J.University of Cambridge, Computer Laboratory1989-02enTextUCAM-CL-TR-161ISSN 1476-2986The Alvey natural language tools grammar (2nd
Release)Glover, ClaireBriscoe, TedCarroll, JohnBoguraev, BranUniversity of Cambridge, Computer Laboratory1989-04enTextUCAM-CL-TR-162ISSN 1476-2986Inference in a natural language front end for
databasesCopestake, AnnSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory1989-02enTextUCAM-CL-TR-163ISSN 1476-2986
This report describes the implementation and initial testing of
knowledge representation and inference capabilities within a modular
database front end designed for transportability.
A matrix key distribution systemGong, LiWheeler, David J.University of Cambridge, Computer Laboratory1988-10enTextUCAM-CL-TR-164ISSN 1476-2986
A new key distribution scheme is presented. It is based on the
distinctive idea that lets each node have a set of keys of which it
shares a distinct subset with every other node. This has the
advantage that the numbers of keys that must be distributed and
maintained are reduced by a square root factor; moreover, two nodes
can start conversation with virtually no delay. Two versions of the
scheme are given. Their performance and security analysis shows it
is a practical solution to some key distribution problems.
Fast packet switching for integrated servicesNewman, PeterUniversity of Cambridge, Computer Laboratory1989-03enTextUCAM-CL-TR-165ISSN 1476-2986Evolution of operating system structuresBacon, JeanUniversity of Cambridge, Computer Laboratory1989-03enTextUCAM-CL-TR-166ISSN 1476-2986A verified compiler for a verified microprocessorJoyce, Jeffrey J.University of Cambridge, Computer Laboratory1989-03enTextUCAM-CL-TR-167ISSN 1476-2986Distributed computing with a processor bankBacon, J.M.Leslie, I.M.Needham, R.M.University of Cambridge, Computer Laboratory1989-04enTextUCAM-CL-TR-168ISSN 1476-2986Filing in a heterogeneous networkSeaborne, Andrew FranklinUniversity of Cambridge, Computer Laboratory1989-04enTextUCAM-CL-TR-169ISSN 1476-2986Ordered rewriting and confluenceMartin, UrsulaNipkow, TobiasUniversity of Cambridge, Computer Laboratory1989-05enTextUCAM-CL-TR-170ISSN 1476-2986Some types with inclusion properties in ∀, →, μFairbairn, JonUniversity of Cambridge, Computer Laboratory1989-06enTextUCAM-CL-TR-171ISSN 1476-2986
This paper concerns the ∀, →, μ type system used in the non-strict
functional programming language Ponder. While the type system is
akin to the types of Second Order Lambda-calculus, the absence of
type application makes it possible to construct types with useful
inclusion relationships between them.
To illustrate this, the paper contains definitions of a natural
numbers type with many definable subtypes, and of a record type with
inheritance.
A theoretical framework for computer models of cooperative
dialogue, acknowledging multi-agent conflictGalliers, Julia RoseUniversity of Cambridge, Computer Laboratory1989-07enTextUCAM-CL-TR-172ISSN 1476-2986Programming in temporal logicHale, Roger William StephenUniversity of Cambridge, Computer Laboratory1989-07enTextUCAM-CL-TR-173ISSN 1476-2986General theory relating to the implementation of concurrent
symbolic computationClarke, James Thomas WoodchurchUniversity of Cambridge, Computer Laboratory1989-08enTextUCAM-CL-TR-174ISSN 1476-2986A formulation of the simple theory of types (for
Isabelle)Paulson, Lawrence C.University of Cambridge, Computer Laboratory1989-08enTextUCAM-CL-TR-175ISSN 1476-2986
Simple type theory is formulated for use with the generic theorem
prover Isabelle. This requires explicit type inference rules. There
are function, product, and subset types, which may be empty.
Descriptions (the eta-operator) introduce the Axiom of Choice.
Higher-order logic is obtained through reflection between formulae
and terms of type bool. Recursive types and functions can be
formally constructed.
Isabelle proof procedures are described. The logic appears suitable
for general mathematics as well as computational problems.
Implementing aggregates in parallel functional
languagesClarke, T.J.W.University of Cambridge, Computer Laboratory1989-08enTextUCAM-CL-TR-176ISSN 1476-2986Experimenting with Isabelle in ZF Set TheoryNoel, P.A.J.University of Cambridge, Computer Laboratory1989-09enTextUCAM-CL-TR-177ISSN 1476-2986Totally verified systems: linking verified software to
verified hardwareJoyce, Jeffrey J.University of Cambridge, Computer Laboratory1989-09enTextUCAM-CL-TR-178ISSN 1476-2986
We describe exploratory efforts to design and verify a compiler for
a formally verified microprocessor as one aspect of the eventual
goal of building totally verified systems. Together with a formal
proof of correctness for the microprocessor this yields a precise
and rigorously established link between the semantics of the source
language and the execution of compiled code by the fabricated
microchip. We describe in particular: (1) how the limitations of
real hardware influenced this proof; and (2) how the general
framework provided by higher order logic was used to formalize the
compiler correctness problem for a hierarchically structured
language.
Automating SquiggolMartin, UrsulaNipkow, TobiasUniversity of Cambridge, Computer Laboratory1989-09enTextUCAM-CL-TR-179ISSN 1476-2986Formal verification of data type refinement : Theory and
practiceNipkow, TobiasUniversity of Cambridge, Computer Laboratory1989-09enTextUCAM-CL-TR-180ISSN 1476-2986Proof transformations for equational theoriesNipkow, TobiasUniversity of Cambridge, Computer Laboratory1989-09enTextUCAM-CL-TR-181ISSN 1476-2986The theory and implementation of a bidirectional question
answering systemLevine, John M.Fedder, LeeUniversity of Cambridge, Computer Laboratory1989-10enTextUCAM-CL-TR-182ISSN 1476-2986The specification and verification of sliding window
protocols in higher order logicCardell-Oliver, RachelUniversity of Cambridge, Computer Laboratory1989-10enTextUCAM-CL-TR-183ISSN 1476-2986Site interconnection and the exchange
architectureTennenhouse, David LawrenceUniversity of Cambridge, Computer Laboratory1989-10enTextUCAM-CL-TR-184ISSN 1476-2986Logics of DomainsZhang, Guo QiangUniversity of Cambridge, Computer Laboratory1989-12enTextUCAM-CL-TR-185ISSN 1476-2986Protocol design for high speed networksMcAuley, Derek RobertUniversity of Cambridge, Computer Laboratory1990-01enTextUCAM-CL-TR-186ISSN 1476-2986
Improvements in fibre optic communication and in VLSI for network
switching components have led to the consideration of building
digital switched networks capable of providing point to point
communication in the gigabit per second range. Provision of
bandwidths of this magnitude allows the consideration of a whole new
range of telecommunications services, integrating video, voice,
image and text. These multi-service networks have a range of
requirements not met by traditional network architectures designed
for digital telephony or computer applications. This dissertation
describes the design, and an implementation, of the Multi-Service
Network architecture and protocol family, which is aimed at
supporting these services.
Asynchronous transfer mode networks provide the basic support
required for these integrated services, and the Multi-Service
Network architecture is designed primarily for these types of
networks. The aim of the Multi-Service protocol family is to provide
a complete architecture which allows use of the full facilities of
asynchronous transfer mode networks by multi-media applications. To
maintain comparable performance with the underlying media, certain
elements of the MSN protocol stack are designed with implementation
in hardware in mind. The interconnection of heterogeneous networks,
and networks belonging to different security and administrative
domains, is considered vital, so the MSN architecture takes an
internetworking approach.
Natural language interfaces to databasesCopestake, AnnSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory1989-09enTextUCAM-CL-TR-187ISSN 1476-2986Specification of computer architectures: a survey and
annotated bibliographyLeonard, Timothy E.University of Cambridge, Computer Laboratory1990-01enTextUCAM-CL-TR-188ISSN 1476-2986Isabelle tutorial and user’s manualPaulson, Lawrence C.Nipkow, TobiasUniversity of Cambridge, Computer Laboratory1990-01enTextUCAM-CL-TR-189ISSN 1476-2986
This (obsolete!) manual describes how to use the theorem prover
Isabelle. For beginners, it explains how to perform simple
single-step proofs in the built-in logics. These include first-order
logic, a classical sequent calculus, ZF set theory, Constructie Type
Theory, and higher-order logic. Each of these logics is described.
The manual then explains how to develop advanced tactics and
tacticals and how to derive rules. Finally, it describes how to
define new logics within Isabelle.
Some notes on mass terms and pluralsCopestake, AnnUniversity of Cambridge, Computer Laboratory1990-01enTextUCAM-CL-TR-190ISSN 1476-2986An architecture for real-time multimedia communications
systemsNicolaou, CosmosUniversity of Cambridge, Computer Laboratory1990-02enTextUCAM-CL-TR-191ISSN 1476-2986
An architecture for real-time multimedia communications systems is
presented. A multimedia communication systems includes both the
communication protocols used to transport the real-time data and
also the Distributed Computing system (DCS) within which any
applications using these protocols must execute. The architecture
presented attempts to integrate these protocols with the DCS in a
smooth fashion in order to ease the writing of multimedia
applications. Two issues are identified as being essential to the
success of this integration: namely the synchronisation of related
real-time data streams, and the management of heterogeneous
multimedia hardware. The synchronisation problem is tackled by
defining explicit synchronisation properties at the presentation
level and by providing control and synchronisation operations within
the DCS which operate in terms of these properties. The
heterogeneity problems are addressed by separating the data
transport semantics (protocols themselves) from the control
semantics (protocol interfaces). The control semantics are
implemented using a distributed, typed interface, scheme within the
DCS (i.e. above the presentation layer), whilst the protocols
themselves are implemented within the communication subsystem. The
interface between the DCS and communications subsystem is referred
to as the orchestration interface and can be considered to lie in
the presentation and session layers.
A conforming prototype implementation is currently under
construction.
Designing a theorem proverPaulson, Lawrence C.University of Cambridge, Computer Laboratory1990-05enTextUCAM-CL-TR-192ISSN 1476-2986
The methods and principles of theorem prover design are presented
through an extended example. Starting with a sequent calculus for
first-order logic, an automatic prover (called Folderol) is
developed. Folderol can prove quite a few complicated theorems,
although its search strategy is crude and limited. Folderol is coded
in Standard ML and consists largely of pure functions. Its complete
listing is included.
The report concludes with a survey of other research in theorem
proving: the Boyer/Moore theorem prover, Automath, LCF, and
Isabelle.
Belief revision and a theory of communicationGalliers, Julia RoseUniversity of Cambridge, Computer Laboratory1990-05enTextUCAM-CL-TR-193ISSN 1476-2986Proceedings of the First Belief Representation and Agent
Architectures WorkshopGalliers, Julia RoseUniversity of Cambridge, Computer Laboratory1990-03enTextUCAM-CL-TR-194ISSN 1476-2986Multi-level verification of microprocessor-based
systemsJoyce, Jeffrey J.University of Cambridge, Computer Laboratory1990-05enTextUCAM-CL-TR-195ISSN 1476-2986The semantics of VHDL with Val and Hol: towards practical
verification toolsVan Tassell, John PeterUniversity of Cambridge, Computer Laboratory1990-06enTextUCAM-CL-TR-196ISSN 1476-2986The semantics and implementation of aggregates : or : how to
express concurrency without destroying determinismClarke, ThomasUniversity of Cambridge, Computer Laboratory1990-07enTextUCAM-CL-TR-197ISSN 1476-2986Evaluation LogicPitts, Andrew M.University of Cambridge, Computer Laboratory1990-08enTextUCAM-CL-TR-198ISSN 1476-2986The HOL verification of ELLA designsBoulton, RichardGordon, MikeHerbert, JohnVan Tassel, JohnUniversity of Cambridge, Computer Laboratory1990-08enTextUCAM-CL-TR-199ISSN 1476-2986
HOL is a public domain system for generating proofs in higher order
predicate calculus. It has been in experimental and commercial use
in several countries for a number of years.
ELLA is a hardware design language developed at the Royal Signals
and Radar Establishment (RSRE) and marketed by Computer General
Electronic Design. It supports simulation models at a variety of
different abstraction levels.
A preliminary methodology for reasoning about ELLA designs using HOL
is described. Our approach is to semantically embed a subset of the
ELLA language in higher order logic, and then to make this embedding
convenient to use with parsers and pretty-printers. There are a
number of semantic issues that may affect the ease of verification.
We discuss some of these briefly. We also give a simple example to
illustrate the methodology.
Type classes and overloading resolution via order-sorted
unificationNipkow, TobiasSnelting, GregorUniversity of Cambridge, Computer Laboratory1990-08enTextUCAM-CL-TR-200ISSN 1476-2986Formalizing abstraction mechanisms for hardware verification
in higher order logicMelham, Thomas FrederickUniversity of Cambridge, Computer Laboratory1990-08enTextUCAM-CL-TR-201ISSN 1476-2986
Recent advances in microelectronics have given designers of digital
hardware the potential to build devices of remarkable size and
complexity. Along with this however, it becomes increasingly
difficult to ensure that such systems are free from design errors,
where complete simulation of even moderately sized circuits is
impossible. One solution to these problems is that of hardware
verification, where the functional behaviour of the hardware is
described mathematically and formal proof is used to show that the
design meets rigorous specifications of the intended operation.
This dissertation therefore seeks to develop this, showing how
reasoning about the correctness of hardware using formal proof can
be achieved using fundamental abstraction mechanisms to relate
specifications of hardware at different levels. Therefore a
systematic method is described for defining any instance of a wide
class of concrete data types in higher order logic. This process has
been automated in the HOL theorem prover, and provides a firm
logical basis for representing data in formal specifications.
Further, these abstractions have been developed into a new technique
for modelling the behaviour of entire classes of hardware designs.
This is based on a formal representation in logic for the structure
of circuit designs using the recursive types defined by the above
method. Two detailed examples are presented showing how this work
can be applied in practice.
Finally, some techniques for temporal abstraction are explained, and
the means for asserting the correctness of a model containing
time-dependent behaviour is described. This work is then illustrated
using a case study; the formal verification on HOL of a simple ring
communication network.
[Abstract by Nicholas Cutler (librarian), as none was submitted with
the report.]
Three-dimensional integrated circuit layoutHarter, Andrew CharlesUniversity of Cambridge, Computer Laboratory1990-08enTextUCAM-CL-TR-202ISSN 1476-2986Subtyping in Ponder (preliminary report)de Paiva, Valeria C.V.University of Cambridge, Computer Laboratory1990-08enTextUCAM-CL-TR-203ISSN 1476-2986
This note starts the formal study of the type system of the
functional language Ponder. Some of the problems of proving
soundness and completeness are discussed and some preliminary
results, about fragments of the type system, shown.
It consists of 6 sections. In section 1 we review briefly Ponder’s
syntax and describe its typing system. In section 2 we consider a
very restricted fragment of the language for which we can prove
soundness of the type inference mechanism, but not completeness.
Section 3 describes possible models of this fragment and some
related work. Section 4 describes the type-inference algorithm for a
larger fragment of Ponder and in section 5 we come up against some
problematic examples. Section 6 is a summary of further work.
New foundations for fixpoint computations:
FIX-hyperdoctrines and the FIX-logicCrole, Roy L.Pitts, Andrew M.University of Cambridge, Computer Laboratory1990-08enTextUCAM-CL-TR-204ISSN 1476-2986Logic programming, functional programming and inductive
definitionsPaulson, Lawrence C.Smith, Andrew W.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-205ISSN 1476-2986
This paper reports an attempt to combine logic and functional
programming. It also questions the traditional view that logic
programming is a form of first-order logic, arguing instead that the
essential nature of a logic program is an inductive definition. This
revised view of logic programming suggests the design of a combined
logic/functional language. A slow but working prototype is
described.
Formal verification of real-time protocols using higher
order logicCardell-Oliver, RichardUniversity of Cambridge, Computer Laboratory1990-08enTextUCAM-CL-TR-206ISSN 1476-2986Video replay in computer animationHawkins, Stuart PhilipUniversity of Cambridge, Computer Laboratory1990-10enTextUCAM-CL-TR-207ISSN 1476-2986Categorical combinators for the calculus of
constructionsRitter, EikeUniversity of Cambridge, Computer Laboratory1990-10enTextUCAM-CL-TR-208ISSN 1476-2986Efficient memory-based learning for robot controlMoore, Andrew WilliamUniversity of Cambridge, Computer Laboratory1990-11enTextUCAM-CL-TR-209ISSN 1476-2986
This dissertation is about the application of machine learning to
robot control. A system which has no initial model of the
robot/world dynamics should be able to construct such a model using
data received through its sensors—an approach which is formalized
here as the SAB (State-Action-Behaviour) control cycle. A method of
learning is presented in which all the experiences in the lifetime
of the robot are explicitly remembered. The experiences are stored
in a manner which permits fast recall of the closest previous
experience to any new situation, thus permitting very quick
predictions of the effects of proposed actions and, given a goal
behaviour, permitting fast generation of a candidate action. The
learning can take place in high-dimensional non-linear control
spaces with real-valued ranges of variables. Furthermore, the method
avoids a number of shortcomings of earlier learning methods in which
the controller can become trapped in inadequate performance which
does not improve. Also considered is how the system is made
resistant to noisy inputs and how it adapts to environmental
changes. A well founded mechanism for choosing actions is introduced
which solves the experiment/perform dilemma for this domain with
adequate computational efficiency, and with fast convergence to the
goal behaviour. The dissertation explains in detail how the SAB
control cycle can be integrated into both low and high complexity
tasks. The methods and algorithms are evaluated with numerous
experiments using both real and simulated robot domains. The final
experiment also illustrates how a compound learning task can be
structured into a hierarchy of simple learning tasks.
Higher-order unification, polymorphism, and
subsortsNipkow, TobiasUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-210ISSN 1476-2986The role of artificial intelligence in information
retrievalSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory1990-11enTextUCAM-CL-TR-211ISSN 1476-2986A distributed and-or parallel Prolog networkWrench, K.L.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-212ISSN 1476-2986The Dialectica categoriesde Paiva, Valeria Correa VazUniversity of Cambridge, Computer Laboratory1991-01enTextUCAM-CL-TR-213ISSN 1476-2986
This work consists of two main parts. The first one, which gives it
its name, presents an internal categorical version of Gödel’s
“Dialectica interpretation” of higher-order arithmetic. The idea is
to analyse the Dialectica interpretation using a cetegory DC where
objects are relations on objects of a basic category C and maps are
pairs of maps of C satisfying a pullback condition. If C is finitely
complete, DC exists and has a very natural symmetric monoidal
structure. If C is locally cartesian closed then DC is symmetric
monoidal closed. If we assume C with stable and disjoint coproducts,
DC has cartesian products and weak-coproducts and satisfies a weak
form of distributivity. Using the structure above, DC is a
categorical model for intuitionistic linear logic.
Moreover if C has free monoids then DC has cofree comonoids and the
corresponding comonad “!” on DC, which has some special properties,
can be used to model the exponential “of course!” in Intuitionistic
Linear Logic. The category of “!”-coalgebras is isomorphic to the
category of comonoids in DC and, if we assume commutative monoids in
C, the “!”-Kleisli category, which is cartesian closed, corresponds
to the Diller-Nahm variant of the Dialectica interpretation.
The second part introduces the categories GC. The objects of GC are
the same objects of DC, but morphisms are easier to handle, since
they are maps in C in opposite directions. If C is finitely
complete, the category GC exists. If C is cartesian closed, we can
define a symmetric monoidal structure and if C is locally cartesian
closed as well, we can define inernal homs in GC that make it a
symmetric monoidal closed category. Supposing C with stable and
disjoint coproducts, we can define cartesian products and coproducts
in GC and, more interesting, we can define a dual operation to the
tensor product bifunctor, called “par”. The operation “par” is a
bifunctor and has a unit “⊥”, which is a dualising object. Using the
internal hom and ⊥ we define a contravariant functor “(−)⊥” which
behaves like negation and thus it is used to model linear negation.
We show that the category GC, with all the structure above, is a
categorical model for Linear Logic, but not exactly the classical
one.
In the last chapter a comonad and a monad are defined to model the
exponentials “!” and “?”. To define these endofunctors, we use
Beck’s distributive laws in an interesting way. Finally, we show
that the Kleisli category GC! is cartesian closed and that the
categories DC and GC are related by a Kleisli construction.
Integrating knowledge of purpose and knowledge of structure
for design evaluationBradshaw, J.A.Young, R.M.University of Cambridge, Computer Laboratory1991-02enTextUCAM-CL-TR-214ISSN 1476-2986A structured approach to the verification of low level
microcodeCurzon, PaulUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-215ISSN 1476-2986
Errors in microprograms are especially serious since all higher
level programs on the machine depend on the microcode. Formal
verification presents one avenue which may be used to discover such
errors. Previous systems which have been used for formally verifying
microcode may be categorised by the form in which the microcode is
supplied. Some demand that it be written in a high level
microprogramming language. Conventional software verification
techniques are then applied. Other methods allow the microcode to be
supplied in the form of a memory image. It is treated as data to an
interpreter modelling the behaviour of the microarchitecture. The
proof is then performed by symbolic execution. A third solution is
for the code to be supplied in an assembly language and modelled at
that level. The assembler instructions are converted to commands in
a modelling language. The resulting program is verified using
traditional software verification techniques.
In this dissertation I present a new universal microprogram
verification system. It achieves many of the advantages of the other
kinds of systems by adopting a hybrid approach. The microcode is
supplied as a memory image, but it is transformed by the system to a
high level program which may be verified using standard software
verification techniques. The structure of the high level program is
obtained from user supplied documentation. I show that this allows
microcode to be split into small, independently validatable portions
even when it was not written in that way. I also demonstrate that
the techniques allow the complexity of detail due to the underlying
microarchitecture to be controlled at an early stage in the
validation process. I suggest that the system described would
combine well with other validation tools and provide help throughout
the firmware development cycle. Two case studies are given. The
first describes the verification of Gordon’s computer. This example
being fairly simple, provides a good illustration of the techniques
used by the system. The second case study is concerned with the High
Level Hardware Orion computer which is a commercially produced
machine with a fairly complex microarchitecture. This example shows
that the techniques scale well to production microarchitectures.
Exploiting OR-parallelism in Prolog using multiple
sequential machinesKlein, Carole SusanUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-216ISSN 1476-2986Dynamic bandwidth managementHarita, Bhaskar RamanathanUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-217ISSN 1476-2986Higher-order critical pairsNipkow, TobiasUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-218ISSN 1476-2986Fairisle project working documents : Snapshot 1Leslie, Ian M.McAuley, Derek M.Hayter, MarkBlack, RichardBeller, RetoNewman, PeterDoar, MatthewUniversity of Cambridge, Computer Laboratory1991-03enTextUCAM-CL-TR-219ISSN 1476-2986A distributed architecture for multimedia communication
systemsNicolaou, Cosmos AndreaUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-220ISSN 1476-2986Transforming axioms for data types into sequential
programsMilne, RobertUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-221ISSN 1476-2986
A process is proposed for refining specifications of abstract data
types into efficient sequential implementations. The process needs
little manual intervention. It is split into three stages, not all
of which need always be carried out. The three stages entail
interpreting equalities as behavioural equivalences, converting
functions into procedures and replacing axioms by programs. The
stages can be performed as automatic transformations which are
certain to produce results that meet the specifications, provided
that simple conditions hold. These conditions describe the adequacy
of the specifications, the freedom from interference between the
procedures, and the mode of construction of the procedures.
Sufficient versions of these conditions can be checked
automatically. Varying the conditions could produce implementations
for different classes of specification. Though the transformations
could be automated, the intermediate results, in styles of
specification which cover both functions and procedures, have
interest in their own right and may be particularly appropriate to
object-oriented design.
Extensions to coloured petri nets and their application to
protocolsBillington, JonathanUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-222ISSN 1476-2986Shallow processing and automatic summarising: a first
studyGladwin, PhilipPulman, StephenSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory1991-05enTextUCAM-CL-TR-223ISSN 1476-2986Generalised probabilistic LR parsing of natural language
(corpora) with unification-based grammarsBriscoe, TedCarroll, JohnUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-224ISSN 1476-2986Categorical multirelations, linear logic and petri nets
(draft)de Paiva, ValeriaUniversity of Cambridge, Computer Laboratory1991-05enTextUCAM-CL-TR-225ISSN 1476-2986
This note presents a categorical treatment of multirelations, which
is, in a loose sense a generalisation of both our previous work on
the categories GC, and of Chu’s construction A_NC [Barr’79]. The
main motivation for writing this note was the utilisation of the
category GC by Brown and Gurr [BG90] to model Petri nets. We wanted
to extend their work to deal with multirelations, as Petri nets are
usually modelled using multirelations pre and post. That proved easy
enough and people interested mainly in concurrency theory should
refer to our joint work [BGdP’91], this note deals with the
mathematics underlying [BGdP’91]. The upshot of this work is that we
build a model of Intuitionistic Linear Logic (without modalities)
over any symmetric monoidal category C with a distinguished object
(N, ≤, ∘, e −∘) – a closed poset. Moreover, if the category C is
cartesian closed with free monoids, we build a model of
Intuitionistic Linear Logic with a non-trivial modality ‘!’ over it.
A new approach for improving system availabilityLam, Kwok-yanUniversity of Cambridge, Computer Laboratory1991-06enTextUCAM-CL-TR-226ISSN 1476-2986Priority in process calculiCamilleri, Juanito AlbertUniversity of Cambridge, Computer Laboratory1991-06enTextUCAM-CL-TR-227ISSN 1476-2986The desk area networkHayter, MarkMcAuley, DerekUniversity of Cambridge, Computer Laboratory1991-05enTextUCAM-CL-TR-228ISSN 1476-2986
A novel architecture for use within an end computing system is
described. This attempts to extend the concepts used in modern high
speed networks into computer system design. A multimedia workstation
is being built based on this concept to evaluate the approach.
Abstraction of image and pixel : The thistle display
systemBrown, David J.University of Cambridge, Computer Laboratory1991-08enTextUCAM-CL-TR-229ISSN 1476-2986Proceedings of the second belief representation and agent
architectures workshop (BRAA ’91)Galliers, J.University of Cambridge, Computer Laboratory1991-08enTextUCAM-CL-TR-230ISSN 1476-2986Managing the order of transactions in widely-distributed
data systemsYahalom, RaphaelUniversity of Cambridge, Computer Laboratory1991-08enTextUCAM-CL-TR-231ISSN 1476-2986Mechanising set theoryCorella, FranciscoUniversity of Cambridge, Computer Laboratory1991-07enTextUCAM-CL-TR-232ISSN 1476-2986
Set theory is today the standard foundation of mathematics, but most
proof development sysems (PDS) are based on type theory rather than
set theory. This is due in part to the difficulty of reducing the
rich mathematical vocabulary to the economical vocabulary of the set
theory. It is known how to do this in principle, but traditional
explanations of mathematical notations in set theoretic terms do not
lead themselves easily to mechanical treatment.
We advocate the representation of mathematical notations in a formal
system consisting of the axioms of any version of ordinary set
theory, such as ZF, but within the framework of higher-order logic
with λ-conversion (H.O.L.) rather than first-order logic (F.O.L.).
In this system each notation can be represented by a constant, which
has a higher-order type when the notation binds variables. The
meaning of the notation is given by an axiom which defines the
representing constant, and the correspondence between the ordinary
syntax of the notation and its representation in the formal language
is specified by a rewrite rule. The collection of rewrite rules
comprises a rewriting system of a kind which is computationally well
behaved.
The formal system is justified by the fact than set theory within
H.O.L. is a conservative extension of set theory within F.O.L.
Besides facilitating the representation of notations, the formal
system is of interestbecause it permits the use of mathematical
methods which do not seem to be available in set theory within
F.O.L.
A PDS, called Watson, has been built to demonstrate this approach to
the mechanization of mathematics. Watson embodies a methodology for
interactive proof which provides both flexibility of use and a
relative guarantee of correctness. Results and proofs can be saved,
and can be perused and modified with an ordinary text editor. The
user can specify his own notations as rewrite rules and adapt the
mix of notations to suit the problem at hand; it is easy to switch
from one set of notations to another. As a case study, Watson has
been used to prove the correctness of a latch implemented as two
cross-coupled nor-gates, with an approximation of time as a
continuum.
A development environment for large natural language
grammarsCarroll, JohnBriscoe, TedGrover, ClaireUniversity of Cambridge, Computer Laboratory1991-07enTextUCAM-CL-TR-233ISSN 1476-2986Two tutorial papers: Information retrieval &
ThesaurusSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory1991-08enTextUCAM-CL-TR-234ISSN 1476-2986
The first paper describes the characteristics of information
retrieval from documents or texts, the development and status of
automatic indexing and retrieval, and the actual and potential
relations between information retrieval and artificial intelligence.
The second paper discusses the properties, construction and actual
and potential uses of thesauri, as semantic classifications or
terminological knowledge bases, in information retrieval and natural
language processing.
Modelling and image generationWang, HengUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-235ISSN 1476-2986Using knowledge of purpose and knowledge of structure as a
basic for evaluating the behaviour of mechanical systemsBradshaw, John AnthonyUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-236ISSN 1476-2986Computing presuppositions in an incremantal language
processing systemBridge, Derek G.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-237ISSN 1476-2986Proceedings of the ACQUILEX workshop on default inheritance
in the lexiconBriscoe, TedCopestake, Annde Paiva, ValeriaUniversity of Cambridge, Computer Laboratory1991-10enTextUCAM-CL-TR-238ISSN 1476-2986Planning multisentential English text using communicative
actsMaybury, Mark ThomasUniversity of Cambridge, Computer Laboratory1991-12enTextUCAM-CL-TR-239ISSN 1476-2986
The goal of this research is to develop explanation presentation
mechanisms for knowledge based systems which enable them to define
domain terminology and concepts, narrate events, elucidate plans,
processes, or propositions and argue to support a claim or advocate
action. This requires the development of devices which select,
structure, order and then linguistically realize explanation content
as coherent and cohesive English text.
With the goal of identifying generic explanation presentation
strategies, a wide range of naturally occurring texts were analyzed
with respect to their communicative structure, function, content and
intended effects on the reader. This motivated an integrated theory
of communicative acts which characterizes text at the level of
rhetorical acts (e.g. describe, define, narrate), illocutionary acts
(e.g. inform, request), and locutionary acts (ask, command). Taken
as a whole, the identified communicative acts characterize the
structure, content and intended effects of four types of text:
description, narration, exposition, argument. These text types have
distinct effects such as getting the reader to know about entities,
to know about events, to understand plans, processes, or
propositions, or to believe propositions or want to perform actions.
In addition to identifying the communicative function and effect of
text at multiple levels of abstraction, this dissertation details a
tripartite theory of focus of attention (discourse focus, temporal
focus and spatial focus) which constrains the planning and
linguistic realization of text.
To test the integrated theory of communicative acts and tripartite
theory of focus of attention, a text generation system TEXPLAN
(Textual EXplanation PLANner) was implemented that plans and
linguistically realizes multisentential and multiparagraph
explanations from knowledge based systems. The communicative acts
identified during text analysis were formalized over sixty
compositional and (in some cases) recursive plan operators in the
library of a hierarchical planner. Discourse, temporal and spatial
models were implemented to track and use attentional information to
guide the organization and realization of text. Because the plan
operators distinguish between the communicative function (e.g. argue
for a proposition) and the expected effect (e.g. the reader believes
the proposition) of communicative acts, the system is able to
construct a discourse model of the structure and function of its
textual responses as well as a user model of the expected effects of
its responses on the reader’s knowledge, beliefs, and desires. The
system uses both the discourse model and user model to guide
subsequent utterances. To test its generality, the system was
interfaced to a variety of domain applications including a
neuropsychological diagnosis system, a mission planning system, and
a knowledge based mission simulator. The system produces
descriptions, narratives, expositions and arguments from these
applications, thus exhibiting a broader ranger of rhetorical
coverage then previous text generation systems.
Symbolic compilation and execution of programs by proof: a
case study in HOLCamilleri, JuanitoUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-240ISSN 1476-2986Learning in large state spaces with an application to biped
robot walkingVogel, Thomas UlrichUniversity of Cambridge, Computer Laboratory1991-12enTextUCAM-CL-TR-241ISSN 1476-2986An object oriented approach to virtual memory
managementMapp, Glenford EzraUniversity of Cambridge, Computer Laboratory1992-01enTextUCAM-CL-TR-242ISSN 1476-2986
Advances in computer technology are being pooled together to form a
new computing environment which is characterised by powerful
workstations with vast amounts of memory connected to high speed
networks. This environment will provide a large number of diverse
services such as multimedia communications, expert systems and
object-oriented databases. In order to develop these complex
applications in an efficient manner, new interfaces are required
which are simple, fast and flexible and allow the programmer to use
an object-oriented approach throughout the design and implementation
of an application. Virtual memory techniques are increasingly being
used to build these new facilities.
In addition since CPU speeds continue to increase faster than disk
speeds, an I/O bottleneck may develop in which the CPU may be idle
for long periods waiting for paging requests to be satisfied. To
overcome this problem it is necessary to develop new paging
algorithms that better reflect how different objects are used. Thus
a facility to page objects on a per-object basis is required and a
testbed is also needed to obtain experimental data on the paging
activity of different objects.
Virtual memory techniques, previously only used in mainframe and
minicomputer architectures, are being employed in the memory
management units of modern microprocessors. With very large address
spaces becoming a standard feature of most systems, the use of
memory mapping is seen as an effective way of providing greater
flexibility as well as improved system efficiency.
This thesis presents an object-oriented interface for memory mapped
objects. Each object has a designated object type. Handles are
associated with different object types and the interface allows
users to define and manage new object types. Moving data between the
object and its backing store is done by user-level processes called
object managers. Object managers interact with the kernel via a
specified interface thus allowing users to build their own object
managers. A framework to compare different algorithms was also
developed and an experimental testbed was designed to gather and
analyse data on the paging activity of various programs. Using the
testbed, conventional paging algorithms were applied to different
types of objects and the results were compared. New paging
algorithms were designed and implemented for objects that are
accessed in a highly sequential manner.
Automating the librarian: a fundamental approach using
belief revisionCawsey, AlisonGalliers, JuliaReece, StenevSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory1992-01enTextUCAM-CL-TR-243ISSN 1476-2986A mechanized theory of the π-calculus in HOLMelham, T.F.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-244ISSN 1476-2986System support for multi-service trafficDixon, Michael J.University of Cambridge, Computer Laboratory1992-01enTextUCAM-CL-TR-245ISSN 1476-2986
Digital network technology is now capable of supporting the
bandwidth requirements of diverse applications such as voice, video
and data (so called multi-service traffic). Some media, for example
voice, have specific transmission requirements regarding the maximum
packet delay and loss which they can tolerate. Problems arise when
attempting to multiplex such traffic over a single channel.
Traditional digital networks based on the Packet- (PTM) and
Synchronous- (STM) Transfer Modes prove unsuitable due to their
media access contention and inflexible bandwidth allocation
properties respectively. The Asynchronous Transfer Mode (STM) has
been proposed as a compromise between the PTM and STM techniques.
The current state of multimedia research suggests that a significant
amount of multi-service traffic will be handled by computer
operating systems. Unfortunately conventional operating systems are
largely unsuited to such a task. This dissertation is concerned with
the system organisation necessary in order to extend the benefits of
ATM networking through the endpoint operating system and up to the
application level. A locally developed micro-kernel, with ATM
network protocol support, has been used as a testbed for the ideas
presented. Practical results over prototype ATM networks, including
the 512 MHz Cambridge Backbone Network, are presented.
A relevance-based utterance processing systemPoznański, VictorUniversity of Cambridge, Computer Laboratory1992-02enTextUCAM-CL-TR-246ISSN 1476-2986
This thesis presents a computational interpretation of Sperber and
Wilson’s relevance theory, based on the use of non-monotonic logic
supported by a reason maintenance system, and shows how the theory,
when given a specific form in this way, can provide a unique and
interesting account of discourse processing.
Relevance theory is a radical theory of natural language pragmatics
which attempts to explain the whole of human cognition using a
single maxim: the Principle of Optimal Relevance. The theory is seen
by its originators as a computationally more adequate alternative to
Gricean pragmatics. Much as it claims to offer the advantage of a
unified approach to utterance comprehension, Relevance Theory is
hard to evaluate because Sperber and Wilson only provide vague,
high-level descriptions of vital aspects of their theory. For
example, the fundamental idea behind the whole theory is that, in
trying to understand an utterance, we attempt to maximise
significant new information obtained from the utterance whilst
consuming as little cognitive effort as possible. However, Sperber
and Wilson do not make the nature of information and effort
sufficiently clear.
Relevance theory is attractive as a general theory of human language
communication and as a potential framework for computational
language processing systems. The thesis seeks to clarify and flesh
out the problem areas in order to develop a computational
implementation which is used to evaluate the theory.
The early chapters examine and criticise the important aspects of
the theory, emerging with a schema for an ideal relevance-based
system. Crystal, a computational implementation of an utterance
processing system based on this schema is then described. Crystal
performs certain types of utterance disambiguation and reference
resolution, and computes implicatures according to relevance theory.
An adequate reasoning apparatus is a key component of a relevance
based discourse processor, so a suitable knowledge representation
and inference engine are required. Various candidate formalisms are
considered, and a knowledge representation and inference engine
based on autoepistemic logic is found to be the most suitable. It is
then shown how this representation can be used to meet particular
discourse processing requirements, and how it provides a convenient
interface to a separate abduction system that supplies not
demonstrative inferences according to relevence theory. Crystal’s
powers are illustrated with examples, and the thesis shows how the
design not only implements the less precise areas of Sperber and
Wilson’s theory, but overcomes problems with the theory itself.
Crystal uses rather crude heuristics to model notions such as
salience and degrees of belief. The thesis thefore presents a
proposal and outline for a new kind of reason maintenance system
that supports non-monotonic logic whose formulae re labelled with
upper/lower probability ranges intended to represent strength of
belief. This system should facilitate measurements of change in
semantic information and shed some light on notions such as expected
utility and salience.
The thesis concludes that the design and implementation of crystal
provide evidence that relevance theory, as a generic theory of
language processing, is a viable alternative theory of pragmatics.
It therefore merits a greater level of investigation than has been
applied to it to date.
Programming metalogics with a fixpoint typeCrole, Roy LuisUniversity of Cambridge, Computer Laboratory1992-02enTextUCAM-CL-TR-247ISSN 1476-2986On efficiency in theorem provers which fully expand proofs
into primitive inferencesBoulton, Richard J.University of Cambridge, Computer Laboratory1992-02enTextUCAM-CL-TR-248ISSN 1476-2986
Theorem Provers which fully expand proofs into applications of
primitive inference rules can be made highly secure, but have been
criticized for being orders of magnitude slower than many other
theorem provers. We argue that much of this relative inefficiency is
due to the way proof procedures are typically written and not all is
inherent in the way the systems work. We support this claim by
considering a proof procedure for linear arithmetic. We show that
straightforward techniques can be used to significantly cut down the
computation required. An order of magnitude improvement in the
performance is shown by an implementation of these techniques.
A formalisation of the VHDL simulation cycleVan Tassel, John P.University of Cambridge, Computer Laboratory1992-03enTextUCAM-CL-TR-249ISSN 1476-2986
The VHSIC Hardware Description Language (VHDL) has been gaining wide
acceptance as a unifying HDL. It is, however, still a language in
which the only way of validating a design is by careful simulation.
With the aim of better understanding VHDL's particular simulation
process and eventually reasoning about it, we have developed a
formalisation of VHDL's simulation cycle for a subset of the
language. It has also been possible to embed our semantics in the
Cambridge Higher-Order Logic (HOL) system and derive interesting
properties about specific VHDL programs.
TouringMachines: autonomous agents with attitudesFerguson, Innes A.University of Cambridge, Computer Laboratory1992-04enTextUCAM-CL-TR-250ISSN 1476-2986Multipoint digital video communicationJiang, XiaofengUniversity of Cambridge, Computer Laboratory1992-04enTextUCAM-CL-TR-251ISSN 1476-2986A co-induction principle for recursively defined
domainsPitts, Andrew M.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-252ISSN 1476-2986The (other) Cambridge ACQUILEX papersSanfilippo, AntonioUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-253ISSN 1476-2986A HOL semantics for a subset of ELLABoulton, Richard J.University of Cambridge, Computer Laboratory1992-04enTextUCAM-CL-TR-254ISSN 1476-2986
Formal verification is an important tool in the design of computer
systems, especially when the systems are safety or security
critical. However, the formal techniques currently available are not
well integrated into the set of tools more traditionally used by
designers. This work is aimed at improving the integration by
providing a formal semantics for a subset of the hardware
description language ELLA, and by supporting this semantics in the
HOL theorem proving system, which has been used extensively for
hardware verification.
A semantics for a subset of ELLA is described, and an outline of a
proof of the equivalence of parallel and recursive implementations
of an n-bit adder is given as an illustration of the semantics. The
proof has been performed in an extension of the HOL system. Some
proof tools written to support the verification are also described.
The formal verification of hard real-time systemsCardell-Oliver, Rachel MaryUniversity of Cambridge, Computer Laboratory1992enTextUCAM-CL-TR-255ISSN 1476-2986MCPL programming manualRichards, MartinUniversity of Cambridge, Computer Laboratory1992-05enTextUCAM-CL-TR-256ISSN 1476-2986Cut-free sequent and tableau systems for propositional
normal modal logicsGoré, Rajeev PrakhakarUniversity of Cambridge, Computer Laboratory1992-05enTextUCAM-CL-TR-257ISSN 1476-2986
We present a unified treatment of tableau, sequent and axiomatic
formulations for many propositional normal modal logics, thus
unifying and extending the work of Hanson, Segerberg, Zeman, Mints,
Fitting, Rautenberg and Shvarts. The primary emphasis is on tableau
systems as the completeness proofs are easier in this setting. Each
tableau system has a natural sequent analogue defining a finitary
provability relation for each axiomatically formulated logic L.
Consequently, any tableau proof can be converted into a sequent
proof which can be read downwards to obtain an axiomatic proof. In
particular, we present cut-free sequent systems for the logics S4.3,
S4.3.1 and S4.14. These three logics have important temporal
interpretations and the sequent systems appear to be new.
All systems are sound and (weakly) complete with respect to their
known finite frame Kripke semantics. By concentrating almost
exclusively on finite tree frames we obtain finer characterisation
results, particularly for the logics with natural temporal
interpretations. In particular, all proofs of tableau completeness
are constructive and yield the finite model property and
decidability for each logic.
Most of these systems are cut-free giving a Gentzen cut-elimination
theorem for the logic in question. But even when the cut rule is
required, all uses of it remain analytic. Some systems do not
possess the subformula property. But in all such cases the class of
“superformulae” remains bounded, giving an analytic superformula
property. Thus all systems remain totally amenable to computer
implementation and immediately serve as nondeterministic decision
procedures for the logics they formulate. Furthermore, the
constructive completeness proofs yield deterministic decision
procedures for all the logics concerned.
In obtaining these systems we domonstrate that the subformula
property can be broken in a systematic and analytic way while still
retaining decidability. This should not be surprising since it is
known that modal logic is a form of second order logic and that the
subformula property does not hold for higher order logics.
Private ATM networksGreaves, David J.McAuley, DerekUniversity of Cambridge, Computer Laboratory1992-05enTextUCAM-CL-TR-258ISSN 1476-2986Full abstraction in the Lazy Lambda CalculusAbramsky, SamsonOng, C.-H. LukeUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-259ISSN 1476-2986Local computation of alternating fixed-pointsAnderson, Henrik ReifUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-260ISSN 1476-2986Image resamplingDodgson, Neil AnthonyUniversity of Cambridge, Computer Laboratory1992-08enTextUCAM-CL-TR-261ISSN 1476-2986
Image resampling is the process of geometrically transforming
digital images. This report considers several aspects of the
process.
We begin by decomposing the resampling process into three simpler
sub-processes: reconstruction of a continuous intensity surface from
a discrete image, transformation of that continuous surface, and
sampling of the transformed surface to produce a new discrete image.
We then consider the sampling process, and the subsidiary problem of
intensity quantisation. Both these are well understood, and we
present a summary of existing work, laying a foundation for the
central body of the report where the sub-process of reconstruction
is studied.
The work on reconstruction divides into four parts, two general and
two specific:
1. Piecewise local polynomials: the most studied group of
reconstructors. We examine these, and the criteria used in their
design. One new derivation is of two piecewise local quadratic
reconstructors.
2. Infinite extent reconstructors: we consider these and their local
approximations, the problem of finite image size, the resulting edge
effects, and the solutions to these problems. Amongst the
reconstructors discussed are the interpolating cubic B-spline and
the interpolating Bezier cubic. We derive the filter kernels for
both of these, and prove that they are the same. Given this kernel
we demonstrate how the interpolating cubic B-spline can be extended
from a one-dimensional to a two-dimensional reconstructor, providing
a considerable speed improvement over the existing method of
extension.
3. Fast Fourier transform reconstruction: it has long been known
that the fast Fourier transform (FFT) can be used to generate an
approximation to perfect scaling of a sample set. Donald Fraser (in
1987) took this result and generated a hybrid FFT reconstructor
which can be used for general transformations, not just scaling. We
modify Fraser’s method to tackle two major problems: its large time
and storage requirements, and the edge effects it causes in the
reconstructed intensity surface.
4. A priori knowledge reconstruction: first considering what can be
done if we know how the original image was sampled, and then
considering what can be done with one particular class of image
coupled with one particular type of sampling. In this latter case we
find that exact reconstruction of the image is possible. This is a
surprising result as this class of images cannot be exactly
reconstructed using classical sampling theory.
The final section of the report draws all of the strands together to
discuss transformations and the resampling process as a whole. Of
particular note here is work on how the quality of different
reconstruction and resampling methods can be assessed.
Term assignment for intuitionistic linear logic (preliminary
report)Benton, NickBierman, Gavinde Paiva, ValeriaUniversity of Cambridge, Computer Laboratory1992-08enTextUCAM-CL-TR-262ISSN 1476-2986The Lazy Lambda Calculus: an investigation into the
foundations of functional programmingOng, C.-H. LukeUniversity of Cambridge, Computer Laboratory1992-08enTextUCAM-CL-TR-263ISSN 1476-2986CCS with environmental guardsCamilleri, JuanitoUniversity of Cambridge, Computer Laboratory1992-08enTextUCAM-CL-TR-264ISSN 1476-2986Reasoning with inductively defined relations in the HOL
theorem proverCamilleri, JuanitoMelham, TomUniversity of Cambridge, Computer Laboratory1992-08enTextUCAM-CL-TR-265ISSN 1476-2986Automatic exploitation of OR-parallelism in
PrologKlein, CaroleUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-266ISSN 1476-2986Untyped strictness analysisErnoult, ChristineMycroft, AlanUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-267ISSN 1476-2986Network file server design for continuous mediaJardetzky, Paul W.University of Cambridge, Computer Laboratory1992-10enTextUCAM-CL-TR-268ISSN 1476-2986
This dissertation concentrates on issues related to the provision of
a network based storage facility for digital audio and video data.
The goal is to demonstrate that a distributed file service in
support of these media may be built without special purpose
hardware. The main objective is to identify those parameters that
affect file system performance and provide the criteria for making
desirable design decisions.
Optimising compilationMycroft, AlanNorman, ArthurUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-269ISSN 1476-2986Designing a universal name serviceMa, ChaoyingUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-270ISSN 1476-2986
Generally speaking, naming in computing systems deals with the
creation of object identifiers at all levels of system architecture
and the mapping among them. Two of the main purposes of having names
in computer systems are (a) to identify objects; (b) to accomplish
sharing. Without naming no computer system design can be done.
The rapid development in the technology of personal workstations and
computer communication networks has placed a great number of demands
on designing large computer naming systems. In this dissertation,
issues of naming in large distributed computing systems are
addressed. Technical aspects as well as system architecture are
examined. A design of a Universal Name Service (UNS) is proposed and
its prototype implementation is described. Three major issues on
designing a global naming system are studied. Firstly, it is
observed that none of the existing name services provides enough
flexibility in restructuring name spaces, more research has to be
done. Secondly it is observed that although using stale naming data
(hints) at the application level is acceptable in most cases as long
as it is detectable and recoverable, stronger naming data integrity
should be maintained to provide a better guarantee of finding
objects, especially when a high degree of availability is required.
Finally, configuring the name service is usually done in an ad hoc
manner, leading to unexpected interruptions or a great deal of human
intervention when the system is reconfigured. It is necessary to
make a systematic study of automatic configuration and
reconfiguration of name services.
This research is based on a distributed computing model, in which a
number of computers work cooperatively to provide the service. The
contributions include: (a) the construction of a Globally Unique
Directory Identifier (GUDI) name space. Flexible name space
restructuring is supported by allowing directories to be added to or
removed from the GUDI name space. (b) The definition of a two class
name service infrastructure which exploits the semantics of naming.
It makes the UNS replication control more robust, reliable as well
as highly available. (c) The identification of two aspects in the
name service configuration: one is concerned with the replication
configuration, and the other is concerned with the server
configuration. It is notable that previous work only studied these
two aspects individually but not in combination. A distinguishing
feature of the UNS is that both issues are considered at the design
stage and novel methods are used to allow dynamic service
configuration to be done automatically and safely.
Set theory as a computational logic: I. from foundations to
functionsPaulson, Lawrence C.University of Cambridge, Computer Laboratory1992-11enTextUCAM-CL-TR-271ISSN 1476-2986
A logic for specification and verification is derived from the
axioms of Zermelo-Fraenkel set theory. The proofs are performed
using the proof assistant Isabelle. Isabelle is generic, supporting
several different logics. Isabelle has the flexibility to adapt to
variants of set theory. Its higher-order syntax supports the
definition of new binding operators. Unknowns in subgoals can be
instantiated incrementally. The paper describes the derivation of
rules for descriptions, relations and functions, and discusses
interactive proofs of Cantor’s Theorem, the Composition of
Homomorphisms challenge, and Ramsey’s Theorem. A generic proof
assistant can stand up against provers dedicated to particular
logics.
Interactive program derivationCoen, Martin DavidUniversity of Cambridge, Computer Laboratory1992-11enTextUCAM-CL-TR-272ISSN 1476-2986
As computer programs are increasingly used in safety critical
applications, program correctness is becoming more important; as the
size and complexity of programs increases, the traditional approach
of testing is becoming inadequate. Proving the correctness of
programs written in imperative languages is awkward; functional
programming languages, however, offer more hope. Their logical
structure is cleaner, and it is practical to reason about
terminating functional programs in an internal logic.
This dissertation describes the development of a logical theory
called TPT for reasoning about the correctness of terminating
functional programs, its implementation using the theorem prover
Isabelle, and its use in proving formal correctness. The theory
draws both from Martin-Löf’s work in type theory and Manna and
Waldinger’s work in program synthesis. It is based on classical
first-order logic, and it contains terms that represent classes of
behaviourally equivalent programs, types that denote sets of
terminating programs and well-founded orderings. Well-founded
induction is used to reason about general recursion in a natural way
and to separate conditions for termination from those for
correctness.
The theory is implemented using the generic theorem prover Isabelle,
which allows correctness proofs to be checked by machine and
partially automated using tactics. In particular, tactics for type
checking use the structure of programs to direct proofs. Type
checking allows both the verification and derivation of programs,
reducing specifications of correctness to sets of correctness
conditions. These conditions can be proved in typed first-order
logic, using well-known techniques of reasoning by induction and
rewriting, and then lifted up to TPT. Examples of program
termination are asserted and proved, using simple types. Behavioural
specifications are expressed using dependent types, and the
correctness of programs asserted and then proved. As a non-trivial
example, a unification algorithm is specified and proved correct by
machine.
The work in this dissertation clearly shows how a classical theory
can be used to reason about program correctness, how general
recursion can be reasoned about, and how programs can direct proofs
of correctness.
TouringMachines: an architecture for dynamic, rational,
mobile agentsFerguson, Innes A.University of Cambridge, Computer Laboratory1992-11enTextUCAM-CL-TR-273ISSN 1476-2986
It is becoming widely accepted that neither purely reactive nor
purely deliberative control techniques are capable of producing the
range of behaviours required of intelligent computational or robotic
agents in dynamic, unpredictable, multi-agent worlds. We present a
new architecture for controlling autonomous, mobile agents –
building on previous work addressing reactive and deliberative
control methods. The proposed multi-layered control architecture
allows a resource-bounded, goal-directed agent to react promptly to
unexpected changes in its environment; at the same time it enables
the agent to reason predictively about potential conflicts by
constructing and projecting causal models or theories which
hypothesise other agents’ goals and intentions.
The line of research adopted is very much a pragmatic one. A single,
common architecture has been implemented which, being extensively
parametrized, allows an experimenter to study functionally- and
behaviourally-diverse agent configurations. A principal aim of this
research is to understand the role different functional capabilities
play in constraining an agent’s behaviour under varying
environmental conditions. To this end, we have constructed an
experimental testbed comprising a simulated multi-agent world in
which a variety of agent configurations and behaviours have been
investigated. Experience with the new control architecture is
described.
Of what use is a verified compiler specification?Curzon, PaulUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-274ISSN 1476-2986Exploratory learning in the game of GOPell, BarneyUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-275ISSN 1476-2986
This paper considers the importance of exploration to game-playing
programs which learn by playing against opponents. The central
question is whether a learning program should play the move which
offers the best chance of winning the present game, or if it should
play the move which has the best chance of providing useful
information for future games. An approach to addressing this
question is developed using probability theory, and then implemented
in two different learning methods. Initial experiments in the game
of Go suggest that a program which takes exploration into account
can learn better against a knowledgeable opponent than a program
which does not.
METAGAME: a new challenge for games and learningPell, BarneyUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-276ISSN 1476-2986
In most current approaches to Computer Game-Playing, including those
employing some form of machine learning, the game analysis mainly is
performed by humans. Thus, we are sidestepping largely the
interesting (and difficult) questions. Human analysis also makes it
difficult to evaluate the generality and applicability of different
approaches.
To address these problems, we introduce a new challenge: Metagame.
The idea is to write programs which take as input the rules of a set
of new games within a pre-specified class, generated by a program
which is publicly available. The programs compete against each other
in many matches on each new game, and they can then be evaluated
based on their overall performance and improvement through
experience.
This paper discusses the goals, research areas, and general concerns
for the idea of Metagame.
METAGAME in symmetric chess-like gamesPell, BarneyUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-277ISSN 1476-2986
I have implemented a game generator that generates games from a wide
but still restricted class. This class is general enough to include
most aspects of many standard games, including Chess, Shogi, Chinese
Chess, Checkers, Draughts, and many variants of Fairy Chess. The
generator, implemented in Prolog is transparent and publicly
available, and generates games using probability distributions for
parameters such as piece complexity, types of movement, board size,
and locality.
The generator is illustrated by means of a new game it produced,
which is then subjected to a simple strategic analysis. This form of
analysis suggests that programs to play Metagame well will either
learn or apply very general game-playing principles. But because the
class is still restricted, it may be possible to develop a naive but
fast program which can outplay more sophisticated opponents.
Performance in a tournament between programs is the deciding
criterion.
A formalization of the process algebra CCS in high order
logicNesi, MonicaUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-278ISSN 1476-2986
This paper describes a mechanization in higher order logic of the
theory for a subset of Milner’s CCS. The aim is to build a sound and
effective tool to support verification and reasoning about process
algebra specifications. To achieve this goal, the formal theory for
pure CCS (no value passing) is defined in the interactive theorem
prover HOL, and a set of proof tools, based on the algebraic
presentation of CCS, is provided.
The transition assertions specification methodCarreño, Victor A.University of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-279ISSN 1476-2986Introduction to IsabellePaulson, Lawrence C.University of Cambridge, Computer Laboratory1993-01enTextUCAM-CL-TR-280ISSN 1476-2986
Isabelle is a generic theorem prover, supporting formal proof in a
variety of logics. Through a variety of examples, this paper
explains the basic theory demonstrates the most important commands.
It serves as the introduction to other Isabelle documentation.
Pegasus project descriptionMullender, Sape J.Leslie, Ian M.McAuley, DerekUniversity of Cambridge, Computer Laboratory1992-09enTextUCAM-CL-TR-281ISSN 1476-2986Pegasus – Operating system support for distributed
multimedia systemsLeslie, Ian M.McAuley, DerekMullender, Sape J.University of Cambridge, Computer Laboratory1992-12enTextUCAM-CL-TR-282ISSN 1476-2986The Isabelle reference manualPaulson, Lawrence C.University of Cambridge, Computer Laboratory1993-02enTextUCAM-CL-TR-283ISSN 1476-2986
This manual is a comprehensive description of Isabelle, including
all commands, functions and packages. It is intended for reference
rather than for reading through, and is certainly not a tutorial.
The manual assumes familiarity with the basic concepts explained in
Introduction to Isabelle. Functions are organized by their purpose,
by their operands (subgoals, tactics, theorems), and by their
usefulness. In each section, basic functions appear first, then
advanced functions, and finally esoteric functions.
The Alvey Natural Language Tools grammar (4th
Release)Grover, ClaireCarroll, JohnBriscoe, TedUniversity of Cambridge, Computer Laboratory1993-01enTextUCAM-CL-TR-284ISSN 1476-2986Functional programming and input/outputGordon, Andrew DonaldUniversity of Cambridge, Computer Laboratory1993-02enTextUCAM-CL-TR-285ISSN 1476-2986Isabelle’s object-logicsPaulson, Lawrence C.University of Cambridge, Computer Laboratory1993-02enTextUCAM-CL-TR-286ISSN 1476-2986
Several logics come with Isabelle. Many of them are sufficiently
developed to serve as comfortable reasoning environments. They are
also good starting points for defining new logics. Each logic is
distributed with sample proofs, some of which are presented in the
paper. The logics described include first-order logic,
Zermelo-Fraenkel set theory, higher-order logic, constructive type
theory, and the classical sequent calculus LK. A final chapter
explains the fine points of defining logics in Isabelle.
A mechanised definition of Silage in HOLGordon, Andrew D.University of Cambridge, Computer Laboratory1993-02enTextUCAM-CL-TR-287ISSN 1476-2986
If formal methods of hardware verification are to have any impact on
the practices of working engineers, connections must be made between
the languages used in practice to design circuits, and those used
for research into hardware verification. Silage is a simple dataflow
language marketed for specifying digital signal processing circuits.
Higher Order Logic (HOL) is extensively used for research into
hardware verification. This paper presents a formal definition of a
substantial subset of Silage, by mapping Silage declarations into
HOL predicates. The definition has been mechanised in the HOL
theorem prover to support the transformational design of Silage
circuits as theorem proving in HOL.
Cut-free sequent and tableau systems for propositional
Diodorean modal logicsGore, RajeevUniversity of Cambridge, Computer Laboratory1993-02enTextUCAM-CL-TR-288ISSN 1476-2986The semantics of noun phrase anaphoraElworthy, David Alan HowardUniversity of Cambridge, Computer Laboratory1993-02enTextUCAM-CL-TR-289ISSN 1476-2986Discourse modelling for automatic summarisingSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory1993-02enTextUCAM-CL-TR-290ISSN 1476-2986Evaluating natural language processing systemsGalliers, J.R.Spärck Jones, K.University of Cambridge, Computer Laboratory1993-02enTextUCAM-CL-TR-291ISSN 1476-2986
This report presents a detailed analysis and review of NLP
evaluation, in principle and in practice. Part 1 examines evaluation
concepts and establishes a framework for NLP system evaluation. This
makes use of experience in the related area of information retrieval
and the analysis also refers to evaluation in speech processing.
Part 2 surveys significant evaluation work done so far, for instance
in machine translation, and discusses the particular problems of
generic system evaluation. The conclusion is that evaluation
strategies and techniques for NLP need much more development, in
particular to take proper account of the influence of system tasks
and settings. Part 3 develops a general approach to NLP evaluation,
aimed at methodologically-sound strategies for test and evaluation
motivated by comprehensive performance factor identification. The
analysis throughout the report is supported by extensive
illustrative examples.
Synchronisation services for digital continuous
mediaSreenan, Cormac JohnUniversity of Cambridge, Computer Laboratory1993-03enTextUCAM-CL-TR-292ISSN 1476-2986
The development of broadband ATM networking makes it attractive to
use computer communication networks for the transport of digital
audio and motion video. Coupled with advances in workstation
technology, this creates the opportunity to integrate these
continuous information media within a distributed computing system.
Continuous media have an inherent temporal dimension, resulting in a
set of synchronisation requirements which have real-time
constraints. This dissertation identifies the role and position of
synchronisation, in terms of the support which is necessary in an
integrated distributed system. This work is supported by a set of
experiments which were performed in an ATM inter-network using
multi-media workstations, each equipped with an Olivetti Pandora
Box.
Objects and transactions for modelling distributed
applications: concurrency control and commitmentBacon, JeanMoody, KenUniversity of Cambridge, Computer Laboratory1993-04enTextUCAM-CL-TR-293ISSN 1476-2986OPERA : Storage, programming and display of multimedia
objectsMoody, KenBacon, JeanAdly, NohaAfshar, MohamadBates, JohnFeng, HuangHayton, RichardLo, Sai LaiSchwiderski, ScarletSultana, RobertWu, ZhixueUniversity of Cambridge, Computer Laboratory1993-04enTextUCAM-CL-TR-294ISSN 1476-2986OPERA : Storage and presentation support for multimedia
applications in a distributed, ATM network environmentBacon, JeanBates, JohnLo, Sai LaiMoody, KenUniversity of Cambridge, Computer Laboratory1993-04enTextUCAM-CL-TR-295ISSN 1476-2986A persistent programming language for multimedia databases
in the OPERA projectWu, Z.Moody, K.Bacon, J.University of Cambridge, Computer Laboratory1993-04enTextUCAM-CL-TR-296ISSN 1476-2986Categorical abstract machines for higher-order lambda
calculiRitter, EikeUniversity of Cambridge, Computer Laboratory1993-04enTextUCAM-CL-TR-297ISSN 1476-2986Multicast in the asynchronous transfer mode
environmentDoar, John Matthew SimonUniversity of Cambridge, Computer Laboratory1993-04enTextUCAM-CL-TR-298ISSN 1476-2986
In future multimedia communication networks, the ability to
multicast information will be useful for many new and existing
services. This dissertation considers the design of multicast
switches for Asynchronous Transfer Mode (ATM) networks and proposes
one design based upon a slotted ring. Analysis and simulation
studies of this design are presented and details of its
implementation for an experimental ATM network (Project Fairisle)
are described, together with the modifications to the existing
multi-service protocol architecture necessary to provide multicast
connections. Finally, a short study of the problem of multicast
routing is presented, together with some simulations of the
long-term effect upon the routing efficiency of modifying the number
of destinations within a multicast group.
Pragmatic reasoning in bridgeGamback, BjornRayner, MannyPell, BarneyUniversity of Cambridge, Computer Laboratory1993-04enTextUCAM-CL-TR-299ISSN 1476-2986
In this paper we argue that bidding in the game of Contract Bridge
can profitably be regarded as a micro-world suitable for
experimenting with pragmatics. We sketch an analysis in which a
“bidding system” is treated as the semantics of an artificial
language, and show how this “language”, despite its apparent
simplicity, is capable of supporting a wide variety of common speech
acts parallel to those in natural languages; we also argue that the
reason for the relatively unsuccessful nature of previous attempts
to write strong Bridge playing programs has been their failure to
address the need to reason explicitly about knowledge, pragmatics,
probabilities and plans. We give an overview of Pragma, a system
currently under development, which embodies these ideas in concrete
form, using a combination of rule-based inference, stochastic
simulation, and “neural-net” learning. Examples are given
illustrating the functionality of the system in its current form.
Formal verification of VIPER’s ALUWong, WaiUniversity of Cambridge, Computer Laboratory1993-04enTextUCAM-CL-TR-300ISSN 1476-2986The dual-level validation concurrency control
methodWu, ZhixueMoody, KenBacon, JeanUniversity of Cambridge, Computer Laboratory1993-06enTextUCAM-CL-TR-301ISSN 1476-2986Logic programming for general game-playingPell, BarneyUniversity of Cambridge, Computer Laboratory1993-06enTextUCAM-CL-TR-302ISSN 1476-2986
Meta-Game Playing is a new approach to games in Artificial
Intelligence, where we construct programs to play new games in a
well-defined class, which are output by an automatic game generator.
As the specific games to be played are not known in advance, a
degree of human bias is eliminated, and playing programs are
required to perform any game-specific optimisations without human
assistance.
The attempt to construct a general game-playing program is made
difficult by the opposing goals of generality and efficiency. This
paper shows how application of standard techniques in
logic-programming (abstract interpretation and partial evaluation)
makes it possible to achieve both of these goals. Using these
techniques, we can represent the semantics of a large class of games
in a general and declarative way, but then have the program
transform this representation into a more efficient version once it
is presented with the rules of a new game. This process can be
viewed as moving some of the responsibility for game analysis (that
concerned with efficiency) from the researcher to the program
itself.
Drawing trees — a case study in functional
programmingKennedy, AndrewUniversity of Cambridge, Computer Laboratory1993-06enTextUCAM-CL-TR-303ISSN 1476-2986Co-induction and co-recursion in higher-order
logicPaulson, Lawrence C.University of Cambridge, Computer Laboratory1993-07enTextUCAM-CL-TR-304ISSN 1476-2986
A theory of recursive and corecursive definitions has been developed
in higher-order logic (HOL) and mechanised using Isabelle. Least
fixedpoints express inductive data types such as strict lists;
greatest fixedpoints express co-inductive data types, such as lazy
lists. Well-founded recursion expresses recursive functions over
inductive data types; co-recursion expresses functions that yield
elements of co-inductive data types. The theory rests on a
traditional formalization of infinite trees. The theory is intended
for use in specification and verification. It supports reasoning
about a wide range of computable functions, but it does not
formalize their operational semantics and can express noncomputable
functions also. The theory is demonstrated using lists and lazy
lists as examples. The emphasis is on using co-recursion to define
lazy list functions, and on using co-induction to reason about them.
Strong normalisation for the linear term calculusBenton, P.N.University of Cambridge, Computer Laboratory1993-07enTextUCAM-CL-TR-305ISSN 1476-2986Recording HOL proofsWong, WaiUniversity of Cambridge, Computer Laboratory1993-07enTextUCAM-CL-TR-306ISSN 1476-2986Natural language processing for information
retrievalLewis, David D.Spärck Jones, KarenUniversity of Cambridge, Computer Laboratory1993-07enTextUCAM-CL-TR-307ISSN 1476-2986
The paper summarizes the essential properties of document retrieval
and reviews both conventional practice and research findings, the
latter suggesting that simple statistical techniques can be
effective. It then considers the new opportunities and challenges
presented by the ability to search full text directly (rather than
e.g. titles and abstracts), and suggests appropriate approaches to
doing this, with a focus on the role of natural language processing.
The paper also comments on possible connections with data and
knowledge retrieval, and concludes by emphasizing the importance of
rigorous performance testing.
A case study of co-induction in Isabelle HOLFrost, JacobUniversity of Cambridge, Computer Laboratory1993-08enTextUCAM-CL-TR-308ISSN 1476-2986
The consistency of the dynamic and static semantics for a small
functional programming language was informally proved by R. Milner
and M. Tofte. The notions of co-inductive definitions and the
associated principle of co-induction played a pivotal role in the
proof. With emphasis on co-induction, the work presented here deals
with the formalisation of this result in the higher-order logic of
the generic theorem prover Isabelle.
Strictness analysis of lazy functional programsBenton, Peter NicholasUniversity of Cambridge, Computer Laboratory1993-08enTextUCAM-CL-TR-309ISSN 1476-2986HARP: a hierarchical asynchronous replication protocol for
massively replicated systemsAdly, NohaUniversity of Cambridge, Computer Laboratory1993-08enTextUCAM-CL-TR-310ISSN 1476-2986A verified Vista implementationCurzon, PaulUniversity of Cambridge, Computer Laboratory1993-09enTextUCAM-CL-TR-311ISSN 1476-2986Set theory for verification: II : Induction and
recursionPaulson, Lawrence C.University of Cambridge, Computer Laboratory1993-09enTextUCAM-CL-TR-312ISSN 1476-2986
A theory of recursive definitions has been mechanized in Isabelle’s
Zermelo-Fraenkel (ZF) set theory. The objective is to support the
formalization of particular recursive definitions for use in
verification, semantics proofs and other computational reasoning.
Inductively defined sets are expressed as least fixedpoints,
applying the Knaster-Tarski Theorem over a suitable set. Recursive
functions are defined by well-founded recursion and its derivatives,
such as transfinite recursion. Recursive data structures are
expressed by applying the Knaster-Tarski Theorem to a set that is
closed under Cartesian product and disjoint sum.
Worked examples include the transitive closure of a relation, lists,
variable-branching trees and mutually recursive trees and forests.
The Schröder-Bernstein Theorem and the soundness of propositional
logic are proved in Isabelle sessions.
Proof by pointingBertot, YvesKahn, GillesThéry, LaurentUniversity of Cambridge, Computer Laboratory1993-10enTextUCAM-CL-TR-313ISSN 1476-2986Practical unification-based parsing of natural
languageCarroll, John AndrewUniversity of Cambridge, Computer LaboratoryenTextUCAM-CL-TR-314ISSN 1476-2986
The thesis describes novel techniques and algorithms for the
practical parsing of realistic Natural Language (NL) texts with a
wide-coverage unification-based grammar of English. The thesis
tackles two of the major problems in this area: firstly, the fact
that parsing realistic inputs with such grammars can be
computationally very expensive, and secondly, the observation that
many analyses are often assigned to an input, only one of which
usually forms the basis of the correct interpretation.
The thesis starts by presenting a new unification algorithm,
justifies why it is well-suited to practical NL parsing, and
describes a bottom-up active chart parser which employs this
unification algorithm together with several other novel processing
and optimisation techniques. Empirical results demonstrate that an
implementation of this parser has significantly better practical
performance than a comparable, state-of-the-art unification-based
parser. Next, techniques for computing an LR table for a large
unification grammar are described, a context free non-deterministic
LR parsing algorithm is presented which has better time complexity
than any previously reported using the same approach, and a
unification-based version is derived. In experiments, the
performance of an implementation of the latter is shown to exceed
both the chart parser and also that of another efficient LR-like
algorithm recently proposed.
Building on these methods, a system for parsing text taken from a
given corpus is described which uses probabilistic techniques to
identify the most plausible syntactic analyses for an input from the
often large number licensed by the grammar. New techniques
implemented include an incremental approach to semi-supervised
training, a context-sensitive method of scoring sub-analyses, the
accurate manipulation of probabilities during parsing, and the
identification of the highest ranked analyses without exhaustive
search. The system attains a similar success rate to approaches
based on context-free grammar, but produces analyses which are more
suitable for semantic processing.
The thesis includes detailed analyses of the worst-case space and
time complexities of all the main algorithms described, and
discusses the practical impact of the theoretical complexity
results.
Strategy generation and evaluation for meta-game
playingPell, Barney DarrylUniversity of Cambridge, Computer Laboratory1993-11enTextUCAM-CL-TR-315ISSN 1476-2986
Meta-Game Playing (METAGAME) is a new paradigm for research in
game-playing in which we design programs to take in the rules of
unknown games and play those games without human assistance. Strong
performance in this new paradigm is evidence that the program,
instead of its human designer, has performed the analysis of each
specific game.
SCL-METAGAME is a concrete METAGAME research problem based around
the class of symmetric chess-like games. The class includes the
games of chess, checkers, noughts and crosses, Chinese-chess, and
Shogi. An implemented game generator produces new games in this
class, some of which are objects of interest in their own right.
METAGAMER is a program that plays SCL-METAGAME. The program takes as
input the rules of a specific game and analyses those rules to
construct for that game an efficient representation and an
evaluation function, both for use with a generic search engine. The
strategic analysis performed by the program relates a set of general
knowledge sources to the details of the particular game. Among other
properties, this analysis determines the relative value of the
different pieces in a given game. Although METAGAMER does not learn
from experience, the values resulting from its analysis are
qualitatively similar to values used by experts on known games, and
are sufficient to produce competitive performance the first time the
program actually plays each game it is given. This appears to be the
first program to have derived useful piece values directly from
analysis of the rules of different games.
Experiments show that the knowledge implemented in METAGAMER is
useful on games unknown to its programmer in advance of the
competition and make it seem likely that future programs which
incorporate learning and more sophisticated active-analysis
techniques will have a demonstrable competitive advantage on this
new problem. When playing the known games of chess and checkers
against humans and specialised programs, METAGAMER has derived from
more general principles some strategies which are familiar to
players of those games and which are hard-wired in many
game-specific programs.
The Compleat LKBCopestake, AnnUniversity of Cambridge, Computer Laboratory1993-08enTextUCAM-CL-TR-316ISSN 1476-2986Femto-VHDL: the semantics of a subset of VHDL and its
embedding in the HOL proof assistantVan Tassel, John PeterUniversity of Cambridge, Computer Laboratory1993-11enTextUCAM-CL-TR-317ISSN 1476-2986A method of program refinementGrundy, JimUniversity of Cambridge, Computer Laboratory1993-11enTextUCAM-CL-TR-318ISSN 1476-2986
A method of specifying the desired behaviour of a computer program,
and of refining such specifications into imperative programs is
proposed. The refinement method has been designed with the intention
of being amenable to tool support, and of being applicable to
real-world refinement problems.
Part of the refinement method proposed involves the use of a style
of transformational reasoning called ‘window inference’. Window
inference is particularly powerful because it allows the information
inherent in the context of a subexpression to be used in its
transformation. If the notion of transformational reasoning is
generalised to include transformations that preserve relationships
weaker than equality, then program refinement can be regarded as a
special case of transformational reasoning. A generalisation of
window inference is described that allows non-equivalence preserving
transformations. Window inference was originally proposed
independently from, and as an alternative to, traditional styles of
reasoning. A correspondence between the generalised version of
window inference and natural deduction is described. This
correspondence forms the basis of a window inference tool that has
been built on top of the HOL theorem proving system.
This dissertation adopts a uniform treatment of specifications and
programs as predicates. A survey of the existing approaches to the
treatment of programs as predicates is presented. A new approach is
then developed based on using predicates of a three-valued logic.
This new approach can distinguish more easily between specifications
of terminating and nonterminating behaviour than can the existing
approaches.
A method of program refinement is then described by combining the
unified treatment of specifications and programs as three-valued
predicates with the window inference style of transformational
reasoning. The result is a simple method of refinement that is well
suited to the provision of tool support.
The method of refinement includes a technique for developing
recursive programs. The proof of such developments is usually
complicated because little can be assumed about the form and
termination properties of a partially developed program. These
difficulties are side-stepped by using a simplified meaning for
recursion that compels the development of terminating programs. Once
the development of a program is complete, the simplified meaning for
recursion is refined into the true meaning.
The dissertation concludes with a case study which presents the
specification and development of a simple line-editor. The case
study demonstrates the applicability of the refinement method to
real-world problems. The line editor is a nontrivial example that
contains features characteristic of large developments, including
complex data structures and the use of data abstraction. Examination
of the case study shows that window inference offers a convenient
way of structuring large developments.
A workstation architecture to support multimediaHayter, Mark DavidUniversity of Cambridge, Computer Laboratory1993-11enTextUCAM-CL-TR-319ISSN 1476-2986
The advent of high speed networks in the wide and local area enables
multimedia traffic to be easily carried between workstation class
machines. The dissertation considers an architecture for a
workstation to support such traffic effectively. In addition to
presenting the information to a human user the architecture allows
processing to be done on continuous media streams.
The proposed workstation architecture, known as the Desk Area
Network (DAN), extends ideas from Asynchronous Transfer Mode (ATM)
networks into the end-system. All processors and devices are
connected to an ATM interconnect. The architecture is shown to be
capable of supporting both multimedia data streams and more
traditional CPU cache line traffic. The advocated extension of the
CPU cache which allows caching of multimedia data streams is shown
to provide a natural programming abstraction and a mechanism for
synchronising the processor with the stream.
A prototype DAN workstation has been built. Experiments have been
done to demonstrate the features of the architecture. In particular
the use of the DAN as a processor-to-memory interconnect is closely
studied to show the practicality of using ATM for cache line traffic
in a real machine. Simple demonstrations of the stream cache ideas
are used to show its utility in future applications.
A fixedpoint approach to implementing (co)inductive
definitions (updated version)Paulson, Lawrence C.University of Cambridge, Computer Laboratory1995-07enTextUCAM-CL-TR-320ISSN 1476-2986
Several theorem provers provide commands for formalizing recursive
datatypes or inductively defined sets. This paper presents a new
approach, based on fixedpoint definitions. It is unusually general:
it admits all monotone inductive definitions. It is conceptually
simple, which has allowed the easy implementation of mutual
recursion and other conveniences. It also handles coinductive
definitions: simply replace the least fixedpoint by a greatest
fixedpoint. This represents the first automated support for
coinductive definitions.
The method has been implemented in Isabelle’s formalization of ZF
set theory. It should be applicable to any logic in which the
Knaster-Tarski Theorem can be proved. The paper briefly describes a
method of formalizing non-well-founded data structures in standard
ZF set theory.
Examples include lists of n elements, the accessible part of a
relation and the set of primitive recursive functions. One example
of a coinductive definition is bisimulations for lazy lists.
Recursive datatypes are examined in detail, as well as one example
of a “codatatype”: lazy lists. The appendices are simple user’s
manuals for this Isabelle/ZF package.
Relational properties of domainsPitts, Andrew M.University of Cambridge, Computer Laboratory1993-12enTextUCAM-CL-TR-321ISSN 1476-2986
New tools are presented for reasoning about properties of
recursively defined domains. We work within a general,
category-theoretic framework for various notions of ‘relation’ on
domains and for actions of domain constructors on relations. Freyd’s
analysis of recursive types in terms of a property of mixed
initiality/finality is transferred to a corresponding property of
invariant relations. The existence of invariant relations is proved
under completeness assumptions about the notion of relation. We show
how this leads to simpler proofs of the computational adequacy of
denotational semantics for functional programming languages with
user-declared datatypes. We show how the initiality/finality
property of invariant relations can be specialized to yield an
induction principle for admissible subsets of recursively defined
domains, generalizing the principle of structural induction for
inductively defined sets. We also show how the initiality/finality
property gives rise to the co-induction principle studied by the
author (in UCAM-CL-TR-252), by which equalities between elements of
recursively defined domains may be proved via an appropriate notion
of ‘bisimulation’.
Supporting distributed realtime computingLi, GuangxingUniversity of Cambridge, Computer Laboratory1993-12enTextUCAM-CL-TR-322ISSN 1476-2986Representing higher-order logic proofs in HOLvon Wright, J.University of Cambridge, Computer Laboratory1994-01enTextUCAM-CL-TR-323ISSN 1476-2986Verifying modular programs in HOLvon Wright, J.University of Cambridge, Computer Laboratory1994-01enTextUCAM-CL-TR-324ISSN 1476-2986The temporal properties of English conditionals and
modalsCrouch, RichardUniversity of Cambridge, Computer Laboratory1994-01enTextUCAM-CL-TR-325ISSN 1476-2986
This thesis deals with the patterns of temporal reference exhibited
by conditional and modal sentences in English, and specifically with
the way that past and present tenses can undergo deictic shift in
these contexts. This shifting behaviour has consequences both for
the semantics of tense and for the semantics of conditionals and
modality.
Asymmetries in the behaviour of the past and present tenses under
deictic shift are explained by positing a primary and secondary
deictic centre for tenses. The two deictic centres, the assertion
time and the verification time, are given independent motivation
through an information based view of tense. This holds that the
tense system not only serves to describe the way that the world
changes over time, but also the way that information about the world
changes. Information change takes place in two stages. First, it is
asserted that some fact holds. And then, either at the same time or
later, it is verified that is assertion is correct.
Typically, assertion and verification occur simultaneously, and most
sentences convey verified information. Modals and conditionals allow
delayed assertion and verification. “If A, then B” means roughly:
suppose you were now to assert A; if and when A is verified, you
will be in a position to assert B, and in due course this assertion
will also be verified. Since A and B will both be tensed clauses,
the shifting of the primary and secondary deictic centres leads to
shifted interpretations of the two clauses.
The thesis presents a range of temporal properties of indicative and
subjunctive conditionals that have not previously been discussed,
and shows how they can be explained. A logic is presented for
indicative conditionals, based around an extension of intuitionistic
logic to allow for both verified and unverified assertions. This
logic naturally gives rise to three forms of epistemic modality,
corresponding to “must”, “may” and “will”.
A modular and extensible network storage
architectureLo, Sai-LaiUniversity of Cambridge, Computer Laboratory1994-01enTextUCAM-CL-TR-326ISSN 1476-2986
Most contemporary distributed file systems are not designed to be
extensible. This work asserts that the lack of extensibility is a
problem because:
– New data types, such as continuous-medium data and structured
data, are significantly different from conventional unstructured
data, such as text and binary, that contemporary distributed file
systems are built to support.
– Value-adding clients can provide functional enhancements, such as
convenient and reliable persistent programming and automatic and
transparent file indexing, but cannot be integrated smoothly with
contemporary distributed file systems.
– New media technologies, such as the optical jukebox and RAID disk,
can extend the scale and performance of a storage service but
contemporary distributed file systems do not have a clear framework
to incorporate these new technologies and to provide the necessary
user level transparency.
Motivated by these observations, the new network storage
architecture (MSSA) presented in this dissertation, is designed to
be extensible. Design modularity is taken as the key to achieve
service extensibility. This dissertation examines a number of issues
related to the design of the architecture. New ideas, such as a
flexible access control mechanism based on temporary capabilities, a
low level storage substrate that uses non-volatile memory to provide
atomic update semantics at high performance, a concept of sessions
to differentiate performance requirements of different data types,
are introduced. Prototype implementations of the key components are
evaluated.
A new application for explanation-based generalisation
within automated deductionBaker, Siani L.University of Cambridge, Computer Laboratory1994-02enTextUCAM-CL-TR-327ISSN 1476-2986The formal verification of the Fairisle ATM switching
element: an overviewCurzon, PaulUniversity of Cambridge, Computer Laboratory1994-03enTextUCAM-CL-TR-328ISSN 1476-2986The formal verification of the Fairisle ATM switching
elementCurzon, PaulUniversity of Cambridge, Computer Laboratory1994-03enTextUCAM-CL-TR-329ISSN 1476-2986Interacting with paper on the DigitalDeskWellner, Pierre DavidUniversity of Cambridge, Computer Laboratory1994-03enTextUCAM-CL-TR-330ISSN 1476-2986
In the 1970’s Xerox PARC developed the “desktop metaphor,” which
made computers easy to use by making them look and act like ordinary
desks and paper. This led visionaries to predict the “paperless
office” would dominate within a few years, but the trouble with this
prediction is that people like paper too much. It is portable,
tactile, universally accepted, and easier to read than a screen.
Today, we continue to use paper, and computers produce more of it
than they replace.
Instead of trying to use computers to replace paper, the DigitalDesk
takes the opposite approach. It keeps the paper, but uses computers
to make it more powerful. It provides a Computer Augmented
Environment for paper.
The DigitalDesk is built around an ordinary physical desk and can be
used as such, but it has extra capabilities. A video camera is
mounted above the desk, pointing down at the work surface. This
camera’s output is fed through a system that can detect where the
user is pointing, and it can read documents that are placed on the
desk. A computer-driven electronic projector is also mounted above
the desk, allowing the system to project electronic objects onto the
work surface and onto real paper documents — something that can’t be
done with flat display panels or rear-projection. The system is
called DigitalDesk because it allows pointing with the fingers.
Several applications have been prototyped on the DigitalDesk. The
first was a calculator where a sheet of paper such as an annual
report can be placed on the desk allowing the user to point at
numbers with a finger or pen. The camera reads the numbers off the
paper, recognizes them, and enters them into the display for further
calculations. Another is a translation system which allows users to
point at unfamiliar French words to get their English definitions
projected down next to the paper. A third is a paper-based paint
program (PaperPaint) that allows users to sketch on paper using
traditional tools, but also be able to select and paste these
sketches with the camera and projector to create merged paper and
electronic documents. A fourth application is the DoubleDigitalDesk,
which allows remote colleagues to “share” their desks, look at each
other’s paper documents and sketch on them remotely.
This dissertation introduces the concept of Computer Augmented
Environments, describes the DigitalDesk and applications for it, and
discusses some of the key implementation issues that need to be
addressed to make this system work. It describes a toolkit for
building DigitalDesk applications, and it concludes with some more
ideas for future work.
HPP: a hierarchical propagation protocol for large scale
replication in wide area networksAdly, NohaKumar, AkhilUniversity of Cambridge, Computer Laboratory1994-03enTextUCAM-CL-TR-331ISSN 1476-2986Distributed computing with objectsEvers, David MartinUniversity of Cambridge, Computer Laboratory1994-03enTextUCAM-CL-TR-332ISSN 1476-2986What is a categorical model of intuitionistic linear
logic?Bierman, G.M.University of Cambridge, Computer Laboratory1994-04enTextUCAM-CL-TR-333ISSN 1476-2986A concrete final coalgebra theorem for ZF set
theoryPaulson, Lawrence C.University of Cambridge, Computer Laboratory1994-05enTextUCAM-CL-TR-334ISSN 1476-2986
A special final coalgebra theorem, in the style of Aczel (1988), is
proved within standard Zermelo-Fraenkel set theory. Aczel’s
Anti-Foundation Axiom is replaced by a variant definition of
function that admits non-well-founded constructions. Variant ordered
pairs and tuples, of possibly infinite length, are special cases of
variant functions. Analogues of Aczel’s Solution and Substitution
Lemmas are proved in the style of Rutten and Turi (1993).
The approach is less general than Aczel’s; non-well-founded objects
can be modelled only using the variant tuples and functions. But the
treatment of non-well-founded objects is simple and concrete. The
final coalgebra of a functor is its greatest fixedpoint. The theory
is intended for machine implementation and a simple case of it is
already implemented using the theorem prover Isabelle.
Video mail retrieval using voice: report on keyword
definition and data collection (deliverable report on VMR task
No. 1)Jones, G.J.F.Foote, J.T.Spärck Jones, K.Young, S.J.University of Cambridge, Computer Laboratory1994-04enTextUCAM-CL-TR-335ISSN 1476-2986
This report describes the rationale, design, collection and basic
statistics of the initial training and test database for the
Cambridge Video Mail Retrieval (VMR) project. This database is
intended to support both training for the wordspotting processes and
testing for the document searching methods using these that are
being developed for the project’s message retrieval task.
Towards a proof theory of rewriting: the simply-typed 2-λ
calculusHilken, Barnaby P.University of Cambridge, Computer Laboratory1994-05enTextUCAM-CL-TR-336ISSN 1476-2986Efficiency in a fully-expansive theorem proverBoulton, Richard JohnUniversity of Cambridge, Computer Laboratory1994-05enTextUCAM-CL-TR-337ISSN 1476-2986
The HOL system is a fully-expansive theorem prover: Proofs generated
in the system are composed of applications of the primitive
inference rules of the underlying logic. This has two main
advantages. First, the soundness of the system depends only on the
implementations of the primitive rules. Second, users can be given
the freedom to write their own proof procedures without the risk of
making the system unsound. A full functional programming language is
provided for this purpose. The disadvantage with the approach is
that performance is compromised. This is partly due to the inherent
cost of fully expanding a proof but, as demonstrated in this thesis,
much of the observed inefficiency is due to the way the derived
proof procedures are written.
This thesis seeks to identify sources of non-inherent inefficiency
in the HOL system and proposes some general-purpose and some
specialised techniques for eliminating it. One area that seems to be
particularly amenable to optimisation is equational reasoning. This
is significant because equational reasoning constitutes large
portions of many proofs. A number of techniques are proposed that
transparently optimise equational reasoning. Existing programs in
the HOL system require little or no modification to work faster.
The other major contribution of this thesis is a framework in which
part of the computation involved in HOL proofs can be postponed.
This enables users to make better use of their time. The technique
exploits a form of lazy evaluation. The critical feature is the
separation of the code that generates the structure of a theorem
from the code that justifies it logically. Delaying the
justification allows some non-local optimisations to be performed in
equational reasoning. None of the techniques sacrifice the security
of the fully-expansive approach.
A decision procedure for a subset of the theory of linear arithmetic
is used to illustrate many of the techniques. Decision procedures
for this theory are commonplace in theorem provers due to the
importance of arithmetic reasoning. The techniques described in the
thesis have been implemented and execution times are given. The
implementation of the arithmetic procedure is a major contribution
in itself. For the first time, users of the HOL system are able to
prove many arithmetic lemmas automatically in a practical amount of
time (typically a second or two).
The applicability of the techniques to other fully-expansive theorem
provers and possible extensions of the ideas are considered.
A new approach to implementing atomic data typesWu, ZhixueUniversity of Cambridge, Computer Laboratory1994-05enTextUCAM-CL-TR-338ISSN 1476-2986Belief revision and dialogue management in information
retrievalLogan, BrianReece, StevenCawsey, AlisonGalliers, JuliaSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory1994-05enTextUCAM-CL-TR-339ISSN 1476-2986
This report describes research to evaluate a theory of belief
revision proposed by Galliers in the context of information-seeking
interaction as modelled by Belkin, Brooks and Daniels and
illustrated by user-librarian dialogues. The work covered the
detailed assessment and development, and computational
implementation and testing, of both the belief revision theory and
the information retrieval model. Some features of the belief theory
presented problems, and the original ‘multiple expert’ retrieval
model had to be drastically modified to support rational dialogue
management. But the experimental results showed that the
characteristics of literature seeking interaction could be
successfully captured by the belief theory, exploiting important
elements of the retrieval model. Thus, though the system’s knowledge
and dialogue performance were very limited, it provides a useful
base for further research. The report presents all aspects of the
research in detail, with particular emphasis on the implementation
of belief and intention revision, and the integration of revision
with domain reasoning and dialogue interaction.
Operating system support for quality of serviceHyden, Eoin AndrewUniversity of Cambridge, Computer Laboratory1994-06enTextUCAM-CL-TR-340ISSN 1476-2986
The deployment of high speed, multiservice networks within the local
area has meant that it has become possible to deliver continuous
media data to a general purpose workstation. This, in conjunction
with the increasing speed of modern microprocessors, means that it
is now possible to write application programs which manipulate
continuous media in real-time. Unfortunately, current operating
systems do not provide the resource management facilities which are
required to ensure the timely execution of such applications.
This dissertation presents a flexible resource management paradigm,
based on the notion of Quality of Service, with which it is possible
to provide the scheduling support required by continuous media
applications. The mechanisms which are required within an operating
system to support this paradigm are described, and the design and
implementation of a prototypical kernel which implements them is
presented.
It is shown that, by augmenting the interface between an application
and the operating system, the application can be informed of varying
resource availabilities, and can make use of this information to
vary the quality of its results. In particular an example decoder
application is presented, which makes use of such information and
exploits some of the fundamental properties of continuous media data
to trade video image quality for the amount of processor time which
it receives.
Presentation support for distributed multimedia
applicationsBates, JohnUniversity of Cambridge, Computer Laboratory1994-06enTextUCAM-CL-TR-341ISSN 1476-2986An architecture for distributed user interfacesFreeman, Stephen Martin GuyUniversity of Cambridge, Computer Laboratory1994-07enTextUCAM-CL-TR-342ISSN 1476-2986
Computing systems have changed rapidly since the first graphical
user interfaces were developed. Hardware has become faster and
software architectures have become more flexible and more open; a
modern computing system consists of many communicating machines
rather than a central host. Understanding of human-computer
interaction has also become more sophisticated and places new
demands on interactive software; these include, in particular,
support for multi-user applications, continuous media, and
‘ubiquitous’ computing. The layer which binds user requirements and
computing systems together, the user interface, has not changed as
quickly; few user interface architectures can easily supportthe new
requirements placed on them and few take advantage of the facilities
offered by advanced computing systems.
Experiences of implementing systems with unusual user interfaces has
shown that current window system models are only a special case of
possible user interface architectures. These window systems are too
strongly tied to assumptions about how users and computers interact
to provide a suitable platform for further evolution. Users and
application builders may reasonably expect to be able to use
multiple input and output devices as their needs arise. Experimental
applications show that flexible user interface architectures, which
support multiple devices and users, can be built without excessive
implementation and processing costs.
This dissertation describes Gemma, a model for a new generation of
interactive systems that are not confined to virtual terminals but
allows collections of independent devices to be bound together for
the task at hand. It provides mediated shared access to basic
devices and higher-level virtual devices so that people can share
computational facilities in the real world, rather than in a virtual
world. An example window system shows how these features may be
exploited to provide a flexible, collaborative and mobile
interactive environment.
The contour tree image encoding technique and file
formatTurner, Martin JohnUniversity of Cambridge, Computer Laboratory1994-07enTextUCAM-CL-TR-344ISSN 1476-2986A proof environment for arithmetic with the Omega
ruleBaker, Siani L.University of Cambridge, Computer Laboratory1994-08enTextUCAM-CL-TR-345ISSN 1476-2986On intuitionistic linear logicBierman, G.M.University of Cambridge, Computer Laboratory1994-08enTextUCAM-CL-TR-346ISSN 1476-2986
In this thesis we carry out a detailed study of the (propositional)
intuitionistic fragment of Girard’s linear logic (ILL). Firstly we
give sequent calculus, natural deduction and axiomatic formulations
of ILL. In particular our natural deduction is different from others
and has important properties, such as closure under substitution,
which others lack. We also study the process of reduction in all
three local formulations, including a detailed proof of cut
elimination. Finally, we consider translations between
Instuitionistic Logic (IL) and ILL.
We then consider the linear term calculus, which arises from
applying the Curry-Howard correspondence to the natural deduction
formulation. We show how the various proof theoretic formulations
suggest reductions at the level of terms. The properties of strong
normalization and confluence are proved for these reduction rules.
We also consider mappings between the extended λ-calculus and the
linear term calculus.
Next we consider a categorical model for ILL. We show how by
considering the linear term calculus as an equational logic, we can
derive a model: a linear category. We consider two alternative
models: firstly, one due to Seely and then one due to Lafont.
Surprisingly, we find that Seely’s model is not sound, in that equal
terms are not modelled with equal morphisms. We show how after
adapting Seely’s model (by viewing it in a more abstract setting) it
becomes a particular instance of a linear category. We show how
Lafont’s model can also be seen as another particular instance of a
linear category. Finally we consider various categories of
coalgebras, whose construction can be seen as a categorical
equivalent of the translation of IL into ILL.
Reflections on TRECSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory1994-07enTextUCAM-CL-TR-347ISSN 1476-2986
This paper discusses the Text REtrieval Conferences (TREC) programme
as a major enterprise in information retrieval research. It reviews
its structure as an evaluation exercise, characterises the methods
of indexing and retrieval being tested within it in terms of the
approaches to system performance factors these represent; analyses
the test results for solid, overall conclusions that can be drawn
from them; and, in the light of the particular features of the test
data, assesses TREC both for generally-applicable findings that
emerge from it and for directions it offers for future research.
Integrated sound synchronisation for computer
animationHunter, Jane LouiseUniversity of Cambridge, Computer Laboratory1994-08enTextUCAM-CL-TR-348ISSN 1476-2986A HOL interpretation of NodenGraham, BrianUniversity of Cambridge, Computer Laboratory1994-09enTextUCAM-CL-TR-349ISSN 1476-2986Ten commandments of formal methodsBowen, Jonathan P.Hinchey, Michael G.University of Cambridge, Computer Laboratory1994-09enTextUCAM-CL-TR-350ISSN 1476-2986Handling realtime traffic in mobile networksBiswas, Subir KumarUniversity of Cambridge, Computer Laboratory1994-09enTextUCAM-CL-TR-351ISSN 1476-2986
The rapidly advancing technology of cellular communication and
wireless LAN makes ubiquitous computing feasible where the mobile
users can have access to the location independent information and
the computing resources. Multimedia networking is another emerging
technological trend of the 1990s and there is an increasing demand
for supporting continuous media traffic in wireless personal
communication environment. In order to guarantee the strict
performance requirements of realtime traffic, the
connection-oriented approaches are proving to be more efficient
compared to the conventional datagram based networking. This
dissertation deals with a network architecture and its design issues
for implementing the connection-oriented services in a mobile radio
environment.
The wired backbone of the proposed wireless LAN comprises of high
speed ATM switching elements, connected in a modular fashion, where
the new switches and the user devices can be dynamically added and
reconnected for maintaining a desired topology. A dynamic
reconfiguration protocol, which can cope with these changing network
topologies, is proposed for the present network architecture. The
details about a prototype implementation of the protocol and a
simulation model for its performance evaluation are presented.
CSMA/AED, a single frequency and carrier sensing based protocol is
proposed for the radio medium access operations. A simulation model
is developed in order to investigate the feasibility of this
statistical and reliable access scheme for the proposed radio
network architecture. The effectiveness of a per-connection window
based flow control mechanism, for the proposed radio LAN, is also
investigated. A hybrid technique is used, where the medium access
and the radio data-link layers are modelled using the mentioned
simulator; an upper layer end-to-end queueing model, involving flow
dependent servers, is solved using an approximate Mean Value
Analysis technique which is augmented for faster iterative
convergence.
A distributed location server, for managing mobile users’ location
information and for aiding the mobile connection management tasks,
is proposed. In order to hide the effects of mobility from the
non-mobile network entities, the concept of a per-mobile software
entity, known as a “representative”, is introduced. A mobile
connection management scheme is also proposed for handling the
end-to-end network layer connections in the present mobile
environment. The scheme uses the representatives and a novel
connection caching technique for providing the necessary realtime
traffic support functionalities.
A prototype system, comprising of the proposed location and the
connection managers, has been built for demonstrating the
feasibility of the presented architecture for transporting
continuous media traffic. A set of experiments have been carried out
in order to investigate the impacts of various design decisions and
to identify the performance-critical parts of the design.
A mixed linear and non-linear logic: proofs, terms and
modelsBenton, P.N.University of Cambridge, Computer Laboratory1994-10enTextUCAM-CL-TR-352ISSN 1476-2986Merging HOL with set theoryGordon, MikeUniversity of Cambridge, Computer Laboratory1994-11enTextUCAM-CL-TR-353ISSN 1476-2986
Set theory is the standard foundation for mathematics, but the
majority of general purpose mechanized proof assistants support
versions of type theory (higher order logic). Examples include Alf,
Automath, Coq, Ehdm, HOL, IMPS, Lambda, LEGO, Nuprl, PVS and
Veritas. For many applications type theory works well and provides
for specification the benefits of type-checking that are well known
in programming. However, there are areas where types get in the way
or seem unmotivated. Furthermore, most people with a scientific or
engineering background already know set theory, whereas type theory
may appear inaccessible and so be an obstacle to the uptake of proof
assistants based on it. This paper describes some experiments (using
HOL) in combining set theory and type theory; the aim is to get the
best of both worlds in a single system. Three approaches have been
tried, all based on an axiomatically specified type V of ZF-like
sets: (i) HOL is used without any additions besides V; (ii) an
embedding of the HOL logic into V is provided; (iii) HOL axiomatic
theories are automatically translated into set-theoretic
definitional theories. These approaches are illustrated with two
examples: the construction of lists and a simple lemma in group
theory.
Formalising a model of the λ-calculus in HOL-STAgerholm, StenUniversity of Cambridge, Computer Laboratory1994-10enTextUCAM-CL-TR-354ISSN 1476-2986Two cryptographic notesWheeler, DavidNeedham, RogerUniversity of Cambridge, Computer Laboratory1994-11enTextUCAM-CL-TR-355ISSN 1476-2986
A large block DES-like algorithm
DES was designed to be slow in software. We give here a DES type of
code which applies directly to single blocks comprising two or more
words of 32 bits. It is thought to be at least as secure as
performing DES separately on two word blocks, and has the added
advantage of not requiring chaining etc. It is about 8m/(12+2m)
times as fast as DES for an m word block and has a greater gain for
Feistel codes where the number of rounds is greater. We use the name
GDES for the codes we discuss. The principle can be used on any
Feistel code.
TEA, a Tiny Encryption Algorithm
We design a short program which will run on most machines and
encypher safely. It uses a large number of iterations rather than a
complicated program. It is hoped that it can easily be translated
into most languages in a compatible way. The first program is given
below. It uses little set up time and does a weak non linear
iteration enough rounds to make it secure. There are no preset
tables or long set up times. It assumes 32 bit words.
Simple, proven approaches to text retrievalRobertson, S.E.Spärck Jones, K.University of Cambridge, Computer Laboratory1994-12enTextUCAM-CL-TR-356ISSN 1476-2986
This technical note describes straightforward techniques for
document indexing and retrieval that have been solidly established
through extensive testing and are easy to apply. They are useful for
many different types of text material, are viable for very large
files, and have the advantage that they do not require special
skills or training for searching, but are easy for end users.
Seven more myths of formal methodsBowen, Jonathan P.Hinchey, Michael G.University of Cambridge, Computer Laboratory1994-12enTextUCAM-CL-TR-357ISSN 1476-2986Multithreaded processor designMoore, Simon WilliamUniversity of Cambridge, Computer Laboratory1995-02enTextUCAM-CL-TR-358ISSN 1476-2986A case study of co-induction in IsabelleFrost, JacobUniversity of Cambridge, Computer Laboratory1995-02enTextUCAM-CL-TR-359ISSN 1476-2986
The consistency of the dynamic and static semantics for a small
functional programming language was informally proved by R. Milner
and M. Tofte. The notions of co-inductive definitions and the
associated principle of co-induction played a pivotal role in the
proof. With emphasis on co-induction, the work presented here deals
with the formalisation of this result in the generic theorem prover
Isabelle.
On the calculation of explicit polymetresClocksin, W.F.University of Cambridge, Computer Laboratory1995-03enTextUCAM-CL-TR-360ISSN 1476-2986
Computer scientists take an interest in objects or events which can
be counted, grouped, timed and synchronised. The computational
problems involved with the interpretation and notation of musical
rhythm are therefore of particular interest, as the most complex
time-stamped structures yet devised by humankind are to be found in
music notation. These problems are brought into focus when
considering explicit polymetric notation, which is the concurrent
use of different time signatures in music notation. While not in
common use the notation can be used to specify complicated
cross-rhythms, simple versus compound metres, and unequal note
values without the need for tuplet notation. From a computational
point of view, explicit polymetric notation is a means of specifying
synchronisation relationships amongst multiple time-stamped streams.
Human readers of explicit polymetic notation use the time signatures
together with the layout of barlines and musical events as clues to
determine the performance. However, if the aim is to lay out the
notation (such as might be required by an automatic music notation
processor), the location of barlines and musical events will be
unknown, and it is necessary to calculate them given only the
information conveyed by the time signatures. Similar problems arise
when trying to perform the notation (i.e. animate the specification)
in real-time. Some problems in the interpretation of explicit
polymetric notation are identified and a solution is proposed. Two
different interpretations are distinguished, and methods for their
automatic calculation are given. The solution given may be applied
to problems which involve the synchronisation or phase adjustment of
multiple independent threads of time-stamped objects.
Explicit network schedulingBlack, Richard JohnUniversity of Cambridge, Computer Laboratory1995-04enTextUCAM-CL-TR-361ISSN 1476-2986
This dissertation considers various problems associated with the
scheduling and network I/O organisation found in conventional
operating systems for effective support for multimedia applications
which require Quality of Service.
A solution for these problems is proposed in a micro-kernel
structure. The pivotal features of the proposed design are that the
processing of device interrupts is performed by user-space processes
which are scheduled by the system like any other, that events are
used for both inter- and intra-process synchronisation, and the use
of a specially developed high performance I/O buffer management
system.
An evaluation of an experimental implementation is included. In
addition to solving the scheduling and networking problems
addressed, the prototype is shown to out-perform the Wanda system (a
locally developed micro-kernel) on the same platform.
This dissertation concludes that it is possible to construct an
operating system where the kernel provides only the fundamental job
of fine grain sharing of the CPU between processes, and hence
synchronisation between those processes. This enables processes to
perform task specific optimisations; as a result system performance
is enhanced, both with respect to throughput and the meeting of soft
real-time guarantees.
W-learning: competition among selfish Q-learnersHumphrys, MarkUniversity of Cambridge, Computer Laboratory1995-04enTextUCAM-CL-TR-362ISSN 1476-2986
W-learning is a self-organising action-selection scheme for systems
with multiple parallel goals, such as autonomous mobile robots. It
uses ideas drawn from the subsumption architecture for mobile robots
(Brooks), implementing them with the Q-learning algorithm from
reinforcement learning (Watkins). Brooks explores the idea of
multiple sensing-and-acting agents within a single robot, more than
one of which is capable of controlling the robot on its own if
allowed. I introduce a model where the agents are not only
autonomous, but are in fact engaged in direct competition with each
other for control of the robot. Interesting robots are ones where no
agent achieves total victory, but rather the state-space is
fragmented among different agents. Having the agents operate by
Q-learning proves to be a way to implement this, leading to a local,
incremental algorithm (W-learning) to resolve competition. I present
a sketch proof that this algorithm converges when the world is a
discrete, finite Markov decision process. For each state,
competition is resolved with the most likely winner of the state
being the agent that is most likely to suffer the most if it does
not win. In this way, W-learning can be viewed as ‘fair’ resolution
of competition. In the empirical section, I show how W-learning may
be used to define spaces of agent-collections whose action selection
is learnt rather than hand-designed. This is the kind of
solution-space that may be searched with a genetic algorithm.
Names and higher-order functionsStark, IanUniversity of Cambridge, Computer Laboratory1995-04enTextUCAM-CL-TR-363ISSN 1476-2986
Many functional programming languages rely on the elimination of
‘impure’ features: assignment to variables, exceptions and even
input/output. But some of these are genuinely useful, and it is of
real interest to establish how they can be reintroducted in a
controlled way. This dissertation looks in detail at one example of
this: the addition to a functional language of dynamically generated
“names”. Names are created fresh, they can be compared with each
other and passed around, but that is all. As a very basic example of
“state”, they capture the graduation between private and public,
local and global, by their interaction with higher-order functions.
The vehicle for this study is the “nu-calculus”, an extension of the
simply-typed lambda-calculus. The nu-calculus is equivalent to a
certain fragment of Standard ML, omitting side-effects, exceptions,
datatypes and recursion. Even without all these features, the
interaction of name creation with higher-order functions can be
complex and subtle.
Various operational and denotational methods for reasoning about the
nu-calculus are developed. These include a computational
metalanguage in the style of Moggi, which distinguishes in the type
system between values and computations. This leads to categorical
models that use a strong monad, and examples are devised based on
functor categories.
The idea of “logical relations” is used to derive powerful reasoning
methods that capture some of the distinction between private and
public names. These techniques are shown to be complete for
establishing contextual equivalence between first-order expressions;
they are also used to construct a correspondingly abstract
categorical model.
All the work with the nu-calculus extends cleanly to Reduced ML, a
larger language that introduces integer references: mutable storage
cells that are dynamically allocated. It turns out that the step up
is quite simple, and both the computational metalanguage and the
sample categorical models can be reused.
The Church-Rosser theorem in Isabelle: a proof porting
experimentRasmussen, OleUniversity of Cambridge, Computer Laboratory1995-04enTextUCAM-CL-TR-364ISSN 1476-2986
This paper describes a proof of the Church-Rosser theorem for the
pure lambda-calculus formalised in the Isabelle theorem prover. The
initial version of the proof is ported from a similar proof done in
the Coq proof assistant by Girard Huet, but a number of
optimisations have been performed. The development involves the
introduction of several inductive and recursive definitions and thus
gives a good presentation of the inductive package of Isabelle.
Computational types from a logical perspective IBenton, P.N.Bierman, G.M.de Paiva, V.C.V.University of Cambridge, Computer Laboratory1995-05enTextUCAM-CL-TR-365ISSN 1476-2986Retrieving spoken documents: VMR Project
experimentsSpärck Jones, K.Jones, G.J.F.Foote, J.T.Young, S.J.University of Cambridge, Computer Laboratory1995-05enTextUCAM-CL-TR-366ISSN 1476-2986Categorical logicPitts, Andrew M.University of Cambridge, Computer Laboratory1995-05enTextUCAM-CL-TR-367ISSN 1476-2986
This document provides an introduction to the interaction between
category theory and mathematical logic which is slanted towards
computer scientists. It will be a chapter in the forthcoming Volume
VI of: S. Abramsky, D. M. Gabbay, and T. S. E. Maibaum (eds),
“Handbook of Logic in Computer Science”, Oxford University Press.
CogPiT – configuration of protocols in TIPStiller, BurkhardUniversity of Cambridge, Computer Laboratory1995-06enTextUCAM-CL-TR-368ISSN 1476-2986
The variety of upcoming applications in terms of their performance
and Quality-of-Service (QoS) requirements is increasing. Besides
almost well-known applications, such as teleconferencing, audio- and
video-transmissions, even more contemporary ones, such as medical
imaging, Video-on-Demand, and interactive tutoring systems, are
introduced and applied to existing networks. On the contrary,
traditionally data-oriented applications, such as file transfer and
remote login, are considerably different in terms of their QoS
requirements. Therefore, the consequences of this evolution effect
the architectures of end-systems, e.g., workstations that have to be
capable of maintaining all different kinds of multi-media data, and
intermediate-systems as well.
Therefore, a configuration approach of communication protocols has
been developed to support the variety of applications. This approach
offers the possibility to configure communication protocols
automatically depending on the application requirements expressed in
various QoS parameters. The result, an application-tailored
communication protocol, matches the requested application
requirements as far as possible. Additionally, network and system
resources (NSR) are taken into account for a well-suited
configuration.
The Configuration of Protocols in TIP is called CogPiT and is part
of the Transport and Internetworking Package (TIP). As an example,
in the TIP environment the transport protocol TEMPO is used for
configuration purposes.
A comparison of HOL-ST and Isabelle/ZFAgerholm, StenUniversity of Cambridge, Computer Laboratory1995-07enTextUCAM-CL-TR-369ISSN 1476-2986
The use of higher order logic (simple type theory) is often limited
by its restrictive type system. Set theory allows many constructions
on sets that are not possible on types in higher order logic. This
paper presents a comparison of two theorem provers supporting set
theory, namely HOL-ST and Isabelle/ZF, based on a formalization of
the inverse limit construction of domain theory; this construction
cannot be formalized in higher order logic directly. We argue that
whilst the combination of higher order logic and set theory in
HOL-ST has advantages over the first order set theory in
Isabelle/ZF, the proof infrastructure of Isabelle/ZF has better
support for set theory proofs than HOL-ST. Proofs in Isabelle/ZF are
both considerably shorter and easier to write.
A package for non-primitive recursive function definitions
in HOLAgerholm, StenUniversity of Cambridge, Computer Laboratory1995-07enTextUCAM-CL-TR-370ISSN 1476-2986LIMINF convergence in Ω-categoriesWagner, Kim RitterUniversity of Cambridge, Computer Laboratory1995-06enTextUCAM-CL-TR-371ISSN 1476-2986A brief history of mobile telephonyHild, Stefan G.University of Cambridge, Computer Laboratory1995-01enTextUCAM-CL-TR-372ISSN 1476-2986
Mobile telephony has gone through a decade of tremendous change and
progress. Today, mobile phones are an indispensable tool to many
professionals, and have great potential to become vital components
in mobile data communication applications. In this survey we will
attempt to present some of the milestones from the route which
mobile telephony has taken over the past decades while developing
from an experimental system with limited capabilities with to a
mature technology (section 1), followd by a more detailed
introduction into the modern pan-European GSM standard (section 2).
Section 3 is devoted to the data communication services, covering
two packet-oriented data only networks as well as data services
planned for the GSM system. Section 4 covers some security issues
and section 5 gives an insight into the realities today with details
of some networks available in the UK. Finally, section 6 concludes
this overview with a brief look into the future.
Natural-language processing and requirements
specificationsMacías, BenjamínPulman, Stephen G.University of Cambridge, Computer Laboratory1995-07enTextUCAM-CL-TR-373ISSN 1476-2986A framework for QoS updates in a networking
environmentStiller, BurkhardUniversity of Cambridge, Computer Laboratory1995-07enTextUCAM-CL-TR-374ISSN 1476-2986
The support of sufficient Quality-of-Service (QoS) for applications
residing in a distributed environment and running on top of high
performance networks is a demanding issue. Currently, the areas to
provide this support adequately include communication protocols,
operating systems support, and offered network services. A
configurable approach of communication protocols offers the needed
protocol flexibility to react accordingly on various different
requirements.
Communication protocols and operating systems have to be
parametrized using internal configuration parameters, such as window
sizes, retry counters, or scheduling mechanisms, that rely closely
on requested application-oriented or network-dependent QoS, such as
bandwidth or delay. Moreover, these internal parameters have to be
recalculated from time to time due to network changes (such as
congestion or line break-down) or due to application-specific
alterations (such as enhanced bandwidth requirements or increased
reliability) to adjust a temporary or semi-permanent “out-of-tune”
service behavior.
Therefore, a rule-based evaluation and QoS updating framework for
configuration parameters in a networking environment has been
developed. The resulting “rulework” can be used within highly
dynamic environments in a communication subsystem that offers the
possibility to specify for every QoS parameter both a bounding
interval of values and an average value. As an example, the
framework has been integrated in the Function-based Communication
Subsystem (F-CSS). Especially, an enhanced application service
interface is offered, allowing for the specification of various
QoS-parameters that are used to configure a sufficient
application-tailored communication protocol.
Restructuring virtual memory to support distributed
computing environmentsHuang, FengUniversity of Cambridge, Computer Laboratory1995-07enTextUCAM-CL-TR-375ISSN 1476-2986The structure of a multi-service operating systemRoscoe, TimothyUniversity of Cambridge, Computer Laboratory1995-08enTextUCAM-CL-TR-376ISSN 1476-2986
Increases in processor speed and network bandwidth have led to
workstations being used to process multimedia data in real time.
These applications have requirements not met by existing operating
systems, primarily in the area of resource control: there is a need
to reserve resources, in particular the processor, at a fine
granularity. Furthermore, guarantees need to be dynamically
renegotiated to allow users to reassign resources when the machine
is heavily loaded. There have been few attempts to provide the
necessary facilities in traditional operating systems, and the
internal structure of such systems makes the implementation of
useful resource control difficult.
This dissertation presents a way of structuring an operating system
to reduce crosstalk between applications sharing the machine, and
enable useful resource guarantees to be made: instead of system
services being located in the kernel or server processes, they are
placed as much as possible in client protection domains and
scheduled as part of the client, with communication between domains
only occurring when necessary to enforce protection and concurrency
control. This amounts to multiplexing the service at as low a level
of abstraction as possible. A mechanism for sharing processor time
between resources is also described. The prototype Nemesis operating
system is used to demonstrate the ideas in use in a practical
system, and to illustrate solutions to several implementation
problems that arise.
Firstly, structuring tools in the form of typed interfaces within a
single address space are used to reduce the complexity of the system
from the programmer’s viewpoint and enable rich sharing of text and
data between applications.
Secondly, a scheduler is presented which delivers useful Quality of
Service guarantees to applications in a highly efficient manner.
Integrated with the scheduler is an inter-domain communication
system which has minimal impact on resource guarantees, and a method
of decoupling hardware interrupts from the execution of device
drivers.
Finally, a framework for high-level inter-domain and inter-machine
communication is described, which goes beyond object-based RPC
systems to permit both Quality of Service negotiation when a
communication binding is established, and services to be implemented
straddling protection domain boundaries as well as locally and in
remote processes.
Mechanising set theory: cardinal arithmetic and the axiom of
choicePaulson, LarryGrabczewski, KrzysztofUniversity of Cambridge, Computer Laboratory1995-07enTextUCAM-CL-TR-377ISSN 1476-2986
Fairly deep results of Zermelo-Fraenkel (ZF) set theory have been
mechanised using the proof assistant Isabelle. The results concern
cardinal arithmetic and the Axiom of Choice (AC). A key result about
cardinal multiplication is K*K=K, where K is any infinite cardinal.
Proving this result required developing theories of orders,
order-isomorphisms, order types, ordinal arithmetic, cardinals,
etc.; this covers most of Kunen, Set Theory, Chapter I. Furthermore,
we have proved the equivalence of 7 formulations of the
Well-ordering Theorem and 20 formulations of AC; this covers the
first two chapters of Rubin and Rubin, Equivalents of the Axiom of
Choice. The definitions used in the proofs are largely faithful in
style to the original mathematics.
Performance evaluation of HARP: a hierarchical asynchronous
replication protocol for large scale systemAdly, NohaUniversity of Cambridge, Computer Laboratory1995-08enTextUCAM-CL-TR-378ISSN 1476-2986Proceedings of the First Isabelle Users WorkshopPaulson, LawrenceUniversity of Cambridge, Computer Laboratory1995-09enTextUCAM-CL-TR-379ISSN 1476-2986Quality-of-Service issues in networking
environmentsStiller, BurkhardUniversity of Cambridge, Computer Laboratory1995-09enTextUCAM-CL-TR-380ISSN 1476-2986
Quality-of-Service (QoS) issues in networking environments cover
various separate areas and topics. They include at least the
specification of applications requirements, the definition of
network services, QoS models, resource reservation methods,
negotiation and transformation methods for QoS, and operating system
support for guaranteed services. An embracing approach for handling,
dealing with, and supporting QoS in different scenarios and
technical set-ups is required to manage sufficiently forthcoming
communication and networking tasks. Modern telecommunication systems
require an integrated architecture for applications, communication
subsystems, and network perspectives to overcome drawbacks of
traditional communication architectures, such as redundant protocol
functionality, weakly designed interfaces between the end-system and
a network adapter, or impossibility of specifying and guaranteeing
QoS parameter.
This work contains the discussion of a number of interconnected QoS
issues, e.g., QoS mapping, QoS negotiation, QoS-based configuration
of communication protocols, or QoS aspects in Asynchronous Transfer
Mode (ATM) signaling protocols, which have been dealt with during a
one-year research fellowship. This report is not intended to be a
complete description of every technical detail, but tries to provide
a brief overall picture of the emerging and explosively developing
QoS issues in telecommunication systems. Additionally,
investigations of some of these issues are undertaken in a more
closer detail. It is mainly focussed on QoS mapping, negotiation,
and updating in the communication protocol area.
Rendering for free form deformationsNimscheck, Uwe MichaelUniversity of Cambridge, Computer Laboratory1995-10enTextUCAM-CL-TR-381ISSN 1476-2986Synthetic image generation for a multiple-view autostereo
displayCastle, Oliver M.University of Cambridge, Computer Laboratory1995-10enTextUCAM-CL-TR-382ISSN 1476-2986Management of replicated data in large scale
systemsAdly, NohaUniversity of Cambridge, Computer Laboratory1995-11enTextUCAM-CL-TR-383ISSN 1476-2986Securing ATM networksChuang, Shaw-ChengUniversity of Cambridge, Computer Laboratory1995-01enTextUCAM-CL-TR-384ISSN 1476-2986
This is an interim report on the investigations into securing
Asynchronous Transfer Mode (ATM) networks. We look at the challenge
in providing such a secure ATM network and identify the important
issues in achieving such goal. In this paper, we discuss the issues
and problems involved and outline some techniques to solving these
problems. The network environment is first examined and we also
consider the correct placement of security mechanism in such an
environment. Following the analysis of the security requirement, we
introduce and describe a key agile cryptographic device for ATM. The
protection of the ATM data plane is extremely important to provide
data confidentiality and data integrity. Techniques in providing
synchronisation, dynamic key change, dynamic initialisation vector
change and Message Authentication Code on ATM data, are also being
considered. Next, we discuss the corresponding control functions. A
few key exchange protocols are given as possible candidates for the
establishment of the session key. The impact of such key exchange
protocols on the design of an ATM signalling protocol has also been
examined and security extension to an existing signalling protocol
being discussed. We also talk about securing other control plane
functions such as NNI routing, Inter-Domain Policy Routing,
authorisation and auditing, firewall and intrusion detection,
Byzantine robustness. Management plane functions are also being
looked at, with discussions on bootstrapping, authenticated
neighbour discovery, ILMI Security, PVC security, VPI security and
ATM Forum management model.
Performance evaluation of the Delphi machineSaraswat, SanjayUniversity of Cambridge, Computer Laboratory1995-12enTextUCAM-CL-TR-385ISSN 1476-2986Bisimilarity for a first-order calculus of objects with
subtypingGordon, Andrew D.Rees, Gareth D.University of Cambridge, Computer Laboratory1996-01enTextUCAM-CL-TR-386ISSN 1476-2986Monitoring composite events in distributed
systemsSchwiderski, ScarletHerbert, AndrewMoody, KenUniversity of Cambridge, Computer Laboratory1996-02enTextUCAM-CL-TR-387ISSN 1476-2986A unified approach to strictness analysis and optimising
transformationsBenton, P.N.University of Cambridge, Computer Laboratory1996-02enTextUCAM-CL-TR-388ISSN 1476-2986A proof checked for HOLWong, WaiUniversity of Cambridge, Computer Laboratory1996-03enTextUCAM-CL-TR-389ISSN 1476-2986Syn: a single language for specifiying abstract syntax
tress, lexical analysis, parsing and pretty-printingBoulton, Richard J.University of Cambridge, Computer Laboratory1996-03enTextUCAM-CL-TR-390ISSN 1476-2986
A language called Syn is described in which all aspects of
context-free syntax can be specified without redundancy. The
language is essentially an extended BNF grammar. Unusual features
include high-level constructs for specifying lexical aspects of a
language and specification of precedence by textual order. A system
has been implemented for generating lexers, parsers, pretty-printers
and abstract syntax tree representations from a Syn specification.
Programming languages and dimensionsKennedy, Andrew JohnUniversity of Cambridge, Computer Laboratory1996-04enTextUCAM-CL-TR-391ISSN 1476-2986
Scientists and engineers must ensure that the equations and formulae
which they use are dimensionally consistent, but existing
programming languages treat all numeric values as dimensionless.
This thesis investigates the extension of programming languages to
support the notion of physical dimension.
A type system is presented similar to that of the programming
language ML but extended with polymorphic dimension types. An
algorithm which infers most general dimension types automatically is
then described and proved correct.
The semantics of the language is given by a translation into an
explicitlytyped language in which dimensions are passed as arguments
to functions. The operational semantics of this language is
specified in the usual way by an evaluation relation defined by a
set of rules. This is used to show that if a program is well-typed
then no dimension errors can occur during its evaluation.
More abstract properties of the language are investigated using a
denotational semantics: these include a notion of invariance under
changes in the units of measure used, analogous to parametricity in
the polymorphic lambda calculus. Finally the dissertation is
summarised and many possible directions for future research in
dimension types and related type systems are described.
Decoding choice encodingsNestmann, UwePierce, Benjamin C.University of Cambridge, Computer Laboratory1996-04enTextUCAM-CL-TR-392ISSN 1476-2986Performance management in ATM networksCrosby, Simon AndrewUniversity of Cambridge, Computer Laboratory1996-04enTextUCAM-CL-TR-393ISSN 1476-2986
The Asynchronous Transfer Mode (ATM) has been identified as the
technology of choice amongst high speed communication networks for
its potential to integrate services with disparate resource needs
and timing constraints. Before it can successfully deliver
integrated services, however, significant problems remain to be
solved. They centre around two major issues. First, there is a need
for a simple, powerful network service interface capable of meeting
the communications needs of new applications. Second, within the
network there is a need to dynamically control a mix of diverse
traffic types to ensure that they meet their performance criteria.
Addressing the first concern, this dissertation argues that a simple
network control interface offers significant advantages over the
traditional, heavyweight approach of the telecommunications
industry. A network control architecture based on a distributed
systems approach is presented which locates both the network control
functions and its services outside the network. The network service
interface uses the Remote Procedure Call (RPC) paradigm and enables
more complicated service offerings to be built from the basic
primitives. A formal specification and verification of the
user-network signalling protocol is presented. Implementations of
the architecture, both on Unix and the Wanda micro-kernel, used on
the Fairisle ATM switch, are described. The implementations
demonstrate the feasibility of the architecture, and feature a high
degree of experimental flexibility. This is exploited in the balance
of the dissertation, which presents the results of a practical study
of network performance under a range of dynamic control mechanisms.
Addressing the second concern, results are presented from a study of
the cell delay variation suffered by ATM connections when
multiplexed with real ATM traffic in an uncontrolled network, and
from an investigation of the expansion of bursts of ATM traffic as a
result of multiplexing. The results are compared with those of
analytical models. Finally, results from a study of the performance
delivered to delay sensitive traffic by priority and rate based cell
scheduling algorithms, and the loss experienced by different types
of traffic under several buffer allocation strategies are presented.
A simple formalization and proof for the mutilated chess
boardPaulson, Lawrence C.University of Cambridge, Computer Laboratory1996-04enTextUCAM-CL-TR-394ISSN 1476-2986
The impossibility of tiling the mutilated chess board has been
formalized and verified using Isabelle. The formalization is concise
because it is expressed using inductive definitions. The proofs are
straightforward except for some lemmas concerning finite
cardinalities. This exercise is an object lesson in choosing a good
formalization. is applicable in a variety of domains.
Cut-elimination for full intuitionistic linear
logicBräuner, Torbende Paiva, ValeriaUniversity of Cambridge, Computer Laboratory1996-05enTextUCAM-CL-TR-395ISSN 1476-2986Generic automatic proof toolsPaulson, Lawrence C.University of Cambridge, Computer Laboratory1996-05enTextUCAM-CL-TR-396ISSN 1476-2986
This paper explores a synthesis between two distinct traditions in
automated reasoning: resolution and interaction. In particular it
discusses Isabelle, an interactive theorem prover based upon a form
of resolution. It aims to demonstrate the value of proof tools that,
compared with traditional resolution systems, seem absurdly limited.
Isabelle’s classical reasoner searches for proofs using a tableau
approach. The reasoner is generic: it accepts rules proved in
applied theories, involving defined connectives. New constants are
not reduced to first-order logic; the reasoner
Optimal routing in 2-jump circulant networksRobič, BorutUniversity of Cambridge, Computer Laboratory1996-06enTextUCAM-CL-TR-397ISSN 1476-2986
An algorithm for routing a message along the shortest path between a
pair of processors in 2-jump circulant (undirected double fixed
step) network is given. The algorithm requires O(d) time for
preprocessing, and l = O(d) routing steps, where l is the distance
between the processors and d is the diameter of the network.
Design and implementation of an autostereoscopic camera
systemDodgson, N.A.Moore, J.R.University of Cambridge, Computer Laboratory1996-06enTextUCAM-CL-TR-398ISSN 1476-2986
An autostereoscopic display provides the viewer with a
three-dimensional image without the need for special glasses, and
allows the user to look around objects in the image by moving the
head left-right. The time-multiplexed autostereo display developed
at the University of Cambridge has been in operation since late
1991.
An autostereoscopic camera system has been designed and implemented.
It is capable of taking video input from up to sixteen cameras, and
multiplexing these into a video output stream with a pixel rate an
order of magnitude faster than the individual input streams. Testing
of the system with eight cameras and a Cambridge Autostereo Display
has produced excellent live autostereoscopic video.
This report describes the design of this camera system which has
been successfully implemented and demonstrated. Problems which arose
during this process are discussed, and a comparison with similar
systems made.
OASIS : An open architecture for secure interworking
servicesHayton, RichardUniversity of Cambridge, Computer Laboratory1996-06enTextUCAM-CL-TR-399ISSN 1476-2986
An emerging requirement is for applications and distributed services
to cooperate or inter-operate. Mechanisms have been devised to hide
the heterogeneity of the host operating systems and abstract the
issues of distribution and object location. However, in order for
systems to inter-operate securely there must also be mechanisms to
hide differences in security policy, or at least negotiate between
them.
This would suggest that a uniform model of access control is
required. Such a model must be extremely flexible with respect to
the specification of policy, as different applications have
radically different needs. In a widely distributed environment this
situation is exacerbated by the differing requirements of different
organisations, and in an open environment there is a need to
interwork with organisations using alternative security mechanisms.
Other proposals for the interworking of security mechanisms have
concentrated on the enforcement of access policy, and neglected the
concerns of freedom of expression of this policy. For example it is
common to associate each request with a user identity, and to use
this as the only parameter when performing access control. This work
describes an architectural approach to security. By reconsidering
the role of the client and the server, we may reformulate access
control issues in terms of client naming.
We think of a client as obtaining a name issued by a service; either
based on credentials already held by the client, or by delegation
from another client. A grammar has been devised that allows the
conditions under which a client may assume a name to be specified,
and the conditions under which use of the name will be revoked. This
allows complex security policies to be specified that define how
clients of a service may interact with each other (through election,
delegation and revocation), how clients interact with a service (by
invoking operations or receiving events) and how clients and
services may inter-operate. (For example, a client of a Login
service may become a client of a file service.)
This approach allows great flexibility when integrating a number of
services, and reduces the mismatch of policies common in
heterogeneous systems. A flexible security definition is meaningless
if not backed by a robust and efficient implementation. In this
thesis we present a systems architecture that can be implemented
efficiently, but that allows individual services to ‘fine tune’ the
trade-offs between security, efficiency and freedom of policy
expression. The architecture is inherently distributed and scalable,
and includes mechanisms for rapid and selective revocation of
privileges which may cascade between services and organisations.
Monitoring the behaviour of distributed systemsSchwiderski, ScarletUniversity of Cambridge, Computer Laboratory1996-07enTextUCAM-CL-TR-400ISSN 1476-2986
Monitoring the behaviour of computing systems is an important task.
In active database systems, a detected system behaviour leads to the
triggering of an ECA (event-condition-action) rule. ECA rules are
employed for supporting database management system functions as well
as external applications. Although distributed database systems are
becoming more commonplace, active database research has to date
focussed on centralised systems. In distributed debugging systems, a
detected system behaviour is compared with the expected system
behaviour. Differences illustrate erroneous behaviour. In both
application areas, system behaviours are specified in terms of
events: primitive events represent elementary occurrences and
composite events represent complex occurrence patterns. At system
runtime, specified primitive and composite events are monitored and
event occurrences are detected. However, in active database systems
events are monitored in terms of physical time and in distributed
debugging systems events are monitored in terms of logical time. The
notion of physical time is difficult in distributed systems because
of their special characteristics: no global time, network delays,
etc.
This dissertation is concerned with monitoring the behaviour of
distributed systems in terms of physical time, i.e. the syntax, the
semantics, the detection, and the implementation of events are
considered.
The syntax of primitive and composite events is derived from the
work of both active database systems and distributed debugging
systems; differences and necessities are highlighted.
The semantics of primitive and composite events establishes when and
where an event occurs; the semantics depends largely on the notion
of physical time in distributed systems. Based on the model for an
approximated global time base, the ordering of events in distributed
systems is considered, and the structure and handling of timestamps
are illustrated. In specific applications, a simplified version of
the semantics can be applied which is easier and therefore more
efficient to implement.
Algorithms for the detection of composite events at system runtime
are developed; event detectors are distributed to arbitrary sites
and composite events are evaluated concurrently. Two different
evaluation policies are examined: asynchronous evaluation and
synchronous evaluation. Asynchronous evaluation is characterised by
the ad hoc consumption of signalled event occurrences. However,
since the signalling of events involves variable delays, the events
may not be evaluated in the system-wide order of their occurrence.
On the other hand, synchronous evaluation enforces events to be
evaluated in the system-wide order of their occurrence. But, due to
site failures and network congestion, the evaluation may block on a
fairly long-term basis.
The prototype implementation realises the algorithms for the
detection of composite events with both asynchronous and synchronous
evaluation. For the purpose of testing, primitive event occurrences
are simulated by distributed event simulators. Several tests are
performed illustrating the differences between asynchronous and
synchronous evaluation: the first is ‘fast and unreliable’ whereas
the latter is ‘slow and reliable’.
A classical linear λ-calculusBierman, GavinUniversity of Cambridge, Computer Laboratory1996-07enTextUCAM-CL-TR-401ISSN 1476-2986Video mail retrieval using voice: report on collection of
naturalistic requests and relevance assessmentsJones, G.J.F.Foote, J.T.Spärck Jones, K.Young, S.J.University of Cambridge, Computer Laboratory1996-09enTextUCAM-CL-TR-402ISSN 1476-2986Devices in a multi-service operating systemBarham, Paul RonaldUniversity of Cambridge, Computer Laboratory1996-10enTextUCAM-CL-TR-403ISSN 1476-2986
Increases in processor speed and network and device bandwidth have
led to general purpose workstations being called upon to process
continuous media data in real time. Conventional operating systems
are unable to cope with the high loads and strict timing constraints
introduced when such applications form part of a multi-tasking
workload. There is a need for the operating system to provide
fine-grained reservation of processor, memory and I/O resources and
the ability to redistribute these resources dynamically. A small
group of operating systems researchers have recently proposed a
“vertically-structured” architecture where the operating system
kernel provides minimal functionality and the majority of operating
system code executes within the application itself. This structure
greatly simplifies the task of accounting for processor usage by
applications. The prototype Nemesis operating system embodies these
principles and is used as the platform for this work.
This dissertation extends the provision of Quality of Service
guarantees to the I/O system by presenting an architecture for
device drivers which minimises crosstalk between applications. This
is achieved by clearly separating the data-path operations, which
require careful accounting and scheduling, and the infrequent
control-path operations, which require protection and concurrency
control. The approach taken is to abstract and multiplex the I/O
data-path at the lowest level possible so as to simplify accounting,
policing and scheduling of I/O resources and enable
application-specific use of I/O devices.
The architecture is applied to several representative classes of
device including network interfaces, network connected peripherals,
disk drives and framestores. Of these, disks and framestores are of
particular interest since they must be shared at a very fine
granularity but have traditionally been presented to the application
via a window system or file-system with a high-level and
coarse-grained interface.
A device driver for the framestore is presented which abstracts the
device at a low level and is therefore able to provide each client
with guaranteed bandwidth to the framebuffer. The design and
implementation of a novel client-rendering window system is then
presented which uses this driver to enable rendering code to be
safely migrated into a shared library within the client.
A low-level abstraction of a standard disk drive is also described
which efficiently supports a wide variety of file systems and other
applications requiring persistent storage, whilst providing
guaranteed rates of I/O to individual clients. An extent-based file
system is presented which can provide guaranteed rate file access
and enables clients to optimise for application-specific access
patterns.
Adaptive parallelism for computing on heterogeneous
clustersShum, Kam HongUniversity of Cambridge, Computer Laboratory1996-11enTextUCAM-CL-TR-404ISSN 1476-2986A tool to support formal reasoning about computer
languagesBoulton, Richard J.University of Cambridge, Computer Laboratory1996-11enTextUCAM-CL-TR-405ISSN 1476-2986
A tool to support formal reasoning about computer languages and
specific language texts is described. The intention is to provide a
tool that can build a formal reasoning system in a mechanical
theorem prover from two specifications, one for the syntax of the
language and one for the semantics. A parser, pretty-printer and
internal representations are generated from the former. Logical
representations of syntax and semantics, and associated theorem
proving tools, are generated from the combination of the two
specifications. The main aim is to eliminate tedious work from the
task of prototyping a reasoning tool for a computer language, but
the abstract specifications of the language also assist the
automation of proof.
Tool support for logics of programsPaulson, Lawrence C.University of Cambridge, Computer Laboratory1996-11enTextUCAM-CL-TR-406ISSN 1476-2986
Proof tools must be well designed if they are to be more effective
than pen and paper. Isabelle supports a range of formalisms, two of
which are described (higher-order logic and set theory). Isabelle’s
representation of logic is influenced by logic programming: its
“logical variables” can be used to implement step-wise refinement.
Its automatic proof procedures are based on search primitives that
are directly available to users. While emphasizing basic concepts,
the article also discusses applications such as an approach to the
analysis of security protocols.
The L4 microkernel on Alpha : Design and
implementationSchoenberg, SebastianUniversity of Cambridge, Computer Laboratory1996-09enTextUCAM-CL-TR-407ISSN 1476-2986
The purpose of a microkernel is to cover the lowest level of the
hardware and to provide a more general platform to operating systems
and applications than the hardware itself. This has made microkernel
development increasingly interesting. Different types of
microkernels have been developed, ranging from kernels which merely
deal with the hardware infterface (Windows NT HAL), kernels
especially for embedded systems (RTEMS), to kernels for multimedia
streams and real time support (Nemesis) and general purpose kernels
(L4, Mach).
The common opinion that microkernels lead to deterioration in system
performance has been disproved by recent research. L4 is an example
of a fast and small, multi address space, message-based microkernel,
developed originally for Intel systems only. Based on the L4
interface, which should be as similar as possible on different
platforms, the L4 Alpha version has been developed.
This work describes design decisions, implementation and interfaces
of the L4 version for 64-bit Alpha processors.
Theorem proving with the real numbersHarrison, John RobertUniversity of Cambridge, Computer Laboratory1996-11enTextUCAM-CL-TR-408ISSN 1476-2986
This thesis discusses the use of the real numbers in theorem
proving. Typically, theorem provers only support a few ‘discrete’
datatypes such as the natural numbers. However the availability of
the real numbers opens up many interesting and important application
areas, such as the verification of floating point hardware and
hybrid systems. It also allows the formalization of many more
branches of classical mathematics, which is particularly relevant
for attempts to inject more rigour into computer algebra systems.
Our work is conducted in a version of the HOL theorem prover. We
describe the rigorous definitional construction of the real numbers,
using a new version of Cantor’s method, and the formalization of a
significant portion of real analysis. We also describe an advanced
derived decision procedure for the ‘Tarski subset’ of real algebra
as well as some more modest but practically useful tools for
automating explicit calculations and routine linear arithmetic
reasoning.
Finally, we consider in more detail two interesting application
areas. We discuss the desirability of combining the rigour of
theorem provers with the power and convenience of computer algebra
systems, and explain a method we have used in practice to achieve
this. We then move on to the verification of floating point
hardware. After a careful discussion of possible correctness
specifications, we report on two case studies, one involving a
transcendental function.
We aim to show that a theory of real numbers is useful in practice
and interesting in theory, and that the ‘LCF style’ of theorem
proving is well suited to the kind of work we describe. We hope also
to convince the reader that the kind of mathematics needed for
applications is well within the abilities of current theorem proving
technology.
Proving properties of security protocols by
inductionPaulson, Lawrence C.University of Cambridge, Computer Laboratory1996-12enTextUCAM-CL-TR-409ISSN 1476-2986
Security protocols are formally specified in terms of traces, which
may involve many interleaved protocol runs. Traces are defined
inductively. Protocol descriptions model accidental key losses as
well as attacks. The model spy can send spoof messages made up of
components decrypted from previous traffic.
Correctness properties are verified using the proof tool
Isabelle/HOL. Several symmetric-key protocols have been studied,
including Needham-Schroeder, Yahalom and Otway-Rees. A new attack
has been discovered in a variant of Otway-Rees (already broken by
Mao and Boyd). Assertions concerning secrecy and authenticity have
been proved.
The approach rests on a common theory of messages, with three
operators. The operator “parts” denotes the components of a set of
messages. The operator “analz” denotes those parts that can be
decrypted with known keys. The operator “synth” denotes those
messages that can be expressed in terms of given components. The
three operators enjoy many algebraic laws that are invaluable in
proofs.
Proof styleHarrison, JohnUniversity of Cambridge, Computer Laboratory1997-01enTextUCAM-CL-TR-410ISSN 1476-2986
We are concerned with how to communicate a mathematical proof to a
computer theorem prover. This can be done in many ways, while
allowing the machine to generate a completely formal proof object.
The most obvious choice is the amount of guidance required from the
user, or from the machine perspective, the degree of automation
provided. But another important consideration, which we consider
particularly significant, is the bias towards a ‘procedural’ or
‘declarative’ proof style. We will explore this choice in depth, and
discuss the strengths and weaknesses of declarative and procedural
styles for proofs in pure mathematics and for verification
applications. We conclude with a brief summary of our own
experiments in trying to combine both approaches.
Formalising process calculi in Higher Order LogicNesi, MonicaUniversity of Cambridge, Computer Laboratory1997-01enTextUCAM-CL-TR-411ISSN 1476-2986Observations on a linear PCF (preliminary report)Bierman, G.M.University of Cambridge, Computer Laboratory1997-01enTextUCAM-CL-TR-412ISSN 1476-2986Mechanized proofs of security protocols: Needham-Schroeder
with public keysPaulson, Lawrence C.University of Cambridge, Computer Laboratory1997-01enTextUCAM-CL-TR-413ISSN 1476-2986
The inductive approach to verifying security protocols, previously
applied to shared-key encryption, is here applied to the public key
version of the Needham-Schroeder protocol. As before, mechanized
proofs are performed using Isabelle/HOL. Both the original, flawed
version and Lowe’s improved version are studied; the properties
proved highlight the distinctions between the two versions. The
results are compared with previous analyses of the same protocol.
The analysis reported below required only 30 hours of the author’s
time. The proof scripts execute in under three minutes.
A calculus for cryptographic protocols : The SPI
calculusAbadi, MartínGordon, Andrew D.University of Cambridge, Computer Laboratory1997-01enTextUCAM-CL-TR-414ISSN 1476-2986
We introduce the spi calculus, an extension of the pi calculus
designed for the description and analysis of cryptographic
protocols. We show how to use the spi calculus, particularly for
studying authentication protocols. The pi calculus (without
extension) suffices for some abstract protocols; the spi calculus
enables us to consider cryptographic issues in more detail. We
represent protocols as processes in the spi calculus and state their
security properties in terms of coarse-grained notions of protocol
equivalence.
Application support for mobile computingPope, Steven LeslieUniversity of Cambridge, Computer Laboratory1997-02enTextUCAM-CL-TR-415ISSN 1476-2986DECLARE: a prototype declarative proof system for higher
order logicSyme, DonaldUniversity of Cambridge, Computer Laboratory1997-02enTextUCAM-CL-TR-416ISSN 1476-2986Selective mesh refinement for interactive terrain
renderingBrown, Peter J.C.University of Cambridge, Computer Laboratory1997-02enTextUCAM-CL-TR-417ISSN 1476-2986
Terrain surfaces are often approximated by geometric meshes to
permit efficient rendering. This paper describes how the complexity
of an approximating irregular mesh can be varied across its domain
in order to minimise the number of displayed facets while ensuring
that the rendered surface meets pre-determined resolution
requirements. We first present a generalised scheme to represent a
mesh over a continuous range of resolutions using the output from
conventional single-resolution approximation methods. We then
describe an algorithm which extracts a surface from this
representation such that the resolution of the surface is enhanced
only in specific areas of interest. We prove that the extracted
surface is complete, minimal, satisfies the given resolution
constraints and meets the Delaunay triangulation criterion if
possible. In addition, we present a method of performing smooth
visual transitions between selectively-refined meshes to permit
efficient animation of a terrain scene.
A HTML version of that report is at
http://www.cl.cam.ac.uk/research/rainbow/publications/pjcb/tr417/
Mechanized proofs for a recursive authentication
protocolPaulson, Lawrence C.University of Cambridge, Computer Laboratory1997-03enTextUCAM-CL-TR-418ISSN 1476-2986
A novel protocol has been formally analyzed using the prover
Isabelle/HOL, following the inductive approach described in earlier
work. There is no limit on the length of a run, the nesting of
messages or the number of agents involved. A single run of the
protocol delivers session keys for all the agents, allowing
neighbours to perform mutual authentication. The basic security
theorem states that session keys are correctly delivered to adjacent
pairs of honest agents, regardless of whether other agents in the
chain are compromised. The protocol’s complexity caused some
difficulties in the specification and proofs, but its symmetry
reduced the number of theorems to prove.
Video-augmented environmentsStafford-Fraser, James QuentinUniversity of Cambridge, Computer Laboratory1997-04enTextUCAM-CL-TR-419ISSN 1476-2986
In the future, the computer will be thought of more as an assistant
than as a tool, and users will increasingly expect machines to make
decisions on their behalf. As with a human assistant, a machine’s
ability to make informed choices will often depend on the extent of
its knowledge of activities in the world around it. Equipping
personal computers with a large number of sensors for monitoring
their environment is, however, expensive and inconvenient, and a
preferable solution would involve a small number of input devices
with a broad scope of application. Video cameras are ideally suited
to many realworld monitoring applications for this reason. In
addition, recent reductions in the manufacturing costs of simple
cameras will soon make their widespread deployment in the home and
office economically viable. The use of video as an input device also
allows the creation of new types of user-interface, more suitable in
some circumstances than those afforded by the conventional keyboard
and mouse.
This thesis examines some examples of these ‘Video-Augmented
Environments’ and related work, and then describes two applications
in detail. The first, a ‘software cameraman’, uses the analysis of
one video stream to control the display of another. The second,
‘BrightBoard’, allows a user to control a computer by making marks
on a conventional whiteboard, thus ‘augmenting’ the board with many
of the facilities common to electronic documents, including the
ability to fax, save, print and email the image of the board. The
techniques which were found to be useful in the construction of
these applications are common to many systems which monitor
real-world video, and so they were combined in a toolkit called
‘Vicar’. This provides an architecture for ‘video plumbing’, which
allows standard videoprocessing components to be connected together
under the control of a scripting language. It is a single
application which can be programmed to create a variety of simple
Video-Augmented Environments, such as those described above, without
the need for any recompilation, and so should simplify the
construction of such applications in the future. Finally,
opportunities for further exploration on this theme are discussed.
Managing complex models for computer graphicsSewell, Jonathan MarkUniversity of Cambridge, Computer Laboratory1997-04enTextUCAM-CL-TR-420ISSN 1476-2986
Three-dimensional computer graphics is becoming more common as
increasing computational power becomes more readily available.
Although the images that can be produced are becoming more complex,
users’ expectations continue to grow. This dissertation examines the
changes in computer graphics software that will be needed to support
continuing growth in complexity, and proposes techniques for
tackling the problems that emerge.
Increasingly complex models will involve longer rendering times,
higher memory requirements, longer data transfer periods and larger
storage capacities. Furthermore, even greater demands will be placed
on the constructors of such models. This dissertation aims to
describe how to construct scalable systems which can be used to
visualise models of any size without requiring dedicated hardware.
This is achieved by controlling the quality of the results, and
hence the costs incurred. In addition, the use of quality controls
can become a tool to help users handle the large volume of
information arising from complex models.
The underlying approach is to separate the model from the graphics
application which uses it, so that the model exists independently.
By doing this, an application is free to access only the data which
is required at any given time. For the application to function in
this manner, the data must be in an appropriate form. To achieve
this, approximation hierarchies are defined as a suitable new model
structure. These utilise multiple representations of both objects
and groups of objects at all levels in the model.
In order to support such a structure, a novel method is proposed for
rapidly constructing simplified representations of groups of complex
objects. By calculating a few geometrical attributes, it is possible
to generate replacement objects that preserve important aspects of
the originals. Such objects, once placed into an approximation
hierarchy, allow rapid loading and rendering of large portions of a
model. Extensions to rendering algorithms are described that take
advantage of this structure.
The use of multiple representations encompasses not only different
quality levels, but also different storage formats and types of
objects. It provides a framework within which such aspects are
hidden from the user, facilitating the sharing and re-use of
objects. A model manager is proposed as a means of encapsulating
these mechanisms. This software gives, as far as possible, the
illusion of direct access to the whole complex model, while at the
same time making the best use of the limited resources available.
An abstract dynamic semantics for CNorrish, MichaelUniversity of Cambridge, Computer Laboratory1997-05enTextUCAM-CL-TR-421ISSN 1476-2986
This report is a presentation of a formal semantics for the C
programming language. The semantics has been defined operationally
in a structured semantics style and covers the bulk of the core of
the language. The semantics has been developed in a theorem prover
(HOL), where some expected consequences of the language definition
Using the BONITA primitives: a case studyRowstron, AntonyUniversity of Cambridge, Computer Laboratory1997-05enTextUCAM-CL-TR-422ISSN 1476-2986Symbol grounding : Learning categorical and sensorimotor
predictions for coordination in autonomous robotsMacDorman, Karl F.University of Cambridge, Computer Laboratory1997-05enTextUCAM-CL-TR-423ISSN 1476-2986Simplification with renaming: a general proof technique for
tableau and sequent-based proversMassacci, FabioUniversity of Cambridge, Computer Laboratory1997-05enTextUCAM-CL-TR-424ISSN 1476-2986Should your specification language be typed?Lamport, LesliePaulson, Lawrence C.University of Cambridge, Computer Laboratory1997-05enTextUCAM-CL-TR-425ISSN 1476-2986
Most specification languages have a type system. Type systems are
hard to get right, and getting them wrong can lead to
inconsistencies. Set theory can serve as the basis for a
specification language without types. This possibility, which has
been widely overlooked, offers many advantages. Untyped set theory
is simple and is more flexible than any simple typed formalism.
Polymorphism, overloading, and subtyping can make a type system more
powerful, but at the cost of increased complexity, and such
refinements can never attain the flexibility of having no types at
all. Typed formalisms have advantages too, stemming from the power
of mechanical type checking. While types serve little purpose in
hand proofs, they do help with mechanized proofs. In the absence of
verification, type checking can catch errors in specifications. It
may be possible to have the best of both worlds by adding typing
annotations to an untyped specification language.
We consider only specification languages, not programming languages.
Action selection methods using reinforcement
learningHumphrys, MarkUniversity of Cambridge, Computer Laboratory1997-06enTextUCAM-CL-TR-426ISSN 1476-2986
The Action Selection problem is the problem of run-time choice
between conflicting and heterogenous goals, a central problem in the
simulation of whole creatures (as opposed to the solution of
isolated uninterrupted tasks). This thesis argues that Reinforcement
Learning has been overlooked in the solution of the Action Selection
problem. Considering a decentralised model of mind, with internal
tension and competition between selfish behaviors, this thesis
introduces an algorithm called “W-learning”, whereby different parts
of the mind modify their behavior based on whether or not they are
succeeding in getting the body to execute their actions. This thesis
sets W-learning in context among the different ways of exploiting
Reinforcement Learning numbers for the purposes of Action Selection.
It is a ‘Minimize the Worst Unhappiness’ strategy. The different
methods are tested and their strengths and weaknesses analysed in an
artificial world.
Proving Java type soundnessSyme, DonUniversity of Cambridge, Computer Laboratory1997-06enTextUCAM-CL-TR-427ISSN 1476-2986Floating point verification in HOL Light: the exponential
functionHarrison, JohnUniversity of Cambridge, Computer Laboratory1997-06enTextUCAM-CL-TR-428ISSN 1476-2986
In that they often embody compact but mathematically sophisticated
algorithms, operations for computing the common transcendental
functions in floating point arithmetic seem good targets for formal
verification using a mechanical theorem prover. We discuss some of
the general issues that arise in verifications of this class, and
then present a machine-checked verification of an algorithm for
computing the exponential function in IEEE-754 standard binary
floating point arithmetic. We confirm (indeed strengthen) the main
result of a previously published error analysis, though we uncover a
minor error in the hand proof and are forced to confront several
subtle issues that might easily be overlooked informally.
Our main theorem connects the floating point exponential to its
abstract mathematical counterpart. The specification we prove is
that the function has the correct overflow behaviour and, in the
absence of overflow, the error in the result is less than 0.54 units
in the last place (0.77 if the answer is denormalized) compared
against the exact mathematical exponential function. The algorithm
is expressed in a simple formalized programming language, intended
to be a subset of real programming and hardware description
languages. It uses underlying floating point operations (addition,
multiplication etc.) that are assumed to conform to the IEEE-754
standard for binary floating point arithmetic.
The development described here includes, apart from the proof
itself, a formalization of IEEE arithmetic, a mathematical semantics
for the programming language in which the algorithm is expressed,
and the body of pure mathematics needed. All this is developed
logically from first principles using the HOL Light prover, which
guarantees strict adherence to simple rules of inference while
allowing the user to perform proofs using higher-level derived
rules. We first present the main ideas and conclusions, and then
collect some technical details about the prover and the underlying
mathematical theories in appendices.
Compilation and equivalence of imperative objectsGordon, Andrew D.Hankin, Paul D.Lassen, Søren B.University of Cambridge, Computer Laboratory1997-06enTextUCAM-CL-TR-429ISSN 1476-2986
We adopt the untyped imperative object calculus of Abadi and
Cardelli as a minimal setting in which to study problems of
compilation and program equivalence that arise when compiling
object-oriented languages. We present both a big-step and a
small-step substitution-based operational semantics for the
calculus. Our first two results are theorems asserting the
equivalence of our substitution-based semantics with a closure-based
semantics like that given by Abadi and Cardelli. Our third result is
a direct proof of the correctness of compilation to a stack-based
abstract machine via a small-step decompilation algorithm. Our
fourth result is that contextual equivalence of objects coincides
with a form of Mason and Talcott’s CIU equivalence; the latter
provides a tractable means of establishing operational equivalences.
Finally, we prove correct an algorithm, used in our prototype
compiler, for statically resolving method offsets. This is the first
study of correctness of an object-oriented abstract machine, and of
operational equivalence for the imperative object calculus.
Video mail retrieval using voice : Report on topic
spottingJones, G.J.F.et al.University of Cambridge, Computer Laboratory1997-07enTextUCAM-CL-TR-430ISSN 1476-2986The MCPL programming manual and user guideRichards, MartinUniversity of Cambridge, Computer Laboratory1997-07enTextUCAM-CL-TR-431ISSN 1476-2986On two formal analyses of the Yahalom protocolPaulson, Lawrence C.University of Cambridge, Computer Laboratory1997-07enTextUCAM-CL-TR-432ISSN 1476-2986
The Yahalom protocol is one of those analyzed by Burrows et al. in
the BAN paper. Based upon their analysis, they have proposed
modifications to make the protocol easier to understand and analyze.
Both versions of Yahalom have now been proved, using Isabelle/HOL,
to satisfy strong security goals. The mathematical reasoning behind
these machine proofs is presented informally.
The new proofs do not rely on a belief logic; they use an entirely
different formal model, the inductive method. They confirm the BAN
analysis and the advantages of the proposed modifications. The new
proof methods detect more flaws than BAN and analyze protocols in
finer detail, while remaining broadly consistent with the BAN
principles. In particular, the proofs confirm the explicitness
principle of Abadi and Needham.
Backtracking algorithms in MCPL using bit patterns and
recursionRichards, MartinUniversity of Cambridge, Computer Laboratory1997-07enTextUCAM-CL-TR-433ISSN 1476-2986Demonstration programs for CTL and μ-calculus symbolic model
checkingRichards, MartinUniversity of Cambridge, Computer Laboratory1997-08enTextUCAM-CL-TR-434ISSN 1476-2986Global/local subtyping for a distributed
π-calculusSewell, PeterUniversity of Cambridge, Computer Laboratory1997-08enTextUCAM-CL-TR-435ISSN 1476-2986A new method for estimating optical flowClocksin, W.F.University of Cambridge, Computer Laboratory1997-11enTextUCAM-CL-TR-436ISSN 1476-2986
Accurate and high density estimation of optical flow vectors in an
image sequence is accomplished by a method that estimates the
velocity distribution function for small overlapping regions of the
image. Because the distribution is multimodal, the method can
accurately estimate the change in velocity near motion contrast
borders. Large spatiotemporal support without sacrificing spatial
resolution is a feature of the method, so it is not necessary to
smooth the resulting flow vectors in a subsequent operation, and
there is a certain degree of resistance to aperture and aliasing
effects. Spatial support also provides for the accurate estimation
of long-range displacements, and subpixel accuracy is achieved by a
simple weighted mean near the mode of the velocity distribution
function.
The method is demonstrated using image sequences obtained from the
analysis of ceramic and metal materials under stress. The
performance of the system under degenerate conditions is also
analysed to provide insight into the behaviour of optical flow
methods in general.
Trusting in computer systemsHarbison, William S.University of Cambridge, Computer Laboratory1997-12enTextUCAM-CL-TR-437ISSN 1476-2986
We need to be able to reason about large systems, and not just about
their components. For this we need new conceptual tools, and this
dissertation therefore indicates the need for a new methodology
which will allow us to better identify areas of possible conflict or
lack of knowledge in a system.
In particular, it examines at the concept of trust, and how this can
help us to understand the basic security aspects of a system. The
main proposal of this present work is that systems are viewed in a
manner which analyses the conditions under which they have been
designed to perform, and the circumstances under which they have
been implemented, and then compares the two. This problem is then
examined from the point of what is being trusted in a system, or
what it is being trusted for.
Starting from an approach developed in a military context, we
demonstrate how this can lead to unanticipated risks when applied
inappropriately. We further suggest that ‘trust’ be considered a
relative concept, in contast to the more usual usage, and that it is
not the result of knowledge but a substitute for it. The utility of
these concepts is in their ability to quantify the risks associated
with a specific participant, whether these are explicitly accepted
by them, or not.
We finally propose a distinction between ‘trust’ and ‘trustworthy’
and demonstrate that most current uses of the term ‘trust’ are more
appropriately viewed as statements of ‘trustworthiness’. Ultimately,
therefore, we suggest that the traditional “Orange Book” concept of
trust resulting from knowledge can violate the security policy of a
system.
An architecture for scalable and deterministic video
serversShi, FengUniversity of Cambridge, Computer Laboratory1997-11enTextUCAM-CL-TR-438ISSN 1476-2986
A video server is a storage system that can provide a repository for
continuous media (CM) data and sustain CM stream delivery (playback
or recording) through networks. The voluminous nature of CM data
demands a video server to be scalable in order to serve a large
number of concurrent client requests. In addition, deterministic
services can be provided by a video server for playback because the
characteristics of variable bit rate (VBR) video can be analysed in
advance and used in run-time admission control (AC) and data
retrieval.
Recent research has made gigabit switches a reality, and the
cost/performance ratio of microprocessors and standard PCs is
dropping steadily. It would be more cost effective and flexible to
use off-the-shelf components inside a video server with a scalable
switched network as the primary interconnect than to make a special
purpose or massively parallel multiprocessor based video server.
This work advocates and assumes such a scalable video server
structure in which data is striped to multiple peripherals attached
directly to a switched network.
However, most contemporary distributed file systems do not support
data distribution across multiple networked nodes, let alone
providing quality of service (QoS) to CM applications at the same
time. It is the observation of this dissertation that the software
system framework for network striped video servers is as important
as the scalable hardware architecture itself. This leads to the
development of a new system architecture, which is scalable,
flexible and QoS aware, for scalable and deterministic video
servers. The resulting srchitecture is called Cadmus from sCAlable
and Deterministic MUlitmedia Servers.
Cadmus also provides integrated solutions to AC and actual QoS
enforcement in storage nodes. This is achieved by considering
resources such as CPU buffer, disk, and network, simultaneously but
not independently and by including both real-time (RT) and
non-real-time (NRT) activities, In addition, the potential to smooth
the variability of VBR videos using read-ahead under client buffer
constraints is identified. A new smoothing algorithm is presented,
analysed, and incorporated into the Cadmus architecture.
A prototype implementation of Cadmus has been constructed based on
distributed object computing and hardware modules directly connected
to an Asynchronous Transfer Mode (ATM) network. Experiments were
performed to evaluate the implementation and demonstrate the utility
and feasibility of the architecture and its AC criteria.
Applying mobile code to distributed systemsHalls, David A.University of Cambridge, Computer Laboratory1997-12enTextUCAM-CL-TR-439ISSN 1476-2986Inductive analysis of the internet protocol TLSPaulson, Lawrence C.University of Cambridge, Computer Laboratory1997-12enTextUCAM-CL-TR-440ISSN 1476-2986
Internet browsers use security protocols to protect confidential
messages. An inductive analysis of TLS (a descendant of SSL 3.0) has
been performed using the theorem prover Isabelle. Proofs are based
on higher-order logic and make no assumptions concerning beliefs or
finiteness. All the obvious security goals can be proved; session
resumption appears to be secure even if old session keys have been
compromised. The analysis suggests modest changes to simplify the
protocol.
TLS, even at an abstract level, is much more complicated than most
protocols that researchers have verified. Session keys are
negotiated rather than distributed, and the protocol has many
optional parts. Nevertheless, the resources needed to verify TLS are
modest. The inductive approach scales up.
A generic tableau prover and its integration with
IsabellePaulson, Lawrence C.University of Cambridge, Computer Laboratory1998-01enTextUCAM-CL-TR-441ISSN 1476-2986
A generic tableau prover has been implemented and integrated with
Isabelle. It is based on leantap but is much more complicated, with
numerous modifications to allow it to reason with any supplied set
of tableau rules. It has a higher-order syntax in order to support
the binding operators of set theory; unification is first-order
(extended for bound variables in obvious ways) instead of
higher-order, for simplicity.
When a proof is found, it is returned to Isabelle as a list of
tactics. Because Isabelle verifies the proof, the prover can cut
corners for efficiency’s sake without compromising soundness. For
example, it knows almost nothing about types.
A combination of nonstandard analysis and geometry theorem
proving, with application to Newton’s PrincipiaFleuriot, JacquesPaulson, Lawrence C.University of Cambridge, Computer Laboratory1998-01enTextUCAM-CL-TR-442ISSN 1476-2986
The theorem prover Isabelle is used to formalise and reproduce some
of the styles of reasoning used by Newton in his Principia. The
Principia’s reasoning is resolutely geometric in nature but contains
“infinitesimal” elements and the presence of motion that take it
beyond the traditional boundaries of Euclidean Geometry. These
present difficulties that prevent Newton’s proofs from being
mechanised using only the existing geometry theorem proving (GTP)
techniques.
Using concepts from Robinson’s Nonstandard Analysis (NSA) and a
powerful geometric theory, we introduce the concept of an
infinitesimal geometry in which quantities can be infinitely small
or infinitesimal. We reveal and prove new properties of this
geometry that only hold because infinitesimal elements are allowed
and use them to prove lemmas and theorems from the Principia.
The inductive approach to verifying cryptographic
protocolsPaulson, Lawrence C.University of Cambridge, Computer Laboratory1998-02enTextUCAM-CL-TR-443ISSN 1476-2986
Informal arguments that cryptographic protocols are secure can be
made rigorous using inductive definitions. The approach is based on
ordinary predicate calculus and copes with infinite-state systems.
Proofs are generated using Isabelle/HOL. The human effort required
to analyze a protocol can be as little as a week or two, yielding a
proof script that takes a few minutes to run.
Protocols are inductively defined as sets of traces. A trace is a
list of communication events, perhaps comprising many interleaved
protocol runs. Protocol descriptions incorporate attacks and
accidental losses. The model spy knows some private keys and can
forge messages using components decrypted from previous traffic.
Three protocols are analyzed below: Otway-Rees (which uses
shared-key encryption), Needham-Schroeder (which uses public-key
encryption), and a recursive protocol (which is of variable length).
One can prove that event ev always precedes event ev′ or that
property P holds provided X remains secret. Properties can be proved
from the viewpoint of the various principals: say, if A receives a
final message from B then the session key it conveys is good.
From rewrite rules to bisimulation congruencesSewell, PeterUniversity of Cambridge, Computer Laboratory1998-05enTextUCAM-CL-TR-444ISSN 1476-2986Secure sessions from weak secretsRoe, MichaelChristianson, BruceWheeler, DavidUniversity of Cambridge, Computer Laboratory1998-07enTextUCAM-CL-TR-445ISSN 1476-2986
Sometimes two parties who share a weak secret k (such as a password)
wish to share a strong secret s (such as a session key) without
revealing information about k to a (possibly active) attacker. We
assume that both parties can generate strong random numbers and
forget secrets, and present three protocols for secure strong secret
sharing, based on RSA, Diffie-Hellman and El-Gamal. As well as being
simpler and quicker than their predecessors, our protocols also have
slightly stronger security properties: in particular, they make no
cryptographic use of s and so impose no subtle restrictions upon the
use which is made of s by other protocols.
A probabilistic model of information and retrieval:
development and statusSpärck Jones, K.Walker, S.Robertson, S.E.University of Cambridge, Computer Laboratory1998-08enTextUCAM-CL-TR-446ISSN 1476-2986
The paper combines a comprehensive account of the probabilistic
model of retrieval with new systematic experiments on TREC Programme
material. It presents the model from its foundations through its
logical development to cover more aspects of retrieval data and a
wider range of system functions. Each step in the argument is
matched by comparative retrieval tests, to provide a single coherent
account of a major line of research. The experiments demonstrate,
for a large test collection, that the probabilistic model is
effective and robust, and that it responds appropriately, with major
improvements in performance, to key features of retrieval
situations.
Are timestamps worth the effort? A formal
treatmentBella, GiampaoloPaulson, Lawrence C.University of Cambridge, Computer Laboratory1998-09enTextUCAM-CL-TR-447ISSN 1476-2986
Theorem proving provides formal and detailed support to the claim
that timestamps can give better freshness guarantees than nonces do,
and can simplify the design of crypto-protocols. However, since they
rely on synchronised clocks, their benefits are still debatable. The
debate should gain from our formal analysis, which is achieved
through the comparison of a nonce-based crypto-protocol,
Needham-Schroeder, with its natural modification by timestamps,
Kerberos.
A computational interpretation of the λμ calculusBierman, G.M.University of Cambridge, Computer Laboratory1998-09enTextUCAM-CL-TR-448ISSN 1476-2986Locales : A sectioning concept for IsabelleKammüller, FlorianWenzel, MarkusUniversity of Cambridge, Computer Laboratory1998-10enTextUCAM-CL-TR-449ISSN 1476-2986Open service support for ATMvan der Merwe, Jacobus ErasmusUniversity of Cambridge, Computer Laboratory1998-11enTextUCAM-CL-TR-450ISSN 1476-2986The structure of open ATM control architecturesRooney, SeanUniversity of Cambridge, Computer Laboratory1998-11enTextUCAM-CL-TR-451ISSN 1476-2986A formal proof of Sylow’s theorem : An experiment in
abstract algebra with Isabelle HolKammüller, FlorianPaulson, Lawrence C.University of Cambridge, Computer Laboratory1998-11enTextUCAM-CL-TR-452ISSN 1476-2986
The theorem of Sylow is proved in Isabelle HOL. We follow the proof
by Wielandt that is more general than the original and uses a
non-trivial combinatorial identity. The mathematical proof is
explained in some detail leading on to the mechanization of group
theory and the necessary combinatorics in Isabelle. We present the
mechanization of the proof in detail giving reference to theorems
contained in an appendix. Some weak points of the experiment with
respect to a natural treatment of abstract algebraic reasoning give
rise to a discussion of the use of module systems to represent
abstract algebra in theorem provers. Drawing from that, we present
tentative ideas for further research into a section concept for
Isabelle.
C formalised in HOLNorrish, MichaelUniversity of Cambridge, Computer Laboratory1998-12enTextUCAM-CL-TR-453ISSN 1476-2986
We present a formal semantics of the C programming language,
covering both the type system and the dynamic behaviour of programs.
The semantics is wide-ranging, covering most of the language, with
its most significant omission being the C library. Using a
structural operational semantics we specify transition relations for
C’s expressions, statements and declarations in higher order logic.
The consistency of our definition is assured by its specification in
the HOL theorem prover. With the theorem prover, we have used the
semantics as the basis for a set of proofs of interesting theorems
about C. We investigate properties of expressions and statements
separately.
In our chapter of results about expressions, we begin with two
results about the interaction between the type system and the
dynamic semantics. We have both type preservation, that the values
produced by expressions conform to the type predicted for them; and
type safety, that typed expressions will not block, but will either
evaluate to a value, or cause undefined behaviour. We then also show
that two broad classes of expression are deterministic. This last
result is of considerable practical value as it makes later
verification proofs significantly easier.
In our chapter of results about statements, we prove a series of
derived rules that provide C with Floyd-Hoare style “axiomatic”
rules for verifying properties of programs. These rules are
consequences of the original semantics, not independently stated
axioms, so we can be sure of their soundness. This chapter also
proves the correctness of an automatic tool for constructing
post-conditions for loops with break and return statements.
Finally, we perform some simple verification case studies, going
some way towards demonstrating practical utility for the semantics
and accompanying tools.
This technical report is substantially the same as the PhD thesis I
submitted in August 1998. The minor differences between that
document and this are principally improvements suggested by my
examiners Andy Gordon and Tom Melham, whom I thank for their help
and careful reading.
Parametric polymorphism and operational
equivalencePitts, Andrew M.University of Cambridge, Computer Laboratory1998-12enTextUCAM-CL-TR-454ISSN 1476-2986Multiple modalitiesBierman, G.M.University of Cambridge, Computer Laboratory1998-12enTextUCAM-CL-TR-455ISSN 1476-2986An evaluation based approach to process calculiRoss, Joshua Robert XavierUniversity of Cambridge, Computer Laboratory1999-01enTextUCAM-CL-TR-456ISSN 1476-2986A concurrent object calculus: reduction and
typingGordon, Andrew D.Hankin, Paul D.University of Cambridge, Computer Laboratory1999-02enTextUCAM-CL-TR-457ISSN 1476-2986Final coalgebras as greatest fixed points in ZF set
theoryPaulson, Lawrence C.University of Cambridge, Computer Laboratory1999-03enTextUCAM-CL-TR-458ISSN 1476-2986
A special final coalgebra theorem, in the style of Aczel (1988), is
proved within standard Zermelo-Fraenkel set theory. Aczel’s
Anti-Foundation Axiom is replaced by a variant definition of
function that admits non-well-founded constructions. Variant ordered
pairs and tuples, of possibly infinite length, are special cases of
variant functions. Analogues of Aczel’s solution and substitution
lemmas are proved in the style of Rutten and Turi (1993). The
approach is less general than Aczel’s, but the treatment of
non-well-founded objects is simple and concrete. The final coalgebra
of a functor is its greatest fixedpoint. Compared with previous work
(Paulson, 1995a), iterated substitutions and solutions are
considered, as well as final coalgebras defined with respect to
parameters. The disjoint sum construction is replaced by a smoother
treatment of urelements that simplifies many of the derivations. The
theory facilitates machine implementation of recursive definitions
by letting both inductive and coinductive definitions be represented
as fixedpoints. It has already been applied to the theorem prover
Isabelle (Paulson, 1994).
An open parallel architecture for data-intensive
applicationsAfshar, MohamadUniversity of Cambridge, Computer Laboratory1999-07enTextUCAM-CL-TR-459ISSN 1476-2986
Data-intensive applications consist of both declarative
data-processing parts and imperative computational parts. For
applications such as climate modelling, scale hits both the
computational aspects which are typically handled in a procedural
programming language, and the data-processing aspects which are
handled in a database query language. Although parallelism has been
successfully exploited in the data-processing parts by parallel
evaluation of database queries associated with the application,
current database query languages are poor at expressing the
computational aspects, which are also subject to scale.
This thesis proposes an open architecture that delivers parallelism
shared between the database, system and application, thus enabling
the integration of the conventionally separated query and non-query
components of a data-intensive application. The architecture is
data-model independent and can be used in a variety of different
application areas including decision-support applications, which are
query based, and complex applications, which comprise procedural
language statements with embedded queries. The architecture
encompasses a unified model of parallelism and the realisation of
this model in the form of a language within which it is possible to
describe both the query and non-query components of data-intensive
applications. The language enables the construction of parallel
applications by the hierarchical composition of platform-independent
parallel forms, each of which implements a form of task or data
parallelism. These forms may be used to determine both query and
non-query actions.
Queries are expressed in a declarative language based on “monoid
comprehensions”. The approach of using monoids to model data types
and monoid homomorphisms to iterate over collection types enables
mathematically provable compile-time optimisations whilst also
facilitating multiple collection types and data type extensibility.
Monoid comprehension programs are automatically transformed into
parallel programs composed of applications of the parallel forms,
one of which is the “monoid homomorphism”. This process involves
identifying the parts of a query where task and data parallelism are
available and mapping that parallelism onto the most suitable form.
Data parallelism in queries is mapped onto a form that implements
combining tree parallelism for query evaluation and dividing tree
parallelism to realise data partitioning. Task parallelism is mapped
onto two separate forms that implement pipeline and independent
parallelism. This translation process is applied to all
comprehension queries including those in complex applications. The
result is a skeleton program in which both the query and non-query
parts are expressed within a single language. Expressions in this
language are amenable to the application of optimising skeleton
rewrite rules.
A complete prototype of the decision-support architecture has been
constructed on a 128-cell MIMD parallel computer. A demonstration of
the utility of the query framework is performed by modelling some of
OQL and a substantial subset of SQL. The system is evaluated for
query speedup with a number of hardware configurations using a large
music catalogue database. The results obtained show that the
implementation delivers the performance gains expected while
offering a convenient definition of the parallel environment.
Message reception in the inductive approachBella, GiampaoloUniversity of Cambridge, Computer Laboratory1999-03enTextUCAM-CL-TR-460ISSN 1476-2986
Cryptographic protocols can be formally analysed in great detail by
means of Paulson’s Inductive Approach, which is mechanised by the
theorem prover Isabelle. The approach only relied on message sending
(and noting) in order to keep the models simple. We introduce a new
event, message reception, and show that the price paid in terms of
runtime is negligible because old proofs can be reused. On the other
hand, the new event enhances the global expressiveness, and makes it
possible to define an accurate notion of agents’ knowledge, which
extends and replaces Paulson’s notion of spy’s knowledge. We have
designed new guarantees to assure each agent that the peer does not
know the crucial message items of the session. This work thus
extends the scope of the Inductive approach. Finally, we provide
general guidance on updating the protocols analysed so far, and give
examples for some cases.
Integrating Gandalf and HOLHurd, JoeUniversity of Cambridge, Computer Laboratory1999-03enTextUCAM-CL-TR-461ISSN 1476-2986
Gandalf is a first-order resolution theorem-prover, optimized for
speed and specializing in manipulations of large clauses. In this
paper I describe GANDALF TAC, a HOL tactic that proves goals by
calling Gandalf and mirroring the resulting proofs in HOL. This call
can occur over a network, and a Gandalf server may be set up
servicing multiple HOL clients. In addition, the translation of the
Gandalf proof into HOL fits in with the LCF model and guarantees
logical consistency.
Location-independent communication for mobile agents: a
two-level architectureSewell, PeterWojciechowski, Paweł T.Pierce, Benjamin C.University of Cambridge, Computer Laboratory1999-04enTextUCAM-CL-TR-462ISSN 1476-2986Secure composition of insecure componentsSewell, PeterVitek, JanUniversity of Cambridge, Computer Laboratory1999-04enTextUCAM-CL-TR-463ISSN 1476-2986Feature representation for the automatic analysis of
fluorescence in-situ hybridization imagesLerner, BoazClocksin, WilliamDhanjal, SeemaHultén, MajBishop, ChristipherUniversity of Cambridge, Computer Laboratory1999-05enTextUCAM-CL-TR-464ISSN 1476-2986Gelfish – graphical environment for labelling FISH
imagesLerner, BoazDhanjal, SeemaHultén, MajUniversity of Cambridge, Computer Laboratory1999-05enTextUCAM-CL-TR-465ISSN 1476-2986Automatic signal classification in fluorescence in-situ
hybridization imagesLerner, BoazClocksin, WilliamDhanjal, SeemaHultén, MajBishop, ChristipherUniversity of Cambridge, Computer Laboratory1999-05enTextUCAM-CL-TR-466ISSN 1476-2986Mechanizing UNITY in IsabellePaulson, Lawrence C.University of Cambridge, Computer Laboratory1999-06enTextUCAM-CL-TR-467ISSN 1476-2986
UNITY is an abstract formalism for proving properties of concurrent
systems, which typically are expressed using guarded assignments
[Chandy and Misra 1988]. UNITY has been mechanized in higher-order
logic using Isabelle, a proof assistant. Safety and progress
primitives, their weak forms (for the substitution axiom) and the
program composition operator (union) have been formalized. To give a
feel for the concrete syntax, the paper presents a few extracts from
the Isabelle definitions and proofs. It discusses a small example,
two-process mutual exclusion. A mechanical theory of unions of
programs supports a degree of compositional reasoning. Original work
on extending program states is presented and then illustrated
through a simple example involving an array of processes.
Synthesis of asynchronous circuitsWilcox, Stephen PaulUniversity of Cambridge, Computer Laboratory1999-07enTextUCAM-CL-TR-468ISSN 1476-2986
The majority of integrated circuits today are synchronous: every
part of the chip times its operation with reference to a single
global clock. As circuits become larger and faster, it becomes
progressively more difficult to coordinate all actions of the chip
to the clock. Asynchronous circuits do not suffer from this problem,
because they do not require global synchronization; they also offer
other benefits, such as modularity, lower power and automatic
adaptation to physical conditions.
The main disadvantage of asynchronous circuits is that there are few
tools to help with design. This thesis describes a new synthesis
tool for asynchronous modules, which combines a number of novel
ideas with existing methods for finite state machine synthesis.
Connections between modules are assumed to have unbounded finite
delays on all wires, but fundamental mode is used inside modules,
rather than the pessimistic speed-independent or
quasi-delay-insensitive models. Accurate technology-specific
verification is performed to check that circuits work correctly.
Circuits are described using a language based upon the Signal
Transition Graph, which is a well-known method for specifying
asynchronous circuits. Concurrency reduction techniques are used to
produce a large number of circuits that conform to a given
specification. Circuits are verified using a simulation algorithm
derived from the work of Brzozowski and Seger, and then performance
estimations are obtained by a gate-level simulator utilising a new
estimation of waveform slopes. Circuits can be ranked in terms of
high speed, low power dissipation or small size, and then the best
circuit for a particular task chosen.
Results are presented that show significant improvements over most
circuits produced by other synthesis tools. Some circuits are twice
as fast and dissipate half the power of equivalent speed-independent
circuits. Specification examples are provided which show that the
front-end specification is easier to use than current specification
approaches. The price that must be paid for the improved performance
is decreased reliability and technology dependence of the circuits
produced; the proposed tool can also can a very long time to produce
a result.
A combination of geometry theorem proving and nonstandard
analysis, with application to Newton’s PrincipiaFleuriot, Jacques DésiréUniversity of Cambridge, Computer Laboratory1999-08enTextUCAM-CL-TR-469ISSN 1476-2986Modular reasoning in IsabelleKammüller, FlorianUniversity of Cambridge, Computer Laboratory1999-08enTextUCAM-CL-TR-470ISSN 1476-2986Murphy’s law, the fitness of evolving species, and the
limits of software reliabilityBrady, Robert M.Anderson, Ross J.Ball, Robin C.University of Cambridge, Computer Laboratory1999-09enTextUCAM-CL-TR-471ISSN 1476-2986
We tackle two problems of interest to the software assurance
community. Firstly, existing models of software development (such as
the waterfall and spiral models) are oriented towards one-off
software development projects, while the growth of mass market
computing has led to a world in which most software consists of
packages which follow an evolutionary development model. This leads
us to ask whether anything interesting and useful may be said about
evolutionary development. We answer in the affirmative. Secondly,
existing reliability growth models emphasise the Poisson
distribution of individual software bugs, while the empirically
observed reliability growth for large systems is asymptotically
slower than this. We provide a rigorous explanation of this
phenomenon. Our reliability growth model is inspired by statistical
thermodynamics, but also applies to biological evolution. It is in
close agreement with experimental measurements of the fitness of an
evolving species and the reliability of commercial software
products. However, it shows that there are significant differences
between the evolution of software and the evolution of species. In
particular, we establish maximisation properties corresponding to
Murphy’s law which work to the advantage of a biological species,
but to the detriment of software reliability.
Simulating music learning with autonomous listening agents:
entropy, ambiguity and contextReis, Ben Y.University of Cambridge, Computer Laboratory1999-09enTextUCAM-CL-TR-472ISSN 1476-2986Computer algebra and theorem provingBallarin, ClemensUniversity of Cambridge, Computer Laboratory1999-10enTextUCAM-CL-TR-473ISSN 1476-2986A Bayesian methodology and probability density estimation
for fluorescence in-situ hybridization signal
classificationLerner, BoazUniversity of Cambridge, Computer Laboratory1999-10enTextUCAM-CL-TR-474ISSN 1476-2986A comparison of state-of-the-art classification techniques
with application to cytogeneticsLerner, BoazLawrence, Neil D.University of Cambridge, Computer Laboratory1999-10enTextUCAM-CL-TR-475ISSN 1476-2986Linking ACL2 and HOLStaples, MarkUniversity of Cambridge, Computer Laboratory1999-11enTextUCAM-CL-TR-476ISSN 1476-2986Presheaf models for CCS-like languagesCattani, Gian LucaWinskel, GlynnUniversity of Cambridge, Computer Laboratory1999-11enTextUCAM-CL-TR-477ISSN 1476-2986Secure composition of untrusted code: wrappers and causality
typesSewell, PeterVitek, JanUniversity of Cambridge, Computer Laboratory1999-11enTextUCAM-CL-TR-478ISSN 1476-2986The interaction between fault tolerance and
securityPrice, GeraintUniversity of Cambridge, Computer Laboratory1999-12enTextUCAM-CL-TR-479ISSN 1476-2986
This dissertation studies the effects on system design when
including fault tolerance design principles within security
services.
We start by looking at the changes made to the trust model within
protocol design, and how moving away from trusted server design
principles affects the structure of the protocol. Taking the primary
results from this work, we move on to study how control in protocol
execution can be used to increase assurances in the actions of
legitimate participants. We study some examples, defining two new
classes of attack, and note that by increasing client control in
areas of protocol execution, it is possible to overcome certain
vulnerabilities.
We then look at different models in fault tolerance, and how their
adoption into a secure environment can change the design principles
and assumptions made when applying the models.
We next look at the application of timing checks in protocols. There
are some classes of timing attack that are difficult to thwart using
existing techniques, because of the inherent unreliability of
networked communication. We develop a method of converting the
Quality of Service mechanisms built into ATM networks in order to
achieve another layer of protection against timing attacks.
We then study the use of primary-backup mechanisms within server
design, as previous work on server replication in security centres
on the use of the state machine approach for replication, which
provides a higher degree of assurance in system design, but adds
complexity.
We then provide a design for a server to reliably and securely store
objects across a loosely coupled, distributed environment. The main
goal behind this design was to realise the ability for a client to
exert control over the fault tolerance inherent in the service.
The main conclusions we draw from our research are that fault
tolerance has a wider application within security than current
practices, which are primarily based on replicating servers, and
clients can exert control over the protocols and mechanisms to
achieve resilience against differing classes of attack. We promote
some new ideas on how, by challenging the prevailing model for
client-server architectures in a secure environment, legitimate
clients can have greater control over the services they use. We
believe this to be a useful goal, given that the client stands to
lose if the security of the server is undermined.
Programming combinations of deduction and BDD-based symbolic
calculationGordon, MikeUniversity of Cambridge, Computer Laboratory1999-12enTextUCAM-CL-TR-480ISSN 1476-2986Combining the Hol98 proof assistant with the BuDDy BDD
packageGordon, MikeLarsen, Ken FriisUniversity of Cambridge, Computer Laboratory1999-12enTextUCAM-CL-TR-481ISSN 1476-2986Biometric decision landscapesDaugman, JohnUniversity of Cambridge, Computer Laboratory2000-01enTextUCAM-CL-TR-482ISSN 1476-2986
This report investigates the “decision landscapes” that characterize
several forms of biometric decision making. The issues discussed
include: (i) Estimating the degrees-of-freedom associated with
different biometrics, as a way of measuring the randomness and
complexity (and therefore the uniqueness) of their templates. (ii)
The consequences of combining more than one biometric test to arrive
at a decision. (iii) The requirements for performing identification
by large-scale exhaustive database search, as opposed to mere
verification by comparison against a single template. (iv) Scenarios
for Biometric Key Cryptography (the use of biometrics for encryption
of messages). These issues are considered here in abstract form, but
where appropriate, the particular example of iris recognition is
used as an illustration. A unifying theme of all four sets of issues
is the role of combinatorial complexity, and its measurement, in
determining the potential decisiveness of biometric decision making.
Elastic network controlBos, Hendrik JaapUniversity of Cambridge, Computer Laboratory2000-01enTextUCAM-CL-TR-483ISSN 1476-2986Automatic summarising and the CLASP systemTucker, RichardUniversity of Cambridge, Computer Laboratory2000-01enTextUCAM-CL-TR-484ISSN 1476-2986
This dissertation discusses summarisers and summarising in general,
and presents CLASP, a new summarising system that uses a shallow
semantic representation of the source text called a “predication
cohesion graph”.
Nodes in the graph are “simple predications” corresponding to
events, states and entities mentioned in the text; edges indicate
related or similar nodes. Summary content is chosen by selecting
some of these predications according to criteria of “importance”,
“representativeness” and “cohesiveness”. These criteria are
expressed as functions on the nodes of a weighted graph. Summary
text is produced either by extracting whole sentences from the
source text, or by generating short, indicative “summary phrases”
from the selected predications.
CLASP uses linguistic processing but no domain knowledge, and
therefore does not restrict the subject matter of the source text.
It is intended to deal robustly with complex texts that it cannot
analyse completely accurately or in full. Experiments in summarising
stories from the Wall Street Journal suggest there may be a benefit
in identifying important material in a semantic representation
rather than a surface one, but that, despite the robustness of the
source representation, inaccuracies in CLASP’s linguistic analysis
can dramatically affect the readability of its summaries. I discuss
ways in which this and other problems might be overcome.
Three notes on the interpretation of VerilogStewart, DarylVanInwegen, MyraUniversity of Cambridge, Computer Laboratory2000-01enTextUCAM-CL-TR-485ISSN 1476-2986Stretching a point: aspect and temporal discourseThomas, James RichardUniversity of Cambridge, Computer Laboratory2000-02enTextUCAM-CL-TR-486ISSN 1476-2986Sequential program composition in UNITYVos, TanjaSwierstra, DoaitseUniversity of Cambridge, Computer Laboratory2000-03enTextUCAM-CL-TR-487ISSN 1476-2986Formal verification of card-holder registration in
SETBella, GiampaoloMassacci, FabioPaulson, LawrenceTramontano, PieroUniversity of Cambridge, Computer Laboratory2000-03enTextUCAM-CL-TR-488ISSN 1476-2986Designing a reliable publishing frameworkLee, Jong-HyeonUniversity of Cambridge, Computer Laboratory2000-04enTextUCAM-CL-TR-489ISSN 1476-2986
Due to the growth of the Internet and the widespread adoption of
easy-to use web browsers, the web provides a new environment for
conventional as well as new businesses. Publishing on the web is a
fundamental and important means of supporting various activities on
the Internet such as commercial transactions, personal home page
publishing, medical information distribution, public key
certification and academic scholarly publishing. Along with the
dramatic growth of the web, the number of reported frauds is
increasing sharply. Since the Internet was not originally designed
for web publishing, it has some weaknesses that undermine its
reliability.
How can we rely on web publishing? In order to resolve this
question, we need to examine what makes people confident when
reading conventional publications printed on paper, to investigate
what attacks can erode confidence in web publishing, and to
understand the nature of publishing in general.
In this dissertation, we examine security properties and policy
models, and their applicability to publishing. We then investigate
the nature of publishing so that we can extract its technical
requirements. To help us understand the practical mechanisms which
might satisfy these requirements, some applications of electronic
publishing are discussed and some example mechanisms are presented.
We conclude that guaranteed integrity, verifiable authenticity and
persistent availability of publications are required to make web
publishing more reliable. Hence we design a framework that can
support these properties. To analyse the framework, we define a
security policy for web publishing that focuses on the guaranteed
integrity and authenticity of web publications, and then describe
some technical primitives that enable us to achieve our
requirements. Finally, the Jikzi publishing system—an implementation
of our framework—is presented with descriptions of its architecture
and possible applications.
Selective mesh refinement for renderingBrown, Peter John CameronUniversity of Cambridge, Computer Laboratory2000-04enTextUCAM-CL-TR-490ISSN 1476-2986
A key task in computer graphics is the rendering of complex models.
As a result, there exist a large number of schemes for improving the
speed of the rendering process, many of which involve displaying
only a simplified version of a model. When such a simplification is
generated selectively, i.e. detail is only removed in specific
regions of a model, we term this selective mesh refinement.
Selective mesh refinement can potentially produce a model
approximation which can be displayed at greatly reduced cost while
remaining perceptually equivalent to a rendering of the original.
For this reason, the field of selective mesh refinement has been the
subject of dramatically increased interest recently. The resulting
selective refinement methods, though, are restricted in both the
types of model which they can handle and the form of output meshes
which they can generate.
Our primary thesis is that a selectively refined mesh can be
produced by combining fragments of approximations to a model without
regard to the underlying approximation method. Thus we can utilise
existing approximation techniques to produce selectively refined
meshes in n-dimensions. This means that the capabilities and
characteristics of standard approximation methods can be retained in
our selectively refined models.
We also show that a selectively refined approximation produced in
this manner can be smoothly geometrically morphed into another
selective refinement in order to satisfy modified refinement
criteria. This geometric morphing is necessary to ensure that detail
can be added and removed from models which are selectively refined
with respect to their impact on the current view frustum. For
example, if a model is selectively refined in this manner and the
viewer approaches the model then more detail may have to be
introduced to the displayed mesh in order to ensure that it
satisfies the new refinement criteria. By geometrically morphing
this introduction of detail we can ensure that the viewer is not
distracted by “popping” artifacts.
We have developed a novel framework within which these proposals
have been verified. This framework consists of a generalised
resolution-based model representation, a means of specifying
refinement criteria and algorithms which can perform the selective
refinement and geometric morphing tasks. The framework has allowed
us to demonstrate that these twin tasks can be performed both on the
output of existing approximation techniques and with respect to a
variety of refinement criteria.
A HTML version of this thesis is at
http://www.cl.cam.ac.uk/research/rainbow/publications/pjcb/thesis/
Is hypothesis testing useful for subcategorization
acquisition?Korhonen, AnnaGorrell, GeneviveMcCarthy, DianaUniversity of Cambridge, Computer Laboratory2000-05enTextUCAM-CL-TR-491ISSN 1476-2986Nomadic Pict: language and infrastructure design for mobile
computationWojciechowski, Paweł TomaszUniversity of Cambridge, Computer Laboratory2000-06enTextUCAM-CL-TR-492ISSN 1476-2986
Mobile agents – units of executing computation that can migrate
between machines – are likely to become an important enabling
technology for future distributed systems. We study the distributed
infrastructures required for location-independent communication
between migrating agents. These infrastructures are problematic: the
choice or design of an infrastructure must be somewhat
application-specific – any given algorithm will only have
satisfactory performance for some range of migration and
communication behaviour; the algorithms must be matched to the
expected properties (and robustness demands) of applications and the
failure characteristic of the communication medium. To study this
problem we introduce an agent programming language – Nomadic Pict.
It is designed to allow infrastructure algorithms to be expressed
clearly, as translations from a high-level language to a lower
level. The levels are based on rigorously-defined process calculi,
which provide sharp levels of abstraction. In this dissertation we
describe the language and use it to develop a distributed
infrastructure for an example application. The language and examples
have been implemented; we conclude with a description of the
compiler and runtime system.
Inductive verification of cryptographic protocolsBella, GiampaoloUniversity of Cambridge, Computer Laboratory2000-07enTextUCAM-CL-TR-493ISSN 1476-2986
The dissertation aims at tailoring Paulson’s Inductive Approach for
the analysis of classical cryptographic protocols towards real-world
protocols. The aim is pursued by extending the approach with new
elements (e.g. timestamps and smart cards), new network events (e.g.
message reception) and more expressive functions (e.g. agents’
knowledge). Hence, the aim is achieved by analysing large protocols
(Kerberos IV and Shoup-Rubin), and by studying how to specify and
verify their goals.
More precisely, the modelling of timestamps and of a discrete time
are first developed on BAN Kerberos, while comparing the outcomes
with those of the BAN logic. The machinery is then applied to
Kerberos IV, whose complicated use of session keys requires a
dedicated treatment. Three new guarantees limiting the spy’s
abilities in case of compromise of a specific session key are
established. Also, it is discovered that Kerberos IV is subject to
an attack due to the weak guarantees of confidentiality for the
protocol responder.
We develop general strategies to investigate the goals of
authenticity, key distribution and non-injective agreement, which is
a strong form of authentication. These strategies require
formalising the agents’ knowledge of messages. Two approaches are
implemented. If an agent creates a message, then he knows all
components of the message, including the cryptographic key that
encrypts it. Alternatively, a broad definition of agents’ knowledge
can be developed if a new network event, message reception, is
formalised.
The concept of smart card as a secure device that can store
long-term secrets and perform easy computations is introduced. The
model cards can be stolen and/or cloned by the spy. The kernel of
their built-in algorithm works correctly, so they spy cannot acquire
unlimited knowledge from their use. However, their functional
interface is unreliable, so they send correct outputs in an
unspecified order. The provably secure protocol based on smart cards
designed by Shoup & Rubin is mechanised. Some design weaknesses
(unknown to the authors’ treatment by Bellare & Rogaway’s
approach) are unveiled, while feasible corrections are suggested and
verified.
We realise that the evidence that a protocol achieves its goals must
be available to the peers. In consequence, we develop a new a
principle of prudent protocol design, goal availability, which holds
of a protocol when suitable guarantees confirming its goals exist on
assumptions that both peers can verify. Failure to observe our
principle raises the risk of attacks, as is the case, for example,
of the attack on Kerberos IV.
An architecture for the notification, storage and retrieval
of eventsSpiteri, Mark DavidUniversity of Cambridge, Computer Laboratory2000-07enTextUCAM-CL-TR-494ISSN 1476-2986Automatic recognition of words in Arabic
manuscriptsKhorsheed, Mohammad S.M.University of Cambridge, Computer Laboratory2000-07enTextUCAM-CL-TR-495ISSN 1476-2986
The need to transliterate large numbers of historic Arabic documents
into machine-readable form has motivated new work on offline
recognition of Arabic script. Arabic script presents two challenges:
orthography is cursive and letter shape is context sensitive.
This dissertation presents two techniques to achieve high word
recognition rates: the segmentation-free technique and the
segmentation-based technique. The segmentation-free technique treats
the word as a whole. The word image is first transformed into a
normalised polar image. The two-dimensional Fourier transform is
then applied to the polar image. This results in a Fourier spectrum
that is invariant to dilation, translation, and rotation. The
Fourier spectrum is used to form the word template, or train the
word model in the template-based and the multiple hidden Markov
model (HMM) recognition systems, respectively. The recognition of an
input word image is based on the minimum distance measure from the
word templates and the maximum likelihood probability for the word
models.
The segmentation-based technique uses a single hidden Markov model,
which is composed of multiple character-models. The technique
implements the analytic approach in which words are segmented into
smaller units, not necessarily characters. The word skeleton is
decomposed into a number of links in orthographic order, it is then
transferred into a sequence of discrete symbols using vector
quantisation. the training of each character-model is performed
using either: state assignment in the lexicon-driven configuration
or the Baum-Welch method in the lexicon-free configuration. The
observation sequence of the input word is given to the hidden Markov
model and the Viterbi algorithm is applied to provide an ordered
list of the candidate recognitions.
Contexts and embeddings for closed shallow action
graphsCattani, Gian LucaLeifer, James J.Milner, RobinUniversity of Cambridge, Computer Laboratory2000-07enTextUCAM-CL-TR-496ISSN 1476-2986Towards a formal type system for ODMG OQLBierman, G.M.Trigoni, A.University of Cambridge, Computer Laboratory2000-09enTextUCAM-CL-TR-497ISSN 1476-2986Applied π – a brief tutorialSewell, PeterUniversity of Cambridge, Computer Laboratory2000-07enTextUCAM-CL-TR-498ISSN 1476-2986
This note provides a brief introduction to π-calculi and their
application to concurrent and distributed programming. Chapter 1
introduces a simple π-calculus and discusses the choice of
primitives, operational semantics (in terms of reductions and of
indexed early labelled transitions), operational equivalences,
Pict-style programming and typing. Chapter 2 goes on to discuss the
application of these ideas to distributed systems, looking
informally at the design of distributed π-calculi with grouping and
interaction primitives. Chapter 3 returns to typing, giving precise
definitions for a simple type system and soundness results for the
labelled transition semantics. Finally, Chapters 4 and 5 provide a
model development of the metatheory, giving first an outline and
then detailed proofs of the results stated earlier. The note can be
read in the partial order 1.(2+3+4.5).
Enhancing spatial deformation for virtual
sculptingGain, James EdwardUniversity of Cambridge, Computer Laboratory2000-08enTextUCAM-CL-TR-499ISSN 1476-2986
The task of computer-based free-form shape design is fraught with
practical and conceptual difficulties. Incorporating elements of
traditional clay sculpting has long been recognised as a means of
shielding a user from the complexities inherent in this form of
modelling. The premise is to deform a mathematically-defined solid
in a fashion that loosely simulates the physical moulding of an
inelastic substance, such as modelling clay or silicone putty.
Virtual sculpting combines this emulation of clay sculpting with
interactive feedback.
Spatial deformations are a class of powerful modelling techniques
well suited to virtual sculpting. They indirectly reshape an object
by warping the surrounding space. This is analogous to embedding a
flexible shape within a lump of jelly and then causing distortions
by flexing the jelly. The user controls spatial deformations by
manipulating points, curves or a volumetric hyperpatch. Directly
Manipulated Free-Form Deformation (DMFFD), in particular, merges the
hyperpatch- and point-based approaches and allows the user to pick
and drag object points directly.
This thesis embodies four enhancements to the versatility and
validity of spatial deformation:
1. We enable users to specify deformations by manipulating the
normal vector and tangent plane at a point. A first derivative frame
can be tilted, twisted and scaled to cause a corresponding
distortion in both the ambient space and inset object. This enhanced
control is accomplished by extending previous work on bivariate
surfaces to trivariate hyperpatches.
2. We extend DMFFD to enable curve manipulation by exploiting
functional composition and degree reduction. Although the resulting
curve-composed DMFFD introduces some modest and bounded
approximation, it is superior to previous curve-based schemes in
other respects. Our technique combines all three forms of spatial
deformation (hyperpatch, point and curve), can maintain any desired
degree of derivative continuity, is amenable to the automatic
detection and prevention of self-intersection, and achieves
interactive update rates over the entire deformation cycle.
3. The approximation quality of a polygon-mesh object frequently
degrades under spatial deformation to become either oversaturated or
undersaturated with polygons. We have devised an efficient adaptive
mesh refinement and decimation scheme. Our novel contributions
include: incorporating fully symmetrical decimation, reducing the
computation cost of the refinement/decimation trigger, catering for
boundary and crease edges, and dealing with sampling problems.
4. The potential self-intersection of an object is a serious
weakness in spatial deformation. We have developed a variant of
DMFFD which guards against self-intersection by subdividing
manipulations into injective (one-to-one) mappings. This depends on
three novel contributions: analytic conditions for identifying
self-intersection, and two injectivity tests (one exact but
computationally costly and the other approximate but efficient).
The memorability and security of passwords – some empirical
resultsYan, JianxinBlackwell, AlanAnderson, RossGrant, AlasdairUniversity of Cambridge, Computer Laboratory2000-09enTextUCAM-CL-TR-500ISSN 1476-2986
There are many things that are ‘well known’ about passwords, such as
that uers can’t remember strong passwords and that the passwords
they can remember are easy to guess. However, there seems to be a
distinct lack of research on the subject that would pass muster by
the standards of applied psychology.
Here we report a controlled trial in which, of four sample groups of
about 100 first-year students, three were recruited to a formal
experiment and of these two were given specific advice about
password selection. The incidence of weak passwords was determined
by cracking the password file, and the number of password resets was
measured from system logs. We observed a number of phenomena which
run counter to the established wisdom. For example, passwords based
on mnemonic phrases are just as hard to crack as random passwords
yet just as easy to remember as naive user selections.
Integrated quality of service managementIngram, DavidUniversity of Cambridge, Computer Laboratory2000-09enTextUCAM-CL-TR-501ISSN 1476-2986Formalizing basic number theoryRasmussen, Thomas MarthedalUniversity of Cambridge, Computer Laboratory2000-09enTextUCAM-CL-TR-502ISSN 1476-2986Hardware/software co-design using functional
languagesMycroft, AlanSharp, RichardUniversity of Cambridge, Computer Laboratory2000-09enTextUCAM-CL-TR-503ISSN 1476-2986
In previous work we have developed and prototyped a silicon compiler
which translates a functional language (SAFL) into hardware. Here we
present a SAFL-level program transformation which: (i) partitions a
specification into hardware and software parts and (ii) generates a
specialised architecture to execute the software part. The
architecture consists of a number of interconnected heterogeneous
processors. Our method allows a large design space to be explored by
systematically transforming a single SAFL specification to
investigate different points on the area-time spectrum.
Word sense selection in texts: an integrated
modelKwong, Oi YeeUniversity of Cambridge, Computer Laboratory2000-09enTextUCAM-CL-TR-504ISSN 1476-2986
Early systems for word sense disambiguation (WSD) often depended on
individual tailor-made lexical resources, hand-coded with as much
lexical information as needed, but of severely limited vocabulary
size. Recent studies tend to extract lexical information from a
variety of existing resources (e.g. machine-readable dictionaries,
corpora) for broad coverage. However, this raises the issue of how
to combine the information from different resources.
Thus while different types of resource could make different
contribution to WSD, studies to date have not shown what
contribution they make, how they should be combined, and whether
they are equally relevant to all words to be disambiguated. This
thesis proposes an Integrated Model as a framework to study the
inter-relatedness of three major parameters in WSD: Lexical
Resource, Contextual Information, and Nature of Target Words. We
argue that it is their interaction which shapes the effectiveness of
any WSD system.
A generalised, structurally-based sense-mapping algorithm was
designed to combine various types of lexical resource. This enables
information from these resources to be used simultaneously and
compatibly, while respecting their distinctive structures. In
studying the effect of context on WSD, different semantic relations
available from the combined resources were used, and a recursive
filtering algorithm was designed to overcome combinatorial
explosion. We then investigated, from two directions, how the target
words themselves could affect the usefulness of different types of
knowledge. In particular, we modelled WSD with the cloze test
format, i.e. as texts with blanks and all senses for one specific
word as alternative choices for filling the blank.
A full-scale combination of WordNet and Roget’s Thesaurus was done,
linking more than 30,000 senses. Using these two resources in
combination, a range of disambiguation tests was done on more than
60,000 noun instances from corpus texts of different types, and 60
blanks from real cloze texts. Results show that combining resources
is useful for enriching lexical information, and hence making WSD
more effective though not completely. Also, different target words
make different demand on contextual information, and this
interaction is closely related to text types. Future work is
suggested for expanding the analysis on target nature and making the
combination of disambiguation evidence sensitive to the requirements
of the word being disambiguated.
Models for name-passing processes: interleaving and
causalCattani, Gian LucaSewell, PeterUniversity of Cambridge, Computer Laboratory2000-09enTextUCAM-CL-TR-505ISSN 1476-2986
We study syntax-free models for name-passing processes. For
interleaving semantics, we identify the indexing structure required
of an early labelled transition system to support the usual
π-calculus operations, defining Indexed Labelled Transition Systems.
For noninterleaving causal semantics we define Indexed Labelled
Asynchronous Transition Systems, smoothly generalizing both our
interleaving model and the standard Asynchronous Transition Systems
model for CCS-like calculi. In each case we relate a denotational
semantics to an operational view, for bisimulation and causal
bisimulation respectively. We establish completeness properties of,
and adjunctions between, categories of the two models. Alternative
indexing structures and possible applications are also discussed.
These are first steps towards a uniform understanding of the
semantics and operations of name-passing calculi.
Modules, abstract types, and distributed
versioningSewell, PeterUniversity of Cambridge, Computer Laboratory2000-09enTextUCAM-CL-TR-506ISSN 1476-2986
In a wide-area distributed system it is often impractical to
synchronise software updates, so one must deal with many coexisting
versions. We study static typing support for modular wide-area
programming, modelling separate compilation/linking and execution of
programs that interact along typed channels. Interaction may involve
communication of values of abstract types; we provide the developer
with fine-grain versioning control of these types to support
interoperation of old and new code. The system makes use of a
second-class module system with singleton kinds; we give a novel
operational semantics for separate compilation/linking and execution
and prove soundness.
Mechanizing a theory of program composition for
UNITYPaulson, LawrenceUniversity of Cambridge, Computer Laboratory2000-11enTextUCAM-CL-TR-507ISSN 1476-2986
Compositional reasoning must be better understood if non-trivial
concurrent programs are to be verified. Chandy and Sanders [2000]
have proposed a new approach to reasoning about composition, which
Charpentier and Chandy [1999] have illustrated by developing a large
example in the UNITY formalism. The present paper describes
extensive experiments on mechanizing the compositionality theory and
the example, using the proof tool Isabelle. Broader issues are
discussed, in particular, the formalization of program states. The
usual representation based upon maps from variables to values is
contrasted with the alternatives, such as a signature of typed
variables. Properties need to be transferred from one program
component’s signature to the common signature of the system. Safety
properties can be so transferred, but progress properties cannot be.
Using polymorphism, this problem can be circumvented by making
signatures sufficiently flexible. Finally the proof of the example
itself is outlined.
Shallow linear action graphs and their embeddingsLeifer, JamesMilner, RobinUniversity of Cambridge, Computer Laboratory2000-10enTextUCAM-CL-TR-508ISSN 1476-2986Proximity visualisation of abstract dataBasalaj, WojciechUniversity of Cambridge, Computer Laboratory2001-01enTextUCAM-CL-TR-509ISSN 1476-2986
Data visualisation is an established technique for exploration,
analysis and presentation of data. A graphical presentation is
generated from the data content, and viewed by an observer, engaging
vision – the human sense with the greatest bandwidth, and the
ability to recognise patterns subconciously. For instance, a
correlation present between two variables can be elucidated with a
scatter plot. An effective visualisation can be difficult to achieve
for an abstract collection of objects, e.g. a database table with
many attributes, or a set of multimedia documents, since there is no
immediately obvious way of arranging the objects based on their
content. Thankfully, similarity between pairs of elements of such a
collection can be measured, and a good overview picture should
respect this proximity information, by positioning similar elements
close to one another, and far from dissimilar objects. The resulting
proximity visualisation is a topology preserving map of the
underlying data collection, and this work investigates various
methods for generating such maps. A number of algorithms are
devised, evaluated quantitatively by means of statistical inference,
and qualitatively in a case study for each type of data collection.
Other graphical representations for abstract data are surveyed and
compared to proximity visualisation.
A standard method for modelling prximity relations is
multidimensional scaling (MDS) analysis. The result is usually a
two- or three-dimensional configuration of points – each
representing a single element from a collection., with inter-point
distances approximating the corresponding proximities. The quality
of this approximation can be expressed as a loss function, and the
optimal arrangement can be found by minimising it numerically – a
procedure known as least-squares metric MDS. This work presents a
number of algorithmic instances of this problem, using established
function optimisation heuristics: Newton-Raphson, Tabu Search,
Genetic Algorithm, Iterative Majorization, and Stimulated annealing.
Their effectiveness at minimising the loss function is measured for
a representative sample of data collections, and the relative
ranking established. The popular classical scaling method serves as
a benchmark for this study.
The computational cost of conventional MDS makes it unsuitable for
visualising a large data collection. Incremental multidimensional
scaling solves this problem by considering only a carefully chosen
subset of all pairwise proximities. Elements that make up cluster
diameters at a certain level of the single link cluster hierarchy
are identified, and are subject to standard MDS, in order to
establish the overall shape of the configuration. The remaining
elements are positioned independently of one another with respect to
this skeleton configuration. For very large collections the skeleton
configuration can itself be built up incrementally. The incremental
method is analysed for the compromise between solution quality and
the proportion of proximities used, and compared to Principal
Components Analysis on a number of large database tables.
In some applications it is convenient to represent individual
objects by compact icons of fixed size, for example the use of
thumbnails when visualising a set of images. Because the MDS
analysis only takes the position of icons into account, and not
their size, its direct use for visualisation may lead to partial or
complete overlap of icons. Proximity grid – an analogue of MDS in a
discrete domain – is proposed to overcome this deficiency. Each
element of an abstract data collection is represented within a
single cell of the grid, and thus considerable detail can be shown
without overlap. The proximity relationships are preserved by
clustering similar elements in the grid, and keeping dissimilar ones
apart. Algorithms for generating such an arrangement are presented
and compared in terms of output quality to one another as well as
standard MDS.
Switchlets and resource-assured MPLS networksMortier, RichardIsaacs, RebeccaFraser, KeirUniversity of Cambridge, Computer Laboratory2000-05enTextUCAM-CL-TR-510ISSN 1476-2986
MPLS (Multi-Protocol Label Switching) is a technology with the
potential to support multiple control systems, each with guaranteed
QoS (Quality of Service), on connectionless best-effort networks.
However, it does not provide all the capabilities required of a
multi-service network. In particular, although resource-assured VPNs
(Virtual Private Networks) can be created, there is no provision for
inter-VPN resource management. Control flexibility is limited
because resources must be pinned down to be guaranteed, and
best-effort flows in different VPNs compete for the same resources,
leading to QoS crosstalk.
The contribution of this paper is an implementation on MPLS of a
network control framework that supports inter-VPN resource
management. Using resource partitions known as switchlets, it allows
the creation of multiple VPNs with guaranteed resource allocations,
and maintains isolation between these VPNs. Devolved control
techniques permit each VPN a customised control system.
We motivate our work by discussing related efforts and example
scenarios of effective deployment of our system. The implementation
is described and evaluated, and we address interoperability with
external IP control systems, in addition to interoperability of data
across different layer 2 technologies.
Software visualization in PrologGrant, CalumUniversity of Cambridge, Computer Laboratory1999-12enTextUCAM-CL-TR-511ISSN 1476-2986
Software visualization (SV) uses computer graphics to communicate
the structure and behaviour of complex software and algorithms. One
of the important issues in this field is how to specify SV, because
existing systems are very cumbersome to specify and implement, which
limits their effectiveness and hinders SV from being integrated into
professional software development tools.
In this dissertation the visualization process is decomposed into a
series of formal mappings, which provides a formal foundation, and
allows separate aspects of visualization to be specified
independently. The first mapping specifies the information content
of each view. The second mapping specifies a graphical
representation of the information, and a third mapping specifies the
graphical components that make up the graphical representation. By
combining different mappings, completely different views can be
generated.
The approach has been implemented in Prolog to provide a very high
level specification language for information visualization, and a
knowledge engineering environment that allows data queries to tailor
the information in a view. The output is generated by a graphical
constraint solver that assembles the graphical components into a
scene.
This system provides a framework for SV called Vmax. Source code and
run-time data are analyzed by Prolog to provide access to
information about the program structure and run-time data for a wide
range of highly interconnected browsable views. Different views and
means of visualization can be selected from menus. An automatic
legend describes each view, and can be interactively modified to
customize how data is presented. A text window for editing source
code is synchronized with the graphical view. Vmax is a complete
Java development environment and end user SV system.
Vmax compares favourably to existing SV systems in many taxonometric
criteria, including automation, scope, information content,
graphical output form, specification, tailorability, navigation,
granularity and elision control. The performance and scalability of
the new approach is very reasonable.
We conclude that Prolog provides a formal and high level
specification language that is suitable for specifying all aspects
of a SV system.
An algebraic framework for modelling and verifying
microprocessors using HOLFox, AnthonyUniversity of Cambridge, Computer Laboratory2001-03enTextUCAM-CL-TR-512ISSN 1476-2986
This report describes an algebraic approach to the specification and
verification of microprocessor designs. Key results are expressed
and verified using the HOL proof tool. Particular attention is paid
to the models of time and temporal abstraction, culminating in a
number of one-step theorems. This work is then explained with a
small but complete case study, which verifies the correctness of a
datapath with microprogram control.
Generic summaries for indexing in information retrieval –
Detailed test resultsSakai, TetsuyaSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory2001-05enTextUCAM-CL-TR-513ISSN 1476-2986
This paper examines the use of generic summaries for indexing in
information retrieval. Our main observations are that:
– With or without pseudo-relevance feedback, a summary index may be
as effective as the corresponding fulltext index for
precision-oriented search of highly relevant documents. But a
reasonably sophisticated summarizer, using a compression ratio of
10–30%, is desirable for this purpose.
– In pseudo-relevance feedback, using a summary index at initial
search and a fulltext index at final search is possibly effective
for precision-oriented search, regardless of relevance levels. This
strategy is significantly more effective than the one using the
summary index only and probably more effective than using summaries
as mere term selection filters. For this strategy, the summary
quality is probably not a critical factor, and a compression ratio
of 5–10% appears best.
Nomadic π-calculi: Expressing and verifying communication
infrastructure for mobile computationUnyapoth, AsisUniversity of Cambridge, Computer Laboratory2001-06enTextUCAM-CL-TR-514ISSN 1476-2986
This thesis addresses the problem of verifying distributed
infrastructure for mobile computation. In particular, we study
language primitives for communication between mobile agents. They
can be classified into two groups. At a low level there are
“location dependent” primitives that require a programmer to know
the current site of a mobile agent in order to communicate with it.
At a high level there are “location independent” primitives that
allow communication with a mobile agent irrespective of any
migrations. Implementation of the high level requires delicate
distributed infrastructure algorithms. In earlier work of Sewell,
Wojciechowski and Pierce, the two levels were made precise as
process calculi, allowing such algorithms to be expressed as
encodings of the high level into the low level; a distributed
programming language “Nomadic Pict” has been built for experimenting
with such encodings.
This thesis turns to semantics, giving a definition of the core
language (with a type system) and proving correctness of an example
infrastructure. This involves extending the standard semantics and
proof techniques of process calculi to deal with the new notions of
sites and agents. The techniques adopted include labelled transition
semantics, operational equivalences and preorders (e.g., expansion
and coupled simulation), “up to” equivalences, and uniform
receptiveness. We also develop two novel proof techniques for
capturing the design intuitions regarding mobile agents: we consider
“translocating” versions of operational equivalences that take
migration into account, allowing compositional reasoning; and
“temporary immobility”, which captures the intuition that while an
agent is waiting for a lock somewhere in the system, it will not
migrate.
The correctness proof of an example infrastructure is non-trivial.
It involves analysing the possible reachable states of the encoding
applied to an arbitrary high-level source program. We introduce an
intermediate language for factoring out as many ‘house-keeping’
reduction steps as possible, and focusing on the partially-committed
steps.
The UDP calculus: rigorous semantics for real
networkingSerjantov, AndreiSewell, PeterWansbrough, KeithUniversity of Cambridge, Computer Laboratory2001-07enTextUCAM-CL-TR-515ISSN 1476-2986Dynamic provisioning of resource-assured and programmable
virtual private networksIsaacs, RebeccaUniversity of Cambridge, Computer Laboratory2001-09enTextUCAM-CL-TR-516ISSN 1476-2986
Virtual Private Networks (VPNs) provide dedicated connectivity to a
closed group of users on a shared network. VPNs have traditionally
been deployed for reasons of economy of scale, but have either been
statically defined, requiring manual configuration, or else unable
to offer any quality of service (QoS) guarantees.
This dissertation describes VServ, a service offering dynamic and
resource-assured VPNs that can be acquired and modified on demand.
In VServ, a VPN is both a subset of physical resources, such as
bandwidth and label space, together with the means to perform
fine-grained management of those resources. This network
programmability, combined with QoS guarantees, enables the
multiservice network – a single universal network that can support
all types of service and thus be efficient, cost-effective and
flexible.
VServ is deployed over a network control framework known as Tempest.
The Tempest explicitly distinguishes between inter- and intra-VPN
resource management mechanisms. This makes the dynamic resource
reallocation capabilities of VServ viable, whilst handling highly
dynamic VPNs or a large number of VPNs. Extensions to the original
implementation of the Tempest to support dynamically reconfigurable
QoS are detailed.
A key part of a dynamic and responsive VPN service is fully
automated VPN provisioning. A notation for VPN specification is
described, together with mechanisms for incorporating policies of
the service provider and the current resource availability in the
network into the design process. The search for a suitable VPN
topology can be expressed as a optimisation problem that is not
computationally tractable except for very small networks. This
dissertation describes how the search is made practical by tailoring
it according to the characteristics of the desired VPN.
Availability of VServ is addressed with a proposal for distributed
VPN creation. A resource revocation protocol exploits the dynamic
resource management capabilities of VServ to allow adaptation in the
control plane on a per-VPN basis. Managed resource revocation
supports highly flexible resource allocation and reallocation
policies, allowing VServ to efficiently provision for short-lived or
highly dynamic VPNs.
The Cambridge Multimedia Document Retrieval Project: summary
of experimentsSpärck Jones, KarenJourlin, P.Johnson, S.E.Woodland, P.C.University of Cambridge, Computer Laboratory2001-07enTextUCAM-CL-TR-517ISSN 1476-2986
This report summarises the experimental work done under the
Multimedia Document Retrieval (MDR) project at Cambridge from
1997-2000, with selected illustrations. The focus is primarily on
retrieval studies, and on speech tests directly related to
retrieval, not on speech recognition itself. The report draws on the
many and varied tests done during the project, but also presents a
new series of results designed to compare strategies across as many
different data sets as possible by using consistent system parameter
settings.
The project tests demonstrate that retrieval from files of audio
news material transcribed using a state of the art speech
recognition system can match the reference level defined by human
transcriptions; and that expansion techniques, especially when
applied to queries, can be very effective means for improving basic
search performance.
An attack on a traitor tracing schemeYan, Jeff JianxinWu, YongdongUniversity of Cambridge, Computer Laboratory2001-07enTextUCAM-CL-TR-518ISSN 1476-2986
In Crypto’99, Boneh and Franklin proposed a public key traitor
tracing scheme, which was believed to be able to catch all traitors
while not accusing any innocent users (i.e., full-tracing and
error-free). Assuming that Decision Diffie-Hellman problem is
unsolvable in Gq, Boneh and Franklin proved that a decoder cannot
distinguish valid ciphertexts from invalid ones that are used for
tracing. However, our novel pirate decoder P3 manages to make some
invalid ciphertexts distinguishable without violating their
assumption, and it can also frame innocent user coalitions to fool
the tracer. Neither the single-key nor arbitrary pirate tracing
algorithm presented in [1] can identify all keys used by P3 as
claimed. Instead, it is possible for both algorithms to catch none
of the traitors. We believe that the construction of our novel
pirate also demonstrates a simple way to defeat some other black-box
traitor tracing schemes in general.
Local evidence in document retrievalChoquette, MartinUniversity of Cambridge, Computer Laboratory2001-08enTextUCAM-CL-TR-519ISSN 1476-2986Ternary and three-point univariate subdivision
schemesHassan, MohamedDodgson, Neil A.University of Cambridge, Computer Laboratory2001-09enTextUCAM-CL-TR-520ISSN 1476-2986
The generating function formalism is used to analyze the continuity
properties of univariate ternary subdivision schemes. These are
compared with their binary counterparts.
Operational congruences for reactive systemsLeifer, JamesUniversity of Cambridge, Computer Laboratory2001-09enTextUCAM-CL-TR-521ISSN 1476-2986Practical behavioural animation based on vision and
attentionGillies, Mark F.P. University of Cambridge, Computer Laboratory2001-09enTextUCAM-CL-TR-522ISSN 1476-2986
The animation of human like characters is a vital aspect of computer
animation. Most animations rely heavily on characters of some sort
or other. This means that one important aspect of computer animation
research is to improve the animation of these characters both by
making it easier to produce animations and by improving the quality
of animation produced. One approach to animating characters is to
produce a simulation of the behaviour of the characters which will
automatically animate the character.
The dissertation investigates the simulation of behaviour in
practical applications. In particular it focuses on models of visual
perception for use in simulating human behaviour. A simulation of
perception is vital for any character that interacts with its
surroundings. Two main aspects of the simulation of perception are
investigated:
– The use of psychology for designing visual algorithms.
– The simulation of attention in order to produce both behaviour and
gaze patterns.
Psychological theories are a useful starting point for designing
algorithms for simulating visual perception. The dissertation
investigates their use and presents some algorithms based on
psychological theories.
Attention is the focusing of a person’s perception on a particular
object. The dissertation presents a simulation of what a character
is attending to (looking at). This is used to simulate behaviour and
for animating eye movements.
The algorithms for the simulation of vision and attention are
applied to two tasks in the simulation of behaviour. The first is a
method for designing generic behaviour patterns from simple pieces
of motion. The second is a behaviour pattern for navigating a
cluttered environment. The simulation of vision and attention gives
advantages over existing work on both problems. The approaches to
the simulation of perception will be evaluated in the context of
these examples.
Bigraphical reactive systems: basic theoryMilner, RobinUniversity of Cambridge, Computer Laboratory2001-09enTextUCAM-CL-TR-523ISSN 1476-2986
A notion of bigraph is proposed as the basis for a model of mobile
interaction. A bigraph consists of two independent structures: a
topograph representing locality and a monograph representing
connectivity. Bigraphs are equipped with reaction rules to form
bigraphical reactive systems (BRSs), which include versions of the
π-calculus and the ambient calculus. Bigraphs are shown to be a
special case of a more abstract notion, wide reactive systems
(WRSs), not assuming any particular graphical or other structure but
equipped with a notion of width, which expresses that agents,
contexts and reactions may all be widely distributed entities.
A behavioural theory is established for WRSs using the categorical
notion of relative pushout; it allows labelled transition systems to
be derived uniformly, in such a way that familiar behavioural
preorders and equivalences, in particular bisimilarity, are
congruential under certain conditions. Then the theory of bigraphs
is developed, and they are shown to meet these conditions. It is
shown that, using certain functors, other WRSs which meet the
conditions may also be derived; these may, for example, be forms of
BRS with additional structure.
Simple examples of bigraphical systems are discussed; the theory is
developed in a number of ways in preparation for deeper application
studies.
Verifying the SET purchase protocolsBella, GiampaoloMassacci, FabioPaulson, Lawrence C.University of Cambridge, Computer Laboratory2001-11enTextUCAM-CL-TR-524ISSN 1476-2986
The Secure Electronic Transaction (SET) protocol has been proposed
by a consortium of credit card companies and software corporations
to guarantee the authenticity of e-commerce transactions and the
confidentiality of data. When the customer makes a purchase, the SET
dual signature keeps his account details secret from the merchant
and his choice of goods secret from the bank. This paper reports
verification results for the purchase step of SET, using the
inductive method. The credit card details do remain confidential.
The customer, merchant and bank can confirm most details of a
transaction even when some of those details are kept from them. The
usage of dual signatures requires repetition in protocol messages,
making proofs more difficult but still feasible. The formal analysis
has revealed a significant defect. The dual signature lacks
explicitness, giving rise to potential vulnerabilities.
Extensible virtual machinesHarris, Timothy L.University of Cambridge, Computer Laboratory2001-12enTextUCAM-CL-TR-525ISSN 1476-2986
Virtual machines (VMs) have enjoyed a resurgence as a way of
allowing the same application program to be used across a range of
computer systems. This flexibility comes from the abstraction that
the provides over the native interface of a particular computer.
However, this also means that the application is prevented from
taking the features of particular physical machines into account in
its implementation.
This dissertation addresses the question of why, where and how it is
useful, possible and practicable to provide an application with
access to lower-level interfaces. It argues that many aspects of
implementation can be devolved safely to untrusted applications and
demonstrates this through a prototype which allows control over
run-time compilation, object placement within the heap and thread
scheduling. The proposed architecture separates these
application-specific policy implementations from the application
itself. This allows one application to be used with different
policies on different systems and also allows naïve or premature
optimizations to be removed.
Extending lossless image compressionPenrose, Andrew J.University of Cambridge, Computer Laboratory2001-12enTextUCAM-CL-TR-526ISSN 1476-2986
“It is my thesis that worthwhile improvements can be made to
lossless image compression schemes, by considering the correlations
between the spectral, temporal and interview aspects of image data,
in extension to the spatial correlations that are traditionally
exploited.”
Images are an important part of today’s digital world. However, due
to the large quantity of data needed to represent modern imagery the
storage of such data can be expensive. Thus, work on efficient image
storage (image compression) has the potential to reduce storage
costs and enable new applications.
Many image compression schemes are lossy; that is they sacrifice
image informationto achieve very compact storage. Although this is
acceptable for many applications, some environments require that
compression not alter the image data. This lossless image
compression has uses in medical, scientific and professional video
processing applications.
Most of the work on lossless image compression has focused on
monochrome images and has made use of the spatial smoothness of
image data. Only recently have researchers begun to look
specifically at the lossless compression of colour images and video.
By extending compression schemes for colour images and video, the
storage requirements for these important classes of image data can
be further reduced.
Much of the previous research into lossless colour image and video
compression has been exploratory. This dissertation studies the
problem in a structured way. Spatial, spectral and temporal
correlations are all considered to facilitate improved compression.
This has lead to a greater data reduction than many existing schemes
for lossless colour image and colour video compression.
Furthermore, this work has considered the application of extended
lossless image coding to more recent image types, such as multiview
imagery. Thus, systems that use multiple views of the same scene to
provide 3D viewing, have beenprovided with a completely novel
solution for the compression of multiview colour video.
Architectures for ubiquitous systemsSaif, UmarUniversity of Cambridge, Computer Laboratory2002-01enTextUCAM-CL-TR-527ISSN 1476-2986
Advances in digital electronics over the last decade have made
computers faster, cheaper and smaller. This coupled with the
revolution in communication technology has led to the development of
sophisticated networked appliances and handheld devices. “Computers”
are no longer boxes sitting on a desk, they are all around us,
embedded in every nook and corner of our environment. This
increasing complexity in our environment leads to the desire to
design a system that could allow this pervasive functionality to
disappear in the infrastructure, automatically carrying out everyday
tasks of the users.
Such a system would enable devices embedded in the environment to
cooperate with one another to make a wide range of new and useful
applications possible, not originally conceived by the manufacturer,
to achieve greater functionality, flexibility and utility.
The compelling question then becomes “what software needs to be
embedded in these devices to enable them to participate in such a
ubiquitous system”? This is the question addressed by the
dissertation.
Based on the experience with home automation systems, as part of the
AutoHAN project, the dissertation presents two compatible but
different architectures; one to enable dumb devices to be controlled
by the system and the other to enable intelligent devices to
control, extend and program the system.
Control commands for dumb devices are managed using an HTTP-based
publish/subscribe/notify architecture; devices publish their control
commands to the system as XML-typed discrete messages, applications
discover and subscribe interest in these events to send and receive
control commands from these devices, as typed messages, to control
their behavior. The architecture handles mobility and failure of
devices by using soft-state, redundent subscriptions and “care-of”
nodes. The system is programmed with event scripts that encode
automation rules as condition-action bindings. Finally, the use of
XML and HTTP allows devices to be controlled by a simple Internet
browser.
While the publish/subscribe/notify defines a simple architecture to
enable interoperability of limited capability devices, intelligent
devices can afford more complexity that can be utilized to support
user applications and services to control, manage and program the
system. However, the operating system embedded in these devices
needs to address the heterogeneity, longevity, mobility and dynamism
of the system.
The dissertation presents the architecture of an embedded
distributed operating system that lends itself to safe
context-driven adaptation. The operating system is instrumented with
four artifacts to address the challenges posed by a ubiquitous
system. 1) An XML-based directory service captures and notifies the
applications and services about changes in the device context, as
resources move, fail, leave or join the system, to allow
context-driven adaptation. 2) A Java-based mobile agent system
allows new software to be injected in the system and moved and
replicated with the changing characteristics of the system to define
a self-organizing system. 3) A subscribe/notify interface allows
context-specific extensions to be dynamically added to the operating
system to enable it to efficiently interoperate in its current
context according to application requirements. 4) Finally, a
Dispatcher module serves as the context-aware system call interface
for the operating system; when requested to invoke a service, the
Dispatcher invokes the resource that best satisfies the requirements
given the characteristics of the system.
Definition alone is not sufficient to prove the validity of an
architecture. The dissertation therefore describes a prototype
implementation of the operating system and presents both a
quantitative comparison of its performance with related systems and
its qualitative merit by describing new applications made possible
by its novel architecture.
Measurement-based management of network resourcesMoore, Andrew WilliamUniversity of Cambridge, Computer Laboratory2002-04enTextUCAM-CL-TR-528ISSN 1476-2986
Measurement-Based Estimators are able to characterise data flows,
enabling improvements to existing management techniques and access
to previously impossible management techniques. It is the thesis of
this dissertation that in addition to making practical adaptive
management schemes, measurement-based estimators can be practical
within current limitations of resource.
Examples of network management include the characterisation of
current utilisation for explicit admission control and the
configuration of a scheduler to divide link-capacity among competing
traffic classes. Without measurements, these management techniques
have relied upon the accurate characterisation of traffic – without
accurate traffic characterisation, network resources may be under or
over utilised.
Embracing Measurement-Based Estimation in admission control,
Measurement-Based Admission Control (MBAC) algorithms have allowed
characterisation of new traffic flows while adapting to changing
flow requirements. However, there have been many MBAC algorithms
proposed, often with no clear differentiation between them. This has
motivated the need for a realistic, implementation-based comparison
in order to identify an ideal MBAC algorithm.
This dissertation reports on an implementation-based comparison of
MBAC algorithms conducted using a purpose built test environment.
The use of an implementation-based comparison has allowed the MBAC
algorithms to be tested under realistic conditions of traffic load
and realistic limitations on memory, computational resources and
measurements. Alongside this comparison is a decomposition of a
group of MBAC algorithms, illustrating the relationship among MBAC
algorithm components, as well as highlighting common elements among
different MBAC algorithms.
The MBAC algorithm comparison reveals that, while no single
algorithm is ideal, the specific resource demands, such as
computation overheads, can dramatically impact on the MBAC
algorithm’s performance. Further, due to the multiple timescales
present in both traffic and management, the estimator of a robust
MBAC algorithm must base its estimate on measurements made over a
wide range of timescales. Finally, a reliable estimator must account
for the error resulting from random properties of measurements.
Further identifying that the estimator components used in MBAC
algorithms need not be tied to the admission control problem, one of
the estimators (originally constructed as part of an MBAC algorithm)
is used to continuously characterise resource requirements for a
number of classes of traffic. Continuous characterisation of
traffic, whether requiring similar or orthogonal resources, leads to
the construction and demonstration of a network switch that is able
to provide differentiated service while being adaptive to the
demands of each traffic class. The dynamic allocation of resources
is an approach unique to a measurement-based technique that would
not be possible if resources were based upon static declarations of
requirement.
The triVM intermediate language reference manualJohnson, NeilUniversity of Cambridge, Computer Laboratory2002-02enTextUCAM-CL-TR-529ISSN 1476-2986
The triVM intermediate language has been developed as part of a
research programme concentrating on code space optimization. The
primary aim in developing triVM is to provide a language that
removes the complexity of high-level languages, such as C or ML,
while maintaining sufficient detail, at as simple a level as
possible, to support reseach and experimentation into code size
optimization. The basic structure of triVM is a notional Static
Single Assignment-based three-address machine. A secondary aim is to
develop an intermediate language that supports graph-based
translation, using graph rewrite rules, in a textual, human-readable
format. Experience has shown that text-format intermediate files are
much easier to use for experimentation, while the penalty in
translating this human-readable form to the internal data structures
used by the software is negligible. Another aim is to provide a
flexible language in which features and innovations can be
evaluated; for example, this is one of the first intermediate
languages directly based on the Static Single Assignment technique,
and which explicitly exposes the condition codes as a result of
arithmetic operations. While this paper is concerned solely with the
description of triVM, we present a brief summary of other
research-orientated intermediate languages.
Subcategorization acquisitionKorhonen, AnnaUniversity of Cambridge, Computer Laboratory2002-02enTextUCAM-CL-TR-530ISSN 1476-2986
Manual development of large subcategorised lexicons has proved
difficult because predicates change behaviour between sublanguages,
domains and over time. Yet access to a comprehensive
subcategorization lexicon is vital for successful parsing capable of
recovering predicate-argument relations, and probabilistic parsers
would greatly benefit from accurate information concerning the
relative likelihood of different subcategorisation frames SCFs of a
given predicate. Acquisition of subcategorization lexicons from
textual corpora has recently become increasingly popular. Although
this work has met with some success, resulting lexicons indicate a
need for greater accuracy. One significant source of error lies in
the statistical filtering used for hypothesis selection, i.e. for
removing noise from automatically acquired SCFs.
This thesis builds on earlier work in verbal subcategorization
acquisition, taking as a starting point the problem with statistical
filtering. Our investigation shows that statistical filters tend to
work poorly because not only is the underlying distribution zipfian,
but there is also very little correlation between conditional
distribution of SCFs specific to a verb and unconditional
distribution regardless of the verb. More accurate back-off
estimates are needed for SCF acquisition than those provided by
unconditional distribution.
We explore whether more accurate estimates could be obtained by
basing them on linguistic verb classes. Experiments are reported
which show that in terms of SCF distributions, individual verbs
correlate more closely with syntactically similar verbs and even
more closely with semantically similar verbs, than with all verbs in
general. On the basis of this result, we suggest classifying verbs
according to their semantic classes and obtaining back-off estimates
specific to these classes.
We propose a method for obtaining such semantically based back-off
estimates, and a novel approach to hypothesis selection which makes
use of these estimates. This approach involves automatically
identifying the semantic class of a predicate, using
subcategorization acquisition machinery to hypothesise conditional
SCF distribution for the predicate, smoothing the conditional
distribution with the back-off estimates of the respective semantic
verb class, and employing a simple method for filtering, which uses
a threshold on the estimates from smoothing. Adopting Briscoe and
Carroll’s (1997) system as a framework, we demonstrate that this
semantically-driven approach to hypothesis selection can
significantly improve the accuracy of large-scale subcategorization
acquisition.
Verifying the SET registration protocolsBella, GiampaoloMassacci, FabioPaulson, Lawrence C.University of Cambridge, Computer Laboratory2002-03enTextUCAM-CL-TR-531ISSN 1476-2986
SET (Secure Electronic Transaction) is an immense e-commerce
protocol designed to improve the security of credit card purchases.
In this paper we focus on the initial bootstrapping phases of SET,
whose objective is the registration of customers and merchants with
a SET certification authority. The aim of registration is twofold:
getting the approval of the cardholder’s or merchant’s bank, and
replacing traditional credit card numbers with electronic
credentials that customers can present to the merchant, so that
their privacy is protected. These registration sub-protocols present
a number of challenges to current formal verification methods.
First, they do not assume that each agent knows the public keys of
the other agents. Key distribution is one of the protocols’ tasks.
Second, SET uses complex encryption primitives (digital envelopes)
which introduce dependency chains: the loss of one secret key can
lead to potentially unlimited losses. Building upon our previous
work, we have been able to model and formally verify SET’s
registration with the inductive method in Isabelle/HOL solving its
challenges with very general techniques.
Internet traffic engineeringMortier, RichardUniversity of Cambridge, Computer Laboratory2002-04enTextUCAM-CL-TR-532ISSN 1476-2986
Due to the dramatically increasing popularity of the services
provided over the public Internet, problems with current mechanisms
for control and management of the Internet are becoming apparent. In
particular, it is increasingly clear that the Internet and other
networks built on the Internet protocol suite do not provide
sufficient support for the efficient control and management of
traffic, i.e. for Traffic Engineering.
This dissertation addresses the problem of traffic engineering in
the Internet. It argues that traffic management techniques should be
applied at multiple timescales, and not just at data timescales as
is currently the case. It presents and evaluates mechanisms for
traffic engineering in the Internet at two further timescales: flow
admission control and control of per-flow packet marking, enabling
control timescale traffic engineering; and support for load based
inter-domain routeing in the Internet, enabling management timescale
traffic engineering.
This dissertation also discusses suitable policies for the
application of the proposed mechanisms. It argues that the proposed
mechanisms are able to support a wide range of policies useful to
both users and operators. Finally, in a network of the size of the
Internet consideration must also be given to the deployment of
proposed solutions. Consequently, arguments for and against the
deployment of these mechanisms are presented and the conclusion
drawn that there are a number of feasible paths toward deployment.
The work presented argues the following: firstly, it is possible to
implement mechanisms within the Internet framework that enable
traffic engineering to be carried out by operators; secondly, that
applying these mechanisms with suitable policies can ease the
management problems faced by operators and at the same time improve
the efficiency with which the network can be run; thirdly, that
these improvements can correspond to increased network performance
as viewed by the user; and finally, that not only the resulting
deployment but also the deployment process itself are feasible.
The acquisition of a unification-based generalised
categorial grammarVillavicencio, AlineUniversity of Cambridge, Computer Laboratory2002-04enTextUCAM-CL-TR-533ISSN 1476-2986
The purpose of this work is to investigate the process of
grammatical acquisition from data. In order to do that, a
computational learning system is used, composed of a Universal
Grammar with associated parameters, and a learning algorithm,
following the Principles and Parameters Theory. The Universal
Grammar is implemented as a Unification-Based Generalised Categorial
Grammar, embedded in a default inheritance network of lexical types.
The learning algorithm receives input from a corpus of spontaneous
child-directed transcribed speech annotated with logical forms and
sets the parameters based on this input. This framework is used as a
basis to investigate several aspects of language acquisition. In
this thesis I concentrate on the acquisition of subcategorisation
frames and word order information, from data. The data to which the
learner is exposed can be noisy and ambiguous, and I investigate how
these factors affect the learning process. The results obtained show
a robust learner converging towards the target grammar given the
input data available. They also show how the amount of noise present
in the input data affects the speed of convergence of the learner
towards the target grammar. Future work is suggested for
investigating the developmental stages of language acquisition as
predicted by the learning model, with a thorough comparison with the
developmental stages of a child. This is primarily a cognitive
computational model of language learning that can be used to
investigate and gain a better understanding of human language
acquisition, and can potentially be relevant to the development of
more adaptive NLP technology.
Resource control in network elementsDonnelly, AustinUniversity of Cambridge, Computer Laboratory2002-04enTextUCAM-CL-TR-534ISSN 1476-2986
Increasingly, substantial data path processing is happening on
devices within the network. At or near the edges of the network,
data rates are low enough that commodity workstations may be used to
process packet flows. However, the operating systems such machines
use are not suited to the needs of data-driven processing. This
dissertation shows why this is a problem, how current work fails to
address it, and proposes a new approach.
The principal problem is that crosstalk occurs in the processing of
different data flows when they contend for a shared resource and
their accesses to this resource are not scheduled appropriately;
typically the shared resource is located in a server process.
Previous work on vertically structured operating systems reduces the
need for such shared servers by making applications responsible for
performing as much of their own processing as possible, protecting
and multiplexing devices at the lowest level consistent with
allowing untrusted user access.
However, shared servers remain on the data path in two
circumstances: firstly, dumb network adaptors need non-trivial
processing to allow safe access by untrusted user applications.
Secondly, shared servers are needed wherever trusted code must be
executed for security reasons.
This dissertation presents the design and implementation of Expert,
an operating system which avoids crosstalk by removing the need for
such servers.
This dissertation describes how Expert handles dumb network adaptors
to enable applications to access them via a low-level interface
which is cheap to implement in the kernel, and retains application
responsibility for the work involved in running a network stack.
Expert further reduces the need for application-level shared servers
by introducing paths which can trap into protected modules of code
to perform actions which would otherwise have to be implemented
within a server.
Expert allows traditional compute-bound tasks to be freely mixed
with these I/O-driven paths in a single system, and schedules them
in a unified manner. This allows the processing performed in a
network element to be resource controlled, both for background
processing tasks such as statistics gathering, and for data path
processing such as encryption.
Designs, disputes and strategiesFaggian, ClaudiaHyland, MartinUniversity of Cambridge, Computer Laboratory2002-05enTextUCAM-CL-TR-535ISSN 1476-2986
Important progresses in logic are leading to interactive and
dynamical models. Geometry of Interaction and Games Semantics are
two major examples. Ludics, initiated by Girard, is a further step
in this direction.
The objects of Ludics which correspond to proofs are designs. A
design can be described as the skeleton of a sequent calculus
derivation, where we do not manipulate formulas, but their location
(the address where the formula is stored). To study the traces of
the interactions between designs as primitive leads to an
alternative presentation, which is to describe a design as the set
of its possible interactions, called disputes. This presentation has
the advantage to make precise the correspondence between the basic
notions of Ludics (designs, disputes and chronicles) and the basic
notions of Games semantics (strategies, plays and views).
Low temperature data remanence in static RAMSkorobogatov, SergeiUniversity of Cambridge, Computer Laboratory2002-06enTextUCAM-CL-TR-536ISSN 1476-2986
Security processors typically store secret key material in static
RAM, from which power is removed if the device is tampered with. It
is commonly believed that, at temperatures below −20 °C, the
contents of SRAM can be ‘frozen’; therefore, many devices treat
temperatures below this threshold as tampering events. We have done
some experiments to establish the temperature dependency of data
retention time in modern SRAM devices. Our experiments show that the
conventional wisdom no longer holds.
Parallel systems in symbolic and algebraic
computationMatooane, MantsikaUniversity of Cambridge, Computer Laboratory2002-06enTextUCAM-CL-TR-537ISSN 1476-2986
This report describes techniques to exploit distributed memory
massively parallel supercomputers to satisfy the peak memory demands
of some very large computer algebra problems (over 10 GB). The
memory balancing is based on a randomized hashing algorithm for
dynamic data distribution. Fine grained partitioning is used to
provide flexibility in the memory allocation, at the cost of higher
communication cost. The main problem areas are multivariate
polynomial algebra, and linear algebra with polynomial matrices. The
system was implemented and tested on a Hitachi SR2201 supercomputer.
The Escritoire: A personal projected display for interacting
with documentsAshdown, MarkRobinson, PeterUniversity of Cambridge, Computer Laboratory2002-06enTextUCAM-CL-TR-538ISSN 1476-2986
The Escritoire is a horizontal desk interface that uses two
projectors to create a foveal display. Items such as images,
documents, and the interactive displays of other conventional
computers, can be manipulated on the desk using pens in both hands.
The periphery covers the desk, providing ample space for laying out
the objects relevant to a task, allowing them to be identified at a
glance and exploiting human spatial memory for rapid retrieval. The
fovea is a high resolution focal area that can be used to view any
item in detail. The projected images are continuously warped with
commodity graphics hardware before display, to reverse the effects
of misaligned projectors and ensure registration between fovea and
periphery. The software is divided into a hardware-specific client
driving the display, and a platform-independent server imposing
control.
Towards a ternary interpolating subdivision scheme for the
triangular meshDodgson, N.A.Sabin, M.A.Barthe, L.Hassan, M.F.University of Cambridge, Computer Laboratory2002-07enTextUCAM-CL-TR-539ISSN 1476-2986
We derive a ternary interpolating subdivision scheme which works on
the regular triangular mesh. It has quadratic precision and fulfils
the standard necessary conditions for C2 continuity. Further
analysis is required to determine its actual continuity class and to
define its behaviour around extraordinary points.
The use of computer graphics rendering software in the
analysis of a novel autostereoscopic display designDodgson, N.A.Moore, J.R.University of Cambridge, Computer Laboratory2002-08enTextUCAM-CL-TR-540ISSN 1476-2986
Computer graphics ‘ray tracing’ software has been used in the design
and evaluation of a new autostereoscopic 3D display. This software
complements the conventional optical design software and provides a
cost-effective method of simulating what is actually seen by a
viewer of the display. It may prove a useful tool in similar design
problems.
Different applications of two-dimensional potential fields
for volume modelingBarthe, L.Dodgson, N.A.Sabin, M.A.Wyvill, B.Gaildrat, V.University of Cambridge, Computer Laboratory2002-08enTextUCAM-CL-TR-541ISSN 1476-2986
Current methods for building models using implicit volume techniques
present problems defining accurate and controllable blend shapes
between implicit primitives. We present new methods to extend the
freedom and controllability of implicit volume modeling. The main
idea is to use a free-form curve to define the profile of the blend
region between implicit primitives.
The use of a free-form implicit curve, controlled point-by-point in
the Euclidean user space, allows us to group boolean composition
operators with sharp transitions or smooth free-form transitions in
a single modeling metaphor. This idea is generalized for the
creation, sculpting and manipulation of volume objects, while
providing the user with simplicity, controllability and freedom in
volume modeling.
Bounded volume objects, known as “Soft objects” or “Metaballs”, have
specific properties. We also present binary Boolean composition
operators that gives more control on the form of the transition when
these objects are blended.
To finish, we show how our free-form implicit curves can be used to
build implicit sweep objects.
A generative classification of mesh refinement rules with
lattice transformationsIvrissimtzis, I.P.Dodgson, N.A.Sabin, M.A.University of Cambridge, Computer Laboratory2002-09enTextUCAM-CL-TR-542ISSN 1476-2986
We give a classification of the subdivision refinement rules using
sequences of similar lattices. Our work expands and unifies recent
results in the classification of primal triangular subdivision
[Alexa, 2001], and results on the refinement of quadrilateral
lattices [Sloan, 1994, 1989]. In the examples we concentrate on the
cases with low ratio of similarity and find new univariate and
bivariate refinement rules with the lowest possible such ratio,
showing that this very low ratio usually comes at the expense of
symmetry.
Evaluating similarity-based visualisations as interfaces for
image browsingRodden, KerryUniversity of Cambridge, Computer Laboratory2002-09enTextUCAM-CL-TR-543ISSN 1476-2986
Large collections of digital images are becoming more and more
common, and the users of these collections need computer-based
systems to help them find the images they require. Digital images
are easy to shrink to thumbnail size, allowing a large number of
them to be presented to the user simultaneously. Generally, current
image browsing interfaces display thumbnails in a two-dimensional
grid, in some default order, and there has been little exploration
of possible alternatives to this model.
With textual document collections, information visualisation
techniques have been used to produce representations where the
documents appear to be clustered according to their mutual
similarity, which is based on the words they have in common. The
same techniques can be applied to images, to arrange a set of
thumbnails according to a defined measure of similarity. In many
collections, the images are manually annotated with descriptive
text, allowing their similarity to be measured in an analogous way
to textual documents. Alternatively, research in content-based image
retrieval has made it possible to measure similarity based on
low-level visual features, such as colour.
The primary goal of this research was to investigate the usefulness
of such similarity-based visualisations as interfaces for image
browsing. We concentrated on visual similarity, because it is
applicable to any image collection, regardless of the availability
of annotations. Initially, we used conventional information
retrieval evaluation methods to compare the relative performance of
a number of different visual similarity measures, both for retrieval
and for creating visualisations.
Thereafter, our approach to evaluation was influenced more by
human-computer interaction: we carried out a series of user
experiments where arrangements based on visual similarity were
compared to random arrangements, for different image browsing tasks.
These included finding a given target image, finding a group of
images matching a generic requirement, and choosing subjectively
suitable images for a particular purpose (from a shortlisted set).
As expected, we found that similarity-based arrangements are
generally more helpful than random arrangements, especially when the
user already has some idea of the type of image she is looking for.
Images are used in many different application domains; the ones we
chose to study were stock photography and personal photography. We
investigated the organisation and browsing of personal photographs
in some depth, because of the inevitable future growth in usage of
digital cameras, and a lack of previous research in this area.
On the support of recursive subdivisionIvrissimtzis, I.P.Sabin, M.A.Dodgson, N.A.University of Cambridge, Computer Laboratory2002-09enTextUCAM-CL-TR-544ISSN 1476-2986
We study the support of subdivision schemes, that is, the area of
the subdivision surface that will be affected by the displacement of
a single control point. Our main results cover the regular case,
where the mesh induces a regular Euclidean tessellation of the
parameter space. If n is the ratio of similarity between the
tessellation at step k and step k−1 of the subdivision, we show that
this number determines if the support is polygonal or fractal. In
particular if n=2, as it is in the most schemes, the support is a
polygon whose vertices can be easily determined. If n is not equal
to two as, for example, in the square root of three scheme, the
support is usually fractal and on its boundary we can identify sets
like the classic ternary Cantor set.
A HOL specification of the ARM instruction set
architectureFox, Anthony C.J.University of Cambridge, Computer Laboratory2001-06enTextUCAM-CL-TR-545ISSN 1476-2986
This report gives details of a HOL specification of the ARM
instruction set architecture. It is shown that the HOL proof tool
provides a suitable environment in which to model the architecture.
The specification is used to execute fragments of ARM code generated
by an assembler. The specification is based primarily around the
third version of the ARM architecture, and the intent is to provide
a target semantics for future microprocessor verifications.
Depth perception in computer graphicsPfautz, Jonathan DavidUniversity of Cambridge, Computer Laboratory2002-09enTextUCAM-CL-TR-546ISSN 1476-2986
With advances in computing and visual display technology, the
interface between man and machine has become increasingly complex.
The usability of a modern interactive system depends on the design
of the visual display. This dissertation aims to improve the design
process by examining the relationship between human perception of
depth and three-dimensional computer-generated imagery (3D CGI).
Depth is perceived when the human visual system combines various
different sources of information about a scene. In Computer
Graphics, linear perspective is a common depth cue, and systems
utilising binocular disparity cues are of increasing interest. When
these cues are inaccurately and inconsistently presented, the
effectiveness of a display will be limited. Images generated with
computers are sampled, meaning they are discrete in both time and
space. This thesis describes the sampling artefacts that occur in 3D
CGI and their effects on the perception of depth. Traditionally,
sampling artefacts are treated as a Signal Processing problem. The
approach here is to evaluate artefacts using Human Factors and
Ergonomics methodology; sampling artefacts are assessed via
performance on relevant visual tasks.
A series of formal and informal experiments were performed on human
subjects to evaluate the effects of spatial and temporal sampling on
the presentation of depth in CGI. In static images with perspective
information, the relative size of an object can be inconsistently
presented across depth. This inconsistency prevented subjects from
making accurate relative depth judgements. In moving images, these
distortions were most visible when the object was moving slowly,
pixel size was large, the object was located close to the line of
sight and/or the object was located a large virtual distance from
the viewer. When stereo images are presented with perspective cues,
the sampling artefacts found in each cue interact. Inconsistencies
in both size and disparity can occur as the result of spatial and
temporal sampling. As a result, disparity can vary inconsistently
across an object. Subjects judged relative depth less accurately
when these inconsistencies were present. An experiment demonstrated
that stereo cues dominated in conflict situations for static images.
In moving imagery, the number of samples in stereo cues is limited.
Perspective information dominated the perception of depth for
unambiguous (i.e., constant in direction and velocity) movement.
Based on the experimental results, a novel method was developed that
ensures the size, shape and disparity of an object are consistent as
it moves in depth. This algorithm manipulates the edges of an object
(at the expense of positional accuracy) to enforce consistent size,
shape and disparity. In a time-to-contact task using only stereo and
perspective depth cues, velocity was judged more accurately using
this method. A second method manipulated the location and
orientation of the viewpoint to maximise the number of samples of
perspective and stereo depth in a scene. This algorithm was tested
in a simulated air traffic control task. The experiment demonstrated
that knowledge about where the viewpoint is located dominates any
benefit gained in reducing sampling artefacts.
This dissertation provides valuable information for the visual
display designer in the form of task-specific experimental results
and computationally inexpensive methods for reducing the effects of
sampling.
Semantic optimization of OQL queriesTrigoni, AgathonikiUniversity of Cambridge, Computer Laboratory2002-10enTextUCAM-CL-TR-547ISSN 1476-2986
This work explores all the phases of developing a query processor
for OQL, the Object Query Language proposed by the Object Data
Management Group (ODMG 3.0). There has been a lot of research on the
execution of relational queries and their optimization using
syntactic or semantic transformations. However, there is no context
that has integrated and tested all the phases of processing an
object query language, including the use of semantic optimization
heuristics. This research is motivated by the need for query
execution tools that combine two valuable properties: i) the
expressive power to encompass all the features of the
object-oriented paradigm and ii) the flexibility to benefit from the
experience gained with relational systems, such as the use of
semantic knowledge to speed up query execution.
The contribution of this work is twofold. First, it establishes a
rigorous basis for OQL by defining a type inference model for OQL
queries and proposing a complete framework for their translation
into calculus and algebraic representations. Second, in order to
enhance query execution it provides algorithms for applying two
semantic optimization heuristics: constraint introduction and
constraint elimination techniques. By taking into consideration a
set of association rules with exceptions, it is possible to add or
remove predicates from an OQL query, thus transforming it to a more
efficient form.
We have implemented this framework, which enables us to measure the
benefits and the cost of exploiting semantic knowledge during query
execution. The experiments showed significant benefits, especially
in the application of the constraint introduction technique. In
contexts where queries are optimized once and are then executed
repeatedly, we can ignore the cost of optimization, and it is always
worth carrying out the proposed transformation. In the context of
adhoc queries the cost of the optimization becomes an important
consideration. We have developed heuristics to estimate the cost as
well as the benefits of optimization. The optimizer will carry out a
semantic transformation only when the overhead is less than the
expected benefit. Thus transformations are performed safely even
with adhoc queries. The framework can often speed up the execution
of an OQL query to a considerable extent.
Formal verification of the ARM6
micro-architectureFox, AnthonyUniversity of Cambridge, Computer Laboratory2002-11enTextUCAM-CL-TR-548ISSN 1476-2986
This report describes the formal verification of the ARM6
micro-architecture using the HOL theorem prover. The correctness of
the microprocessor design compares the micro-architecture with an
abstract, target instruction set semantics. Data and temporal
abstraction maps are used to formally relate the state spaces and to
capture the timing behaviour of the processor. The verification is
carried out in HOL and one-step theorems are used to provide the
framework for the proof of correctness. This report also describes
the formal specification of the ARM6’s three stage pipelined
micro-architecture.
Two remarks on public key cryptologyAnderson, RossUniversity of Cambridge, Computer Laboratory2002-12enTextUCAM-CL-TR-549ISSN 1476-2986
In some talks I gave in 1997-98, I put forward two observations on
public-key cryptology, concerning forward-secure signatures and
compatible weak keys. I did not publish a paper on either of them as
they appeared to be rather minor footnotes to public key cryptology.
But the work has occasionally been cited, and I’ve been asked to
write a permanent record.
Computer security – a layperson’s guide, from the bottom
upSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory2002-06enTextUCAM-CL-TR-550ISSN 1476-2986
Computer security as a technical matter is complex, and opaque for
those who are not themselves computer professionals but who
encounter, or are ultimately responsible for, computer systems. This
paper presents the essentials of computer security in non-technical
terms, with the aim of helping people affected by computer systems
to understand what security is about and to withstand the blinding
with science mantras that too often obscure the real issues.
The relative consistency of the axiom of choice — mechanized
using Isabelle/ZFPaulson, Lawrence C.University of Cambridge, Computer Laboratory2002-12enTextUCAM-CL-TR-551ISSN 1476-2986
The proof of the relative consistency of the axiom of choice has
been mechanized using Isabelle/ZF. The proof builds upon a previous
mechanization of the reflection theorem. The heavy reliance on
metatheory in the original proof makes the formalization unusually
long, and not entirely satisfactory: two parts of the proof do not
fit together. It seems impossible to solve these problems without
formalizing the metatheory. However, the present development follows
a standard textbook, Kunen’s “Set Theory”, and could support the
formalization of further material from that book. It also serves as
an example of what to expect when deep mathematics is formalized.
The Xenoserver computing infrastructureFraser, Keir A.Hand, Steven M.Harris, Timothy L.Leslie, Ian M.Pratt, Ian A.University of Cambridge, Computer Laboratory2003-01enTextUCAM-CL-TR-552ISSN 1476-2986
The XenoServer project will build a public infrastructure for
wide-area distributed computing. We envisage a world in which
XenoServer execution platforms will be scattered across the globe
and available for any member of the public to submit code for
execution. Crucially, the code’s sponsor will be billed for all the
resources used or reserved during its execution. This will encourage
load balancing, limit congestion, and make the platform
self-financing.
Such a global infrastructure is essential to address the fundamental
problem of communication latency. By enabling principals to run
programs at points throughout the network they can ensure that their
code executes close to the entities with which it interacts. As well
as reducing latency this can be used to avoid network bottlenecks,
to reduce long-haul network charges and to provide a network
presence for transiently-connected mobile devices.
This project will build and deploy a global XenoServer test-bed and
make it available to authenticated external users; initially members
of the scientific community and ultimately of the general public. In
this environment accurate resource accounting and pricing is
critical – whether in an actual currency or one that is fictitious.
As with our existing work on OS resource management, pricing
provides the feedback necessary for applications that can adapt, and
prevents over-use by applications that cannot.
Xen 2002Barham, Paul R.Dragovic, BorisFraser, Keir A.Hand, Steven M.Harris, Timothy L.Ho, Alex C.Kotsovinos, EvangelosMadhavapeddy, Anil V.S.Neugebauer, RolfPratt, Ian A.Warfield, Andrew K.University of Cambridge, Computer Laboratory2003-01enTextUCAM-CL-TR-553ISSN 1476-2986
This report describes the design of Xen, the hypervisor developed as
part of the XenoServer wide-area computing project. Xen enables the
hardware resources of a machine to be virtualized and dynamically
partitioned such as to allow multiple different ‘guest’ operating
system images to be run simultaneously.
Virtualizing the machine in this manner provides flexibility,
allowing different users to choose their preferred operating system
(Windows, Linux, NetBSD), and also enables use of the platform as a
testbed for operating systems research. Furthermore, Xen provides
secure partitioning between these ‘domains’, and enables better
resource accounting and QoS isolation than can be achieved within a
conventional operating system. We show these benefits can be
achieved at negligible performance cost.
We outline the design of Xen’s main sub-systems, and the interface
exported to guest operating systems. Initial performance results are
presented for our most mature guest operating system port, Linux
2.4. This report covers the initial design of Xen, leading up to our
first public release which we plan to make available for download in
April 2003. Further reports will update the design as our work
progresses and present the implementation in more detail.
Towards a field theory for networksCrowcroft, JonUniversity of Cambridge, Computer Laboratory2003-01enTextUCAM-CL-TR-554ISSN 1476-2986
It is often claimed that Internet Traffic patterns are interesting
because the Internet puts few constraints on sources. This leads to
innovation. It also makes the study of Internet traffic, what we
might cal the search for the Internet Erlang, very difficult. At the
same time, traffic control (congestion control) and engineering are
both hot topics.
What if “flash crowds” (a.k.a. slashdot), cascades, epidemics and so
on are the norm? What if the trend continues for network link
capacity to become flatter, with more equal capacity in the access
and core, or even more capacity in the access than the core (as in
the early 1980s with 10Mbps LANs versus Kbps links in the ARPANET)?
How could we cope?
This is a paper about the use of field equations (e.g.
gravitational, electrical, magnetic, strong and weak atomic and so
forth) as a future model for managing network traffic. We believe
that in the future, one could move from this model to a very general
prescriptive technique for designing network control on different
timescales, including traffic engineering and the set of admission
and congestion control laws. We also speculate about the use of the
same idea in wireless networks.
BOURSE – Broadband Organisation of Unregulated Radio Systems
through EconomicsCrowcroft, JonGibbens, RichardHailes, StephenUniversity of Cambridge, Computer Laboratory2003-01enTextUCAM-CL-TR-555ISSN 1476-2986
This is a technical report about an idea for research in the
intersection of active nets, cognitive radio and power laws of
network topologies.
Turing Switches – Turing machines for all-optical Internet
routingCrowcroft, JonUniversity of Cambridge, Computer Laboratory2003-01enTextUCAM-CL-TR-556ISSN 1476-2986
This is technical report outlining an idea for basic long term
research into the architectures for programmable all-optical
Internet routers.
We are revisiting some of the fundamental tenets of computer science
to carry out this work, and so it is necessarily highly speculative.
Currently, the processing elements in all-electronic routers are
typically fairly conventional von-Neumann architecture computers
with processors that have large, complex instruction sets (even RISC
is relatively complex compared with the actual requirements for
packet processing) and Random Access Memory.
As the need for speed increases, first this architecture, and then
the classical computing hardware components, and finally,
electronics cease to be able to keep up.
At this time, optical device technology is making great strides, and
we see the availability of gates, as well as a plethora of invention
in providing buffering mechanisms.
However, a critical problem we foresee is the ability to re-program
devices for different packet processing functions such as
classification and scheduling. This proposal is aimed at researching
one direction for adding optical domain programmability.
Iota: A concurrent XML scripting language with applications
to Home Area NetworkingBierman, G.M.Sewell, P.University of Cambridge, Computer Laboratory2003-01enTextUCAM-CL-TR-557ISSN 1476-2986
Iota is a small and simple concurrent language that provides native
support for functional XML computation and for typed channel-based
communication. It has been designed as a domain-specific language to
express device behaviour within the context of Home Area Networking.
In this paper we describe Iota, explaining its novel treatment of
XML and describing its type system and operational semantics. We
give a number of examples including Iota code to program Universal
Plug ’n’ Play (UPnP) devices.
A role and context based security modelBeresnevichiene, YolantaUniversity of Cambridge, Computer Laboratory2003-01enTextUCAM-CL-TR-558ISSN 1476-2986
Security requirements approached at the enterprise level initiate
the need for models that capture the organisational and distributed
aspects of information usage. Such models have to express
organisation-specific security policies and internal controls aiming
to protect information against unauthorised access and modification,
and against usage of information for unintended purposes. This
technical report describes a systematic approach to modelling the
security requirements from the perspective of job functions and
tasks performed in an organisation. It deals with the design,
analysis, and management of security abstractions and mechanisms in
a unified framework.
The basis of access control policy in this framework is formulated
around a semantic construct of a role. Roles are granted permissions
according to the job functions that exist in an organisation, and
then users are assigned to roles on basis of their specific job
responsibilities. In order to ensure that permissions included in
the roles are used by users only for purposes corresponding to the
organisation’s present business needs, a novel approach of “active”
context-based access control is proposed. The usage of role
permissions in this approach is controlled according to the emerging
context associated with progress of various tasks in the
organisation.
The work explores formally the security properties of the
established model, in particular, support for separation of duty and
least privilege principles that are important requirements in many
commercial systems. Results have implications for understanding
different variations of separation of duty policy that are currently
used in the role-based access control.
Finally, a design architecture of the defined security model is
presented detailing the components and processing phases required
for successful application of the model to distributed computer
environments. The model provides opportunities for the implementers,
based on application requirements, to choose between several
alternative design approaches.
Pronto: MobileGateway with publish-subscribe paradigm over
wireless networkYoneki, EikoBacon, JeanUniversity of Cambridge, Computer Laboratory2003-02enTextUCAM-CL-TR-559ISSN 1476-2986
This paper presents the design, implementation, and evaluation of
Pronto, a middleware system for mobile applications with messaging
as a basis. It provides a solution for mobile application specific
problems such as resource constraints, network characteristics, and
data optimization. Pronto consists of three main functions: 1)
MobileJMS Client, a lightweight client of Message Oriented
Middleware (MOM) based on Java Message Service (JMS), 2) Gateway for
reliable and efficient transmission between mobile devices and a
server with pluggable components, and 3) Serverless JMS based on IP
multicast. The publish-subscribe paradigm is ideal for mobile
applications, as mobile devices are commonly used for data
collection under conditions of frequent disconnection and changing
numbers of recipients. This paradigm provides greater flexibility
due to the decoupling of publisher and subscriber. Adding a gateway
as a message hub to transmit information in real-time or with
store-and-forward messaging provides powerful optimization and data
transformation. Caching is an essential function of the gateway, and
SmartCaching is designed for generic caching in an N-tier
architecture. Serverless JMS aims at a decentralized messaging
model, which supports an ad-hoc network, as well as creating a
high-speed messaging BUS. Pronto is an intelligent MobileGateway,
providing a useful MOM intermediary between a server and mobile
devices over a wireless network.
Decimalisation table attacks for PIN crackingBond, MikeZieliński, PiotrUniversity of Cambridge, Computer Laboratory2003-02enTextUCAM-CL-TR-560ISSN 1476-2986
We present an attack on hardware security modules used by retail
banks for the secure storage and verification of customer PINs in
ATM (cash machine) infrastructures. By using adaptive decimalisation
tables and guesses, the maximum amount of information is learnt
about the true PIN upon each guess. It takes an average of 15
guesses to determine a four digit PIN using this technique, instead
of the 5000 guesses intended. In a single 30 minute lunch-break, an
attacker can thus discover approximately 7000 PINs rather than 24
with the brute force method. With a £300 withdrawal limit per card,
the potential bounty is raised from £7200 to £2.1 million and a
single motivated attacker could withdraw £30–50 thousand of this
each day. This attack thus presents a serious threat to bank
security.
Resource control of untrusted code in an open network
environmentMenage, Paul B.University of Cambridge, Computer Laboratory2003-03enTextUCAM-CL-TR-561ISSN 1476-2986
Current research into Active Networks, Open Signalling and other
forms of mobile code have made use of the ability to execute
user-supplied code at locations within the network infrastructure,
in order to avoid the inherent latency associated with wide area
networks or to avoid sending excessive amounts of data across
bottleneck links or nodes. Existing research has addressed the
design and evaluation of programming environments, and testbeds have
been implemented on traditional operating systems. Such work has
deferred issues regarding resource control; this has been
reasonable, since this research has been conducted in a closed
environment.
In an open environment, which is required for widespread deployment
of such technologies, the code supplied to the network nodes may not
be from a trusted source. Thus, it cannot be assumed that such code
will behave non-maliciously, nor that it will avoid consuming more
than its fair share of the available system resources.
The computing resources consumed by end-users on programmable nodes
within a network are not free, and must ultimately be paid for in
some way. Programmable networks allow users substantially greater
complexity in the way that they may consume network resources. This
dissertation argues that, due to this complexity, it is essential to
be able control and account for the resources used by untrusted
user-supplied code if such technology is to be deployed effectively
in a wide-area open environment.
The Resource Controlled Active Node Environment (RCANE) is presented
to facilitate the control of untrusted code. RCANE supports the
allocation, scheduling and accounting of the resources available on
a node, including CPU and network I/O scheduling, memory allocation,
and garbage collection overhead.
Fast Marching farthest point samplingMoenning, CarstenDodgson, Neil A.University of Cambridge, Computer Laboratory2003-04enTextUCAM-CL-TR-562ISSN 1476-2986
Using Fast Marching for the incremental computation of distance maps
across the sampling domain, we obtain an efficient farthest point
sampling technique (FastFPS). The method is based on that of Eldar
et al. (1992, 1997) but extends more naturally to the case of
non-uniform sampling and is more widely applicable. Furthermore, it
can be applied to both planar domains and curved manifolds and
allows for weighted domains in which different cost is associated
with different points on the surface. We conclude with considering
the extension of FastFPS to the sampling of point clouds without the
need for prior surface reconstruction.
MJ: An imperative core calculus for Java and Java with
effectsBierman, G.M.Parkinson, M.J.Pitts, A.M.University of Cambridge, Computer Laboratory2003-04enTextUCAM-CL-TR-563ISSN 1476-2986
In order to study rigorously object-oriented languages such as Java
or C#, a common practice is to define lightweight fragments, or
calculi, which are sufficiently small to facilitate formal proofs of
key properties. However many of the current proposals for calculi
lack important language features. In this paper we propose
Middleweight Java, MJ, as a contender for a minimal imperative core
calculus for Java. Whilst compact, MJ models features such as object
identity, field assignment, constructor methods and block structure.
We define the syntax, type system and operational semantics of MJ,
and give a proof of type safety. In order to demonstrate the
usefulness of MJ to reason about operational features, we consider a
recent proposal of Greenhouse and Boyland to extend Java with an
effects system. This effects system is intended to delimit the scope
of computational effects within a Java program. We define an
extension of MJ with a similar effects system and instrument the
operational semantics. We then prove the correctness of the effects
system; a question left open by Greenhouse and Boyland. We also
consider the question of effect inference for our extended calculus,
detail an algorithm for inferring effects information and give a
proof of correctness.
Access policies for middlewareLang, UlrichUniversity of Cambridge, Computer Laboratory2003-05enTextUCAM-CL-TR-564ISSN 1476-2986
This dissertation examines how the architectural layering of
middleware constrains the design of a middleware security
architecture, and analyses the complications that arise from that.
First, we define a precise notion of middleware that includes its
architecture and features. Our definition is based on the Common
Object Request Broker Architecture (CORBA), which is used throughout
this dissertation both as a reference technology and as a basis for
a proof of concept implementation. In several steps, we construct a
security model that fits to the described middleware architecture.
The model facilitates conceptual reasoning about security. The
results of our analysis indicate that the cryptographic identities
available on the lower layers of the security model are only of
limited use for expressing fine-grained security policies, because
they are separated from the application layer entities by the
middleware layer. To express individual application layer entities
in access policies, additional more fine-grained descriptors are
required. To solve this problem for the target side (i.e., the
receiving side of an invocation), we propose an improved middleware
security model that supports individual access policies on a
per-target basis. The model is based on so-called “resource
descriptors”, which are used in addition to cryptographic identities
to describe application layer entities in access policies. To be
useful, descriptors need to fulfil a number of properties, such as
local uniqueness and persistency. Next, we examine the information
available at the middleware layer for its usefulness as resource
descriptors, in particular the interface name and the instance
information inside the object reference. Unfortunately neither
fulfils all required properties. However, it is possible to obtain
resource descriptors on the target side through a mapping process
that links target instance information to an externally provided
descriptor. We describe both the mapping configuration when the
target is instantiated and the mapping process at invocation time. A
proof of concept implementation, which contains a number of
technical improvements over earlier attempts to solve this problem,
shows that this approach is useable in practice, even for complex
architectures, such as CORBA and CORBASec (the security services
specified for CORBA). Finally, we examine the security approaches of
several related middleware technologies that have emerged since the
specification of CORBA and CORBASec, and show the applicability of
the resource descriptor mapping.
Fast Marching farthest point sampling for point clouds and
implicit surfacesMoenning, CarstenDodgson, Neil A.University of Cambridge, Computer Laboratory2003-05enTextUCAM-CL-TR-565ISSN 1476-2986
In a recent paper (Moenning and Dodgson, 2003), the Fast Marching
farthest point sampling strategy (FastFPS) for planar domains and
curved manifolds was introduced. The version of FastFPS for curved
manifolds discussed in the paper deals with surface domains in
triangulated form only. Due to a restriction of the underlying Fast
Marching method, the algorithm further requires the splitting of any
obtuse into acute triangles to ensure the consistency of the Fast
Marching approximation. In this paper, we overcome these
restrictions by using Memoli and Sapiro’s (Memoli and Sapiro, 2001
and 2002) extension of the Fast Marching method to the handling of
implicit surfaces and point clouds. We find that the extended
FastFPS algorithm can be applied to surfaces in implicit or point
cloud form without the loss of the original algorithm’s
computational optimality and without the need for any preprocessing.
Formal verification of probabilistic algorithmsHurd, JoeUniversity of Cambridge, Computer Laboratory2003-05enTextUCAM-CL-TR-566ISSN 1476-2986
This thesis shows how probabilistic algorithms can be formally
verified using a mechanical theorem prover.
We begin with an extensive foundational development of probability,
creating a higher-order logic formalization of mathematical measure
theory. This allows the definition of the probability space we use
to model a random bit generator, which informally is a stream of
coin-flips, or technically an infinite sequence of IID
Bernoulli(1/2) random variables.
Probabilistic programs are modelled using the state-transformer
monad familiar from functional programming, where the random bit
generator is passed around in the computation. Functions remove
random bits from the generator to perform their calculation, and
then pass back the changed random bit generator with the result.
Our probability space modelling the random bit generator allows us
to give precise probabilistic specifications of such programs, and
then verify them in the theorem prover.
We also develop technical support designed to expedite verification:
probabilistic quantifiers; a compositional property subsuming
measurability and independence; a probabilistic while loop together
with a formal concept of termination with probability 1. We also
introduce a technique for reducing properties of a probabilistic
while loop to properties of programs that are guaranteed to
terminate: these can then be established using induction and
standard methods of program correctness.
We demonstrate the formal framework with some example probabilistic
programs: sampling algorithms for four probability distributions;
some optimal procedures for generating dice rolls from coin flips;
the symmetric simple random walk. In addition, we verify the
Miller-Rabin primality test, a well-known and commercially used
probabilistic algorithm. Our fundamental perspective allows us to
define a version with strong properties, which we can execute in the
logic to prove compositeness of numbers.
Using inequalities as term ordering constraintsHurd, JoeUniversity of Cambridge, Computer Laboratory2003-06enTextUCAM-CL-TR-567ISSN 1476-2986
In this paper we show how linear inequalities can be used to
approximate Knuth-Bendix term ordering constraints, and how term
operations such as substitution can be carried out on systems of
inequalities. Using this representation allows an off-the-shelf
linear arithmetic decision procedure to check the satisfiability of
a set of ordering constraints. We present a formal description of a
resolution calculus where systems of inequalities are used to
constrain clauses, and implement this using the Omega test as a
satisfiability checker. We give the results of an experiment over
problems in the TPTP archive, comparing the practical performance of
the resolution calculus with and without inherited inequality
constraints.
Dynamic rebinding for marshalling and update, with
destruct-time λBierman, GavinHicks, MichaelSewell, PeterStoyle, GarethWansbrough, KeithUniversity of Cambridge, Computer Laboratory2004-02enTextUCAM-CL-TR-568ISSN 1476-2986
Most programming languages adopt static binding, but for distributed
programming an exclusive reliance on static binding is too
restrictive: dynamic binding is required in various guises, for
example when a marshalled value is received from the network,
containing identifiers that must be rebound to local resources.
Typically it is provided only by ad-hoc mechanisms that lack clean
semantics.
In this paper we adopt a foundational approach, developing core
dynamic rebinding mechanisms as extensions to the simply-typed
call-by-value λ-calculus. To do so we must first explore refinements
of the call-by-value reduction strategy that delay instantiation, to
ensure computations make use of the most recent versions of rebound
definitions. We introduce redex-time and destruct-time strategies.
The latter forms the basis for a λ-marsh calculus that supports
dynamic rebinding of marshalled values, while remaining as far as
possible statically-typed. We sketch an extension of λ-marsh with
concurrency and communication, giving examples showing how wrappers
for encapsulating untrusted code can be expressed. Finally, we show
that a high-level semantics for dynamic updating can also be based
on the destruct-time strategy, defining a λ-update calculus with
simple primitives to provide type-safe updating of running code. We
thereby establish primitives and a common semantic foundation for a
variety of real-world dynamic rebinding requirements.
Global abstraction-safe marshalling with hash
typesLeifer, James J.Peskine, GillesSewell, PeterWansbrough, KeithUniversity of Cambridge, Computer Laboratory2003-06enTextUCAM-CL-TR-569ISSN 1476-2986
Type abstraction is a key feature of ML-like languages for writing
large programs. Marshalling is necessary for writing distributed
programs, exchanging values via network byte-streams or persistent
stores. In this paper we combine the two, developing compile-time
and run-time semantics for marshalling, that guarantee
abstraction-safety between separately-built programs.
We obtain a namespace for abstract types that is global, ie
meaningful between programs, by hashing module declarations. We
examine the scenarios in which values of abstract types are
communicated from one program to another, and ensure, by
constructing hashes appropriately, that the dynamic and static
notions of type equality mirror each other. We use singleton kinds
to express abstraction in the static semantics; abstraction is
tracked in the dynamic semantics by coloured brackets. These allow
us to prove preservation, erasure, and coincidence results. We argue
that our proposal is a good basis for extensions to existing ML-like
languages, pragmatically straightforward for language users and for
implementors.
Bigraphs and mobile processesJensen, Ole HøghMilner, RobinUniversity of Cambridge, Computer Laboratory2003-07enTextUCAM-CL-TR-570ISSN 1476-2986
A bigraphical reactive system (BRS) involves bigraphs, in which the
nesting of nodes represents locality, independently of the edges
connecting them; it also allows bigraphs to reconfigure themselves.
BRSs aim to provide a uniform way to model spatially distributed
systems that both compute and communicate. In this memorandum we
develop their static and dynamic theory.
In Part I we illustrate bigraphs in action, and show how they
correspond to to process calculi. We then develop the abstract
(non-graphical) notion of wide reactive system (WRS), of which BRSs
are an instance. Starting from reaction rules —often called
rewriting rules— we use the RPO theory of Leifer and Milner to
derive (labelled) transition systems for WRSs, in a way that leads
automatically to behavioural congruences.
In Part II we develop bigraphs and BRSs formally. The theory is
based directly on graphs, not on syntax. Key results in the static
theory are that sufficient RPOs exist (enabling the results of
Part I to be applied), that parallel combinators familiar from
process calculi may be defined, and that a complete algebraic theory
exists at least for pure bigraphs (those without binding). Key
aspects in the dynamic theory —the BRSs— are the definition of
parametric reaction rules that may replicate or discard parameters,
and the full application of the behavioural theory of Part I.
In Part III we introduce a special class: the simple BRSs. These
admit encodings of many process calculi, including the π-calculus
and the ambient calculus. A still narrower class, the basic BRSs,
admits an easy characterisation of our derived transition systems.
We exploit this in a case study for an asynchronous π-calculus. We
show that structural congruence of process terms corresponds to
equality of the representing bigraphs, and that classical strong
bisimilarity corresponds to bisimilarity of bigraphs. At the end, we
explore several directions for further work.
Multi-layer network monitoring and analysisHall, JamesUniversity of Cambridge, Computer Laboratory2003-07enTextUCAM-CL-TR-571ISSN 1476-2986
Passive network monitoring offers the possibility of gathering a
wealth of data about the traffic traversing the network and the
communicating processes generating that traffic. Significant
advantages include the non-intrusive nature of data capture and the
range and diversity of the traffic and driving applications which
may be observed. Conversely there are also associated practical
difficulties which have restricted the usefulness of the technique:
increasing network bandwidths can challenge the capacity of monitors
to keep pace with passing traffic without data loss, and the bulk of
data recorded may become unmanageable.
Much research based upon passive monitoring has in consequence been
limited to that using a sub-set of the data potentially available,
typically TCP/IP packet headers gathered using Tcpdump or similar
monitoring tools. The bulk of data collected is thereby minimised,
and with the possible exception of packet filtering, the monitor’s
available processing power is available for the task of collection
and storage. As the data available for analysis is drawn from only a
small section of the network protocol stack, detailed study is
largely confined to the associated functionality and dynamics in
isolation from activity at other levels. Such lack of context
severely restricts examination of the interaction between protocols
which may in turn lead to inaccurate or erroneous conclusions.
The work described in this report attempts to address some of these
limitations. A new passive monitoring architecture — Nprobe — is
presented, based upon ‘off the shelf’ components and which, by using
clusters of probes, is scalable to keep pace with current high
bandwidth networks without data loss. Monitored packets are fully
captured, but are subject to the minimum processing in real time
needed to identify and associate data of interest across the target
set of protocols. Only this data is extracted and stored. The data
reduction ratio thus achieved allows examination of a wider range of
encapsulated protocols without straining the probe’s storage
capacity.
Full analysis of the data harvested from the network is performed
off-line. The activity of interest within each protocol is examined
and is integrated across the range of protocols, allowing their
interaction to be studied. The activity at higher levels informs
study of the lower levels, and that at lower levels infers detail of
the higher. A technique for dynamically modelling TCP connections is
presented, which, by using data from both the transport and higher
levels of the protocol stack, differentiates between the effects of
network and end-process activity.
The balance of the report presents a study of Web traffic using
Nprobe. Data collected from the IP, TCP, HTTP and HTML levels of the
stack is integrated to identify the patterns of network activity
involved in downloading whole Web pages: by using the links
contained in HTML documents observed by the monitor, together with
data extracted from the HTML headers of downloaded contained
objects, the set of TCP connections used, and the way in which
browsers use them, are studied as a whole. An analysis of the degree
and distribution of delay is presented and contributes to the
understanding of performance as perceived by the user. The effects
of packet loss on whole page download times are examined,
particularly those losses occurring early in the lifetime of
connections before reliable estimations of round trip times are
established. The implications of such early packet losses for pages
downloads using persistent connections are also examined by
simulations using the detailed data available.
Design choices for language-based transactionsHarris, TimUniversity of Cambridge, Computer Laboratory2003-08enTextUCAM-CL-TR-572ISSN 1476-2986
This report discusses two design choices which arose in our recent
work on introducing a new ‘atomic’ keyword as an extension to the
Java programming language. We discuss the extent to which programs
using atomic blocks should be provided with an explicit ‘abort’
operation to roll-back the effects of the current block. We also
discuss mechanisms for supporting blocks that perform I/O operations
or external database transactions.
Mechanizing compositional reasoning for concurrent systems:
some lessonsPaulson, Lawrence C.University of Cambridge, Computer Laboratory2003-08enTextUCAM-CL-TR-573ISSN 1476-2986
The paper reports on experiences of mechanizing various proposals
for compositional reasoning in concurrent systems. The work uses the
UNITY formalism and the Isabelle proof tool. The proposals
investigated include existential/universal properties, guarantees
properties and progress sets. The paper mentions some alternative
proposals that are also worth of investigation. The conclusions are
that many of these methods work and are suitable candidates for
further development.
Sketchpad: A man-machine graphical communication
systemSutherland, Ivan EdwardUniversity of Cambridge, Computer LaboratoryBlackwell, AlanRodden, Kerry2003-09enTextUCAM-CL-TR-574ISSN 1476-2986
The Sketchpad system uses drawing as a novel communication medium
for a computer. The system contains input, output, and computation
programs which enable it to interpret information drawn directly on
a computer display. It has been used to draw electrical, mechanical,
scientific, mathematical, and animated drawings; it is a general
purpose system. Sketchpad has shown the most usefulness as an aid to
the understanding of processes, such as the notion of linkages,
which can be described with pictures. Sketchpad also makes it easy
to draw highly repetitive or highly accurate drawings and to change
drawings previously drawn with it. The many drawings in this thesis
were all made with Sketchpad.
A Sketchpad user sketches directly on a computer display with a
“light pen.” The light pen is used both to position parts of the
drawing on the display and to point to them to change them. A set of
push buttons controls the changes to be made such as “erase,” or
“move.” Except for legends, no written language is used.
Information sketched can include straight line segments and circle
arcs. Arbitrary symbols may be defined from any collection of line
segments, circle arcs, and previously defined symbols. A user may
define and use as many symbols as he wishes. Any change in the
definition of a symbol is at once seen wherever that symbol appears.
Sketchpad stores explicit information about the topology of a
drawing. If the user moves one vertex of a polygon, both adjacent
sides will be moved. If the user moves a symbol, all lines attached
to that symbol will automatically move to stay attached to it. The
topological connections of the drawing are automatically indicated
by the user as he sketches. Since Sketchpad is able to accept
topological information from a human being in a picture language
perfectly natural to the human, it can be used as an input program
for computation programs which require topological data, e.g.,
circuit simulators.
Sketchpad itself is able to move parts of the drawing around to meet
new conditions which the user may apply to them. The user indicates
conditions with the light pen and push buttons. For example, to make
two lines parallel, he successively points to the lines with the
light pen and presses a button. The conditions themselves are
displayed on the drawing so that they may be erased or changed with
the light pen language. Any combination of conditions can be defined
as a composite condition and applied in one step.
It is easy to add entirely new types of conditions to Sketchpad’s
vocabulary. Since the conditions can involve anything computable,
Sketchpad can be used for a very wide range of problems. For
example, Sketchpad has been used to find the distribution of forces
in the members of truss bridges drawn with it.
Sketchpad drawings are stored in the computer in a specially
designed “ring” structure. The ring structure features rapid
processing of topological information with no searching at all. The
basic operations used in Sketchpad for manipulating the ring
structure are described.
Reconfigurable wavelength-switched optical networks for the
Internet coreGranger, TimUniversity of Cambridge, Computer Laboratory2003-11enTextUCAM-CL-TR-575ISSN 1476-2986
With the quantity of data traffic carried on the Internet doubling
each year, there is no let up in the demand for ever increasing
network capacity. Optical fibres have a theoretical capacity of many
tens of terabits per second. Currently six terabits per second has
been achieved using Dense Wavelength Division Multiplexing: multiple
signals at different wavelengths carried on the same fibre.
This large available bandwidth moves the performance bottlenecks to
the processing required at each network node to receive, buffer,
route, and transmit each individual packet. For the last 10 years
the speed of the electronic routers has been, in relative terms,
increasing slower than optical capacity. The space required and
power consumed by these routers is also becoming a significant
limitation.
One solution examined in this dissertation is to create a virtual
topology in the optical layer by using all-optical switches to
create lightpaths across the network. In this way nodes that are not
directly connected can appear to be a single virtual hop away, and
no per-packet processing is required at the intermediate nodes. With
advances in optical switches it is now possible for the network to
reconfigure lightpaths dynamically. This allows the network to share
the resources available between the different traffic streams
flowing across the network, and track changes in traffic volumes by
allocating bandwidth on demand.
This solution is inherently a circuit-switched approach, but taken
into account are characteristics of optical switching, in particular
waveband switching (where we switch a contiguous range of
wavelengths as a single unit) and latency required to achieve non
disruptive switching.
This dissertation quantifies the potential gain from such a system
and how that gain is related to the frequency of reconfiguration. It
outlines possible network architectures which allow reconfiguration
and, through simulation, measures the performance of these
architectures. It then discusses the possible interactions between a
reconfiguring optical layer and higher-level network layers.
This dissertation argues that the optical layer should be distinct
from higher network layers, maintaining stable full-mesh
connectivity, and dynamically reconfiguring the sizes and physical
routes of the virtual paths to take advantage of changing traffic
levels.
An implementation of a coordinate based location
systemSpence, David R.University of Cambridge, Computer Laboratory2003-11enTextUCAM-CL-TR-576ISSN 1476-2986
This paper explains the co-ordinate based location system built for
XenoSearch, a resource discovery system in the XenoServer Open
Platform. The system is builds on the work of GNP, Lighthouse and
many more recent schemes. We also present results from various
combinations of algorithms to perform the actual co-ordinate
calculation based on GNP, Lighthouse and spring based systems and
show our implementations of the various algorithms give similar
prediction errors.
Compromising emanations: eavesdropping risks of computer
displaysKuhn, Markus G.University of Cambridge, Computer Laboratory2003-12enTextUCAM-CL-TR-577ISSN 1476-2986
Electronic equipment can emit unintentional signals from which
eavesdroppers may reconstruct processed data at some distance. This
has been a concern for military hardware for over half a century.
The civilian computer-security community became aware of the risk
through the work of van Eck in 1985. Military “Tempest” shielding
test standards remain secret and no civilian equivalents are
available at present. The topic is still largely neglected in
security textbooks due to a lack of published experimental data.
This report documents eavesdropping experiments on contemporary
computer displays. It discusses the nature and properties of
compromising emanations for both cathode-ray tube and liquid-crystal
monitors. The detection equipment used matches the capabilities to
be expected from well-funded professional eavesdroppers. All
experiments were carried out in a normal unshielded office
environment. They therefore focus on emanations from display refresh
signals, where periodic averaging can be used to obtain reproducible
results in spite of varying environmental noise.
Additional experiments described in this report demonstrate how to
make information emitted via the video signal more easily
receivable, how to recover plaintext from emanations via
radio-character recognition, how to estimate remotely precise
video-timing parameters, and how to protect displayed text from
radio-frequency eavesdroppers by using specialized screen drivers
with a carefully selected video card. Furthermore, a proposal for a
civilian radio-frequency emission-security standard is outlined,
based on path-loss estimates and published data about radio noise
levels.
Finally, a new optical eavesdropping technique is demonstrated that
reads CRT displays at a distance. It observes high-frequency
variations of the light emitted, even after diffuse reflection.
Experiments with a typical monitor show that enough video signal
remains in the light to permit the reconstruction of readable text
from signals detected with a fast photosensor. Shot-noise
calculations provide an upper bound for this risk.
Linear types for packet processing (extended
version)Ennals, RobertSharp, RichardMycroft, AlanUniversity of Cambridge, Computer Laboratory2004-01enTextUCAM-CL-TR-578ISSN 1476-2986
We present PacLang: an imperative, concurrent, linearly-typed
language designed for expressing packet processing applications.
PacLang’s linear type system ensures that no packet is referenced by
more than one thread, but allows multiple references to a packet
within a thread. We argue (i) that this property greatly simplifies
compilation of high-level programs to the distributed memory
architectures of modern Network Processors; and (ii) that PacLang’s
type system captures that style in which imperative packet
processing programs are already written. Claim (ii) is justified by
means of a case-study: we describe a PacLang implementation of the
IPv4 unicast packet forwarding algorithm.
PacLang is formalised by means of an operational semantics and a
Unique Ownership theorem formalises its correctness with respect to
the type system.
Practical lock-freedomFraser, KeirUniversity of Cambridge, Computer Laboratory2004-02enTextUCAM-CL-TR-579ISSN 1476-2986
Mutual-exclusion locks are currently the most popular mechanism for
interprocess synchronisation, largely due to their apparent
simplicity and ease of implementation. In the parallel-computing
environments that are increasingly commonplace in high-performance
applications, this simplicity is deceptive: mutual exclusion does
not scale well with large numbers of locks and many concurrent
threads of execution. Highly-concurrent access to shared data
demands a sophisticated ‘fine-grained’ locking strategy to avoid
serialising non-conflicting operations. Such strategies are hard to
design correctly and with good performance because they can harbour
problems such as deadlock, priority inversion and convoying. Lock
manipulations may also degrade the performance of cache-coherent
multiprocessor systems by causing coherency conflicts and increased
interconnect traffic, even when the lock protects read-only data.
In looking for solutions to these problems, interest has developed
in lock-free data structures. By eschewing mutual exclusion it is
hoped that more efficient and robust systems can be built.
Unfortunately the current reality is that most lock-free algorithms
are complex, slow and impractical. In this dissertation I address
these concerns by introducing and evaluating practical abstractions
and data structures that facilitate the development of large-scale
lock-free systems.
Firstly, I present an implementation of two useful abstractions that
make it easier to develop arbitrary lock-free data structures.
Although these abstractions have been described in previous work, my
designs are the first that can be practically implemented on current
multiprocessor systems.
Secondly, I present a suite of novel lock-free search structures.
This is interesting not only because of the fundamental importance
of searching in computer science and its wide use in real systems,
but also because it demonstrates the implementation issues that
arise when using the practical abstractions I have developed.
Finally, I evaluate each of my designs and compare them with
existing lock-based and lock-free alternatives. To ensure the
strongest possible competition, several of the lock-based
alternatives are significant improvements on the best-known
solutions in the literature. These results demonstrate that it is
possible to build useful data structures with all the perceived
benefits of lock-freedom and with performance better than
sophisticated lock-based designs. Furthermore, and contrary to
popular belief, this work shows that existing hardware primitives
are sufficient to build practical lock-free implementations of
complex data structures.
Bigraphs and mobile processes (revised)Jensen, Ole HøghMilner, RobinUniversity of Cambridge, Computer Laboratory2004-02enTextUCAM-CL-TR-580ISSN 1476-2986
A bigraphical reactive system (BRS) involves bigraphs, in which the
nesting of nodes represents locality, independently of the edges
connecting them; it also allows bigraphs to reconfigure themselves.
BRSs aim to provide a uniform way to model spatially distributed
systems that both compute and communicate. In this memorandum we
develop their static and dynamic theory.
In Part I we illustrate bigraphs in action, and show how they
correspond to to process calculi. We then develop the abstract
(non-graphical) notion of wide reactive system (WRS), of which BRSs
are an instance. Starting from reaction rules —often called
rewriting rules— we use the RPO theory of Leifer and Milner to
derive (labelled) transition systems for WRSs, in a way that leads
automatically to behavioural congruences.
In Part II we develop bigraphs and BRSs formally. The theory is
based directly on graphs, not on syntax. Key results in the static
theory are that sufficient RPOs exist (enabling the results of
Part I to be applied), that parallel combinators familiar from
process calculi may be defined, and that a complete algebraic theory
exists at least for pure bigraphs (those without binding). Key
aspects in the dynamic theory —the BRSs— are the definition of
parametric reaction rules that may replicate or discard parameters,
and the full application of the behavioural theory of Part I.
In Part III we introduce a special class: the simple BRSs. These
admit encodings of many process calculi, including the π-calculus
and the ambient calculus. A still narrower class, the basic BRSs,
admits an easy characterisation of our derived transition systems.
We exploit this in a case study for an asynchronous π-calculus. We
show that structural congruence of process terms corresponds to
equality of the representing bigraphs, and that classical strong
bisimilarity corresponds to bisimilarity of bigraphs. At the end, we
explore several directions for further work.
Axioms for bigraphical structureMilner, RobinUniversity of Cambridge, Computer Laboratory2004-02enTextUCAM-CL-TR-581ISSN 1476-2986
This paper axiomatises the structure of bigraphs, and proves that
the resulting theory is complete. Bigraphs are graphs with double
structure, representing locality and connectivity. They have been
shown to represent dynamic theories for the π-calculus, mobile
ambients and Petri nets, in a way that is faithful to each of those
models of discrete behaviour. While the main purpose of bigraphs is
to understand mobile systems, a prerequisite for this understanding
is a well-behaved theory of the structure of states in such systems.
The algebra of bigraph structure is surprisingly simple, as the
paper demonstrates; this is because bigraphs treat locality and
connectivity orthogonally.
Latency-optimal Uniform Atomic Broadcast
algorithmZieliński, PiotrUniversity of Cambridge, Computer Laboratory2004-02enTextUCAM-CL-TR-582ISSN 1476-2986
We present a new asynchronous Uniform Atomic Broadcast algorithm
with a delivery latency of two communication steps in optimistic
settings, which is faster than any other known algorithm and has
been shown to be the lower bound. It also has the weakest possible
liveness requirements (the Ω failure detector and a majority of
correct processes) and achieves three new lower bounds presented in
this paper. Finally, we introduce a new notation and several new
abstractions, which are used to construct and present the algorithm
in a clear and modular way.
Subdivision as a sequence of sampled Cp surfaces and
conditions for tuning schemesGérot, CédricBarthe, LoïcDodgson, Neil A.Sabin, Malcolm A.University of Cambridge, Computer Laboratory2004-03enTextUCAM-CL-TR-583ISSN 1476-2986
We deal with practical conditions for tuning a subdivision scheme in
order to control its artifacts in the vicinity of a mark point. To
do so, we look for good behaviour of the limit vertices rather than
good mathematical properties of the limit surface. The good
behaviour of the limit vertices is characterised with the definition
of C2-convergence of a scheme. We propose necessary explicit
conditions for C2-convergence of a scheme in the vicinity of any
mark point being a vertex of valency n or the centre of an n-sided
face with n greater or equal to three. These necessary conditions
concern the eigenvalues and eigenvectors of subdivision matrices in
the frequency domain. The components of these matrices may be
complex. If we could guarantee that they were real, this would
simplify numerical analysis of the eigenstructure of the matrices,
especially in the context of scheme tuning where we manipulate
symbolic terms. In this paper we show that an appropriate choice of
the parameter space combined with a substitution of vertices lets us
transform these matrices into pure real ones. The substitution
consists in replacing some vertices by linear combinations of
themselves. Finally, we explain how to derive conditions on the
eigenelements of the real matrices which are necessary for the
C2-convergence of the scheme.
Concise texture editingBrooks, StephenUniversity of Cambridge, Computer Laboratory2004-03enTextUCAM-CL-TR-584ISSN 1476-2986
Many computer graphics applications remain in the domain of the
specialist. They are typically characterized by complex
user-directed tasks, often requiring proficiency in design, colour
spaces, computer interaction and file management. Furthermore, the
demands of this skill set are often exacerbated by an equally
complex collection of image or object manipulation commands embedded
in a variety of interface components. The complexity of these
graphic editing tools often requires that the user possess a
correspondingly high level of expertise.
Concise Texture Editing is aimed at addressing the over-complexity
of modern graphics tools and is based on the intuitive notion that
the human user is skilled at high level decision making while the
computer is proficient at rapid computation. This thesis has focused
on the development of interactive editing tools for 2D texture
images and has led to the development of a novel texture
manipulation system that allows:
– the concise painting of a texture;
– the concise cloning of textures;
– the concise alteration of texture element size.
The system allows complex operations to be performed on images with
minimal user interaction. When applied to the domain of image
editing, this implies that the user can instruct the system to
perform complex changes to digital images without having to specify
copious amounts of detail. In order to reduce the user’s workload,
the inherent self-similarity of textures is exploited to
interactively replicate editing operations globally over an image.
This unique image system thereby reduces the user’s workload through
semi-automation, resulting in an acutely concise user interface.
Personal projected displaysAshdown, Mark S. D.University of Cambridge, Computer Laboratory2004-03enTextUCAM-CL-TR-585ISSN 1476-2986
Since the inception of the personal computer, the interface
presented to users has been defined by the monitor screen, keyboard,
and mouse, and by the framework of the desktop metaphor. It is very
different from a physical desktop which has a large horizontal
surface, allows paper documents to be arranged, browsed, and
annotated, and is controlled via continuous movements with both
hands. The desktop metaphor will not scale to such a large display;
the continuing profusion of paper, which is used as much as ever,
attests to its unsurpassed affordances as a medium for manipulating
documents; and despite its proven manual and cognitive benefits,
two-handed input is still not used in computer interfaces.
I present a system called the Escritoire that uses a novel
configuration of overlapping projectors to create a large desk
display that fills the area of a conventional desk and also has a
high resolution region in front of the user for precise work. The
projectors need not be positioned exactly—the projected imagery is
warped using standard 3D video hardware to compensate for rough
projector positioning and oblique projection. Calibration involves
computing planar homographies between the 2D co-ordinate spaces of
the warped textures, projector framebuffers, desk, and input
devices.
The video hardware can easily perform the necessary warping and
achieves 30 frames per second for the dual-projector display.
Oblique projection has proved to be a solution to the problem of
occlusion common to front-projection systems. The combination of an
electromagnetic digitizer and an ultrasonic pen allows simultaneous
input with two hands. The pen for the non-dominant hand is simpler
and coarser than that for the dominant hand, reflecting the
differing roles of the hands in bimanual manipulation. I give a new
algorithm for calibrating a pen, that uses piecewise linear
interpolation between control points. I also give an algorithm to
calibrate a wall display at distance using a device whose position
and orientation are tracked in three dimensions.
The Escritoire software is divided into a client that exploits the
video hardware and handles the input devices, and a server that
processes events and stores all of the system state. Multiple
clients can connect to a single server to support collaboration.
Sheets of virtual paper on the Escritoire can be put in piles which
can be browsed and reordered. As with physical paper this allows
items to be arranged quickly and informally, avoiding the premature
work required to add an item to a hierarchical file system. Another
interface feature is pen traces, which allow remote users to gesture
to each other. I report the results of tests with individuals and
with pairs collaborating remotely. Collaborating participants found
an audio channel and the shared desk surface much more useful than a
video channel showing their faces.
The Escritoire is constructed from commodity components, and unlike
multi-projector display walls its cost is feasible for an individual
user and it fits into a normal office setting. It demonstrates a
hardware configuration, calibration algorithm, graphics warping
process, set of interface features, and distributed architecture
that can make personal projected displays a reality.
Role-based access control policy administrationBelokosztolszki, AndrásUniversity of Cambridge, Computer Laboratory2004-03enTextUCAM-CL-TR-586ISSN 1476-2986
The wide proliferation of the Internet has set new requirements for
access control policy specification. Due to the demand for ad-hoc
cooperation between organisations, applications are no longer
isolated from each other; consequently, access control policies face
a large, heterogeneous, and dynamic environment. Policies, while
maintaining their main functionality, go through many minor
adaptations, evolving as the environment changes.
In this thesis we investigate the long-term administration of
role-based access control (RBAC) – in particular OASIS RBAC –
policies.
With the aim of encapsulating persistent goals of policies we
introduce extensions in the form of meta-policies. These
meta-policies, whose expected lifetime is longer than the lifetime
of individual policies, contain extra information and restrictions
about policies. It is expected that successive policy versions are
checked at policy specification time to ensure that they comply with
the requirements and guidelines set by meta-policies.
In the first of the three classes of meta-policies we group together
policy components by annotating them with context labels. Based on
this grouping and an information flow relation on context labels, we
limit the way in which policy components may be connected to other
component groups. We use this to partition conceptually disparate
portions of policies, and reference these coherent portions to
specify policy restrictions and policy enforcement behaviour.
In our second class of meta-policies – compliance policies – we
specify requirements on an abstract policy model. We then use this
for static policy checking. As compliance tests are performed at
policy specification time, compliance policies may include
restrictions that either cannot be included in policies, or whose
inclusion would result in degraded policy enforcement performance.
We also indicate how to use compliance policies to provide
information about organisational policies without disclosing
sensitive information.
The final class of our meta-policies, called interface policies, is
used to help set up and maintain cooperation among organisations by
enabling them to use components from each other’s policies. Being
based on compliance policies, they use an abstract policy component
model, and can also specify requirements for both component
exporters and importers. Using such interface policies we can
reconcile compatibility issues between cooperating parties
automatically.
Finally, building on our meta-policies, we consider policy evolution
and self-administration, according to which we treat RBAC policies
as distributed resources to which access is specified with the help
of RBAC itself. This enables environments where policies are
maintained by many administrators who have varying levels of
competence, trust, and jurisdiction.
We have tested all of these concepts in Desert, our proof of concept
implementation.
Verification of asynchronous circuitsCunningham, Paul AlexanderUniversity of Cambridge, Computer Laboratory2004-04enTextUCAM-CL-TR-587ISSN 1476-2986
The purpose of this thesis is to introduce proposition-oriented
behaviours and apply them to the verification of asynchronous
circuits. The major contribution of proposition-oriented behaviours
is their ability to extend existing formal notations to permit the
explicit use of both signal levels and transitions.
This thesis begins with the formalisation of proposition-oriented
behaviours in the context of gate networks, and with the
set-theoretic extension of both regular-expressions and
trace-expressions to reason over proposition-oriented behaviours. A
new trace-expression construct, referred to as biased composition,
is also introduced. Algorithmic realisation of these set-theoretic
extensions is documented using a special form of finite automata
called proposition automata. A verification procedure for
conformance of gate networks to a set of proposition automata is
described in which each proposition automaton may be viewed either
as a constraint or a specification. The implementation of this
procedure as an automated verification program called Veraci is
summarised, and a number of example Veraci programs are used to
demonstrate contributions of proposition-oriented behaviour to
asynchronous circuit design. These contributions include level-event
unification, event abstraction, and relative timing assumptions
using biased composition. The performance of Veraci is also compared
to an existing event-oriented verification program called Versify,
the result of this comparison being a consistent performance gain
using Veraci over Versify.
This thesis concludes with the design and implementation of a 2048
bit dual-rail asynchronous Montgomery exponentiator, MOD_EXP, in a
0.18µm standard-cell process. The application of Veraci to the
design of MOD_EXP is summarised, and the practical benefits of
proposition-oriented verification are discussed.
MulTEP: A MultiThreaded Embedded ProcessorWatcharawitch, PanitUniversity of Cambridge, Computer Laboratory2004-05enTextUCAM-CL-TR-588ISSN 1476-2986
Conventional embedded microprocessors have traditionally followed
the footsteps of high-end processor design to achieve high
performance. Their underlying architectures prioritise tasks by
time-critical interrupts and rely on software to perform scheduling
tasks. Single threaded execution relies on instruction-based
probabilistic techniques, such as speculative execution and branch
prediction, which are unsuitable for embedded systems when real-time
performance guarantees need to be met. Multithreading appears to be
a feasible solution for embedded processors. Thread-level
parallelism has a potential to overcome the limitations of
insufficient instruction-level parallelism to hide the increasing
memory latencies. MulTEP is designed to provide high performance
thread-level parallelism, real-time characteristics, a flexible
number of threads and low incremental cost per thread for the
embedded system. In its architecture, a matching-store
synchronisation mechanism allows a thread to wait for multiple data
items. A tagged up/down dynamic-priority hardware scheduler is
provided for real-time scheduling. Pre-loading, pre-fetching and
colour-tagging techniques are implemented to allow context switches
without any overhead. The architecture provides four additional
multithreading instructions for programmers and advance compilers to
create programs with low-overhead multithreaded operations.
Experimental results demonstrate that multithreading can be
effectively used to improve performance and system utilisation.
Latency operations that would otherwise stall the pipeline are
hidden by the execution of the other threads. The hardware scheduler
provides priority scheduling, which is suitable for real-time
embedded applications.
new-HOPLA — a higher-order process language with name
generationWinskel, GlynnNardelli, Francesco ZappaUniversity of Cambridge, Computer Laboratory2004-05enTextUCAM-CL-TR-589ISSN 1476-2986
This paper introduces new-HOPLA, a concise but powerful language for
higher-order nondeterministic processes with name generation. Its
origins as a metalanguage for domain theory are sketched but for the
most part the paper concentrates on its operational semantics. The
language is typed, the type of a process describing the shape of the
computation paths it can perform. Its transition semantics,
bisimulation, congruence properties and expressive power are
explored. Encodings of π-calculus and HOπ are presented.
Hermes: A scalable event-based middlewarePietzuch, Peter R.University of Cambridge, Computer Laboratory2004-06enTextUCAM-CL-TR-590ISSN 1476-2986
Large-scale distributed systems require new middleware paradigms
that do not suffer from the limitations of traditional request/reply
middleware. These limitations include tight coupling between
components, a lack of information filtering capabilities, and
support for one-to-one communication semantics only. We argue that
event-based middleware is a scalable and powerful new type of
middleware for building large-scale distributed systems. However, it
is important that an event-based middleware platform includes all
the standard functionality that an application programmer expects
from middleware.
In this thesis we describe the design and implementation of Hermes,
a distributed, event-based middleware platform. The power and
flexibility of Hermes is illustrated throughout for two application
domains: Internet-wide news distribution and a sensor-rich, active
building. Hermes follows a type- and attribute-based
publish/subscribe model that places particular emphasis on
programming language integration by supporting type-checking of
event data and event type inheritance. To handle dynamic,
large-scale environments, Hermes uses peer-to-peer techniques for
autonomic management of its overlay network of event brokers and for
scalable event dissemination. Its routing algorithms, implemented on
top of a distributed hash table, use rendezvous nodes to reduce
routing state in the system, and include fault-tolerance features
for repairing event dissemination trees. All this is achieved
without compromising scalability and efficiency, as is shown by a
simulational evaluation of Hermes routing.
The core functionality of an event-based middleware is extended with
three higher-level middleware services that address different
requirements in a distributed computing environment. We introduce a
novel congestion control service that avoids congestion in the
overlay broker network during normal operation and recovery after
failure, and therefore enables a resource-efficient deployment of
the middleware. The expressiveness of subscriptions in the
event-based middleware is enhanced with a composite event service
that performs the distributed detection of complex event patterns,
thus taking the burden away from clients. Finally, a security
service adds access control to Hermes according to a secure
publish/subscribe model. This model supports fine-grained access
control decisions so that separate trust domains can share the same
overlay broker network.
Conversion of notationsBrown, Silas S.University of Cambridge, Computer Laboratory2004-06enTextUCAM-CL-TR-591ISSN 1476-2986
Music, engineering, mathematics, and many other disciplines have
established notations for writing their documents. The effectiveness
of each of these notations can be hampered by the circumstances in
which it is being used, or by a user’s disability or cultural
background. Adjusting the notation can help, but the requirements of
different cases often conflict, meaning that each new document will
have to be transformed between many versions. Tools that support the
programming of such transformations can also assist by allowing the
creation of new notations on demand, which is an under-explored
option in the relief of educational difficulties.
This thesis reviews some programming tools that can be used to
manipulate the tree-like structure of a notation in order to
transform it into another. It then describes a system “4DML” that
allows the programmer to create a “model” of the desired result,
from which the transformation is derived. This is achieved by
representing the structure in a geometric space with many
dimensions, where the model acts as an alternative frame of
reference.
Example applications of 4DML include the transcription of songs and
musical scores into various notations, the production of
specially-customised notations to assist a sight-impaired person in
learning Chinese, an unusual way of re-organising personal notes, a
“website scraping” system for extracting data from on-line services
that provide only one presentation, and an aid to making mathematics
and diagrams accessible to people with severe print disabilities.
The benefits and drawbacks of the 4DML approach are evaluated, and
possible directions for future work are explored.
Unwrapping the ChrysalisBond, MikeCvrček, DanielMurdoch, Steven J.University of Cambridge, Computer Laboratory2004-06enTextUCAM-CL-TR-592ISSN 1476-2986
We describe our experiences reverse engineering the Chrysalis-ITS
Luna CA³ – a PKCS#11 compliant cryptographic token. Emissions
analysis and security API attacks are viewed by many to be simpler
and more efficient than a direct attack on an HSM. But how difficult
is it to actually “go in the front door”? We describe how we
unpicked the CA³ internal architecture and abused its low-level API
to impersonate a CA³ token in its cloning protocol – and extract
PKCS#11 private keys in the clear. We quantify the effort involved
in developing and applying the skills necessary for such a
reverse-engineering attack. In the process, we discover that the
Luna CA³ has far more undocumented code and functionality than is
revealed to the end-user.
Paxos at warZieliński, PiotrUniversity of Cambridge, Computer Laboratory2004-06enTextUCAM-CL-TR-593ISSN 1476-2986
The optimistic latency of Byzantine Paxos can be reduced from three
communication steps to two, without using public-key cryptography.
This is done by making a decision when more than (n+3f)/2 acceptors
report to have received the same proposal from the leader, with n
being the total number of acceptors and f the number of the faulty
ones. No further improvement in latency is possible, because every
Consensus algorithm must take at least two steps even in benign
settings. Moreover, if the leader is correct, our protocol achieves
the latency of at most three steps, even if some other processes
fail. These two properties make this the fastest Byzantine agreement
protocol proposed so far.
By running many instances of this algorithm in parallel, we can
implement Vector Consensus and Byzantine Atomic Broadcast in two and
three steps, respectively, which is two steps faster than any other
known algorithm.
Designing and attacking anonymous communication
systemsDanezis, GeorgeUniversity of Cambridge, Computer Laboratory2004-07enTextUCAM-CL-TR-594ISSN 1476-2986
This report contributes to the field of anonymous communications
over widely deployed communication networks. It describes novel
schemes to protect anonymity; it also presents powerful new attacks
and new ways of analysing and understanding anonymity properties.
We present Mixminion, a new generation anonymous remailer, and
examine its security against all known passive and active
cryptographic attacks. We use the secure anonymous replies it
provides, to describe a pseudonym server, as an example of the
anonymous protocols that mixminion can support. The security of mix
systems is then assessed against a compulsion threat model, in which
an adversary can request the decryption of material from honest
nodes. A new construction, the fs-mix, is presented that makes
tracing messages by such an adversary extremely expensive.
Moving beyond the static security of anonymous communication
protocols, we define a metric based on information theory that can
be used to measure anonymity. The analysis of the pool mix serves as
an example of its use. We then create a framework within which we
compare the traffic analysis resistance provided by different mix
network topologies. A new topology, based on expander graphs, proves
to be efficient and secure. The rgb-mix is also presented; this
implements a strategy to detect flooding attacks against honest mix
nodes and neutralise them by the use of cover traffic.
Finally a set of generic attacks are studied. Statistical disclosure
attacks model the whole anonymous system as a black box, and are
able to uncover the relationships between long-term correspondents.
Stream attacks trace streams of data travelling through anonymizing
networks, and uncover the communicating parties very quickly. They
both use statistical methods to drastically reduce the anonymity of
users. Other minor attacks are described against peer discovery and
route reconstruction in anonymous networks, as well as the naïve use
of anonymous replies.
Representations of quantum operations, with applications to
quantum cryptographyArrighi, Pablo J.University of Cambridge, Computer Laboratory2004-07enTextUCAM-CL-TR-595ISSN 1476-2986
Representations of quantum operations – We start by introducing a
geometrical representation (real vector space) of quantum states and
quantum operations. To do so we exploit an isomorphism from positive
matrices to a subcone of the Minkowski future light-cone. Pure
states map onto certain light-like vectors, whilst the axis of
revolution encodes the overall probability of occurrence for the
state. This extension of the Generalized Bloch Sphere enables us to
cater for non-trace-preserving quantum operations, and in particular
to view the per-outcome effects of generalized measurements. We show
that these consist of the product of an orthogonal transform about
the axis of the cone of revolution and a positive real symmetric
linear transform. In the case of a qubit the representation becomes
all the more interesting since it elegantly associates, to each
measurement element of a generalized measurement, a Lorentz
transformation in Minkowski space. We formalize explicitly this
correspondence between ‘observation of a quantum system’ and
‘special relativistic change of inertial frame’. To end this part we
review the state-operator correspondence, which was successfully
exploited by Choi to derive the operator-sum representation of
quantum operations. We go further and show that all of the important
theorems concerning quantum operations can in fact be derived as
simple corollaries of those concerning quantum states. Using this
methodology we derive novel composition laws upon quantum states and
quantum operations, Schmidt-type decompositions for bipartite pure
states and some powerful formulae relating to the correspondence.
Quantum cryptography – The key principle of quantum cryptography
could be summarized as follows. Honest parties communicate using
quantum states. To the eavesdropper these states are random and
non-orthogonal. In order to gather information she must measure
them, but this may cause irreversible damage. Honest parties seek to
detect her mischief by checking whether certain quantum states are
left intact. Thus tradeoff between the eavesdropper’s information
gain, and the disturbance she necessarily induces, can be viewed as
the power engine behind quantum cryptographic protocols. We begin by
quantifying this tradeoff in the case of a measure distinguishing
two non-orthogonal equiprobable pure states. A formula for this
tradeoff was first obtained by Fuchs and Peres, but we provide a
shorter, geometrical derivation (within the framework of the above
mentioned conal representation). Next we proceed to analyze the
Information gain versus disturbance tradeoff in a scenario where
Alice and Bob interleave, at random, pairwise superpositions of two
message words within their otherwise classical communications. This
work constitutes one of the few results currently available
regarding d-level systems quantum cryptography, and seems to provide
a good general primitive for building such protocols. The proof
crucially relies on the state-operator correspondence formulae
derived in the first part, together with some methods by Banaszek.
Finally we make use of this analysis to prove the security of a
‘blind quantum computation’ protocol, whereby Alice gets Bob to
perform some quantum algorithm for her, but prevents him from
learning her input to this quantum algorithm.
Reconstructing I/OFraser, KeirHand, StevenNeugebauer, RolfPratt, IanWarfield, AndrewWilliamson, MarkUniversity of Cambridge, Computer Laboratory2004-08enTextUCAM-CL-TR-596ISSN 1476-2986
We present a next-generation architecture that addresses problems of
dependability, maintainability, and manageability of I/O devices and
their software drivers on the PC platform. Our architecture resolves
both hardware and software issues, exploiting emerging hardware
features to improve device safety. Our high-performance
implementation, based on the Xen virtual machine monitor, provides
an immediate transition opportunity for today’s systems.
Syntactic simplification and text cohesionSiddharthan, AdvaithUniversity of Cambridge, Computer Laboratory2004-08enTextUCAM-CL-TR-597ISSN 1476-2986
Syntactic simplification is the process of reducing the grammatical
complexity of a text, while retaining its information content and
meaning. The aim of syntactic simplification is to make text easier
to comprehend for human readers, or process by programs. In this
thesis, I describe how syntactic simplification can be achieved
using shallow robust analysis, a small set of hand-crafted
simplification rules and a detailed analysis of the discourse-level
aspects of syntactically rewriting text. I offer a treatment of
relative clauses, apposition, coordination and subordination.
I present novel techniques for relative clause and appositive
attachment. I argue that these attachment decisions are not purely
syntactic. My approaches rely on a shallow discourse model and on
animacy information obtained from a lexical knowledge base. I also
show how clause and appositive boundaries can be determined reliably
using a decision procedure based on local context, represented by
part-of-speech tags and noun chunks.
I then formalise the interactions that take place between syntax and
discourse during the simplification process. This is important
because the usefulness of syntactic simplification in making a text
accessible to a wider audience can be undermined if the rewritten
text lacks cohesion. I describe how various generation issues like
sentence ordering, cue-word selection, referring-expression
generation, determiner choice and pronominal use can be resolved so
as to preserve conjunctive and anaphoric cohesive-relations during
syntactic simplification.
In order to perform syntactic simplification, I have had to address
various natural language processing problems, including clause and
appositive identification and attachment, pronoun resolution and
referring-expression generation. I evaluate my approaches to solving
each problem individually, and also present a holistic evaluation of
my syntactic simplification system.
Transition systems, link graphs and Petri netsLeifer, James J.Milner, RobinUniversity of Cambridge, Computer Laboratory2004-08enTextUCAM-CL-TR-598ISSN 1476-2986
A framework is defined within which reactive systems can be studied
formally. The framework is based upon s-categories, a new variety of
categories, within which reactive systems can be set up in such a
way that labelled transition systems can be uniformly extracted.
These lead in turn to behavioural preorders and equivalences, such
as the failures preorder (treated elsewhere) and bisimilarity, which
are guaranteed to be congruential. The theory rests upon the notion
of relative pushout previously introduced by the authors. The
framework is applied to a particular graphical model known as link
graphs, which encompasses a variety of calculi for mobile
distributed processes. The specific theory of link graphs is
developed. It is then applied to an established calculus, namely
condition-event Petri nets. In particular, a labelled transition
system is derived for condition-event nets, corresponding to a
natural notion of observable actions in Petri net theory. The
transition system yields a congruential bisimilarity coinciding with
one derived directly from the observable actions. This yields a
calibration of the general theory of reactive systems and link
graphs against known specific theories.
Further analysis of ternary and 3-point univariate
subdivision schemesHassan, Mohamed F.University of Cambridge, Computer Laboratory2004-08enTextUCAM-CL-TR-599ISSN 1476-2986
The precision set, approximation order and Hölder exponent are
derived for each of the univariate subdivision schemes described in
Technical Report UCAM-CL-TR-520.
Trust for resource control: Self-enforcing automatic
rational contracts between computersShand, Brian NinhamUniversity of Cambridge, Computer Laboratory2004-08enTextUCAM-CL-TR-600ISSN 1476-2986
Computer systems need to control access to their resources, in order
to give precedence to urgent or important tasks. This is
increasingly important in networked applications, which need to
interact with other machines but may be subject to abuse unless
protected from attack. To do this effectively, they need an explicit
resource model, and a way to assess others’ actions in terms of it.
This dissertation shows how the actions can be represented using
resource-based computational contracts, together with a rich trust
model which monitors and enforces contract compliance.
Related research in the area has focused on individual aspects of
this problem, such as resource pricing and auctions, trust modelling
and reputation systems, or resource-constrained computing and
resource-aware middleware. These need to be integrated into a single
model, in order to provide a general framework for computing by
contract.
This work explores automatic computerized contracts for negotiating
and controlling resource usage in a distributed system. Contracts
express the terms under which client and server promise to exchange
resources, such as processor time in exchange for money, using a
constrained language which can be automatically interpreted. A
novel, distributed trust model is used to enforce these promises,
and this also supports trust delegation through cryptographic
certificates. The model is formally proved to have appropriate
properties of safety and liveness, which ensure that cheats cannot
systematically gain resources by deceit, and that mutually
profitable contracts continue to be supported.
The contract framework has many applications, in automating
distributed services and in limiting the disruptiveness of users’
programs. Applications such as resource-constrained sandboxes,
operating system multimedia support and automatic distribution of
personal address book entries can all treat the user’s time as a
scarce resource, to trade off computational costs against user
distraction. Similarly, commercial Grid services can prioritise
computations with contracts, while a cooperative service such as
distributed composite event detection can use contracts for detector
placement and load balancing. Thus the contract framework provides a
general purpose tool for managing distributed computation, allowing
participants to take calculated risks and rationally choose which
contracts to perform.
Combining model checking and theorem provingAmjad, HasanUniversity of Cambridge, Computer Laboratory2004-09enTextUCAM-CL-TR-601ISSN 1476-2986
We implement a model checker for the modal mu-calculus as a derived
rule in a fully expansive mechanical theorem prover, without causing
an unacceptable performance penalty.
We use a restricted form of a higher order logic representation
calculus for binary decision diagrams (BDDs) to interface the model
checker to a high-performance BDD engine. This is used with a
formalised theory of the modal mu-calculus (which we also develop)
for model checking in which all steps of the algorithm are justified
by fully expansive proof. This provides a fine-grained integration
of model checking and theorem proving using a mathematically
rigourous interface. The generality of our theories allows us to
perform much of the proof offline, in contrast with earlier work.
This substantially reduces the inevitable performance penalty of
doing model checking by proof.
To demonstrate the feasibility of our approach, optimisations to the
model checking algorithm are added. We add naive caching and also
perform advanced caching for nested non-alternating fixed-point
computations.
Finally, the usefulness of the work is demonstrated. We leverage our
theory by proving translations to simpler logics that are in more
widespread use. We then implement an executable theory for
counterexample-guided abstraction refinement that also uses a SAT
solver. We verify properties of a bus architecture in use in
industry as well as a pedagogical arithmetic and logic unit. The
benchmarks show an acceptable performance penalty, and the results
are correct by construction.
Model checking the AMBA protocol in HOLAmjad, HasanUniversity of Cambridge, Computer Laboratory2004-09enTextUCAM-CL-TR-602ISSN 1476-2986
The Advanced Microcontroller Bus Architecture (AMBA) is an open
System-on-Chip bus protocol for high-performance buses on low-power
devices. In this report we implement a simple model of AMBA and use
model checking and theorem proving to verify latency, arbitration,
coherence and deadlock freedom properties of the implementation.
Bigraphs whose names have multiple localityMilner, RobinUniversity of Cambridge, Computer Laboratory2004-09enTextUCAM-CL-TR-603ISSN 1476-2986
The previous definition of binding bigraphs is generalised so that
local names may be located in more than one region, allowing more
succinct and flexible presentation of bigraphical reactive systems.
This report defines the generalisation, verifies that it retains
relative pushouts, and introduces a new notion of bigraph extension;
this admits a wider class of parametric reaction rules. Extension is
shown to be well-behaved algebraically; one consequence is that, as
in the original definition of bigraphs, discrete parameters are
sufficient to generate all reactions.
On the anonymity of anonymity systemsSerjantov, AndreiUniversity of Cambridge, Computer Laboratory2004-10enTextUCAM-CL-TR-604ISSN 1476-2986
Anonymity on the Internet is a property commonly identified with
privacy of electronic communications. A number of different systems
exist which claim to provide anonymous email and web browsing, but
their effectiveness has hardly been evaluated in practice. In this
thesis we focus on the anonymity properties of such systems. First,
we show how the anonymity of anonymity systems can be quantified,
pointing out flaws with existing metrics and proposing our own. In
the process we distinguish the anonymity of a message and that of an
anonymity system.
Secondly, we focus on the properties of building blocks of mix-based
(email) anonymity systems, evaluating their resistance to powerful
blending attacks, their delay, their anonymity under normal
conditions and other properties. This leads us to methods of
computing anonymity for a particular class of mixes – timed mixes –
and a new binomial mix.
Next, we look at the anonymity of a message going through an entire
anonymity system based on a mix network architecture. We construct a
semantics of a network with threshold mixes, define the information
observable by an attacker, and give a principled definition of the
anonymity of a message going through such a network.
We then consider low latency connection-based anonymity systems,
giving concrete attacks and describing methods of protection against
them. In particular, we show that Peer-to-Peer anonymity systems
provide less anonymity against the global passive adversary than
ones based on a “classic” architecture.
Finally, we give an account of how anonymity can be used in
censorship resistant systems. These are designed to provide
availability of documents, while facing threats from a powerful
adversary. We show how anonymity can be used to hide the identity of
the servers where each of the documents are stored, thus making them
harder to remove from the system.
Acute: High-level programming language design for
distributed computation : Design rationale and language
definitionSewell, PeterLeifer, James J.Wansbrough, KeithAllen-Williams, MairNardelli, Francesco ZappaHabouzit, PierreVafeiadis, ViktorUniversity of Cambridge, Computer Laboratory2004-10enTextUCAM-CL-TR-605ISSN 1476-2986
This paper studies key issues for distributed programming in
high-level languages. We discuss the design space and describe an
experimental language, Acute, which we have defined and implemented.
Acute extends an OCaml core to support distributed development,
deployment, and execution, allowing type-safe interaction between
separately-built programs. It is expressive enough to enable a wide
variety of distributed infrastructure layers to be written as simple
library code above the byte-string network and persistent store
APIs, disentangling the language runtime from communication.
This requires a synthesis of novel and existing features:
(1) type-safe marshalling of values between programs;
(2) dynamic loading and controlled rebinding to local resources;
(3) modules and abstract types with abstraction boundaries that are
respected by interaction;
(4) global names, generated either freshly or based on module
hashes: at the type level, as runtime names for abstract types; and
at the term level, as channel names and other interaction handles;
(5) versions and version constraints, integrated with type identity;
(6) local concurrency and thread thunkification; and
(7) second-order polymorphism with a namecase construct.
We deal with the interplay among these features and the core, and
develop a semantic definition that tracks abstraction boundaries,
global names, and hashes throughout compilation and execution, but
which still admits an efficient implementation strategy.
Dynamic binary analysis and instrumentationNethercote, NicholasUniversity of Cambridge, Computer Laboratory2004-11enTextUCAM-CL-TR-606ISSN 1476-2986
Dynamic binary analysis (DBA) tools such as profilers and checkers
help programmers create better software. Dynamic binary
instrumentation (DBI) frameworks make it easy to build new DBA
tools. This dissertation advances the theory and practice of dynamic
binary analysis and instrumentation, with an emphasis on the
importance of the use and support of metadata.
The dissertation has three main parts.
The first part describes a DBI framework called Valgrind which
provides novel features to support heavyweight DBA tools that
maintain rich metadata, especially location metadata—the shadowing
of every register and memory location with a metavalue. Location
metadata is used in shadow computation, a kind of DBA where every
normal operation is shadowed by an abstract operation.
The second part describes three powerful DBA tools. The first tool
performs detailed cache profiling. The second tool does an old kind
of dynamic analysis—bounds-checking—in a new way. The third tool
produces dynamic data flow graphs, a novel visualisation that cuts
to the essence of a program’s execution. All three tools were built
with Valgrind, and rely on Valgrind’s support for heavyweight DBA
and rich metadata, and the latter two perform shadow computation.
The third part describes a novel system of semi-formal descriptions
of DBA tools. It gives many example descriptions, and also considers
in detail exactly what dynamic analysis is.
The dissertation makes six main contributions.
First, the descriptions show that metadata is the key component of
dynamic analysis; in particular, whereas static analysis predicts
approximations of a program’s future, dynamic analysis remembers
approximations of a program’s past, and these approximations are
exactly what metadata is.
Second, the example tools show that rich metadata and shadow
computation make for powerful and novel DBA tools that do more than
the traditional tracing and profiling.
Third, Valgrind and the example tools show that a DBI framework can
make it easy to build heavyweight DBA tools, by providing good
support for rich metadata and shadow computation.
Fourth, the descriptions are a precise and concise way of
characterising tools, provide a directed way of thinking about tools
that can lead to better implementations, and indicate the
theoretical upper limit of the power of DBA tools in general.
Fifth, the three example tools are interesting in their own right,
and the latter two are novel.
Finally, the entire dissertation provides many details, and
represents a great deal of condensed experience, about implementing
DBI frameworks and DBA tools.
Code size optimization for embedded processorsJohnson, Neil E.University of Cambridge, Computer Laboratory2004-11enTextUCAM-CL-TR-607ISSN 1476-2986
This thesis studies the problem of reducing code size produced by an
optimizing compiler. We develop the Value State Dependence Graph
(VSDG) as a powerful intermediate form. Nodes represent computation,
and edges represent value (data) and state (control) dependencies
between nodes. The edges specify a partial ordering of the
nodes—sufficient ordering to maintain the I/O semantics of the
source program, while allowing optimizers greater freedom to move
nodes within the program to achieve better (smaller) code.
Optimizations, both classical and new, transform the graph through
graph rewriting rules prior to code generation. Additional
(semantically inessential) state edges are added to transform the
VSDG into a Control Flow Graph, from which target code is generated.
We show how procedural abstraction can be advantageously applied to
the VSDG. Graph patterns are extracted from a program’s VSDG. We
then select repeated patterns giving the greatest size reduction,
generate new functions from these patterns, and replace all
occurrences of the patterns in the original VSDG with calls to these
abstracted functions. Several embedded processors have load- and
store-multiple instructions, representing several loads (or stores)
as one instruction. We present a method, benefiting from the VSDG
form, for using these instructions to reduce code size by
provisionally combining loads and stores before code generation. The
final contribution of this thesis is a combined register allocation
and code motion (RACM) algorithm. We show that our RACM algorithm
formulates these two previously antagonistic phases as one combined
pass over the VSDG, transforming the graph (moving or cloning nodes,
or spilling edges) to fit within the physical resources of the
target processor.
We have implemented our ideas within a prototype C compiler and
suite of VSDG optimizers, generating code for the Thumb 32-bit
processor. Our results show improvements for each optimization and
that we can achieve code sizes comparable to, and in some cases
better than, that produced by commercial compilers with significant
investments in optimization technology.
Trust management for widely distributed systemsYao, WaltUniversity of Cambridge, Computer Laboratory2004-11enTextUCAM-CL-TR-608ISSN 1476-2986
In recent years, we have witnessed the evolutionary development of a
new breed of distributed systems. Systems of this type share a
number of characteristics – highly decentralized, of Internet-grade
scalability, and autonomous within their administrative domains.
Most importantly, they are expected to operate collaboratively
across both known and unknown domains. Prime examples include
peer-to-peer applications and open web services. Typically,
authorization in distributed systems is identity-based, e.g. access
control lists. However, approaches based on predefined identities
are unsuitable for the new breed of distributed systems because of
the need to deal with unknown users, i.e. strangers, and the need to
manage a potentially large number of users and/or resources.
Furthermore, effective administration and management of
authorization in such systems requires: (1) natural mapping of
organizational policies into security policies; (2) managing
collaboration of independently administered domains/organizations;
(3) decentralization of security policies and policy enforcement.
This thesis describes Fidelis, a trust management framework designed
to address the authorization needs for the next-generation
distributed systems. A trust management system is a term coined to
refer to a unified framework for the specification of security
policies, the representation of credentials, and the evaluation and
enforcement of policy compliances. Based on the concept of trust
conveyance and a generic abstraction for trusted information as
trust statements, Fidelis provides a generic platform for building
secure, trust-aware distributed applications. At the heart of the
Fidelis framework is a language for the specification of security
policies, the Fidelis Policy Language (FPL), and the inference model
for evaluating policies expressed in FPL. With the policy language
and its inference model, Fidelis is able to model
recommendation-style policies and policies with arbitrarily complex
chains of trust propagation.
Web services have rapidly been gaining significance both in industry
and research as a ubiquitous, next-generation middleware platform.
The second half of the thesis describes the design and
implementation of the Fidelis framework for the standard web service
platform. The goal of this work is twofold: first, to demonstrate
the practical feasibility of Fidelis, and second, to investigate the
use of a policy-driven trust management framework for Internet-scale
open systems. An important requirement in such systems is trust
negotiation that allows unfamiliar principals to establish mutual
trust and interact with confidence. Addressing this requirement, a
trust negotiation framework built on top of Fidelis is developed.
This thesis examines the application of Fidelis in three distinctive
domains: implementing generic role-based access control, trust
management in the World Wide Web, and an electronic marketplace
comprising unfamiliar and untrusted but collaborative organizations.
Using camera-phones to interact with context-aware mobile
servicesToye, EleanorMadhavapeddy, AnilSharp, RichardScott, DavidBlackwell, AlanUpton, EbenUniversity of Cambridge, Computer Laboratory2004-12enTextUCAM-CL-TR-609ISSN 1476-2986
We describe an interaction technique for controlling site-specific
mobile services using commercially available camera-phones, public
information displays and visual tags. We report results from an
experimental study validating this technique in terms of pointing
speed and accuracy. Our results show that even novices can use
camera-phones to “point-and-click” on visual tags quickly and
accurately. We have built a publicly available client/server
software framework for rapid development of applications that
exploit our interaction technique. We describe two prototype
applications that were implemented using this framework and present
findings from user-experience studies based on these applications.
Influence of syntax on prosodic boundary
predictionIngulfsen, TommyUniversity of Cambridge, Computer Laboratory2004-12enTextUCAM-CL-TR-610ISSN 1476-2986
In this thesis we compare the effectiveness of different syntactic
features and syntactic representations for prosodic boundary
prediction, setting out to clarify which representations are most
suitable for this task. The results of a series of experiments show
that it is not possible to conclude that a single representation is
superior to all others. Three representations give rise to similar
experimental results. One of these representations is composed only
of shallow features, which were originally thought to have less
predictive power than deep features. Conversely, one of the deep
representations that seemed to be best suited for our purposes
(syntactic chunks) turns out not to be among the three best.
An heuristic analysis of the classification of bivariate
subdivision schemesDodgson, Neil A.University of Cambridge, Computer Laboratory2004-12enTextUCAM-CL-TR-611ISSN 1476-2986
Alexa [*] and Ivrissimtzis et al. [+] have proposed a classification
mechanism for bivariate subdivision schemes. Alexa considers
triangular primal schemes, Ivrissimtzis et al. generalise this both
to quadrilateral and hexagonal meshes and to dual and mixed schemes.
I summarise this classification and then proceed to analyse it in
order to determine which classes of subdivision scheme are likely to
contain useful members. My aim is to ascertain whether there are any
potentially useful classes which have not yet been investigated or
whether we can say, with reasonable confidence, that all of the
useful classes have already been considered. I apply heuristics
related to the mappings of element types (vertices, face centres,
and mid-edges) to one another, to the preservation of symmetries, to
the alignment of meshes at different subdivision levels, and to the
size of the overall subdivision mask. My conclusion is that there
are only a small number of useful classes and that most of these
have already been investigated in terms of linear, stationary
subdivision schemes. There is some space for further work,
particularly in the investigation of whether there are useful
ternary linear, stationary subdivision schemes, but it appears that
future advances are more likely to lie elsewhere.
[*] M. Alexa. Refinement operators for triangle meshes. Computer
Aided Geometric Design, 19(3):169-172, 2002.
[+] I. P. Ivrissimtzis, N. A. Dodgson, and M. A. Sabin. A generative
classification of mesh refinement rules with lattice
transformations. Computer Aided Geometric Design, 22(1):99-109,
2004.
Location privacy in ubiquitous computingBeresford, Alastair R.University of Cambridge, Computer Laboratory2005-01enTextUCAM-CL-TR-612ISSN 1476-2986
The field of ubiquitous computing envisages an era when the average
consumer owns hundreds or thousands of mobile and embedded computing
devices. These devices will perform actions based on the context of
their users, and therefore ubiquitous systems will gather, collate
and distribute much more personal information about individuals than
computers do today. Much of this personal information will be
considered private, and therefore mechanisms which allow users to
control the dissemination of these data are vital. Location
information is a particularly useful form of context in ubiquitous
computing, yet its unconditional distribution can be very invasive.
This dissertation develops novel methods for providing location
privacy in ubiquitous computing. Much of the previous work in this
area uses access control to enable location privacy. This
dissertation takes a different approach and argues that many
location-aware applications can function with anonymised location
data and that, where this is possible, its use is preferable to that
of access control.
Suitable anonymisation of location data is not a trivial task: under
a realistic threat model simply removing explicit identifiers does
not anonymise location information. This dissertation describes why
this is the case and develops two quantitative security models for
anonymising location data: the mix zone model and the variable
quality model.
A trusted third-party can use one, or both, models to ensure that
all location events given to untrusted applications are suitably
anonymised. The mix zone model supports untrusted applications which
require accurate location information about users in a set of
disjoint physical locations. In contrast, the variable quality model
reduces the temporal or spatial accuracy of location information to
maintain user anonymity at every location.
Both models provide a quantitative measure of the level of anonymity
achieved; therefore any given situation can be analysed to determine
the amount of information an attacker can gain through analysis of
the anonymised data. The suitability of both these models is
demonstrated and the level of location privacy available to users of
real location-aware applications is measured.
Abstracting application-level security policy for ubiquitous
computingScott, David J.University of Cambridge, Computer Laboratory2005-01enTextUCAM-CL-TR-613ISSN 1476-2986
In the future world of Ubiquitous Computing, tiny embedded networked
computers will be found in everything from mobile phones to
microwave ovens. Thanks to improvements in technology and software
engineering, these computers will be capable of running
sophisticated new applications constructed from mobile agents.
Inevitably, many of these systems will contain application-level
vulnerabilities; errors caused by either unanticipated mobility or
interface behaviour. Unfortunately existing methods for applying
security policy – network firewalls – are inadequate to control and
protect the hordes of vulnerable mobile devices. As more and more
critical functions are handled by these systems, the potential for
disaster is increasing rapidly.
To counter these new threats, this report champions the approach of
using new application-level security policy languages in combination
to protect vulnerable applications. Policies are abstracted from
main application code, facilitating both analysis and future
maintenance. As well as protecting existing applications, such
policy systems can help as part of a security-aware design process
when building new applications from scratch.
Three new application-level policy languages are contributed each
addressing a different kind of vulnerability. Firstly, the policy
language MRPL allows the creation of Mobility Restriction Policies,
based on a unified spatial model which represents both physical
location of objects as well as virtual location of mobile code.
Secondly, the policy language SPDL-2 protects applications against a
large number of common errors by allowing the specification of
per-request/response validation and transformation rules. Thirdly,
the policy language SWIL allows interfaces to be described as
automata which may be analysed statically by a model-checker before
being checked dynamically in an application-level firewall. When
combined together, these three languages provide an effective means
for preventing otherwise critical application-level vulnerabilities.
Systems implementing these policy languages have been built; an
implementation framework is described and encouraging performance
results and analysis are presented.
Pure bigraphsMilner, RobinUniversity of Cambridge, Computer Laboratory2005-01enTextUCAM-CL-TR-614ISSN 1476-2986
Bigraphs are graphs whose nodes may be nested, representing
locality, independently of the edges connecting them. They may be
equipped with reaction rules, forming a bigraphical reactive system
(Brs) in which bigraphs can reconfigure themselves. Brss aim to
unify process calculi, and to model applications —such as pervasive
computing— where locality and mobility are prominent. The paper is
devoted to the theory of pure bigraphs, which underlie various more
refined forms. It begins by developing a more abstract structure, a
wide reactive system Wrs, of which a Brs is an instance; in this
context, labelled transitions are defined in such a way that the
induced bisimilarity is a congruence.
This work is then specialised to Brss, whose graphical structure
allows many refinements of the dynamic theory. Elsewhere it is shown
that behavioural analysis for Petri nets, π-calculus and mobile
ambients can all be recovered in the uniform framework of bigraphs.
The latter part of the paper emphasizes the parts of bigraphical
theory that are common to these applications, especially the
treatment of dynamics via labelled transitions. As a running
example, the theory is applied to finite pure CCS, whose resulting
transition system and bisimilarity are analysed in detail.
The paper also discusses briefly the use of bigraphs to model both
pervasive computing and biological systems.
Global public computingKotsovinos, EvangelosUniversity of Cambridge, Computer Laboratory2005-01enTextUCAM-CL-TR-615ISSN 1476-2986
High-bandwidth networking and cheap computing hardware are leading
to a world in which the resources of one machine are available to
groups of users beyond their immediate owner. This trend is visible
in many different settings. Distributed computing, where
applications are divided into parts that run on different machines
for load distribution, geographical dispersion, or robustness, has
recently found new fertile ground. Grid computing promises to
provide a common framework for scheduling scientific computation and
managing the associated large data sets. Proposals for utility
computing envision a world in which businesses rent computing
bandwidth in server farms on-demand instead of purchasing and
maintaining servers themselves.
All such architectures target particular user and application groups
or deployment scenarios, where simplifying assumptions can be made.
They expect centralised ownership of resources, cooperative users,
and applications that are well-behaved and compliant to a specific
API or middleware. Members of the public who are not involved in
Grid communities or wish to deploy out-of-the-box distributed
services, such as game servers, have no means to acquire resources
on large numbers of machines around the world to launch their tasks.
This dissertation proposes a new distributed computing paradigm,
termed global public computing, which allows any user to run any
code anywhere. Such platforms price computing resources, and
ultimately charge users for resources consumed. This dissertation
presents the design and implementation of the XenoServer Open
Platform, putting this vision into practice. The efficiency and
scalability of the developed mechanisms are demonstrated by
experimental evaluation; the prototype platform allows the
global-scale deployment of complex services in less than 45 seconds,
and could scale to millions of concurrent sessions without
presenting performance bottlenecks.
To facilitate global public computing, this work addresses several
research challenges. It introduces reusable mechanisms for
representing, advertising, and supporting the discovery of
resources. To allow flexible and federated control of resource
allocation by all stakeholders involved, it proposes a novel
role-based resource management framework for expressing and
combining distributed management policies. Furthermore, it
implements effective service deployment models for launching
distributed services on large numbers of machines around the world
easily, quickly, and efficiently. To keep track of resource
consumption and pass charges on to consumers, it devises an
accounting and charging infrastructure.
Dictionary characteristics in cross-language information
retrievalNic Gearailt, DonnlaUniversity of Cambridge, Computer Laboratory2005-02enTextUCAM-CL-TR-616ISSN 1476-2986
In the absence of resources such a as suitable MT system,
translation in Cross-Language Information Retrieval (CLIR) consists
primarily of mapping query terms to a semantically equivalent
representation in the target language. This can be accomplished by
looking up each term in a simple bilingual dictionary. The main
problem here is deciding which of the translations provided by the
dictionary for each query term should be included in the query
translation. We tackled this problem by examining different
characteristics of the system dictionary. We found that dictionary
properties such as scale (the average number of translations per
term), translation repetition (providing the same translation for a
term more than once in a dictionary entry, for example, for
different senses of a term), and dictionary coverage rate (the
percentage of query terms for which the dictionary provides a
translation) can have a profound effect on retrieval performance.
Dictionary properties were explored in a series of carefully
controlled tests, designed to evaluate specific hypotheses. These
experiments showed that (a) contrary to expectation, smaller scale
dictionaries resulted in better performance than large-scale ones,
and (b) when appropriately managed e.g. through strategies to ensure
adequate translational coverage, dictionary-based CLIR could perform
as well as other CLIR methods discussed in the literature. Our
experiments showed that it is possible to implement an effective
CLIR system with no resources other than the system dictionary
itself, provided this dictionary is chosen with careful examination
of its characteristics, removing any dependency on outside
resources.
Pocket Switched Networks: Real-world mobility and its
consequences for opportunistic forwardingChaintreau, AugustinHui, PanCrowcroft, JonDiot, ChristopheGass, RichardScott, JamesUniversity of Cambridge, Computer Laboratory2005-02enTextUCAM-CL-TR-617ISSN 1476-2986
Opportunistic networks make use of human mobility and local
forwarding in order to distribute data. Information can be stored
and passed, taking advantage of the device mobility, or forwarded
over a wireless link when an appropriate contact is met. Such
networks fall into the fields of mobile ad-hoc networking and
delay-tolerant networking. In order to evaluate forwarding
algorithms for these networks, accurate data is needed on the
intermittency of connections.
In this paper, the inter-contact time between two transmission
opportunities is observed empirically using four distinct sets of
data, two having been specifically collected for this work, and two
provided by other research groups.
We discover that the distribution of inter-contact time follows an
approximate power law over a large time range in all data sets. This
observation is at odds with the exponential decay expected by many
currently used mobility models. We demonstrate that opportunistic
transmission schemes designed around these current models have poor
performance under approximate power-law conditions, but could be
significantly improved by using limited redundant transmissions.
The Fresh Approach: functional programming with names and
bindersShinwell, Mark R.University of Cambridge, Computer Laboratory2005-02enTextUCAM-CL-TR-618ISSN 1476-2986
This report concerns the development of a language called Fresh
Objective Caml, which is an extension of the Objective Caml language
providing facilities for the manipulation of data structures
representing syntax involving α-convertible names and binding
operations.
After an introductory chapter which includes a survey of related
work, we describe the Fresh Objective Caml language in detail. Next,
we proceed to formalise a small core language which captures the
essence of Fresh Objective Caml; we call this Mini-FreshML. We
provide two varieties of operational semantics for this language and
prove them equivalent. Then in order to prove correctness properties
of representations of syntax in the language we introduce a new
variety of domain theory called FM-domain theory, based on the
permutation model of name binding from Pitts and Gabbay. We show how
classical domain-theoretic constructions—including those for the
solution of recursive domain equations—fall naturally into this
setting, where they are augmented by new constructions to handle
name-binding.
After developing the necessary domain theory, we demonstrate how it
may be exploited to give a monadic denotational semantics to
Mini-FreshML. This semantics in itself is quite novel and
demonstrates how a simple monad of continuations is sufficient to
model dynamic allocation of names. We prove that our denotational
semantics is computationally adequate with respect to the
operational semantics—in other words, equality of denotation implies
observational equivalence. After this, we show how the denotational
semantics may be used to prove our desired correctness properties.
In the penultimate chapter, we examine the implementation of Fresh
Objective Caml, describing detailed issues in the compiler and
runtime systems. Then in the final chapter we close the report with
a discussion of future avenues of research and an assessment of the
work completed so far.
Operating system support for simultaneous multithreaded
processorsBulpin, James R.University of Cambridge, Computer Laboratory2005-02enTextUCAM-CL-TR-619ISSN 1476-2986
Simultaneous multithreaded (SMT) processors are able to execute
multiple application threads in parallel in order to improve the
utilisation of the processor’s execution resources. The improved
utilisation provides a higher processor-wide throughput at the
expense of the performance of each individual thread.
Simultaneous multithreading has recently been incorporated into the
Intel Pentium 4 processor family as “Hyper-Threading”. While there
is already basic support for it in popular operating systems, that
support does not take advantage of any knowledge about the
characteristics of SMT, and therefore does not fully exploit the
processor.
SMT presents a number of challenges to operating system designers.
The threads’ dynamic sharing of processor resources means that there
are complex performance interactions between threads. These
interactions are often unknown, poorly understood, or hard to avoid.
As a result such interactions tend to be ignored leading to a lower
processor throughput.
In this dissertation I start by describing simultaneous
multithreading and the hardware implementations of it. I discuss
areas of operating system support that are either necessary or
desirable.
I present a detailed study of a real SMT processor, the Intel
Hyper-Threaded Pentium 4, and describe the performance interactions
between threads. I analyse the results using information from the
processor’s performance monitoring hardware.
Building on the understanding of the processor’s operation gained
from the analysis, I present a design for an operating system
process scheduler that takes into account the characteristics of the
processor and the workloads in order to improve the system-wide
throughput. I evaluate designs exploiting various levels of
processor-specific knowledge.
I finish by discussing alternative ways to exploit SMT processors.
These include the partitioning onto separate simultaneous threads of
applications and hardware interrupt handling. I present preliminary
experiments to evaluate the effectiveness of this technique.
Middleware support for context-awareness in distributed
sensor-driven systemsKatsiri, EleftheriaUniversity of Cambridge, Computer Laboratory2005-02enTextUCAM-CL-TR-620ISSN 1476-2986
Context-awareness concerns the ability of computing devices to
detect, interpret and respond to aspects of the user’s local
environment. Sentient Computing is a sensor-driven programming
paradigm which maintains an event-based, dynamic model of the
environment which can be used by applications in order to drive
changes in their behaviour, thus achieving context-awareness.
However, primitive events, especially those arising from sensors,
e.g., that a user is at position {x,y,z} are too low-level to be
meaningful to applications. Existing models for creating
higher-level, more meaningful events, from low-level events, are
insufficient to capture the user’s intuition about abstract system
state. Furthermore, there is a strong need for user-centred
application development, without undue programming overhead.
Applications need to be created dynamically and remain functional
independently of the distributed nature and heterogeneity of
sensor-driven systems, even while the user is mobile. Both issues
combined necessitate an alternative model for developing
applications in a real-time, distributed sensor-driven environment
such as Sentient Computing.
This dissertation describes the design and implementation of the
SCAFOS framework. SCAFOS has two novel aspects. Firstly, it provides
powerful tools for inferring abstract knowledge from low-level,
concrete knowledge, verifying its correctness and estimating its
likelihood. Such tools include Hidden Markov Models, a Bayesian
Classifier, Temporal First-Order Logic, the theorem prover SPASS and
the production system CLIPS. Secondly, SCAFOS provides support for
simple application development through the XML-based SCALA language.
By introducing the new concept of a generalised event, an abstract
event, defined as a notification of changes in abstract system
state, expressiveness compatible with human intuition is achieved
when using SCALA. The applications that are created through SCALA
are automatically integrated and operate seamlessly in the various
heterogeneous components of the context-aware environment even while
the user is mobile or when new entities or other applications are
added or removed in SCAFOS.
Fresh Objective Caml user manualShinwell, Mark R.Pitts, Andrew M.University of Cambridge, Computer Laboratory2005-02enTextUCAM-CL-TR-621ISSN 1476-2986
This technical report is the user manual for the Fresh Objective
Caml system, which implements a functional programming language
incorporating facilities for manipulating syntax involving names and
binding operations.
Cooperation and deviation in market-based resource
allocationLepler, Jörg H.University of Cambridge, Computer Laboratory2005-03enTextUCAM-CL-TR-622ISSN 1476-2986
This thesis investigates how business transactions are enhanced
through competing strategies for economically motivated cooperation.
To this end, it focuses on the setting of a distributed, bilateral
allocation protocol for electronic services and resources.
Cooperative efforts like these are often threatened by transaction
parties who aim to exploit their competitors by deviating from
so-called cooperative goals. We analyse this conflict between
cooperation and deviation by presenting the case of two novel market
systems which use economic incentives to solve the complications
that arise from cooperation.
The first of the two systems is a pricing model which is designed to
address the problematic resource market situation, where supply
exceeds demand and perfect competition can make prices collapse to
level zero. This pricing model uses supply functions to determine
the optimal Nash-Equilibrium price. Moreover, in this model the
providers’ market estimations are updated with information about
each of their own transactions. Here, we implement the protocol in a
discrete event simulation, to show that the equilibrium prices are
above competitive levels, and to demonstrate that deviations from
the pricing model are not profitable.
The second of the two systems is a reputation aggregation model,
which seeks the subgroup of raters that (1) contains the largest
degree of overall agreement and (2) derives the resulting reputation
scores from their comments. In order to seek agreement, this model
assumes that not all raters in the system are equally able to foster
an agreement. Based on the variances of the raters’ comments, the
system derives a notion of the reputation for each rater, which is
in turn fed back into the model’s recursive scoring algorithm. We
demonstrate the convergence of this algorithm, and show the
effectiveness of the model’s ability to discriminate between poor
and strong raters. Then with a series of threat models, we show how
resilient this model is in terms of finding agreement, despite large
collectives of malicious raters. Finally, in a practical example, we
apply the model to the academic peer review process in order to show
its versatility at establishing a ranking of rated objects.
Simple polymorphic usage analysisWansbrough, KeithUniversity of Cambridge, Computer Laboratory2005-03enTextUCAM-CL-TR-623ISSN 1476-2986
Implementations of lazy functional languages ensure that
computations are performed only when they are needed, and save the
results so that they are not repeated. This frees the programmer to
describe solutions at a high level, leaving details of control flow
to the compiler.
This freedom however places a heavy burden on the compiler;
measurements show that over 70% of these saved results are never
used again. A usage analysis that could statically detect values
used at most once would enable these wasted updates to be avoided,
and would be of great benefit. However, existing usage analyses
either give poor results or have been applied only to prototype
compilers or toy languages.
This thesis presents a sound, practical, type-based usage analysis
that copes with all the language features of a modern functional
language, including type polymorphism and user-defined algebraic
data types, and addresses a range of problems that have caused
difficulty for previous analyses, including poisoning, mutual
recursion, separate compilation, and partial application and usage
dependencies. In addition to well-typing rules, an inference
algorithm is developed, with proofs of soundness and a complexity
analysis.
In the process, the thesis develops simple polymorphism, a novel
approach to polymorphism in the presence of subtyping that attempts
to strike a balance between pragmatic concerns and expressive power.
This thesis may be considered an extended experiment into this
approach, worked out in some detail but not yet conclusive.
The analysis described was designed in parallel with a full
implementation in the Glasgow Haskell Compiler, leading to informed
design choices, thorough coverage of language features, and accurate
measurements of its potential and effectiveness when used on real
code. The latter demonstrate that the analysis yields moderate
benefit in practice.
TCP, UDP, and Sockets: rigorous and experimentally-validated
behavioural specification : Volume 1: OverviewBishop, SteveFairbairn, MatthewNorrish, MichaelSewell, PeterSmith, MichaelWansbrough, KeithUniversity of Cambridge, Computer Laboratory2005-03enTextUCAM-CL-TR-624ISSN 1476-2986
We have developed a mathematically rigorous and
experimentally-validated post-hoc specification of the behaviour of
TCP, UDP, and the Sockets API. It characterises the API and
network-interface interactions of a host, using operational
semantics in the higher-order logic of the HOL automated proof
assistant. The specification is detailed, covering almost all the
information of the real-world communications: it is in terms of
individual TCP segments and UDP datagrams, though it abstracts from
the internals of IP. It has broad coverage, dealing with arbitrary
API call sequences and incoming messages, not just some well-behaved
usage. It is also accurate, closely based on the de facto standard
of (three of) the widely-deployed implementations. To ensure this we
have adopted a novel experimental semantics approach, developing
test generation tools and symbolic higher-order-logic model checking
techniques that let us validate the specification directly against
several thousand traces captured from the implementations.
The resulting specification, which is annotated for the
non-HOL-specialist reader, may be useful as an informal reference
for TCP/IP stack implementors and Sockets API users, supplementing
the existing informal standards and texts. It can also provide a
basis for high-fidelity automated testing of future implementations,
and a basis for design and formal proof of higher-level
communication layers. More generally, the work demonstrates that it
is feasible to carry out similar rigorous specification work at
design-time for new protocols. We discuss how such a design-for-test
approach should influence protocol development, leading to protocol
specifications that are both unambiguous and clear, and to
high-quality implementations that can be tested directly against
those specifications.
This document (Volume 1) gives an overview of the project,
discussing the goals and techniques and giving an introduction to
the specification. The specification itself is given in the
companion Volume 2 (UCAM-CL-TR-625), which is automatically typeset
from the (extensively annotated) HOL source. As far as possible we
have tried to make the work accessible to four groups of intended
readers: workers in networking (implementors of TCP/IP stacks, and
designers of new protocols); in distributed systems (implementors of
software above the Sockets API); in distributed algorithms (for whom
this may make it possible to prove properties about executable
implementations of those algorithms); and in semantics and automated
reasoning.
TCP, UDP, and Sockets: rigorous and experimentally-validated
behavioural specification : Volume 2: The SpecificationBishop, SteveFairbairn, MatthewNorrish, MichaelSewell, PeterSmith, MichaelWansbrough, KeithUniversity of Cambridge, Computer Laboratory2005-03enTextUCAM-CL-TR-625ISSN 1476-2986
See Volume 1 (UCAM-CL-TR-624).
Landmark Guided Forwarding: A hybrid approach for Ad Hoc
routingLim, Meng HowGreenhalgh, AdamChesterfield, JulianCrowcroft, JonUniversity of Cambridge, Computer Laboratory2005-03enTextUCAM-CL-TR-626ISSN 1476-2986
Wireless Ad Hoc network routing presents some extremely challenging
research problems, trying to optimize parameters such as energy
conservation vs connectivity and global optimization vs routing
overhead scalability. In this paper we focus on the problems of
maintaining network connectivity in the presence of node mobility
whilst providing globally efficient and robust routing. The common
approach among existing wireless Ad Hoc routing solutions is to
establish a global optimal path between a source and a destination.
We argue that establishing a globally optimal path is both
unreliable and unsustainable as the network diameter, traffic
volume, number of nodes all increase in the presence of moderate
node mobility. To address this we propose Landmark Guided Forwarding
(LGF), a protocol that provides a hybrid solution of topological and
geographical routing algorithms. We demonstrate that LGF is adaptive
to unstable connectivity and scalable to large networks. Our results
indicate therefore that Landmark Guided Forwarding converges much
faster, scales better and adapts well within a dynamic wireless Ad
Hoc environment in comparison to existing solutions.
Efficient computer interfaces using continuous gestures,
language models, and speechVertanen, KeithUniversity of Cambridge, Computer Laboratory2005-03enTextUCAM-CL-TR-627ISSN 1476-2986
Despite advances in speech recognition technology, users of
dictation systems still face a significant amount of work to correct
errors made by the recognizer. The goal of this work is to
investigate the use of a continuous gesture-based data entry
interface to provide an efficient and fun way for users to correct
recognition errors. Towards this goal, techniques are investigated
which expand a recognizer’s results to help cover recognition
errors. Additionally, models are developed which utilize a speech
recognizer’s n-best list to build letter-based language models.
A formal security policy for an NHS electronic health record
serviceBecker, Moritz Y.University of Cambridge, Computer Laboratory2005-03enTextUCAM-CL-TR-628ISSN 1476-2986
The ongoing NHS project for the development of a UK-wide electronic
health records service, also known as the ‘Spine’, raises many
controversial issues and technical challenges concerning the
security and confidentiality of patient-identifiable clinical data.
As the system will need to be constantly adapted to comply with
evolving legal requirements and guidelines, the Spine’s
authorisation policy should not be hard-coded into the system but
rather be specified in a high-level, general-purpose,
machine-enforceable policy language.
We describe a complete authorisation policy for the Spine and
related services, written for the trust management system Cassandra,
and comprising 375 formal rules. The policy is based on the NHS’s
Output-based Specification (OBS) document and deals with all
requirements concerning access control of patient-identifiable data,
including legitimate relationships, patients restricting access,
authenticated express consent, third-party consent, and workgroup
management.
Hybrid routing: A pragmatic approach to mitigating position
uncertainty in geo-routingLim, Meng HowGreenhalgh, AdamChesterfield, JulianCrowcroft, JonUniversity of Cambridge, Computer Laboratory2005-04enTextUCAM-CL-TR-629ISSN 1476-2986
In recent years, research in wireless Ad Hoc routing seems to be
moving towards the approach of position based forwarding. Amongst
proposed algorithms, Greedy Perimeter Stateless Routing has gained
recognition for guaranteed delivery with modest network overheads.
Although this addresses the scaling limitations with topological
routing, it has limited tolerance for position inaccuracy or stale
state reported by a location service. Several researchers have
demonstrated that the inaccuracy of the positional system could have
a catastrophic effect on position based routing protocols. In this
paper, we evaluate how the negative effects of position inaccuracy
can be countered by extending position based forwarding with a
combination of restrictive topological state, adaptive route
advertisement and hybrid forwarding. Our results show that a hybrid
of the position and topology approaches used in Landmark Guided
Forwarding yields a high goodput and timely packet delivery, even
with 200 meters of position error.
Semi-invasive attacks – A new approach to hardware security
analysisSkorobogatov, Sergei P.University of Cambridge, Computer Laboratory2005-04enTextUCAM-CL-TR-630ISSN 1476-2986
Semiconductor chips are used today not only to control systems, but
also to protect them against security threats. A continuous battle
is waged between manufacturers who invent new security solutions,
learning their lessons from previous mistakes, and the hacker
community, constantly trying to break implemented protections. Some
chip manufacturers do not pay enough attention to the proper design
and testing of protection mechanisms. Even where they claim their
products are highly secure, they do not guarantee this and do not
take any responsibility if a device is compromised. In this
situation, it is crucial for the design engineer to have a
convenient and reliable method of testing secure chips.
This thesis presents a wide range of attacks on hardware security in
microcontrollers and smartcards. This includes already known
non-invasive attacks, such as power analysis and glitching, and
invasive attacks, such as reverse engineering and microprobing. A
new class of attacks – semi-invasive attacks – is introduced. Like
invasive attacks, they require depackaging the chip to get access to
its surface. But the passivation layer remains intact, as these
methods do not require electrical contact to internal lines.
Semi-invasive attacks stand between non-invasive and invasive
attacks. They represent a greater threat to hardware security, as
they are almost as effective as invasive attacks but can be low-cost
like non-invasive attacks.
This thesis’ contribution includes practical fault-injection attacks
to modify SRAM and EEPROM content, or change the state of any
individual CMOS transistor on a chip. This leads to almost unlimited
capabilities to control chip operation and circumvent protection
mechanisms. A second contribution consist of experiments on data
remanence, which show that it is feasible to extract information
from powered-off SRAM and erased EPROM, EEPROM and Flash memory
devices.
A brief introduction to copy protection in microcontrollers is
given. Hardware security evaluation techniques using semi-invasive
methods are introduced. They should help developers to make a proper
selection of components according to the required level of security.
Various defence technologies are discussed, from low-cost obscurity
methods to new approaches in silicon design.
MIRRORS: An integrated framework for capturing real world
behaviour for models of ad hoc networksHu, WenjunCrowcroft, JonUniversity of Cambridge, Computer Laboratory2005-04enTextUCAM-CL-TR-631ISSN 1476-2986
The simulation models used in mobile ad hoc network research have
been criticised for lack of realism. While credited with ease of
understanding and implementation, they are often based on
theoretical models, rather than real world observations. Criticisms
have centred on radio propagation or mobility models.
In this work, we take an integrated approach to modelling the real
world that underlies a mobile ad hoc network. While pointing out the
correlations between the space, radio propagation and mobility
models, we use mobility as a focal point to propose a new framework,
MIRRORS, that captures real world behaviour. We give the formulation
of a specific model within the framework and present simulation
results that reflect topology properties of the networks
synthesised. Compared with the existing models studied, our model
better represent real world topology properties and presents a wider
spectrum of variation in the metrics examined, due to the model
encapsulating more detailed dynamics. While the common approach is
to focus on performance evaluation of existing protocols using these
models, we discuss protocol design opportunities across layers in
view of the simulation results.
Between shallow and deep: an experiment in automatic
summarisingSpärck Jones, R.I. Tucker and K.University of Cambridge, Computer Laboratory2005-04enTextUCAM-CL-TR-632ISSN 1476-2986
This paper describes an experiment in automatic summarising using a
general-purpose strategy based on a compromise between shallow and
deep processing. The method combines source text analysis into
simple logical forms with the use of a semantic graph for
representation and operations on the graph to identify summary
content.
The graph is based on predications extracted from the logical forms,
and the summary operations apply three criteria, namely importance,
representativeness, and cohesiveness, in choosing node sets to form
the content representation for the summary. This is used in
different ways for output summaries. The paper presents the
motivation for the strategy, details of the CLASP system, and the
results of initial testing and evaluation on news material.
On deadlock, livelock, and forward progressHo, AlexSmith, StevenHand, StevenUniversity of Cambridge, Computer Laboratory2005-05enTextUCAM-CL-TR-633ISSN 1476-2986
Deadlock and livelock can happen at many different levels in a
distributed system. We unify both around the concept of forward
progress and standstill. We describe a framework capable of
detecting the lack of forward progress in distributed systems. Our
prototype can easily solve traditional deadlock problems where
synchronization is via a customer network protocol; however, many
interesting research challenges remain.
Visualisation, interpretation and use of location-aware
interfacesRehman, KasimUniversity of Cambridge, Computer Laboratory2005-05enTextUCAM-CL-TR-634ISSN 1476-2986
Ubiquitous Computing (Ubicomp), a term coined by Mark Weiser in the
early 1990’s, is about transparently equipping the physical
environment and everyday objects in it with computational, sensing
and networking abilities. In contrast with traditional desktop
computing the “computer” moves into the background, unobtrusively
supporting users in their everyday life.
One of the instantiations of Ubicomp is location-aware computing.
Using location sensors, the “computer” reacts to changes in location
of users and everyday objects. Location changes are used to infer
user intent in order to give the user the most appropriate support
for the task she is performing. Such support can consist of
automatically providing information or configuring devices and
applications deemed adequate for the inferred user task.
Experience with these applications has uncovered a number of
usability problems that stem from the fact that the “computer” in
this paradigm has become unidentifiable for the user. More
specifically, these arise from lack of feedback from, loss of user
control over, and the inability to provide a conceptual model of the
“computer”.
Starting from the proven premise that feedback is indispensable for
smooth human-machine interaction, a system that uses Augmented
Reality in order to visually provide information about the state of
a location-aware environment and devices in it, is designed and
implemented.
Augmented Reality (AR) as it is understood for the purpose of this
research uses a see-through head-mounted display, trackers and
3-dimensional (3D) graphics in order to give users the illusion that
3-dimensional graphical objects specified and generated on a
computer are actually located in the real world.
The system described in this thesis can be called a Graphical User
Interface (GUI) for a physical environment. Properties of GUIs for
desktop environments are used as a valuable resource in designing a
software architecture that supports interactivity in a
location-aware environment, understanding how users might
conceptualise the “computer” and extracting design principles for
visualisation in a Ubicomp environment.
Most importantly this research offers a solution to fundamental
interaction problems in Ubicomp environments. In doing so this
research presents the next step from reactive environments to
interactive environments.
Results from 200 billion iris cross-comparisonsDaugman, JohnUniversity of Cambridge, Computer Laboratory2005-06enTextUCAM-CL-TR-635ISSN 1476-2986
Statistical results are presented for biometric recognition of
persons by their iris patterns, based on 200 billion
cross-comparisons between different eyes. The database consisted of
632,500 iris images acquired in the Middle East, in a national
border-crossing protection programme that uses the Daugman
algorithms for iris recognition. A total of 152 different
nationalities were represented in this database. The set of
exhaustive cross-comparisons between all possible pairings of irises
in the database shows that with reasonable acceptance thresholds,
the False Match rate is less than 1 in 200 billion. Recommendations
are given for the numerical decision threshold policy that would
enable reliable identification performance on a national scale in
the UK.
Mind-reading machines: automated inference of complex mental
statesel Kaliouby, Rana AymanUniversity of Cambridge, Computer Laboratory2005-07enTextUCAM-CL-TR-636ISSN 1476-2986
People express their mental states all the time, even when
interacting with machines. These mental states shape the decisions
that we make, govern how we communicate with others, and affect our
performance. The ability to attribute mental states to others from
their behaviour, and to use that knowledge to guide one’s own
actions and predict those of others is known as theory of mind or
mind-reading.
The principal contribution of this dissertation is the real time
inference of a wide range of mental states from head and facial
displays in a video stream. In particular, the focus is on the
inference of complex mental states: the affective and cognitive
states of mind that are not part of the set of basic emotions. The
automated mental state inference system is inspired by and draws on
the fundamental role of mind-reading in communication and
decision-making.
The dissertation describes the design, implementation and validation
of a computational model of mind-reading. The design is based on the
results of a number of experiments that I have undertaken to analyse
the facial signals and dynamics of complex mental states. The
resulting model is a multi-level probabilistic graphical model that
represents the facial events in a raw video stream at different
levels of spatial and temporal abstraction. Dynamic Bayesian
Networks model observable head and facial displays, and
corresponding hidden mental states over time.
The automated mind-reading system implements the model by combining
top-down predictions of mental state models with bottom-up
vision-based processing of the face. To support intelligent
human-computer interaction, the system meets three important
criteria. These are: full automation so that no manual preprocessing
or segmentation is required, real time execution, and the
categorization of mental states early enough after their onset to
ensure that the resulting knowledge is current and useful.
The system is evaluated in terms of recognition accuracy,
generalization and real time performance for six broad classes of
complex mental states—agreeing, concentrating, disagreeing,
interested, thinking and unsure, on two different corpora. The
system successfully classifies and generalizes to new examples of
these classes with an accuracy and speed that are comparable to that
of human recognition.
The research I present here significantly advances the nascent
ability of machines to infer cognitive-affective mental states in
real time from nonverbal expressions of people. By developing a real
time system for the inference of a wide range of mental states
beyond the basic emotions, I have widened the scope of
human-computer interaction scenarios in which this technology can be
integrated. This is an important step towards building socially and
emotionally intelligent machines.
The topology of covert conflictNagaraja, ShishirAnderson, RossUniversity of Cambridge, Computer Laboratory2005-07enTextUCAM-CL-TR-637ISSN 1476-2986
Often an attacker tries to disconnect a network by destroying nodes
or edges, while the defender counters using various resilience
mechanisms. Examples include a music industry body attempting to
close down a peer-to-peer file-sharing network; medics attempting to
halt the spread of an infectious disease by selective vaccination;
and a police agency trying to decapitate a terrorist organisation.
Albert, Jeong and Barabási famously analysed the static case, and
showed that vertex-order attacks are effective against scale-free
networks. We extend this work to the dynamic case by developing a
framework based on evolutionary game theory to explore the
interaction of attack and defence strategies. We show, first, that
naive defences don’t work against vertex-order attack; second, that
defences based on simple redundancy don’t work much better, but that
defences based on cliques work well; third, that attacks based on
centrality work better against clique defences than vertex-order
attacks do; and fourth, that defences based on complex strategies
such as delegation plus clique resist centrality attacks better than
simple clique defences. Our models thus build a bridge between
network analysis and evolutionary game theory, and provide a
framework for analysing defence and attack in networks where
topology matters. They suggest definitions of efficiency of attack
and defence, and may even explain the evolution of insurgent
organisations from networks of cells to a more virtual leadership
that facilitates operations rather than directing them. Finally, we
draw some conclusions and present possible directions for future
research.
Optimistic Generic BroadcastZieliński, PiotrUniversity of Cambridge, Computer Laboratory2005-07enTextUCAM-CL-TR-638ISSN 1476-2986
We consider an asynchronous system with the Ω failure detector, and
investigate the number of communication steps required by various
broadcast protocols in runs in which the leader does not change.
Atomic Broadcast, used for example in state machine replication,
requires three communication steps. Optimistic Atomic Broadcast
requires only two steps if all correct processes receive messages in
the same order. Generic Broadcast requires two steps if no messages
conflict. We present an algorithm that subsumes both of these
approaches and guarantees two-step delivery if all conflicting
messages are received in the same order, and three-step delivery
otherwise. Internally, our protocol uses two new algorithms. First,
a Consensus algorithm which decides in one communication step if all
proposals are the same, and needs two steps otherwise. Second, a
method that allows us to run infinitely many instances of a
distributed algorithm, provided that only finitely many of them are
different. We assume that fewer than a third of all processes are
faulty (n > 3f).
Non-blocking hashtables with open addressingPurcell, ChrisHarris, TimUniversity of Cambridge, Computer Laboratory2005-09enTextUCAM-CL-TR-639ISSN 1476-2986
We present the first non-blocking hashtable based on open addressing
that provides the following benefits: it combines good cache
locality, accessing a single cacheline if there are no collisions,
with short straight-line code; it needs no storage overhead for
pointers and memory allocator schemes, having instead an overhead of
two words per bucket; it does not need to periodically reorganise or
replicate the table; and it does not need garbage collection, even
with arbitrary-sized keys. Open problems include resizing the table
and replacing, rather than erasing, entries. The result is a
highly-concurrent set algorithm that approaches or outperforms the
best externally-chained implementations we tested, with fixed memory
costs and no need to select or fine-tune a garbage collector or
locking strategy.
Combining cryptography with biometrics
effectivelyHao, FengAnderson, RossDaugman, JohnUniversity of Cambridge, Computer Laboratory2005-07enTextUCAM-CL-TR-640ISSN 1476-2986
We propose the first practical and secure way to integrate the iris
biometric into cryptographic applications. A repeatable binary
string, which we call a biometric key, is generated reliably from
genuine iris codes. A well-known difficulty has been how to cope
with the 10 to 20% of error bits within an iris code and derive an
error-free key. To solve this problem, we carefully studied the
error patterns within iris codes, and devised a two-layer error
correction technique that combines Hadamard and Reed-Solomon codes.
The key is generated from a subject’s iris image with the aid of
auxiliary error-correction data, which do not reveal the key, and
can be saved in a tamper-resistant token such as a smart card. The
reproduction of the key depends on two factors: the iris biometric
and the token. The attacker has to procure both of them to
compromise the key. We evaluated our technique using iris samples
from 70 different eyes, with 10 samples from each eye. We found that
an error-free key can be reproduced reliably from genuine iris codes
with a 99.5% success rate. We can generate up to 140 bits of
biometric key, more than enough for 128-bit AES. The extraction of a
repeatable binary string from biometrics opens new possible
applications, where a strong binding is required between a person
and cryptographic operations. For example, it is possible to
identify individuals without maintaining a central database of
biometric templates, to which privacy objections might be raised.
Cryptographic processors – a surveyAnderson, RossBond, MikeClulow, JolyonSkorobogatov, SergeiUniversity of Cambridge, Computer Laboratory2005-08enTextUCAM-CL-TR-641ISSN 1476-2986
Tamper-resistant cryptographic processors are becoming the standard
way to enforce data-usage policies. Their history began with
military cipher machines, and hardware security modules used to
encrypt the PINs that bank customers use to authenticate themselves
to ATMs. In both cases, the designers wanted to prevent abuse of
data and key material should a device fall into the wrong hands.
From these specialist beginnings, cryptoprocessors spread into
devices such as prepayment electricity meters, and the vending
machines that sell credit for them. In the 90s, tamper-resistant
smartcards became integral to GSM mobile phone identification and to
key management in pay-TV set-top boxes, while secure
microcontrollers were used in remote key entry devices for cars. In
the last five years, dedicated crypto chips have been embedded in
devices from games console accessories to printer ink cartridges, to
control product and accessory aftermarkets. The ‘Trusted Computing’
initiative will soon embed cryptoprocessors in PCs so that they can
identify each other remotely.
This paper surveys the range of applications of tamper-resistant
hardware, and the array of attack and defence mechanisms which have
evolved in the tamper-resistance arms race.
First-class relationships in an object-oriented
languageBierman, GavinWren, AlisdairUniversity of Cambridge, Computer Laboratory2005-08enTextUCAM-CL-TR-642ISSN 1476-2986
In this paper we investigate the addition of first-class
relationships to a prototypical object-oriented programming language
(a “middleweight” fragment of Java). We provide language-level
constructs to declare relationships between classes and to
manipulate relationship instances. We allow relationships to have
attributes and provide a novel notion of relationship inheritance.
We formalize our language giving both the type system and
operational semantics and prove certain key safety properties.
Using trust and risk for access control in Global
ComputingDimmock, Nathan E.University of Cambridge, Computer Laboratory2005-08enTextUCAM-CL-TR-643ISSN 1476-2986
Global Computing is a vision of a massively networked infrastructure
supporting a large population of diverse but cooperating entities.
Similar to ubiquitous computing, entities of global computing will
operate in environments that are dynamic and unpredictable,
requiring them to be capable of dealing with unexpected interactions
and previously unknown principals using an unreliable
infrastructure.
These properties will pose new security challenges that are not
adequately addressed by existing security models and mechanisms.
Traditionally privileges are statically encoded as security policy,
and while rôle-based access control introduces a layer of
abstraction between privilege and identity, rôles, privileges and
context must still be known in advance of any interaction taking
place.
Human society has developed the mechanism of trust to overcome
initial suspicion and gradually evolve privileges. Trust
successfully enables collaboration amongst human agents — a
computational model of trust ought to be able to enable the same in
computational agents. Existing research in this area has
concentrated on developing trust management systems that permit the
encoding of, and reasoning about, trust beliefs, but the
relationship between these and privilege is still hard coded. These
systems also omit any explicit reasoning about risk, and its
relationship to privilege, nor do they permit the automated
evolution of trust over time.
This thesis examines the relationship between trust, risk and
privilege in an access control system. An outcome-based approach is
taken to risk modelling, using explicit costs and benefits to model
the relationship between risk and privilege. This is used to develop
a novel model of access control — trust-based access control (TBAC)
— firstly for the limited domain of collaboration between Personal
Digital Assistants (PDAs), and later for more general global
computing applications using the SECURE computational trust
framework.
This general access control model is also used to extend an existing
rôle-based access control system to explicitly reason about trust
and risk. A further refinement is the incorporation of the economic
theory of decision-making under uncertainty by expressing costs and
benefits as utility, or preference-scaling, functions. It is then
shown how Bayesian trust models can be used in the SECURE framework,
and how these models enable a better abstraction to be obtained in
the access control policy. It is also shown how the access control
model can be used to take such decisions as whether the cost of
seeking more information about a principal is justified by the risk
associated with granting the privilege, and to determine whether a
principal should respond to such requests upon receipt. The use of
game theory to help in the construction of policies is also briefly
considered.
Global computing has many applications, all of which require access
control to prevent abuse by malicious principals. This thesis
develops three in detail: an information sharing service for PDAs,
an identity-based spam detector and a peer-to-peer collaborative
spam detection network. Given the emerging nature of computational
trust systems, in order to evaluate the effectiveness of the TBAC
model, it was first necessary to develop an evaluation methodology.
This takes the approach of a threat-based analysis, considering
possible attacks at the component and system level, to ensure that
components are correctly integrated, and system-level assumptions
made by individual components are valid. Applying the methodology to
the implementation of the TBAC model demonstrates its effectiveness
in the scenarios chosen, with good promise for further, untested,
scenarios.
Robbing the bank with a theorem proverYoun, PaulAdida, BenBond, MikeClulow, JolyonHerzog, JonathanLin, AmersonRivest, Ronald L.Anderson, RossUniversity of Cambridge, Computer Laboratory2005-08enTextUCAM-CL-TR-644ISSN 1476-2986
We present the first methodology for analysis and automated
detection of attacks on security application programming interfaces
(security APIs) – the interfaces to hardware cryptographic services
used by developers of critical security systems, such as banking
applications. Taking a cue from previous work on the formal analysis
of security protocols, we model APIs purely according to
specifications, under the assumption of ideal encryption primitives.
We use a theorem prover tool and adapt it to the security API
context. We develop specific formalization and automation techniques
that allow us to fully harness the power of a theorem prover. We
show how, using these techniques, we were able to automatically
re-discover all of the pure API attacks originally documented by
Bond and Anderson against banking payment networks, since their
discovery of this type of attack in 2000. We conclude with a note of
encouragement: the complexity and unintuiveness of the modelled
attacks make a very strong case for continued focus on automated
formal analysis of cryptographic APIs.
RFID is X-ray visionStajano, FrankUniversity of Cambridge, Computer Laboratory2005-08enTextUCAM-CL-TR-645ISSN 1476-2986
Making RFID tags as ubiquitous as barcodes will enable machines to
see and recognize any tagged object in their vicinity, better than
they ever could with the smartest image processing algorithms. This
opens many opportunities for “sentient computing” applications.
However, in so far as this new capability has some of the properties
of X-ray vision, it opens the door to abuses. To promote discussion,
I won’t elaborate on low level technological solutions; I shall
instead discuss a simple security policy model that addresses most
of the privacy issues. Playing devil’s advocate, I shall also
indicate why it is currently unlikely that consumers will enjoy the
RFID privacy that some of them vociferously demand.
An agent architecture for simulation of end-users in
programming-like tasksStaton, SamUniversity of Cambridge, Computer Laboratory2005-10enTextUCAM-CL-TR-647ISSN 1476-2986
We present some motivation and technical details for a software
simulation of an end-user performing programming-like tasks. The
simulation uses an agent/agenda model by breaking tasks down into
subgoals, based on work of A. Blackwell. This document was
distributed at the CHI 2002 workshop on Cognitive Models of
Programming-Like Processes.
Cassandra: flexible trust management and its application to
electronic health recordsBecker, Moritz Y.University of Cambridge, Computer Laboratory2005-10enTextUCAM-CL-TR-648ISSN 1476-2986
The emergence of distributed applications operating on large-scale,
heterogeneous and decentralised networks poses new and challenging
problems of concern to society as a whole, in particular for data
security, privacy and confidentiality. Trust management and
authorisation policy languages have been proposed to address access
control and authorisation in this context. Still, many key problems
have remained unsolved. Existing systems are often not expressive
enough, or are so expressive that access control becomes
undecidable; their semantics is not formally specified; and they
have not been shown to meet the requirements set by actual
real-world applications.
This dissertation addresses these problems. We present Cassandra, a
role-based language and system for expressing authorisation policy,
and the results of a substantial case study, a policy for a national
electronic health record (EHR) system, based on the requirements of
the UK National Health Service’s National Programme for Information
Technology (NPfIT).
Cassandra policies are expressed in a language derived from Datalog
with constraints. Cassandra supports credential-based authorisation
(eg between administrative domains), and rules can refer to remote
policies (for credential retrieval and trust negotiation). The
expressiveness of the language (and its computational complexity)
can be tuned by choosing an appropriate constraint domain. The
language is small and has a formal semantics for both query
evaluation and the access control engine.
There has been a lack of real-world examples of complex security
policies: our NPfIT case study fills this gap. The resulting
Cassandra policy (with 375 rules) demonstrates that the policy
language is expressive enough for a real-world application. We thus
demonstrate that a general-purpose trust management system can be
designed to be highly flexible, expressive, formally founded and
meet the complex requirements of real-world applications.
The decolorize algorithm for contrast enhancing, color to
grayscale conversionGrundland, MarkDodgson, Neil A.University of Cambridge, Computer Laboratory2005-10enTextUCAM-CL-TR-649ISSN 1476-2986
We present a new contrast enhancing color to grayscale conversion
algorithm which works in real-time. It incorporates novel techniques
for image sampling and dimensionality reduction, sampling color
differences by Gaussian pairing and analyzing color differences by
predominant component analysis. In addition to its speed and
simplicity, the algorithm has the advantages of continuous mapping,
global consistency, and grayscale preservation, as well as
predictable luminance, saturation, and hue ordering properties. We
give an extensive range of examples and compare our method with
other recently published algorithms.
Parallel iterative solution method for large sparse linear
equation systemsMehmood, RashidCrowcroft, JonUniversity of Cambridge, Computer Laboratory2005-10enTextUCAM-CL-TR-650ISSN 1476-2986
Solving sparse systems of linear equations is at the heart of
scientific computing. Large sparse systems often arise in science
and engineering problems. One such problem we consider in this paper
is the steady-state analysis of Continuous Time Markov Chains
(CTMCs). CTMCs are a widely used formalism for the performance
analysis of computer and communication systems. A large variety of
useful performance measures can be derived from a CTMC via the
computation of its steady-state probabilities. A CTMC may be
represented by a set of states and a transition rate matrix
containing state transition rates as coefficients, and can be
analysed using probabilistic model checking. However, CTMC models
for realistic systems are very large. We address this largeness
problem in this paper, by considering parallelisation of symbolic
methods. In particular, we consider Multi-Terminal Binary Decision
Diagrams (MTBDDs) to store CTMCs, and, using Jacobi iterative
method, present a parallel method for the CTMC steady-state
solution. Employing a 24-node processor bank, we report results of
the sparse systems with over a billion equations and eighteen
billion nonzeros.
End-user programming in multiple languagesHague, RobUniversity of Cambridge, Computer Laboratory2005-10enTextUCAM-CL-TR-651ISSN 1476-2986
Advances in user interface technology have removed the need for the
majority of users to program, but they do not allow the automation
of repetitive or indirect tasks. End-user programming facilities
solve this problem without requiring users to learn and use a
conventional programming language, but must be tailored to specific
types of end user. In situations where the user population is
particularly diverse, this presents a problem.
In addition, studies have shown that the performance of tasks based
on the manipulation and interpretation of data depends on the way in
which the data is represented. Different representations may
facilitate different tasks, and there is not necessarily a single,
optimal representation that is best for all tasks. In many cases,
the choice of representation is also constrained by other factors,
such as display size. It would be advantageous for an end-user
programming system to provide multiple, interchangeable
representations of programs.
This dissertation describes an architecture for providing end-user
programming facilities in the networked home, a context with a
diverse user population, and a wide variety of input and output
devices. The Media Cubes language, a novel end-user programming
language, is introduced as the context that lead to the development
of the architecture. A framework for translation between languages
via a common intermediate form is then described, with particular
attention paid to the requirements of mappings between languages and
the intermediate form. The implementation of Lingua Franca, a system
realizing this framework in the given context, is described.
Finally, the system is evaluated by considering several end-user
programming languages implemented within this system. It is
concluded that translation between programming languages, via a
common intermediate form, is viable for systems within a limited
domain, and the wider applicability of the technique is discussed.
Discriminative training methods and their applications to
handwriting recognitionNopsuwanchai, RoongrojUniversity of Cambridge, Computer Laboratory2005-11enTextUCAM-CL-TR-652ISSN 1476-2986
This thesis aims to improve the performance of handwriting
recognition systems by introducing the use of discriminative
training methods. Discriminative training methods use data from all
competing classes when training the recogniser for each class. We
develop discriminative training methods for two popular classifiers:
Hidden Markov Models (HMMs) and a prototype-based classifier. At the
expense of additional computations in the training process,
discriminative training has demonstrated significant improvements in
recognition accuracies from the classifiers that are not
discriminatively optimised. Our studies focus on isolated character
recognition problems with an emphasis on, but not limited to,
off-line handwritten Thai characters.
The thesis is organised as followed. First, we develop an HMM-based
classifier that employs a Maximum Mutual Information (MMI)
discriminative training criterion. HMMs have an increasing number of
applications to character recognition in which they are usually
trained by Maximum Likelihood (ML) using the Baum-Welch algorithm.
However, ML training does not take into account the data of other
competing categories, and thus is considered non-discriminative. By
contrast, MMI provides an alternative training method with the aim
of maximising the mutual information between the data and their
correct categories. One of our studies highlights the efficiency of
MMI training that improves the recognition results from ML training,
despite being applied to a highly constrained system (tied-mixture
density HMMs). Various aspects of MMI training are investigated,
including its optimisation algorithms and a set of optimised
parameters that yields maximum discriminabilities.
Second, a system for Thai handwriting recognition based on HMMs and
MMI training is introduced. In addition, novel feature extraction
methods using block-based PCA and composite images are proposed and
evaluated. A technique to improve generalisation of the MMI-trained
systems and the use of N-best lists to efficiently compute the
probabilities are described. By applying these techniques, the
results from extensive experiments are compelling, showing up to 65%
relative error reduction, compared to conventional ML training
without the proposed features. The best results are comparable to
those achieved by other high performance systems.
Finally, we focus on the Prototype-Based Minimum Error Classifier
(PBMEC), which uses a discriminative Minimum Classification Error
(MCE) training method to generate the prototypes. MCE tries to
minimise recognition errors during the training process using data
from all classes. Several key findings are revealed, including the
setting of smoothing parameters and a proposed clustering method
that are more suitable for PBMEC than using the conventional
methods. These studies reinforce the effectiveness of discriminative
training and are essential as a foundation for its application to
the more difficult problem of cursive handwriting recognition.
Anonymity and traceability in cyberspaceClayton, RichardUniversity of Cambridge, Computer Laboratory2005-11enTextUCAM-CL-TR-653ISSN 1476-2986
Traceability is the ability to map events in cyberspace,
particularly on the Internet, back to real-world instigators, often
with a view to holding them accountable for their actions. Anonymity
is present when traceability fails.
I examine how traceability on the Internet actually works, looking
first at a classical approach from the late 1990s that emphasises
the rôle of activity logging and reporting on the failures that are
known to occur. Failures of traceability, with consequent
unintentional anonymity, have continued as the technology has
changed. I present an analysis that ascribes these failures to the
mechanisms at the edge of the network being inherently inadequate
for the burden that traceability places upon them. The underlying
reason for this continuing failure is a lack of economic incentives
for improvement. The lack of traceability at the edges is further
illustrated by a new method of stealing another person’s identity on
an Ethernet Local Area Network that existing tools and procedures
would entirely fail to detect.
Preserving activity logs is seen, especially by Governments, as
essential for the traceability of illegal cyberspace activity. I
present a new and efficient method of processing email server logs
to detect machines sending bulk unsolicited email “spam” or email
infected with “viruses”. This creates a clear business purpose for
creating logs, but the new detector is so effective that the logs
can be discarded within days, which may hamper general traceability.
Preventing spam would be far better than tracing its origin or
detecting its transmission. Many analyse spam in economic terms, and
wish to levy a small charge for sending each email. I consider an
oft-proposed approach using computational “proof-of-work” that is
elegant and anonymity preserving. I show that, in a world of high
profit margins and insecure end-user machines, it is impossible to
find a payment level that stops the spam without affecting
legitimate usage of email.
Finally, I consider a content-blocking system with a hybrid design
that has been deployed by a UK Internet Service Provider to inhibit
access to child pornography. I demonstrate that the two-level design
can be circumvented at either level, that content providers can use
the first level to attack the second, and that the selectivity of
the first level can be used as an “oracle” to extract a list of the
sites being blocked. Although many of these attacks can be
countered, there is an underlying failure that cannot be fixed. The
system’s database holds details of the traceability of content, as
viewed from a single location at a single time. However, a blocking
system may be deployed at many sites and must track content as it
moves in space and time; functions which traceability, as currently
realized, cannot deliver.
Local reasoning for JavaParkinson, Matthew J.University of Cambridge, Computer Laboratory2005-11enTextUCAM-CL-TR-654ISSN 1476-2986
This thesis develops the local reasoning approach of separation
logic for common forms of modularity such as abstract datatypes and
objects. In particular, this thesis focuses on the modularity found
in the Java programming language.
We begin by developing a formal semantics for a core imperative
subset of Java, Middleweight Java (MJ), and then adapt separation
logic to reason about this subset. However, a naive adaption of
separation logic is unable to reason about encapsulation or
inheritance: it provides no support for modularity.
First, we address the issue of encapsulation with the novel concept
of an abstract predicate, which is the logical analogue of an
abstract datatype. We demonstrate how this method can encapsulate
state, and provide a mechanism for ownership transfer: the ability
to transfer state safely between a module and its client. We also
show how abstract predicates can be used to express the calling
protocol of a class.
However, the encapsulation provided by abstract predicates is too
restrictive for some applications. In particular, it cannot reason
about multiple datatypes that have shared read-access to state, for
example list iterators. To compensate, we alter the underlying model
to allow the logic to express properties about read-only references
to state. Additionally, we provide a model that allows both sharing
and disjointness to be expressed directly in the logic.
Finally, we address the second modularity issue: inheritance. We do
this by extending the concept of abstract predicates to abstract
predicate families. This extension allows a predicate to have
multiple definitions that are indexed by class, which allows
subclasses to have a different internal representation while
remaining behavioural subtypes. We demonstrate the usefulness of
this concept by verifying a use of the visitor design pattern.
Wearing proper combinationsSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory2005-11enTextUCAM-CL-TR-655ISSN 1476-2986
This paper discusses the proper treatment of multiple indexing
fields, representations, or streams, in document retrieval. Previous
experiments by Robertson and his colleagues have shown that, with a
widely used type of term weighting and fields that share keys,
document scores should be computed using term frequencies over
fields rather than by combining field scores. Here I examine a wide
range of document and query indexing situations, and consider their
implications for this approach to document scoring.
Seamless mobility in 4G systemsVidales, PabloUniversity of Cambridge, Computer Laboratory2005-11enTextUCAM-CL-TR-656ISSN 1476-2986
The proliferation of radio access technologies, wireless networking
devices, and mobile services has encouraged intensive nomadic
computing activity. When travelling, mobile users experience
connectivity disturbances, particularly when they handoff between
two access points that belong to the same wireless network and when
they change from one access technology to another. Nowadays, an
average mobile user might connect to many different wireless
networks in the course of a day to obtain diverse services, whilst
demanding transparent operation. Current protocols offer portability
and transparent mobility. However, they fail to cope with huge
delays caused by different link-layer characteristics when roaming
between independent disparate networks. In this dissertation, I
address this deficiency by introducing and evaluating practical
methods and solutions that minimise connection disruptions and
support transparent mobility in future communication systems.
Security protocol design by compositionChoi, Hyun-JinUniversity of Cambridge, Computer Laboratory2006-01enTextUCAM-CL-TR-657ISSN 1476-2986
The aim of this research is to present a new methodology for the
systematic design of compound protocols from their parts. Some
security properties can be made accumulative, i.e. can be put
together without interfering with one another, by carefully
selecting the mechanisms which implement them. Among them are
authentication, secrecy and non-repudiation. Based on this
observation, a set of accumulative protocol mechanisms called
protocol primitives are proposed and their correctness is verified.
These protocol primitives are obtained from common mechanisms found
in many security protocols such as challenge and response. They have
been carefully designed not to interfere with each other. This
feature makes them flexible building blocks in the proposed
methodology. Equipped with these protocol primitives, a scheme for
the systematic construction of a complicated protocol from simple
protocol primitives is presented, namely, design by composition.
This design scheme allows the combination of several simple protocol
parts into a complicated protocol without destroying the security
properties established by each independent part. In other words, the
composition framework permits the specification of a complex
protocol to be decomposed into the specifications of simpler
components, and thus makes the design and verification of the
protocol easier to handle. Benefits of this approach are similar to
those gained when using a modular approach to software development.
The applicability and practicality of the proposed methodology are
validated through many design examples of protocols found in many
different environments and with various initial assumptions. The
method is not aimed to cover all existent design issues, but a
reasonable range of protocols is addressed.
Intrinsic point-based surface processingMoenning, CarstenUniversity of Cambridge, Computer Laboratory2006-01enTextUCAM-CL-TR-658ISSN 1476-2986
The need for the processing of surface geometry represents an
ubiquitous problem in computer graphics and related disciplines. It
arises in numerous important applications such as computer-aided
design, reverse engineering, rapid prototyping, medical imaging,
cultural heritage acquisition and preservation, video gaming and the
movie industry. Existing surface processing techniques predominantly
follow an extrinsic approach using combinatorial mesh data
structures in the embedding Euclidean space to represent, manipulate
and visualise the surfaces. This thesis advocates, firstly, the
intrinsic processing of surfaces, i.e. processing directly across
the surface rather than in its embedding space. Secondly, it
continues the trend towards the use of point primitives for the
processing and representation of surfaces.
The discussion starts with the design of an intrinsic point sampling
algorithm template for surfaces. This is followed by the
presentation of a module library of template instantiations for
surfaces in triangular mesh or point cloud form. The latter is at
the heart of the intrinsic meshless surface simplification algorithm
also put forward. This is followed by the introduction of intrinsic
meshless surface subdivision, the first intrinsic meshless surface
subdivision scheme and a new method for the computation of geodesic
centroids on manifolds. The meshless subdivision scheme uses an
intrinsic neighbourhood concept for point-sampled geometry also
presented in this thesis. Its main contributions can therefore be
summarised as follows:
– An intrinsic neighbourhood concept for point-sampled geometry.
– An intrinsic surface sampling algorithm template with sampling
density guarantee.
– A modular library of template instantiations for the sampling of
planar domains and surfaces in triangular mesh or point cloud form.
– A new method for the computation of geodesic centroids on
manifolds.
– An intrinsic meshless surface simplification algorithm.
– The introduction of the notion of intrinsic meshless surface
subdivision.
– The first intrinsic meshless surface subdivision scheme.
The overall result is a set of algorithms for the processing of
point-sampled geometry centering around a generic sampling template
for surfaces in the most widely-used forms of representation. The
intrinsic nature of these point-based algorithms helps to overcome
limitations associated with the more traditional extrinsic,
mesh-based processing of surfaces when dealing with highly complex
point-sampled geometry as is typically encountered today.
A safety proof of a lazy concurrent list-based set
implementationVafeiadis, ViktorHerlihy, MauriceHoare, TonyShapiro, MarcUniversity of Cambridge, Computer Laboratory2006-01enTextUCAM-CL-TR-659ISSN 1476-2986
We prove the safety of a practical concurrent list-based
implementation due to Heller et al. It exposes an interface of an
integer set with methods contains, add, and remove. The
implementation uses a combination of fine-grain locking, optimistic
and lazy synchronisation. Our proofs are hand-crafted. They use
rely-guarantee reasoning and thereby illustrate its power and
applicability, as well as some of its limitations. For each method,
we identify the linearisation point, and establish its validity.
Hence we show that the methods are safe, linearisable and implement
a high-level specification. This report is a companion document to
our PPoPP 2006 paper entitled “Proving correctness of
highly-concurrent linearisable objects”.
Static program analysis based on virtual register
renamingSinger, JeremyUniversity of Cambridge, Computer Laboratory2006-02enTextUCAM-CL-TR-660ISSN 1476-2986
Static single assignment form (SSA) is a popular program
intermediate representation (IR) for static analysis. SSA programs
differ from equivalent control flow graph (CFG) programs only in the
names of virtual registers, which are systematically transformed to
comply with the naming convention of SSA. Static single information
form (SSI) is a recently proposed extension of SSA that enforces a
greater degree of systematic virtual register renaming than SSA.
This dissertation develops the principles, properties, and practice
of SSI construction and data flow analysis. Further, it shows that
SSA and SSI are two members of a larger family of related IRs, which
are termed virtual register renaming schemes (VRRSs). SSA and SSI
analyses can be generalized to operate on any VRRS family member.
Analysis properties such as accuracy and efficiency depend on the
underlying VRRS.
This dissertation makes four significant contributions to the field
of static analysis research.
First, it develops the SSI representation. Although SSI was
introduced five years ago, it has not yet received widespread
recognition as an interesting IR in its own right. This dissertation
presents a new SSI definition and an optimistic construction
algorithm. It also sets SSI in context among the broad range of IRs
for static analysis.
Second, it demonstrates how to reformulate existing data flow
analyses using new sparse SSI-based techniques. Examples include
liveness analysis, sparse type inference and program slicing. It
presents algorithms, together with empirical results of these
algorithms when implemented within a research compiler framework.
Third, it provides the only major comparative evaluation of the
merits of SSI for data flow analysis. Several qualitative and
quantitative studies in this dissertation compare SSI with other
similar IRs.
Last, it identifies the family of VRRSs, which are all CFGs with
different virtual register naming conventions. Many extant IRs are
classified as VRRSs. Several new IRs are presented, based on a
consideration of previously unspecified members of the VRRS family.
General analyses can operate on any family member. The required
level of accuracy or efficiency can be selected by working in terms
of the appropriate family member.
Compatible RMRS representations from RASP and the
ERGRitchie, AnnaUniversity of Cambridge, Computer Laboratory2006-03enTextUCAM-CL-TR-661ISSN 1476-2986
Various applications could potentially benefit from the integration
of deep and shallow processing techniques. A universal
representation, compatible between deep and shallow parsers, would
enable such integration, allowing the advantages of both to be
combined. This paper describes efforts to make RMRS such a
representation. This work was done as part of DeepThought, funded
under the 5th Framework Program of the European Commission (contract
reference IST-2001-37836).
An introduction to tag sequence grammars and the RASP system
parserBriscoe, TedUniversity of Cambridge, Computer Laboratory2006-03enTextUCAM-CL-TR-662ISSN 1476-2986
This report describes the tag sequence grammars released as part of
the Robust Accurate Statistical Parsing (RASP) system. It is
intended to help users of RASP understand the linguistic and
engineering rationale behind the grammars and prepare them to
customise the system for their application. It also contains a
fairly exhaustive list of references to extant work utilising the
RASP parser.
Syntax-driven analysis of context-free languages with
respect to fuzzy relational semanticsBergmair, RichardUniversity of Cambridge, Computer Laboratory2006-03enTextUCAM-CL-TR-663ISSN 1476-2986
A grammatical framework is presented that augments context-free
production rules with semantic production rules that rely on fuzzy
relations as representations of fuzzy natural language concepts. It
is shown how the well-known technique of syntax-driven semantic
analysis can be used to infer from an expression in a language
defined in such a semantically augmented grammar a weak ordering on
the possible worlds it describes. Considering the application of
natural language query processing, we show how to order elements in
the domain of a relational database scheme according to the degree
to which they fulfill the intuition behind a given natural language
statement like “Carol lives in a small city near San Francisco”.
Designing knowledge: An interdisciplinary experiment in
research infrastructure for shared descriptionBlackwell, Alan F.University of Cambridge, Computer Laboratory2006-04enTextUCAM-CL-TR-664ISSN 1476-2986
The report presents the experimental development, evaluation and
refinement of a method for doing adventurous design work, in
contexts where academics must work in collaboration with corporate
and public policy strategists and researchers. The intention has
been to do applied social science, in which a reflective research
process has resulted in a “new social form”, as expressed in the
title of the research grant that funded the project. The objective
in doing so is not simply to produce new theories, or to enjoy
interdisciplinary encounters (although both of those have been side
effects of this work). My purpose in doing the work and writing this
report is purely instrumental – working as a technologist among
social scientists, the outcome described in this report is intended
for adoption as a kind of social technology. I have given this
product a name: the “Blackwell-Leach Process” for interdisciplinary
design. The Blackwell-Leach process has since been applied and
proven useful in several novel situations, and I believe is now
sufficiently mature to justify publication of the reports that
describe both the process and its development.
Security evaluation at design time for cryptographic
hardwareLi, HuiyunUniversity of Cambridge, Computer Laboratory2006-04enTextUCAM-CL-TR-665ISSN 1476-2986
Consumer security devices are becoming ubiquitous, from pay-TV
through mobile phones, PDA, prepayment gas meters to smart cards.
There are many ongoing research efforts to keep these devices secure
from opponents who try to retrieve key information by observation or
manipulation of the chip’s components. In common industrial
practise, it is after the chip has been manufactured that security
evaluation is performed. Due to design time oversights, however,
weaknesses are often revealed in fabricated chips. Furthermore, post
manufacture security evaluation is time consuming, error prone and
very expensive. This evokes the need of “design time security
evaluation” techniques in order to identify avoidable mistakes in
design.
This thesis proposes a set of “design time security evaluation”
methodologies covering the well-known non-invasive side-channel
analysis attacks, such as power analysis and electromagnetic
analysis attacks. The thesis also covers the recently published
semi-invasive optical fault injection attacks. These security
evaluation technologies examine the system under test by reproducing
attacks through simulation and observing its subsequent response.
The proposed “design time security evaluation” methodologies can be
easily implemented into the standard integrated circuit design flow,
requiring only commonly used EDA tools. So it adds little
non-recurrent engineering (NRE) cost to the chip design but helps
identify the security weaknesses at an early stage, avoids costly
silicon re-spins, and helps succeed in industrial evaluation for
faster time-to-market.
A pact with the DevilBond, MikeDanezis, GeorgeUniversity of Cambridge, Computer Laboratory2006-06enTextUCAM-CL-TR-666ISSN 1476-2986
We study malware propagation strategies which exploit not the
incompetence or naivety of users, but instead their own greed,
malice and short-sightedness. We demonstrate that interactive
propagation strategies, for example bribery and blackmail of
computer users, are effective mechanisms for malware to survive and
entrench, and present an example employing these techniques. We
argue that in terms of propagation, there exists a continuum between
legitimate applications and pure malware, rather than a quantised
scale.
Minimizing latency of agreement protocolsZieliński, PiotrUniversity of Cambridge, Computer Laboratory2006-06enTextUCAM-CL-TR-667ISSN 1476-2986
Maintaining consistency of fault-tolerant distributed systems is
notoriously difficult to achieve. It often requires non-trivial
agreement abstractions, such as Consensus, Atomic Broadcast, or
Atomic Commitment. This thesis investigates implementations of such
abstractions in the asynchronous model, extended with unreliable
failure detectors or eventual synchrony. The main objective is to
develop protocols that minimize the number of communication steps
required in failure-free scenarios but remain correct if failures
occur. For several agreement problems and their numerous variants,
this thesis presents such low-latency algorithms and lower-bound
theorems proving their optimality.
The observation that many agreement protocols share the same
round-based structure helps to cope with a large number of agreement
problems in a uniform way. One of the main contributions of this
thesis is “Optimistically Terminating Consensus” (OTC) – a new
lightweight agreement abstraction that formalizes the notion of a
round. It is used to provide simple modular solutions to a large
variety of agreement problems, including Consensus, Atomic
Commitment, and Interactive Consistency. The OTC abstraction
tolerates malicious participants and has no latency overhead;
agreement protocols constructed in the OTC framework require no more
communication steps than their ad-hoc counterparts.
The attractiveness of this approach lies in the fact that the
correctness of OTC algorithms can be tested automatically. A theory
developed in this thesis allows us to quickly evaluate OTC algorithm
candidates without the time-consuming examination of their entire
state space. This technique is then used to scan the space of
possible solutions in order to automatically discover new
low-latency OTC algorithms. From these, one can now easily obtain
new implementations of Consensus and similar agreement problems such
as Atomic Commitment or Interactive Consistency.
Because of its continuous nature, Atomic Broadcast is considered
separately from other agreement abstractions. I first show that no
algorithm can guarantee a latency of less than three communication
steps in all failure-free scenarios. Then, I present new Atomic
Broadcast algorithms that achieve the two-step latency in some
special cases, while still guaranteeing three steps for other
failure-free scenarios. The special cases considered here are:
Optimistic Atomic Broadcast, (Optimistic) Generic Broadcast, and
closed-group Atomic Broadcast. For each of these, I present an
appropriate algorithm and prove its latency to be optimal.
Optimistically Terminating ConsensusZieliński, PiotrUniversity of Cambridge, Computer Laboratory2006-06enTextUCAM-CL-TR-668ISSN 1476-2986
Optimistically Terminating Consensus (OTC) is a variant of Consensus
that decides if all correct processes propose the same value. It is
surprisingly easy to implement: processes broadcast their proposals
and decide if sufficiently many processes report the same proposal.
This paper shows an OTC-based framework which can reconstruct all
major asynchronous Consensus algorithms, even in Byzantine settings,
with no overhead in latency or the required number of processes.
This result does not only deepen our understanding of Consensus, but
also reduces the problem of designing new, modular distributed
agreement protocols to choosing the parameters of OTC.
Active privilege management for distributed access control
systemsEyers, David M.University of Cambridge, Computer Laboratory2006-06enTextUCAM-CL-TR-669ISSN 1476-2986
The last decade has seen the explosive uptake of technologies to
support true Internet-scale distributed systems, many of which will
require security.
The policy dictating authorisation and privilege restriction should
be decoupled from the services being protected: (1) policy can be
given its own independent language syntax and semantics, hopefully
in an application independent way; (2) policy becomes portable – it
can be stored away from the services it protects; and (3) the
evolution of policy can be effected dynamically.
Management of dynamic privileges in wide-area distributed systems is
a challenging problem. Supporting fast credential revocation is a
simple example of dynamic privilege management. More complex
examples include policies that are sensitive to the current state of
a principal, such as dynamic separation of duties.
The Open Architecture for Secure Interworking Services (OASIS), an
expressive distributed role-based access control system, is traced
to the development of the Clinical and Biomedical Computing Limited
(CBCL) OASIS implementation. Two OASIS deployments are discussed –
an Electronic Health Record framework, and an inter-organisational
distributed courseware system.
The Event-based Distributed Scalable Authorisation Control
architecture for the 21st century (EDSAC21, or just EDSAC) is then
presented along with its four design layers. It builds on OASIS,
adding support for the collaborative enforcement of distributed
dynamic constraints, and incorporating publish/subscribe messaging
to allow scalable and flexible deployment. The OASIS policy language
is extended to support delegation, dynamic separation of duties, and
obligation policies.
An EDSAC prototype is examined. We show that our architecture is
ideal for experiments performed into location-aware access control.
We then demonstrate how event-based features specific to EDSAC
facilitate integration of an ad hoc workflow monitor into an access
control system.
The EDSAC architecture is powerful, flexible and extensible. It is
intended to have widespread applicability as the basis for designing
next-generation security middleware and implementing distributed,
dynamic privilege management.
On the application of program analysis and transformation to
high reliability hardwareThompson, SarahUniversity of Cambridge, Computer Laboratory2006-07enTextUCAM-CL-TR-670ISSN 1476-2986
Safety- and mission-critical systems must be both correct and
reliable. Electronic systems must behave as intended and, where
possible, do so at the first attempt – the fabrication costs of
modern VLSI devices are such that the iterative design/code/test
methodology endemic to the software world is not financially
feasible. In aerospace applications it is also essential to
establish that systems will, with known probability, remain
operational for extended periods, despite being exposed to very low
or very high temperatures, high radiation, large G-forces, hard
vacuum and severe vibration.
Hardware designers have long understood the advantages of formal
mathematical techniques. Notably, model checking and automated
theorem proving both gained acceptance within the electronic design
community at an early stage, though more recently the research focus
in validation and verification has drifted toward software. As a
consequence, the newest and most powerful techniques have not been
significantly applied to hardware; this work seeks to make a modest
contribution toward redressing the imbalance.
An abstract interpretation-based formalism is introduced,
transitional logic, that supports formal reasoning about dynamic
behaviour of combinational asynchronous circuits. The behaviour of
majority voting circuits with respect to single-event transients is
analysed, demonstrating that such circuits are not SET-immune. This
result is generalised to show that SET immunity is impossible for
all delay-insensitive circuits.
An experimental hardware partial evaluator, HarPE, is used to
demonstrate the 1st Futamura projection in hardware – a small CPU is
specialised with respect to a ROM image, yielding results that are
equivalent to compiling the program into hardware. HarPE is then
used alongside an experimental non-clausal SAT solver to implement
an automated transformation system that is capable of repairing
FPGAs that have suffered cosmic ray damage. This approach is
extended to support automated configuration, dynamic testing and
dynamic error recovery of reconfigurable spacecraft wiring
harnesses.
Low-latency Atomic Broadcast in the presence of
contentionZieliński, PiotrUniversity of Cambridge, Computer Laboratory2006-07enTextUCAM-CL-TR-671ISSN 1476-2986
The Atomic Broadcast algorithm described in this paper can deliver
messages in two communication steps, even if multiple processes
broadcast at the same time. It tags all broadcast messages with the
local real time, and delivers all messages in order of these
timestamps. The Ω-elected leader simulates processes it suspects to
have crashed (◇S). For fault-tolerance, it uses a new cheap Generic
Broadcast algorithm that requires only a majority of correct
processes (n > 2f) and, in failure-free runs, delivers all
non-conflicting messages in two steps. The main algorithm satisfies
several new lower bounds, which are proved in this paper.
Decomposing file data into discernible itemsPolicroniades-Borraz, CalicratesUniversity of Cambridge, Computer Laboratory2006-08enTextUCAM-CL-TR-672ISSN 1476-2986
The development of the different persistent data models shows a
constant pattern: the higher the level of abstraction a storage
system exposes the greater the payoff for programmers. The file API
offers a simple storage model that is agnostic of any structure or
data types in file contents. As a result, developers employ
substantial programming effort in writing persistent code. At the
other extreme, orthogonally persistent programming languages reduce
the impedance mismatch between the volatile and the persistent data
spaces by exposing persistent data as conventional programming
objects. Consequently, developers spend considerably less effort in
developing persistent code.
This dissertation addresses the lack of ability in the file API to
exploit the advantages of gaining access to the logical composition
of file content. It argues that the trade-off between efficiency and
ease of programmability of persistent code in the context of the
file API is unbalanced. Accordingly, in this dissertation I present
and evaluate two practical strategies to disclose structure and type
in file data.
First, I investigate to what extent it is possible to identify
specific portions of file content in diverse data sets through the
implementation and evaluation of techniques for data redundancy
detection. This study is interesting not only because it
characterises redundancy levels in storage systems content, but also
because redundant portions of data at a sub-file level can be an
indication of internal file data structure. Although these
techniques have been used by previous work, my analysis of data
redundancy is the first that makes an in-depth comparison of them
and highlights the trade-offs in their employment.
Second, I introduce a novel storage system API, called Datom, that
departs from the view of file content as a monolithic object.
Through a minimal set of commonly-used abstract data types, it
discloses a judicious degree of structure and type in the logical
composition of files and makes the data access semantics of
applications explicit. The design of the Datom API weighs the
addition of advanced functionality and the overheads introduced by
their employment, taking into account the requirements of the target
application domain. The implementation of the Datom API is evaluated
according to different criteria such as usability, impact at the
source-code level, and performance. The experimental results
demonstrate that the Datom API reduces work-effort and improves
software quality by providing a storage interface based on
high-level abstractions.
Probabilistic word sense disambiguation : Analysis and
techniques for combining knowledge sourcesPreiss, JuditaUniversity of Cambridge, Computer Laboratory2006-08enTextUCAM-CL-TR-673ISSN 1476-2986
This thesis shows that probabilistic word sense disambiguation
systems based on established statistical methods are strong
competitors to current state-of-the-art word sense disambiguation
(WSD) systems.
We begin with a survey of approaches to WSD, and examine their
performance in the systems submitted to the SENSEVAL-2 WSD
evaluation exercise. We discuss existing resources for WSD, and
investigate the amount of training data needed for effective
supervised WSD.
We then present the design of a new probabilistic WSD system. The
main feature of the design is that it combines multiple
probabilistic modules using both Dempster-Shafer theory and Bayes
Rule. Additionally, the use of Lidstone’s smoothing provides a
uniform mechanism for weighting modules based on their accuracy,
removing the need for an additional weighting scheme.
Lastly, we evaluate our probabilistic WSD system using traditional
evaluation methods, and introduce a novel task-based approach. When
evaluated on the gold standard used in the SENSEVAL-2 competition,
the performance of our system lies between the first and second
ranked WSD system submitted to the English all words task.
Task-based evaluations are becoming more popular in natural language
processing, being an absolute measure of a system’s performance on a
given task. We present a new evaluation method based on
subcategorization frame acquisition. Experiments with our
probabilistic WSD system give an extremely high correlation between
subcategorization frame acquisition performance and WSD performance,
thus demonstrating the suitability of SCF acquisition as a WSD
evaluation task.
Landmark Guided ForwardingLim, Meng HowUniversity of Cambridge, Computer Laboratory2006-10enTextUCAM-CL-TR-674ISSN 1476-2986
Wireless mobile ad hoc network routing presents some extremely
challenging research problems. While primarily trying to provide
connectivity, algorithms may also be designed to minimise resource
consumption such as power, or to trade off global optimisation
against the routing protocol overheads. In this thesis, we focus on
the problems of maintaining network connectivity in the presence of
node mobility whilst providing a balance between global efficiency
and robustness. The common design goal among existing wireless ad
hoc routing solutions is to search for an optimal topological path
between a source and a destination for some shortest path metric. We
argue that the goal of establishing an end to end globally optimal
path is unsustainable as the network diameter, traffic volume and
number of nodes all increase in the presence of moderate node
mobility.
Some researchers have proposed using geographic position-based
forwarding, rather than a topological-based approach. In
position-based forwarding, besides knowing about its own geographic
location, every node also acquires the geographic position of its
surrounding neighbours. Packet delivery in general is achieved by
first learning the destination position from a location service.
This is followed by addressing the packet with the destination
position before forwarding the packet on to a neighbour that,
amongst all other neighbours, is geographically nearest to the
destination. It is clear that in the ad hoc scenario, forwarding
only by geodesic position could result in situations that prevent
the packet from advancing further. To resolve this, some researchers
propose improving delivery guarantees by routing the packet along a
planar graph constructed from a Gabriel (GG) or a Relative Neighbour
Graph (RNG). This approach however has been shown to fail frequently
when position information is inherently inaccurate, or neighbourhood
state is stale, such as is the case in many plausible deployment
scenarios, e.g. due to relative mobility rates being higher than
location service update frequency.
We propose Landmark Guided Forwarding (LGF), an algorithm that
harnesses the strengths of both topological and geographical routing
algorithms. LGF is a hybrid scheme that leverages the scaling
property of the geographic approach while using local topology
knowledge to mitigate location uncertainty. We demonstrate through
extensive simulations that LGF is suited both to situations where
there are high mobility rates, and deployment when there is
inherently less accurate position data. Our results show that
Landmark Guided Forwarding converges faster, scales better and is
more flexible in a range of plausible mobility scenarios than
representative protocols from the leading classes of existing
solutions, namely GPSR, AODV and DSDV.
Computational models for first language
acquisitionButtery, Paula J.University of Cambridge, Computer Laboratory2006-11enTextUCAM-CL-TR-675ISSN 1476-2986
This work investigates a computational model of first language
acquisition; the Categorial Grammar Learner or CGL. The model builds
on the work of Villavicenio, who created a parametric Categorial
Grammar learner that organises its parameters into an inheritance
hierarchy, and also on the work of Buszkowski and Kanazawa, who
demonstrated the learnability of a k-valued Classic Categorial
Grammar (which uses only the rules of function application) from
strings. The CGL is able to learn a k-valued General Categorial
Grammar (which uses the rules of function application, function
composition and Generalised Weak Permutation). The novel concept of
Sentence Objects (simple strings, augmented strings, unlabelled
structures and functor-argument structures) are presented as
potential points from which learning may commence. Augmented strings
(which are strings augmented with some basic syntactic information)
are suggested as a sensible input to the CGL as they are cognitively
plausible objects and have greater information content than strings
alone. Building on the work of Siskind, a method for constructing
augmented strings from unordered logic forms is detailed and it is
suggested that augmented strings are simply a representation of the
constraints placed on the space of possible parses due to a strings
associated semantic content. The CGL makes crucial use of a
statistical Memory Module (constructed from a Type Memory and Word
Order Memory) that is used to both constrain hypotheses and handle
data which is noisy or parametrically ambiguous. A consequence of
the Memory Module is that the CGL learns in an incremental fashion.
This echoes real child learning as documented in Browns Stages of
Language Development and also as alluded to by an included corpus
study of child speech. Furthermore, the CGL learns faster when
initially presented with simpler linguistic data; a further corpus
study of child-directed speech suggests that this echos the input
provided to children. The CGL is demonstrated to learn from real
data. It is evaluated against previous parametric learners (the
Triggering Learning Algorithm of Gibson and Wexler and the
Structural Triggers Learner of Fodor and Sakas) and is found to be
more efficient.
Road traffic analysis using MIDAS data: journey time
predictionGibbens, R.J.Saacti, Y.University of Cambridge, Computer Laboratory2006-12enTextUCAM-CL-TR-676ISSN 1476-2986
The project described in this report was undertaken within the
Department for Transport’s second call for proposals in the Horizons
research programme under the theme of “Investigating the handling of
large transport related datasets”. The project looked at the
variability of journey times across days in three day categories:
Mondays, midweek days and Fridays. Two estimators using real-time
data were considered: a simple-to-implement regression-based method
and a more computationally demanding k-nearest neighbour method. Our
example scenario of UK data was taken from the M25 London orbital
motorway during 2003 and the results compared in terms of the
root-mean-square prediction error. It was found that where the
variability was greatest (typically during the rush hours periods or
periods of flow breakdowns) the regression and nearest neighbour
estimators reduced the prediction error substantially compared with
a naive estimator constructed from the historical mean journey time.
Only as the lag between the decision time and the journey start time
increased to beyond around 2 hours did the potential to improve upon
the historical mean estimator diminish. Thus, there is considerable
scope for prediction methods combined with access to real-time data
to improve the accuracy in journey time estimates. In so doing, they
reduce the uncertainty in estimating the generalized cost of travel.
The regression-based prediction estimator has a particularly low
computational overhead, in contrast to the nearest neighbour
estimator, which makes it entirely suitable for an online
implementation. Finally, the project demonstrates both the value of
preserving historical archives of transport related datasets as well
as provision of access to real-time measurements.
ECCO: Data centric asynchronous communicationYoneki, EikoUniversity of Cambridge, Computer Laboratory2006-12enTextUCAM-CL-TR-677ISSN 1476-2986
This dissertation deals with data centric networking in distributed
systems, which relies on content addressing instead of host
addressing for participating nodes, thus providing network
independence for applications. Publish/subscribe asynchronous group
communication realises the vision of data centric networking that is
particularly important for networks supporting mobile clients over
heterogeneous wireless networks. In such networks, client
applications prefer to receive specific data and require selective
data dissemination. Underlying mechanisms such as asynchronous
message passing, distributed message filtering and
query/subscription management are essential. Furthermore, recent
progress in wireless sensor networks brought a new dimension of data
processing in ubiquitous computing, where the sensors are used to
gather high volumes of different data types and to feed them as
contexts to a wide range of applications.
Particular emphasis has been placed on fundamental design of event
representation. Besides existing event attributes, event order, and
continuous context information such as time or geographic location
can be incorporated within an event description. Data representation
of event and query will be even more important in future ubiquitous
computing, where events flow over heterogeneous networks. This
dissertation presents a multidimensional event representation (i.e.,
Hypercube structure in RTree) for efficient indexing, filtering,
matching, and scalability in publish/subscribe systems. The
hypercube event with a typed content-based publish/subscribe system
for wide-area networks is demonstrated for improving the event
filtering process.
As a primary focus, this dissertation investigates a structureless,
asynchronous group communication over wireless ad hoc networks named
‘ECCO Pervasive Publish/Subscribe’ (ECCO-PPS). ECCO-PPS uses
context-adaptive controlled flooding, which takes a cross-layer
approach between middleware and network layers and provides a
content-based publish/subscribe paradigm. Traditionally events have
been payload data within network layer components; the network layer
never touches the data contents. However, application data have more
influence on data dissemination in ubiquitous computing scenarios.
The state information of the local node may be the event forwarding
trigger. Thus, the model of publish/subscribe must become more
symmetric, with events being disseminated based on rules and
conditions defined by the events themselves. The event can thus
choose the destinations instead of relying on the potential
receivers’ decision. The publish/subscribe system offers a data
centric approach, where the destination address is not described
with any explicit network address. The symmetric publish/subscribe
paradigm brings another level to the data-centric paradigm, leading
to a fundamental change in functionality at the network level of
asynchronous group communication and membership maintenance.
To add an additional dimension of event processing in global
computing, It is important to understand event aggregation,
filtering and correlation. Temporal ordering of events is essential
for event correlation over distributed systems. This dissertation
introduces generic composite event semantics with interval-based
semantics for event detection. This precisely defines complex timing
constraints among correlated event instances.
In conclusion, this dissertation provides advanced data-centric
asynchronous communication, which provides efficiency, reliability,
and robustness, while adapting to the underlying network
environments.
Compact forbidden-set routingTwigg, Andrew D.University of Cambridge, Computer Laboratory2006-12enTextUCAM-CL-TR-678ISSN 1476-2986
We study the compact forbidden-set routing problem. We describe the
first compact forbidden-set routing schemes that do not suffer from
non-convergence problems often associated with Bellman-Ford
iterative schemes such as the interdomain routing protocol, BGP. For
degree-d n-node graphs of treewidth t, our schemes use space O(t² d
polylog(n)) bits per node; a trivial scheme uses O(n²) and routing
trees use Ω(n) per node (these results have since been improved and
extended – see [Courcelle, Twigg, Compact forbidden-set routing,
24th Symposium on Theoretical Aspects of Computer Science, Aachen
2007]. We also show how to do forbidden-set routing on planar graphs
between nodes whose distance is less than a parameter l. We prove a
lower bound on the space requirements of forbidden-set routing for
general graphs, and show that the problem is related to constructing
an efficient distributed representation of all the separators of an
undirected graph. Finally, we consider routing while taking into
account path costs of intermediate nodes and show that this requires
large routing labels. We also study a novel way of approximating
forbidden-set routing using quotient graphs of low treewidth.
Automatic summarising: a review and discussion of the state
of the artSpärck Jones, KarenUniversity of Cambridge, Computer Laboratory2007-01enTextUCAM-CL-TR-679ISSN 1476-2986
This paper reviews research on automatic summarising over the last
decade. This period has seen a rapid growth of work in the area
stimulated by technology and by several system evaluation
programmes. The review makes use of several frameworks to organise
the review, for summarising, for systems, for the task factors
affecting summarising, and for evaluation design and practice.
The review considers the evaluation strategies that have been
applied to summarising and the issues they raise, and the major
summary evaluation programmes. It examines the input, purpose and
output factors that have been investigated in summarising research
in the last decade, and discusses the classes of strategy, both
extractive and non-extractive, that have been explored, illustrating
the range of systems that have been built. This analysis of
strategies is amplified by accounts of specific exemplar systems.
The conclusions drawn from the review are that automatic
summarisation research has made valuable progress in the last
decade, with some practically useful approaches, better evaluation,
and more understanding of the task. However as the review also makes
clear, summarising systems are often poorly motivated in relation to
the factors affecting summaries, and evaluation needs to be taken
significantly further so as to engage with the purposes for which
summaries are intended and the contexts in which they are used.
A reduced version of this report, entitled ‘Automatic summarising:
the state of the art’ will appear in Information Processing and
Management, 2007.
Haggle: Clean-slate networking for mobile devicesSu, JingScott, JamesHui, PanUpton, EbenLim, Meng HowDiot, ChristopheCrowcroft, JonGoel, Ashvinde Lara, EyalUniversity of Cambridge, Computer Laboratory2007-01enTextUCAM-CL-TR-680ISSN 1476-2986
Haggle is a layerless networking architecture for mobile devices. It
is motivated by the infrastructure dependence of applications such
as email and web browsing, even in situations where infrastructure
is not necessary to accomplish the end user goal, e.g. when the
destination is reachable by ad hoc neighbourhood communication. In
this paper we present details of Haggle’s architecture, and of the
prototype implementation which allows existing email and web
applications to become infrastructure-independent, as we show with
an experimental evaluation.
Indirect channels: a bandwidth-saving technique for
fault-tolerant protocolsZieliński, PiotrUniversity of Cambridge, Computer Laboratory2007-04enTextUCAM-CL-TR-681ISSN 1476-2986
Sending large messages known to the recipient is a waste of
bandwidth. Nevertheless, many fault-tolerant agreement protocols
send the same large message between each pair of participating
processes. This practical problem has recently been addressed in the
context of Atomic Broadcast by presenting a specialized algorithm.
This paper proposes a more general solution by providing virtual
indirect channels that physically transmit message ids instead of
full messages if possible. Indirect channels are transparent to the
application; they can be used with any distributed algorithm, even
with unreliable channels or malicious participants. At the same
time, they provide rigorous theoretical properties.
Indirect channels are conservative: they do not allow manipulating
message ids if full messages are not known. This paper also
investigates the consequences of relaxing this assumption on the
latency and correctness of Consensus and Atomic Broadcast
implementations: new algorithms and lower bounds are shown.
Translating HOL functions to hardwareIyoda, JulianoUniversity of Cambridge, Computer Laboratory2007-04enTextUCAM-CL-TR-682ISSN 1476-2986
Delivering error-free products is still a major challenge for
hardware and software engineers. Due to the increasingly growing
complexity of computing systems, there is a demand for higher levels
of automation in formal verification.
This dissertation proposes an approach to generate formally verified
circuits automatically. The main outcome of our project is a
compiler implemented on top of the theorem prover HOL4 which
translates a subset of higher-order logic to circuits. The subset of
the logic is a first-order tail-recursive functional language. The
compiler takes a function f as argument and automatically produces
the theorem “⊢ C implements f” where C is a circuit and “implements”
is a correctness relation between a circuit and a function. We
achieve full mechanisation of proofs by defining theorems which are
composable. The correctness of a circuit can be mechanically
determined by the correctness of its sub-circuits. This technology
allows the designer to focus on higher levels of abstraction instead
of reasoning and verifying systems at the gate level.
A pretty-printer translates netlists described in higher-order logic
to structural Verilog. Our compiler is integrated with Altera tools
to run our circuits in FPGAs. Thus the theorem prover is used as an
environment for supporting the development process from formal
specification to implementation.
Our approach has been tested with fairly substantial case studies.
We describe the design and the verification of a multiplier and a
simple microcomputer which has shown us that the compiler supports
small and medium-sized applications. Although this approach does not
scale to industrial-sized applications yet, it is a first step
towards the implementation of a new technology that can raise the
level of mechanisation in formal verification.
Simulation of colliding constrained rigid bodiesKleppmann, MartinUniversity of Cambridge, Computer Laboratory2007-04enTextUCAM-CL-TR-683ISSN 1476-2986
I describe the development of a program to simulate the dynamic
behaviour of interacting rigid bodies. Such a simulation may be used
to generate animations of articulated characters in 3D graphics
applications. Bodies may have an arbitrary shape, defined by a
triangle mesh, and may be connected with a variety of different
joints. Joints are represented by constraint functions which are
solved at run-time using Lagrange multipliers. The simulation
performs collision detection and prevents penetration of rigid
bodies by applying impulses to colliding bodies and reaction forces
to bodies in resting contact.
The simulation is shown to be physically accurate and is tested on
several different scenes, including one of an articulated human
character falling down a flight of stairs.
An appendix describes how to derive arbitrary constraint functions
for the Lagrange multiplier method. Collisions and joints are both
represented as constraints, which allows them to be handled with a
unified algorithm. The report also includes some results relating to
the use of quaternions in dynamic simulations.
Bubble Rap: Forwarding in small world DTNs in ever
decreasing circlesHui, PanCrowcroft, JonUniversity of Cambridge, Computer Laboratory2007-05enTextUCAM-CL-TR-684ISSN 1476-2986
In this paper we seek to improve understanding of the structure of
human mobility, and to use this in the design of forwarding
algorithms for Delay Tolerant Networks for the dissemination of data
amongst mobile users.
Cooperation binds but also divides human society into communities.
Members of the same community interact with each other
preferentially. There is structure in human society. Within society
and its communities, individuals have varying popularity. Some
people are more popular and interact with more people than others;
we may call them hubs. Popularity ranking is one facet of the
population. In many physical networks, some nodes are more highly
connected to each other than to the rest of the network. The set of
such nodes are usually called clusters, communities, cohesive groups
or modules. There is structure to social networking. Different
metrics can be used such as information flow, Freeman betweenness,
closeness and inference power, but for all of them, each node in the
network can be assigned a global centrality value.
What can be inferred about individual popularity, and the structure
of human society from measurements within a network? How can the
local and global characteristics of the network be used practically
for information dissemination? We present and evaluate a sequence of
designs for forwarding algorithms for Pocket Switched Networks,
culminating in Bubble, which exploit increasing levels of
information about mobility and interaction.
Effect of severe image compression on iris recognition
performanceDaugman, JohnDowning, CathrynUniversity of Cambridge, Computer Laboratory2007-05enTextUCAM-CL-TR-685ISSN 1476-2986
We investigate three schemes for severe compression of iris images,
in order to assess what their impact would be on recognition
performance of the algorithms deployed today for identifying persons
by this biometric feature. Currently, standard iris images are 600
times larger than the IrisCode templates computed from them for
database storage and search; but it is administratively desired that
iris data should be stored, transmitted, and embedded in media in
the form of images rather than as templates computed with
proprietary algorithms. To reconcile that goal with its implications
for bandwidth and storage, we present schemes that combine
region-of-interest isolation with JPEG and JPEG2000 compression at
severe levels, and we test them using a publicly available
government database of iris images. We show that it is possible to
compress iris images to as little as 2 KB with minimal impact on
recognition performance. Only some 2% to 3% of the bits in the
IrisCode templates are changed by such severe image compression.
Standard performance metrics such as error trade-off curves document
very good recognition performance despite this reduction in data
size by a net factor of 150, approaching a convergence of image data
size and template size.
Dependable systems for Sentient ComputingRice, Andrew C.University of Cambridge, Computer Laboratory2007-05enTextUCAM-CL-TR-686ISSN 1476-2986
Computers and electronic devices are continuing to proliferate
throughout our lives. Sentient Computing systems aim to reduce the
time and effort required to interact with these devices by composing
them into systems which fade into the background of the user’s
perception. Failures are a significant problem in this scenario
because their occurrence will pull the system into the foreground as
the user attempts to discover and understand the fault. However,
attempting to exist and interact with users in a real,
unpredictable, physical environment rather than a well-constrained
virtual environment makes failures inevitable.
This dissertation describes a study of dependability. A dependable
system permits applications to discover the extent of failures and
to adapt accordingly such that their continued behaviour is
intuitive to users of the system.
Cantag, a reliable marker-based machine-vision system, has been
developed to aid the investigation of dependability. The description
of Cantag includes specific contributions for marker tracking such
as rotationally invariant coding schemes and reliable
back-projection for circular tags. An analysis of Cantag’s
theoretical performance is presented and compared to its real-world
behaviour. This analysis is used to develop optimised tag designs
and performance metrics. The use of validation is proposed to permit
runtime calculation of observable metrics and verification of system
components. Formal proof methods are combined with a logical
validation framework to show the validity of performance
optimisations.
A marriage of rely/guarantee and separation logicVafeiadis, ViktorParkinson, MatthewUniversity of Cambridge, Computer Laboratory2007-06enTextUCAM-CL-TR-687ISSN 1476-2986
In the quest for tractable methods for reasoning about concurrent
algorithms both rely/guarantee logic and separation logic have made
great advances. They both seek to tame, or control, the complexity
of concurrent interactions, but neither is the ultimate approach.
Rely-guarantee copes naturally with interference, but its
specifications are complex because they describe the entire state.
Conversely separation logic has difficulty dealing with
interference, but its specifications are simpler because they
describe only the relevant state that the program accesses.
We propose a combined system which marries the two approaches. We
can describe interference naturally (using a relation as in
rely/guarantee), and where there is no interference, we can reason
locally (as in separation logic). We demonstrate the advantages of
the combined approach by verifying a lock-coupling list algorithm,
which actually disposes/frees removed nodes.
Name-passing process calculi: operational models and
structural operational semanticsStaton, SamUniversity of Cambridge, Computer Laboratory2007-06enTextUCAM-CL-TR-688ISSN 1476-2986
This thesis is about the formal semantics of name-passing process
calculi. We study operational models by relating various different
notions of model, and we analyse structural operational semantics by
extracting a congruence rule format from a model theory. All aspects
of structural operational semantics are addressed: behaviour,
syntax, and rule-based inductive definitions.
A variety of models for name-passing behaviour are considered and
developed. We relate classes of indexed labelled transition systems,
proposed by Cattani and Sewell, with coalgebraic models proposed by
Fiore and Turi. A general notion of structured coalgebra is
introduced and developed, and a natural notion of structured
bisimulation is related to Sangiorgi’s open bisimulation for the
π-calculus. At first the state spaces are organised as presheaves,
but it is reasonable to constrain the models to sheaves in a
category known as the Schanuel topos. This sheaf topos is exhibited
as equivalent to a category of named-sets proposed by Montanari and
Pistore for efficient verification of name-passing systems.
Syntax for name-passing calculi involves variable binding and
substitution. Gabbay and Pitts proposed nominal sets as an elegant
model for syntax with binding, and we develop a framework for
substitution in this context. The category of nominal sets is
equivalent to the Schanuel topos, and so syntax and behaviour can be
studied within one universe.
An abstract account of structural operational semantics was
developed by Turi and Plotkin. They explained the inductive
specification of a system by rules in the GSOS format of Bloom et
al., in terms of initial algebra recursion for lifting a monad of
syntax to a category of behaviour. The congruence properties of
bisimilarity can be observed at this level of generality. We study
this theory in the general setting of structured coalgebras, and
then for the specific case of name-passing systems, based on
categories of nominal sets.
At the abstract level of category theory, classes of rules are
understood as natural transformations. In the concrete domain,
though, rules for name-passing systems are formulae in a suitable
logical framework. By imposing a format on rules in Pitts’s nominal
logic, we characterise a subclass of rules in the abstract domain.
Translating the abstract results, we conclude that, for a
name-passing process calculus defined by rules in this format, a
variant of open bisimilarity is a congruence.
Removing polar rendering artifacts in subdivision
surfacesAugsdörfer, Ursula H.Dodgson, Neil A.Sabin, Malcolm A.University of Cambridge, Computer Laboratory2007-06enTextUCAM-CL-TR-689ISSN 1476-2986
There is a belief that subdivision schemes require the subdominant
eigenvalue, λ, to be the same around extraordinary vertices as in
the regular regions of the mesh. This belief is owing to the polar
rendering artifacts which occur around extraordinary points when λ
is significantly larger than in the regular regions. By constraining
the tuning of subdivision schemes to solutions which fulfill this
condition we may prevent ourselves from finding the optimal limit
surface. We show that the perceived problem is purely a rendering
artifact and that it does not reflect the quality of the underlying
limit surface. Using the bounded curvature Catmull-Clark scheme as
an example, we describe three practical methods by which this
rendering artifact can be removed, thereby allowing us to tune
subdivision schemes using any appropriate values of λ.
Cluster storage for commodity computationRoss, Russell GlenUniversity of Cambridge, Computer Laboratory2007-06enTextUCAM-CL-TR-690ISSN 1476-2986
Standards in the computer industry have made basic components and
entire architectures into commodities, and commodity hardware is
increasingly being used for the heavy lifting formerly reserved for
specialised platforms. Now software and services are following.
Modern updates to virtualization technology make it practical to
subdivide commodity servers and manage groups of heterogeneous
services using commodity operating systems and tools, so services
can be packaged and managed independent of the hardware on which
they run. Computation as a commodity is soon to follow, moving
beyond the specialised applications typical of today’s utility
computing.
In this dissertation, I argue for the adoption of service
clusters—clusters of commodity machines under central control, but
running services in virtual machines for arbitrary, untrusted
clients—as the basic building block for an economy of flexible
commodity computation. I outline the requirements this platform
imposes on its storage system and argue that they are necessary for
service clusters to be practical, but are not found in existing
systems.
Next I introduce Envoy, a distributed file system for service
clusters. In addition to meeting the needs of a new environment,
Envoy introduces a novel file distribution scheme that organises
metadata and cache management according to runtime demand. In
effect, the file system is partitioned and control of each part
given to the client that uses it the most; that client in turn acts
as a server with caching for other clients that require concurrent
access. Scalability is limited only by runtime contention, and
clients share a perfectly consistent cache distributed across the
cluster. As usage patterns change, the partition boundaries are
updated dynamically, with urgent changes made quickly and more minor
optimisations made over a longer period of time.
Experiments with the Envoy prototype demonstrate that service
clusters can support cheap and rapid deployment of services, from
isolated instances to groups of cooperating components with shared
storage demands.
Preconditions on geometrically sensitive subdivision
schemesDodgson, Neil A.Sabin, Malcolm A.Southern, RichardUniversity of Cambridge, Computer Laboratory2007-08enTextUCAM-CL-TR-691ISSN 1476-2986
Our objective is to create subdivision schemes with limit surfaces
which are surfaces useful in engineering (spheres, cylinders, cones
etc.) without resorting to special cases. The basic idea explored by
us previously in the curve case is that if the property that all
vertices lie on an object of the required class can be preserved
through the subdivision refinement, it will be preserved into the
limit surface also. The next obvious step was to try a bivariate
example. We therefore identified the simplest possible scheme and
implemented it. However, this misbehaved quite dramatically. This
report, by doing the limit analysis, identifies why the misbehaviour
occurred, and draws conclusions about how the problems should be
avoided.
Toward an undergraduate programme in Interdisciplinary
DesignBlackwell, Alan F.University of Cambridge, Computer Laboratory2007-07enTextUCAM-CL-TR-692ISSN 1476-2986
This technical report describes an experimental syllabus proposal
that was developed for the Cambridge Computer Science Tripos (the
standard undergraduate degree programme in Computer Science at
Cambridge). The motivation for the proposal was to create an
innovative research-oriented taught course that would be compatible
with the broader policy goals of the Crucible network for research
in interdisciplinary design. As the course is not proceeding, the
syllabus is published here for use by educators and educational
researchers with interests in design teaching.
Automatic classification of eventual failure
detectorsZieliński, PiotrUniversity of Cambridge, Computer Laboratory2007-07enTextUCAM-CL-TR-693ISSN 1476-2986
Eventual failure detectors, such as Ω or ♢P, can make arbitrarily
many mistakes before they start providing correct information. This
paper shows that any detector implementable in an purely
asynchronous system can be implemented as a function of only the
order of most-recently heard-from processes. The finiteness of this
representation means that eventual failure detectors can be
enumerated and their relative strengths tested automatically. The
results for systems with two and three processes are presented.
Implementability can also be modelled as a game between Prover and
Disprover. This approach not only speeds up automatic
implementability testing, but also results in shorter and more
intuitive proofs. I use this technique to identify the new weakest
failure detector anti-Ω and prove its properties. Anti-Ω outputs
process ids and, while not necessarily stabilizing, it ensures that
some correct process is eventually never output.
Anti-Ω: the weakest failure detector for set
agreementZieliński, PiotrUniversity of Cambridge, Computer Laboratory2007-07enTextUCAM-CL-TR-694ISSN 1476-2986
In the set agreement problem, n processes have to decide on at most
n−1 of the proposed values. This paper shows that the anti-Ω failure
detector is both sufficient and necessary to implement set agreement
in an asynchronous shared-memory system equipped with registers.
Each query to anti-Ω returns a single process id; the specification
ensures that there is a correct process whose id is returned only
finitely many times.
Efficient maximum-likelihood decoding of spherical lattice
codesSu, KarenBerenguer, InakiWassell, Ian J.Wang, XiaodongUniversity of Cambridge, Computer Laboratory2007-07enTextUCAM-CL-TR-695ISSN 1476-2986
A new framework for efficient and exact Maximum-Likelihood (ML)
decoding of spherical lattice codes is developed. It employs a
double-tree structure: The first is that which underlies established
tree-search decoders; the second plays the crucial role of guiding
the primary search by specifying admissible candidates and is our
focus in this report. Lattice codes have long been of interest due
to their rich structure, leading to numerous decoding algorithms for
unbounded lattices, as well as those with axis-aligned rectangular
shaping regions. Recently, spherical Lattice Space-Time (LAST) codes
were proposed to realize the optimal diversity-multiplexing tradeoff
of MIMO channels. We address the so-called boundary control problem
arising from the spherical shaping region defining these codes. This
problem is complicated because of the varying number of candidates
potentially under consideration at each search stage; it is not
obvious how to address it effectively within the frameworks of
existing schemes. Our proposed strategy is compatible with all
sequential tree-search detectors, as well as auxiliary processing
such as the MMSE-GDFE and lattice reduction. We demonstrate the
superior performance and complexity profiles achieved when applying
the proposed boundary control in conjunction with two current
efficient ML detectors and show an improvement of 1dB over the
state-of-the-art at a comparable complexity.
An introduction to inertial navigationWoodman, Oliver J.University of Cambridge, Computer Laboratory2007-08enTextUCAM-CL-TR-696ISSN 1476-2986
Until recently the weight and size of inertial sensors has
prohibited their use in domains such as human motion capture. Recent
improvements in the performance of small and lightweight
micro-machined electromechanical systems (MEMS) inertial sensors
have made the application of inertial techniques to such problems
possible. This has resulted in an increased interest in the topic of
inertial navigation, however current introductions to the subject
fail to sufficiently describe the error characteristics of inertial
systems.
We introduce inertial navigation, focusing on strapdown systems
based on MEMS devices. A combination of measurement and simulation
is used to explore the error characteristics of such systems. For a
simple inertial navigation system (INS) based on the Xsens Mtx
inertial measurement unit (IMU), we show that the average error in
position grows to over 150 m after 60 seconds of operation. The
propagation of orientation errors caused by noise perturbing
gyroscope signals is identified as the critical cause of such drift.
By simulation we examine the significance of individual noise
processes perturbing the gyroscope signals, identifying white noise
as the process which contributes most to the overall drift of the
system.
Sensor fusion and domain specific constraints can be used to reduce
drift in INSs. For an example INS we show that sensor fusion using
magnetometers can reduce the average error in position obtained by
the system after 60 seconds from over 150 m to around 5 m. We
conclude that whilst MEMS IMU technology is rapidly improving, it is
not yet possible to build a MEMS based INS which gives sub-meter
position accuracy for more than one minute of operation.
Scaling Mount Concurrency: scalability and progress in
concurrent algorithmsPurcell, Chris J.University of Cambridge, Computer Laboratory2007-08enTextUCAM-CL-TR-697ISSN 1476-2986
As processor speeds plateau, chip manufacturers are turning to
multi-processor and multi-core designs to increase performance. As
the number of simultaneous threads grows, Amdahl’s Law means the
performance of programs becomes limited by the cost that does not
scale: communication, via the memory subsystem. Algorithm design is
critical in minimizing these costs.
In this dissertation, I first show that existing instruction set
architectures must be extended to allow general scalable algorithms
to be built. Since it is impractical to entirely abandon existing
hardware, I then present a reasonably scalable implementation of a
map built on the widely-available compare-and-swap primitive, which
outperforms existing algorithms for a range of usages.
Thirdly, I introduce a new primitive operation, and show that it
provides efficient and scalable solutions to several problems before
proving that it satisfies strong theoretical properties. Finally, I
outline possible hardware implementations of the primitive with
different properties and costs, and present results from a hardware
evaluation, demonstrating that the new primitive can provide good
practical performance.
Pulse-based, on-chip interconnectHollis, Simon J.University of Cambridge, Computer Laboratory2007-09enTextUCAM-CL-TR-698ISSN 1476-2986
This thesis describes the development of an on-chip point-to-point
link, with particular emphasis on the reduction of its global metal
area footprint.
To reduce its metal footprint, the interconnect uses a serial
transmission approach. 8-bit data is sent using just two wires,
through a pulse-based technique, inspired by the GasP interconnect
from Sun Microsystems. Data and control signals are transmitted
bi-directionally on a wire using this double-edged, pulse-based
signalling protocol, and formatted using a variant of dual-rail
encoding. These choices enable a reduction in the number of wires
needed, an improvement in the acknowledgement overhead of the
asynchronous protocol, and the ability to cross clock domains
without synchronisation hazards.
New, stateful, repeaters are demonstrated, and results from spice
simulations of the system show that data can be transferred at over
1Gbit/s, over 1mm of minimum-sized, minimally-spaced metal 5 wiring,
on a 180nm (0.18um) technology. This reduces to only 926Mbit/s, when
10mm of wiring is considered, and represents a channel utilisation
of a very attractive 45% of theoretical capacity at this length.
Analysis of latencies, energy consumption, and area use are also
provided.
The point-to-point link is then expanded with the invention and
demonstration of a router and an arbitrated merge element, to
produce a Network-on-Chip (NoC) design, called RasP. The full system
is then evaluated, and peak throughput is shown to be 763Mbit/s for
1mm of wiring, reducing to 599Mbit/s for 10mm of the narrow metal 5
interconnect.
Finally, RasP is compared in performance with the Chain interconnect
from the University of Manchester. Results for the metrics of
throughput, latency, energy consumption and area footprint show that
the two systems perform very similarly — the maximum absolute
deviation is under 25% for throughput, latency and area; and the
energy-efficiency of RasP is approximately twice that of Chain.
Between the two systems, RasP has the smaller latency, energy and
area requirements and is shown to be a viable alternative NoC
design.
A smooth manifold based construction of approximating lofted
surfacesSouthern, RichardDodgson, Neil A.University of Cambridge, Computer Laboratory2007-10enTextUCAM-CL-TR-699ISSN 1476-2986
We present a new method for constructing a smooth manifold
approximating a curve network or control mesh. In our two-step
method, smooth vertex patches are initially defined by extrapolating
and then blending a univariate or bivariate surface representation.
Each face is then constructed by blending together the segments of
each vertex patch corresponding to the face corners. By
approximating the input curve network, rather than strictly
interpolating it, we have greater flexibility in controlling surface
behaviour and have local control. Additionally no initial control
mesh fitting or fairing needs to be performed, and no derivative
information is needed to ensure continuity at patch boundaries.
Context aware service compositionVuković, MajaUniversity of Cambridge, Computer Laboratory2007-10enTextUCAM-CL-TR-700ISSN 1476-2986
Context aware applications respond and adapt to changes in the
computing environment. For example, they may react when the location
of the user or the capabilities of the device used change. Despite
the increasing importance and popularity of such applications,
advances in application models to support their development have not
kept up. Legacy application design models, which embed contextual
dependencies in the form of if-then rules specifying how
applications should react to context changes, are still widely used.
Such models are impractical to accommodate the large variety of
possibly even unanticipated context types and their values.
This dissertation proposes a new application model for building
context aware applications, considering them as dynamically composed
sequences of calls to services, software components that perform
well-defined computational operations and export open interfaces
through which they can be invoked. This work employs goal-oriented
inferencing from planning technologies for selecting the services
and assembling the sequence of their execution, allowing different
compositions to result from different context parameters such as
resources available, time constraints, and user location. Contextual
changes during the execution of the services may trigger further
re-composition causing the application to evolve dynamically.
An important challenge in providing a context aware service
composition facility is dealing with failures that may occur, for
instance as a result of context changes or missing service
descriptions. To handle composition failures, this dissertation
introduces GoalMorph, a system which transforms failed composition
requests into alternative ones that can be solved.
This dissertation describes the design and implementation of the
proposed framework for context aware service composition.
Experimental evaluation of a realistic infotainment application
demonstrates that the framework provides an effcient and scalable
solution. Furthermore, it shows that GoalMorph transforms goals
successfully, increasing the utility of achieved goals without
imposing a prohibitive composition time overhead.
By developing the proposed framework for fault-tolerant, context
aware service composition this work ultimately lowers the barrier
for building extensible applications that automatically adapt to the
user’s context. This represents a step towards a new paradigm for
developing adaptive software to accommodate the increasing
dynamicity of computing environments.
Vector microprocessors for cryptographyFournier, Jacques Jean-AlainUniversity of Cambridge, Computer Laboratory2007-10enTextUCAM-CL-TR-701ISSN 1476-2986
Embedded security devices like ‘Trusted Platforms’ require both
scalability (of power, performance and area) and flexibility (of
software and countermeasures). This thesis illustrates how data
parallel techniques can be used to implement scalable architectures
for cryptography. Vector processing is used to provide high
performance, power efficient and scalable processors. A programmable
vector 4-stage pipelined co-processor, controlled by a scalar MIPS
compatible processor, is described. The instruction set of the
co-processor is defined for cryptographic algorithms like AES and
Montgomery modular multiplication for RSA and ECC. The instructions
are assessed using an instruction set simulator based on the ArchC
tool. This instruction set simulator is used to see the impact of
varying the vector register depth (p) and the number of vector
processing units (r). Simulations indicate that for vector versions
of AES, RSA and ECC the performance improves in O(log(r)). A
cycle-accurate synthesisable Verilog model of the system (VeMICry)
is implemented in TSMC’s 90nm technology and used to show that the
best area/power/performance tradeoff is reached for r = (p/4). Also,
this highly scalable design allows area/power/performance trade-offs
to be made for a panorama of applications ranging from smart-cards
to servers. This thesis is, to my best knowledge, the first attempt
to implement embedded cryptography using vector processing
techniques.
Relationships for object-oriented programming
languagesWren, AlisdairUniversity of Cambridge, Computer Laboratory2007-11enTextUCAM-CL-TR-702ISSN 1476-2986
Object-oriented approaches to software design and implementation
have gained enormous popularity over the past two decades. However,
whilst models of software systems routinely allow software engineers
to express relationships between objects, object-oriented
programming languages lack this ability. Instead, relationships must
be encoded using complex reference structures. When the model cannot
be expressed directly in code, it becomes more difficult for
programmers to see a correspondence between design and
implementation – the model no longer faithfully documents the code.
As a result, programmer intuition is lost, and error becomes more
likely, particularly during maintenance of an unfamiliar software
system.
This thesis explores extensions to object-oriented languages so that
relationships may be expressed with the same ease as objects. Two
languages with relationships are specified: RelJ, which offers
relationships in a class-based language based on Java, and QSigma,
which is an object calculus with heap query.
In RelJ, relationship declarations exist at the same level as class
declarations: relationships are named, they may have fields and
methods, they may inherit from one another and their instances may
be referenced just like objects. Moving into the object-based world,
QSigma is based on the sigma-calculi of Abadi and Cardelli, extended
with the ability to query the heap. Heap query allows objects to
determine how they are referenced by other objects, such that single
references are sufficient for establishing an inter-object
relationship observable by all participants. Both RelJ and QSigma
are equipped with a formal type system and semantics to ensure type
safety in the presence of these extensions.
By giving formal models of relationships in both class- and
object-based settings, we can obtain general principles for
relationships in programming languages and, therefore, establish a
correspondence between implementation and design.
Lazy Susan: dumb waiting as proof of workCrowcroft, JonDeegan, TimKreibich, ChristianMortier, RichardWeaver, NicholasUniversity of Cambridge, Computer Laboratory2007-11enTextUCAM-CL-TR-703ISSN 1476-2986
The open nature of Internet services has been of great value to
users, enabling dramatic innovation and evolution of services.
However, this openness permits many abuses of open-access Internet
services such as web, email, and DNS. To counteract such abuses, a
number of so called proof-of-work schemes have been proposed. They
aim to prevent or limit such abuses by demanding potential clients
of the service to prove that they have carried out some amount of
work before they will be served. In this paper we show that existing
resource-based schemes have several problems, and instead propose
latency-based proof-of-work as a solution. We describe centralised
and distributed variants, introducing the problem class of
non-parallelisable shared secrets in the process. We also discuss
application of this technique at the network layer as a way to
prevent Internet distributed denial-of-service attacks.
Complexity and infinite games on finite graphsHunter, Paul WilliamUniversity of Cambridge, Computer Laboratory2007-11enTextUCAM-CL-TR-704ISSN 1476-2986
This dissertation investigates the interplay between complexity,
infinite games, and finite graphs. We present a general framework
for considering two-player games on finite graphs which may have an
infinite number of moves and we consider the computational
complexity of important related problems. Such games are becoming
increasingly important in the field of theoretical computer science,
particularly as a tool for formal verification of non-terminating
systems. The framework introduced enables us to simultaneously
consider problems on many types of games easily, and this is
demonstrated by establishing previously unknown complexity bounds on
several types of games.
We also present a general framework which uses infinite games to
define notions of structural complexity for directed graphs. Many
important graph parameters, from both a graph theoretic and
algorithmic perspective, can be defined in this system. By
considering natural generalizations of these games to directed
graphs, we obtain a novel feature of digraph complexity: directed
connectivity. We show that directed connectivity is an
algorithmically important measure of complexity by showing that when
it is limited, many intractable problems can be efficiently solved.
Whether it is structurally an important measure is yet to be seen,
however this dissertation makes a preliminary investigation in this
direction.
We conclude that infinite games on finite graphs play an important
role in the area of complexity in theoretical computer science.
Optimizing compilation with the Value State Dependence
GraphLawrence, Alan C.University of Cambridge, Computer Laboratory2007-12enTextUCAM-CL-TR-705ISSN 1476-2986
Most modern compilers are based on variants of the Control Flow
Graph. Developments on this representation—specifically, SSA form
and the Program Dependence Graph (PDG)—have focused on adding and
refining data dependence information, and these suggest the next
step is to use a purely data-dependence-based representation such as
the VDG (Ernst et al.) or VSDG (Johnson et al.).
This thesis studies such representations, identifying key
differences in the information carried by the VSDG and several
restricted forms of PDG, which relate to functional programming and
continuations. We unify these representations in a new framework for
specifying the sharing of resources across a computation.
We study the problems posed by using the VSDG, and argue that
existing techniques have not solved the sequentialization problem of
mapping VSDGs back to CFGs. We propose a new compiler architecture
breaking sequentialization into several stages which focus on
different characteristics of the input VSDG, and tend to be
concerned with different properties of the output and target
machine. The stages integrate a wide variety of important
optimizations, exploit opportunities offered by the VSDG to address
many common phase-order problems, and unify many operations
previously considered distinct.
Focusing on branch-intensive code, we demonstrate how effective
control flow—sometimes superior to that of the original source code,
and comparable to the best CFG optimization techniques—can be
reconstructed from just the dataflow information comprising the
VSDG. Further, a wide variety of more invasive optimizations
involving the duplication and specialization of program elements are
eased because the VSDG relaxes the CFG’s overspecification of
instruction and branch ordering. Specifically we identify the
optimization of nested branches as generalizing the problem of
minimizing boolean expressions.
We conclude that it is now practical to discard the control flow
information rather than maintain it in parallel as is done in many
previous approaches (e.g. the PDG).
Covert channel vulnerabilities in anonymity
systemsMurdoch, Steven J.University of Cambridge, Computer Laboratory2007-12enTextUCAM-CL-TR-706ISSN 1476-2986
The spread of wide-scale Internet surveillance has spurred interest
in anonymity systems that protect users’ privacy by restricting
unauthorised access to their identity. This requirement can be
considered as a flow control policy in the well established field of
multilevel secure systems. I apply previous research on covert
channels (unintended means to communicate in violation of a security
policy) to analyse several anonymity systems in an innovative way.
One application for anonymity systems is to prevent collusion in
competitions. I show how covert channels may be exploited to violate
these protections and construct defences against such attacks,
drawing from previous covert channel research and
collusion-resistant voting systems.
In the military context, for which multilevel secure systems were
designed, covert channels are increasingly eliminated by physical
separation of interconnected single-role computers. Prior work on
the remaining network covert channels has been solely based on
protocol specifications. I examine some protocol implementations and
show how the use of several covert channels can be detected and how
channels can be modified to resist detection.
I show how side channels (unintended information leakage) in
anonymity networks may reveal the behaviour of users. While drawing
on previous research on traffic analysis and covert channels, I
avoid the traditional assumption of an omnipotent adversary. Rather,
these attacks are feasible for an attacker with limited access to
the network. The effectiveness of these techniques is demonstrated
by experiments on a deployed anonymity network, Tor.
Finally, I introduce novel covert and side channels which exploit
thermal effects. Changes in temperature can be remotely induced
through CPU load and measured by their effects on crystal clock
skew. Experiments show this to be an effective attack against Tor.
This side channel may also be usable for geolocation and, as a
covert channel, can cross supposedly infallible air-gap security
boundaries.
This thesis demonstrates how theoretical models and generic
methodologies relating to covert channels may be applied to find
practical solutions to problems in real-world anonymity systems.
These findings confirm the existing hypothesis that covert channel
analysis, vulnerabilities and defences developed for multilevel
secure systems apply equally well to anonymity systems.
Complexity-effective superscalar embedded processors using
instruction-level distributed processingCaulfield, IanUniversity of Cambridge, Computer Laboratory2007-12enTextUCAM-CL-TR-707ISSN 1476-2986
Modern trends in mobile and embedded devices require ever increasing
levels of performance, while maintaining low power consumption and
silicon area usage. This thesis presents a new architecture for a
high-performance embedded processor, based upon the
instruction-level distributed processing (ILDP) methodology. A
qualitative analysis of the complexity of an ILDP implementation as
compared to both a typical scalar RISC CPU and a superscalar design
is provided, which shows that the ILDP architecture eliminates or
greatly reduces the size of a number of structures present in a
superscalar architecture, allowing its complexity and power
consumption to compare favourably with a simple scalar design.
The performance of an implementation of the ILDP architecture is
compared to some typical processors used in high-performance
embedded systems. The effect on performance of a number of the
architectural parameters is analysed, showing that many of the
parallel structures used within the processor can be scaled to
provide less parallelism with little cost to the overall
performance. In particular, the size of the register file can be
greatly reduced with little average effect on performance – a size
of 32 registers, with 16 visible in the instruction set, is shown to
provide a good trade-off between area/power and performance.
Several novel developments to the ILDP architecture are then
described and analysed. Firstly, a scheme to halve the number of
processing elements and thus greatly reduce silicon area and power
consumption is outlined but proves to result in a 12–14% drop in
performance. Secondly, a method to reduce the area and power
requirements of the memory logic in the architecture is presented
which can achieve similar performance to the original architecture
with a large reduction in area and power requirements or, at an
increased area/power cost, can improve performance by approximately
24%. Finally, a new organisation for the register file is proposed,
which reduces the silicon area used by the register file by
approximately three-quarters and allows even greater power savings,
especially in the case where processing elements are power gated.
Overall, it is shown that the ILDP methodology is a viable approach
for future embedded system design, and several new variants on the
architecture are contributed. Several areas of useful future
research are highlighted, especially with respect to compiler design
for the ILDP paradigm.
IDRM: Inter-Domain Routing Protocol for Mobile Ad Hoc
NetworksChau, Chi-KinCrowcroft, JonLee, Kang-WonWong, Starsky H.Y.University of Cambridge, Computer Laboratory2008-01enTextUCAM-CL-TR-708ISSN 1476-2986
Inter-domain routing is an important component to allow
interoperation among heterogeneous network domains operated by
different organizations. Although inter-domain routing has been
extensively studied in the Internet, it remains relatively
unexplored in the Mobile Ad Hoc Networks (MANETs) space. In MANETs,
the inter-domain routing problem is challenged by: (1) dynamic
network topology, and (2) diverse intra-domain ad hoc routing
protocols. In this paper, we propose a networking protocol called
IDRM (Inter-Domain Routing Protocol for MANETs) to enable
interoperation among MANETs. IDRM can handle the dynamic nature of
MANETs and support policy-based routing similarly to BGP. We first
discuss the design challenges for inter-domain routing in MANETs,
and then present the design of IDRM with illustrative examples.
Finally, we present a simulation-based study to understand the
operational effectiveness of inter-domain routing and show that the
overhead of IDRM is moderate.
Protocols and technologies for security in pervasive
computing and communicationsWong, Ford LongUniversity of Cambridge, Computer Laboratory2008-01enTextUCAM-CL-TR-709ISSN 1476-2986
As the state-of-the-art edges towards Mark Weiser’s vision of
ubiquitous computing (ubicomp), we found that we have to revise some
previous assumptions about security engineering for this domain.
Ubicomp devices have to be networked together to be able to realize
their promise. To communicate securely amongst themselves, they have
to establish secret session keys, but this is a difficult problem
when this is done primarily over radio in an ad-hoc scenario, i.e.
without the aid of an infrastructure (such as a PKI), and when it is
assumed that the devices are resource-constrained and cannot perform
complex calculations. Secondly, when ubicomp devices are carried by
users as personal items, their permanent identifiers inadvertently
allow the users to be tracked, to the detriment of user privacy.
Unless there are deliberate improvements in designing for location
privacy, ubicomp devices can be trivially detected, and linked to
individual users, with discomfiting echoes of a surveillance
society. Our findings and contributions are thus as follow. In
considering session key establishment, we learnt that asymmetric
cryptography is not axiomatically infeasible, and may in fact be
essential, to counter possible attackers, for some of the more
computationally capable (and important) devices. We next found
existing attacker models to be inadequate, along with existing
models of bootstrapping security associations, in ubicomp. We
address the inadequacies with a contribution which we call:
‘multi-channel security protocols’, by leveraging on multiple
channels, with different properties, existing in the said
environment. We gained an appreciation of the fact that location
privacy is really a multi-layer problem, particularly so in ubicomp,
where an attacker often may have access to different layers. Our
contributions in this area are to advance the design for location
privacy by introducing a MAC-layer proposal with stronger
unlinkability, and a physical-layer proposal with stronger
unobservability.
Event structures with persistenceBrace-Evans, Lucy G.University of Cambridge, Computer Laboratory2008-02enTextUCAM-CL-TR-710ISSN 1476-2986
Increasingly, the style of computation is changing. Instead of one
machine running a program sequentially, we have systems with many
individual agents running in parallel. The need for mathematical
models of such computations is therefore ever greater.
There are many models of concurrent computations. Such models can,
for example, provide a semantics to process calculi and thereby
suggest behavioural equivalences between processes. They are also
key to the development of automated tools for reasoning about
concurrent systems. In this thesis we explore some applications and
generalisations of one particular model – event structures. We
describe a variety of kinds of morphism between event structures.
Each kind expresses a different sort of behavioural relationship. We
demonstrate the way in which event structures can model both
processes and types of processes by recalling a semantics for Affine
HOPLA, a higher order process language. This is given in terms of
asymmetric spans of event structures. We show that such spans
support a trace construction. This allows the modelling of feedback
and suggests a semantics for non-deterministic dataflow processes in
terms of spans. The semantics given is shown to be consistent with
Kahn’s fixed point construction when we consider spans modelling
deterministic processes.
A generalisation of event structures to include persistent events is
proposed. Based on previously described morphisms between classical
event structures, we define several categories of event structures
with persistence. We show that, unlike for the corresponding
categories of classical event structures, all are isomorphic to
Kleisli categories of monads on the most restricted category.
Amongst other things, this provides us with a way of understanding
the asymmetric spans mentioned previously as symmetric spans where
one morphism is modified by a monad. Thus we provide a general
setting for future investigations involving event structures.
Thinking inside the box: system-level failures of tamper
proofingDrimer, SaarMurdoch, Steven J.Anderson, RossUniversity of Cambridge, Computer Laboratory2008-02enTextUCAM-CL-TR-711ISSN 1476-2986
PIN entry devices (PEDs) are critical security components in EMV
smartcard payment systems as they receive a customer’s card and PIN.
Their approval is subject to an extensive suite of evaluation and
certification procedures. In this paper, we demonstrate that the
tamper proofing of PEDs is unsatisfactory, as is the certification
process. We have implemented practical low-cost attacks on two
certified, widely-deployed PEDs – the Ingenico i3300 and the Dione
Xtreme. By tapping inadequately protected smartcard communications,
an attacker with basic technical skills can expose card details and
PINs, leaving cardholders open to fraud. We analyze the
anti-tampering mechanisms of the two PEDs and show that, while the
specific protection measures mostly work as intended, critical
vulnerabilities arise because of the poor integration of
cryptographic, physical and procedural protection. As these
vulnerabilities illustrate a systematic failure in the design
process, we propose a methodology for doing it better in the future.
They also demonstrate a serious problem with the Common Criteria. We
discuss the incentive structures of the certification process, and
show how they can lead to problems of the kind we identified.
Finally, we recommend changes to the Common Criteria framework in
light of the lessons learned.
An abridged version of this paper is to appear at the IEEE Symposium
on Security and Privacy, May 2008, Oakland, CA, US.
Flash-exposure high dynamic range imaging: virtual
photography and depth-compensating flashRichardt, ChristianUniversity of Cambridge, Computer Laboratory2008-03enTextUCAM-CL-TR-712ISSN 1476-2986
I present a revised approach to flash-exposure high dynamic range
(HDR) imaging and demonstrate two applications of this image
representation. The first application enables the creation of
realistic ‘virtual photographs’ for arbitrary flash-exposure
settings, based on a single flash-exposure HDR image. The second
application is a novel tone mapping operator for flash-exposure HDR
images based on the idea of an ‘intelligent flash’. It compensates
for the depth-related brightness fall-off occurring in flash
photographs by taking the ambient illumination into account.
People are the network: experimental design and evaluation
of social-based forwarding algorithmsHui, PanUniversity of Cambridge, Computer Laboratory2008-03enTextUCAM-CL-TR-713ISSN 1476-2986
Cooperation binds but also divides human society into communities.
Members of the same community interact with each other
preferentially. There is structure in human society. Within society
and its communities, individuals have varying popularity. Some
people are more popular and interact with more people than others;
we may call them hubs. I develop methods to extract this kind of
social information from experimental traces and use it to choose the
next hop forwarders in Pocket Switched Networks (PSNs). I find that
by incorporating social information, forwarding efficiency can be
significantly improved. For practical reasons, I also develop
distributed algorithms for inferring communities.
Forwarding in Delay Tolerant Networks (DTNs), or more particularly
PSNs, is a challenging problem since human mobility is usually
difficult to predict. In this thesis, I aim to tackle this problem
using an experimental approach by studying real human mobility. I
perform six mobility experiments in different environments. The
resultant experimental datasets are valuable for the research
community. By analysing the experimental data, I find out that the
inter-contact time of humans follows a power-law distribution with
coefficient smaller than 1 (over the range of 10 minutes to 1 day).
I study the limits of “oblivious” forwarding in the experimental
environment and also the impact of the power-law coefficient on
message delivery.
In order to study social-based forwarding, I develop methods to
infer human communities from the data and use these in the study of
social-aware forwarding. I propose several social-aware forwarding
schemes and evaluate them on different datasets. I find out that by
combining community and centrality information, forwarding
efficiency can be significantly improved, and I call this scheme
BUBBLE forwarding with the analogy that each community is a BUBBLE
with big bubbles containing smaller bubbles. For practical
deployment of these algorithms, I propose distributed community
detection schemes, and also propose methods to approximate node
centrality in the system.
Besides the forwarding study, I also propose a layerless
data-centric architecture for the PSN scenario to address the
problem with the status quo in communication (e.g. an
infrastructuredependent and synchronous API), which brings PSN one
step closer to real-world deployment.
A wide-area file system for migrating virtual
machinesMoreton, TimUniversity of Cambridge, Computer Laboratory2008-03enTextUCAM-CL-TR-714ISSN 1476-2986
Improvements in processing power and core bandwidth set against
fundamental constraints on wide-area latency increasingly emphasise
the position in the network at which services are deployed. The
XenoServer project is building a platform for distributed computing
that facilitates the migration of services between hosts to minimise
client latency and balance load in response to changing patterns of
demand. Applications run inside whole-system virtual machines,
allowing the secure multiplexing of host resources.
Since services are specified in terms of a complete root file system
and kernel image, a key component of this architecture is a
substrate that provides an abstraction akin to local disks for these
virtual machines, whether they are running, migrating or suspended.
However, the same combination of wide-area latency, constrained
bandwidth and global scale that motivates the XenoServer platform
itself impedes the location, management and rapid transfer of
storage between deployment sites. This dissertation describes Xest,
a novel wide-area file system that aims to address these challenges.
I examine Xest’s design, centred on the abstraction of virtual
disks, volumes that allow only a single writer yet are transparently
available despite migration. Virtual disks support the creation of
snapshots and may be rapidly forked into copies that can be modified
independently. This encourages an architectural separation into
node-local file system and global content distribution framework and
reduces the dependence of local operations on wide-area
interactions.
I then describe how Xest addresses the dual problem of latency and
scale by managing, caching, advertising and retrieving storage on
the basis of groups, sets of files that correspond to portions of
inferred working sets of client applications. Coarsening the
granularity of these interfaces further decouples local and global
activity: fewer units can lead to fewer interactions and the
maintenance of less addressing state. The precision of these
interfaces is retained by clustering according to observed access
patterns and, in response to evidence of poor clusterings,
selectively degrading groups into their constituent elements.
I evaluate a real deployment of Xest over a wide-area testbed. Doing
so entails developing new tools for capturing and replaying traces
to simulate virtual machine workloads. My results demonstrate the
practicality and high performance of my design and illustrate the
trade-offs involved in modifying the granularity of established
storage interfaces.
On using fuzzy data in security mechanismsHao, FengUniversity of Cambridge, Computer Laboratory2008-04enTextUCAM-CL-TR-715ISSN 1476-2986
Under the microscope, every physical object has unique features. It
is impossible to clone an object, reproducing exactly the same
physical traits. This unclonability principle has been applied in
many security applications. For example, the science of biometrics
is about measuring unique personal features. It can authenticate
individuals with a high level of assurance. Similarly, a paper
document can be identified by measuring its unique physical
properties, such as randomly-interleaving fiber structure.
Unfortunately, when physical measurements are involved, errors arise
inevitably and the obtained data are fuzzy by nature. This causes
two main problems: 1) fuzzy data cannot be used as a cryptographic
key, as cryptography demands the key be precise; 2) fuzzy data
cannot be sorted easily, which prevents efficient information
retrieval. In addition, biometric measurements create a strong
binding between a person and his unique features, which may conflict
with personal privacy. In this dissertation, we study these problems
in detail and propose solutions.
First, we propose a scheme to derive error-free keys from fuzzy
data, such as iris codes. There are two types of errors within iris
codes: background-noise errors and burst errors. Accordingly, we
devise a two-layer error correction technique, which first corrects
the background-noise errors using a Hadamard code, then the burst
errors using a Reed-Solomon code. Based on a database of 700 iris
images, we demonstrate that an error-free key of 140 bits can be
reliably reproduced from genuine iris codes with a 99.5% success
rate. In addition, despite the irrevocability of the underlying
biometric data, the keys produced using our technique can be easily
revoked or updated.
Second, we address the search problem for a large fuzzy database
that stores iris codes or data with a similar structure. Currently,
the algorithm used in all public deployments of iris recognition is
to search exhaustively through a database of iris codes, looking for
a match that is close enough. We propose a much more efficient
search algorithm: Beacon Guided Search (BGS). BGS works by indexing
iris codes, adopting a “multiple colliding segments principle” along
with an early termination strategy to reduce the search range
dramatically. We evaluate this algorithm using 632,500 real-world
iris codes, showing a substantial speed-up over exhaustive search
with a negligible loss of precision. In addition, we demonstrate
that our empirical findings match theoretical analysis.
Finally, we study the anonymous-veto problem, which is more commonly
known as the Dining Cryptographers problem: how to perform a secure
multiparty computation of the boolean-OR function, while preserving
the privacy of each input bit. The solution to this problem has
general applications in security going way beyond biometrics. Even
though there have been several solutions presented over the past 20
years, we propose a new solution called: Anonymous Veto Network
(AV-net). Compared with past work, the AV-net protocol provides the
strongest protection of each delegate’s privacy against collusion;
it requires only two rounds of broadcast, fewer than any other
solutions; the computational load and bandwidth usage are the lowest
among the available techniques; and our protocol does not require
any private channels or third parties. Overall, it seems unlikely
that, with the same underlying technology, there can be any other
solutions significantly more efficient than ours.
UpgradeJ: Incremental typechecking for class
upgradesBierman, Gavin M.Parkinson, Matthew J.Noble, JamesUniversity of Cambridge, Computer Laboratory2008-04enTextUCAM-CL-TR-716ISSN 1476-2986
One of the problems facing developers is the constant evolution of
components that are used to build applications. This evolution is
typical of any multi-person or multi-site software project. How can
we program in this environment? More precisely, how can language
design address such evolution? In this paper we attack two
significant issues that arise from constant component evolution: we
propose language- level extensions that permit multiple, co-existing
versions of classes and the ability to dynamically upgrade from one
version of a class to another, whilst still maintaining type safety
guarantees and requiring only lightweight extensions to the runtime
infrastructure. We show how our extensions, whilst intuitive,
provide a great deal of power by giving a number of examples. Given
the subtlety of the problem, we formalize a core fragment of our
language and prove a number of important safety properties.
Psychologically-based simulation of human
behaviourRymill, Stephen JulianUniversity of Cambridge, Computer Laboratory2008-06enTextUCAM-CL-TR-717ISSN 1476-2986
The simulation of human behaviour is a key area of computer graphics
as there is currently a great demand for animations consisting of
virtual human characters, ranging from film special effects to
building design. Currently, animated characters can either be
laboriously created by hand, or by using an automated system:
however, results from the latter may still look artificial and
require much further manual work.
The aim of this work is to improve the automated simulation of human
behaviour by making use of ideas from psychology research; the ways
in which this research has been used are made clear throughout this
thesis. It has influenced all aspects of the design:
• Collision avoidance techniques are based on observed practices.
• Actors have simulated vision and attention.
• Actors can be given a variety of moods and emotions to affect
their
behaviour.
This thesis discusses the benefits of the simulation of attention;
this technique recreates the eye movements of each actor, and allows
each actor to build up its own mental model of its surroundings. It
is this model that the actor then uses in its decisions on how to
behave: techniques for collision prediction and collision avoidance
are discussed. On top of this basic behaviour, variability is
introduced by allowing all actors to have different sets of moods
and emotions, which influence all aspects of their behaviour. The
real-time 3D simulation created to demonstrate the actors’ behaviour
is also described.
This thesis demonstrates that the use of techniques based on
psychology research leads to a qualitative and quantitative
improvement in the simulation of human behaviour; this is shown
through a variety of pictures and videos, and by results of
numerical experiments and user testing. Results are compared with
previous work in the field, and with real human behaviour.
Cooperative attack and defense in distributed
networksMoore, TylerUniversity of Cambridge, Computer Laboratory2008-06enTextUCAM-CL-TR-718ISSN 1476-2986
The advance of computer networking has made cooperation essential to
both attackers and defenders. Increased decentralization of network
ownership requires devices to interact with entities beyond their
own realm of control. The distribution of intelligence forces
decisions to be taken at the edge. The exposure of devices makes
multiple, simultaneous attacker-chosen compromise a credible threat.
Motivation for this thesis derives from the observation that it is
often easier for attackers to cooperate than for defenders to do so.
I describe a number of attacks which exploit cooperation to
devastating effect. I also propose and evaluate defensive strategies
which require cooperation.
I first investigate the security of decentralized, or ‘ad-hoc’,
wireless networks. Many have proposed pre-loading symmetric keys
onto devices. I describe two practical attacks on these schemes.
First, attackers may compromise several devices and share the
pre-loaded secrets to impersonate legitimate users. Second, whenever
some keys are not pre-assigned but exchanged upon deployment, a
revoked attacker can rejoin the network.
I next consider defensive strategies where devices collectively
decide to remove a malicious device from the network. Existing
voting-based protocols are made resilient to the attacks I have
developed, and I propose alternative strategies that can be more
efficient and secure. First, I describe a reelection protocol which
relies on positive affirmation from peers to continue participation.
Then I describe a more radical alternative called suicide: a good
device removes a bad one unilaterally by declaring both devices
dead. Suicide offers significant improvements in speed and
efficiency compared to voting-based decision mechanisms. I then
apply suicide and voting to revocation in vehicular networks.
Next, I empirically investigate attack and defense in another
context: phishing attacks on the Internet. I have found evidence
that one group responsible for half of all phishing, the rock-phish
gang, cooperates by pooling hosting resources and by targeting many
banks simultaneously. These cooperative attacks are shown to be far
more effective.
I also study the behavior of defenders – banks and Internet service
providers – who must cooperate to remove malicious sites. I find
that phishing-website lifetimes follow a long-tailed lognormal
distribution. While many sites are removed quickly, others remain
much longer. I examine several feeds from professional ‘take-down’
companies and find that a lack of data sharing helps many phishing
sites evade removal for long time periods.
One anti-phishing organization has relied on volunteers to submit
and verify suspected phishing sites. I find its voting-based
decision mechanism to be slower and less comprehensive than
unilateral verification performed by companies. I also note that the
distribution of user participation is highly skewed, leaving the
scheme vulnerable to manipulation.
The Intelligent Book: technologies for intelligent and
adaptive textbooks, focussing on Discrete MathematicsBillingsley, William H.University of Cambridge, Computer Laboratory2008-06enTextUCAM-CL-TR-719ISSN 1476-2986
An “Intelligent Book” is a Web-based textbook that contains
exercises that are backed by computer models or reasoning systems.
Within the exercises, students work using appropriate graphical
notations and diagrams for the subject matter, and comments and
feedback from the book are related into the content model of the
book. The content model can be extended by its readers. This
dissertation examines the question of how to provide an Intelligent
Book that can support undergraduate questions in Number Theory, and
particularly questions that allow the student to write a proof as
the answer. Number Theory questions pose a challenge not only
because the student is working on an unfamiliar topic in an
unfamiliar syntax, but also because there is no straightforward
procedure for how to prove an arbitrary Number Theory problem.
The main contribution is a system for supporting student-written
proof exercises, backed by the Isabelle/HOL automated proof
assistant and a set of teaching scripts. Students write proofs using
MathsTiles: a graphical notation consisting of composable tiles,
each of which can contain an arbitrary piece of mathematics or logic
written by the teacher. These tiles resemble parts of the proof as
it might be written on paper, and are translated into
Isabelle/HOL’ws Isar syntax on the server. Unlike traditional
syntax-directed editors, MathsTiles allow students to freely sketch
out parts of an answer and do not constrain the order in which an
answer is written. They also allow details of the language to change
between or even during questions.
A number of smaller contributions are also presented. By using the
dynamic nature of MathsTiles, a type of proof exercise is developed
where the student must search for the statements he or she wishes to
use. This allows questions to be supported by informal modelling,
making them much easier to write, but still ensures that the
interface does not act as a prop for the answer. The concept of
searching for statements is extended to develop “massively multiple
choice” questions: a mid-point between the multiple choice and short
answer formats. The question architecture that is presented is
applicable across different notational forms and different answer
analysis techniques. The content architecture uses an informal
ontology that enables students and untrained users to add and adapt
content within the book, including adding their own chapters, while
ensuring the content can also be referred to by the models and
systems that advise students during exercises.
A capability-based access control architecture for
multi-domain publish/subscribe systemsPesonen, Lauri I.W.University of Cambridge, Computer Laboratory2008-06enTextUCAM-CL-TR-720ISSN 1476-2986
Publish/subscribe is emerging as the favoured communication paradigm
for large-scale, wide-area distributed systems. The
publish/subscribe many-to-many interaction model together with
asynchronous messaging provides an efficient transport for highly
distributed systems in high latency environments with direct
peer-to-peer interactions amongst the participants.
Decentralised publish/subscribe systems implement the event service
as a network of event brokers. The broker network makes the system
more resilient to failures and allows it to scale up efficiently as
the number of event clients increases. In many cases such
distributed systems will only be feasible when implemented over the
Internet as a joint effort spanning multiple administrative domains.
The participating members will benefit from the federated event
broker networks both with respect to the size of the system as well
as its fault-tolerance.
Large-scale, multi-domain environments require access control; users
will have different privileges for sending and receiving instances
of different event types. Therefore, we argue that access control is
vital for decentralised publish/subscribe systems, consisting of
multiple independent administrative domains, to ever be deployable
in large scale.
This dissertation presents MAIA, an access control mechanism for
decentralised, type-based publish/subscribe systems. While the work
concentrates on type-based publish/subscribe the contributions are
equally applicable to both topic and content-based publish/subscribe
systems.
Access control in distributed publish/subscribe requires secure,
distributed naming, and mechanisms for enforcing access control
policies. The first contribution of this thesis is a mechanism for
names to be referenced unambiguously from policy without risk of
forgeries. The second contribution is a model describing how signed
capabilities can be used to grant domains and their members’ access
rights to event types in a scalable and expressive manner. The third
contribution is a model for enforcing access control in the
decentralised event service by encrypting event content.
We illustrate the design and implementation of MAIA with a running
example of the UK Police Information Technology Organisation and the
UK police forces.
Investigating classification for natural language processing
tasksMedlock, Ben W.University of Cambridge, Computer Laboratory2008-06enTextUCAM-CL-TR-721ISSN 1476-2986
This report investigates the application of classification
techniques to four natural language processing (NLP) tasks. The
classification paradigm falls within the family of statistical and
machine learning (ML) methods and consists of a framework within
which a mechanical ‘learner’ induces a functional mapping between
elements drawn from a particular sample space and a set of
designated target classes. It is applicable to a wide range of NLP
problems and has met with a great deal of success due to its
flexibility and firm theoretical foundations.
The first task we investigate, topic classification, is firmly
established within the NLP/ML communities as a benchmark application
for classification research. Our aim is to arrive at a deeper
understanding of how class granularity affects classification
accuracy and to assess the impact of representational issues on
different classification models. Our second task, content-based spam
filtering, is a highly topical application for classification
techniques due to the ever-worsening problem of unsolicited email.
We assemble a new corpus and formulate a state-of-the-art classifier
based on structured language model components. Thirdly, we introduce
the problem of anonymisation, which has received little attention to
date within the NLP community. We define the task in terms of
obfuscating potentially sensitive references to real world entities
and present a new publicly-available benchmark corpus. We explore
the implications of the subjective nature of the problem and present
an interactive model for anonymising large quantities of data based
on syntactic analysis and active learning. Finally, we investigate
the task of hedge classification, a relatively new application which
is currently of growing interest due to the expansion of research
into the application of NLP techniques to scientific literature for
information extraction. A high level of annotation agreement is
obtained using new guidelines and a new benchmark corpus is made
publicly available. As part of our investigation, we develop a
probabilistic model for training data acquisition within a
semi-supervised learning framework which is explored both
theoretically and experimentally.
Throughout the report, many common themes of fundamental importance
to classification for NLP are addressed, including sample
representation, performance evaluation, learning model selection,
linguistically-motivated feature engineering, corpus construction
and real-world application.
Energy-efficient sentient computingEyole-Monono, MbouUniversity of Cambridge, Computer Laboratory2008-07enTextUCAM-CL-TR-722ISSN 1476-2986
In a bid to improve the interaction between computers and humans, it
is becoming necessary to make increasingly larger deployments of
sensor networks. These clusters of small electronic devices can be
embedded in our surroundings and can detect and react to physical
changes. They will make computers more proactive in general by
gathering and interpreting useful information about the physical
environment through a combination of measurements. Increasing the
performance of these devices will mean more intelligence can be
embedded within the sensor network. However, most conventional ways
of increasing performance often come with the burden of increased
power dissipation which is not an option for energy-constrained
sensor networks. This thesis proposes, develops and tests a design
methodology for performing greater amounts of processing within a
sensor network while satisfying the requirement for low energy
consumption. The crux of the thesis is that there is a great deal of
concurrency present in sensor networks which when combined with a
tightly-coupled group of small, fast, energy-conscious processors
can result in a significantly more efficient network. The
construction of a multiprocessor system aimed at sensor networks is
described in detail. It is shown that a routine critical to sensor
networks can be sped up with the addition of a small set of
primitives. The need for a very fast inter-processor communication
mechanism is highlighted, and the hardware scheduler developed as
part of this effort forms the cornerstone of the new sentient
computing framework by facilitating thread operations and minimising
the time required for context-switching. The experimental results
also show that end-to-end latency can be reduced in a flexible way
through multiprocessing.
Animation manifolds for representing topological
alterationSouthern, RichardUniversity of Cambridge, Computer Laboratory2008-07enTextUCAM-CL-TR-723ISSN 1476-2986
An animation manifold encapsulates an animation sequence of surfaces
contained within a higher dimensional manifold with one dimension
being time. An iso–surface extracted from this structure is a frame
of the animation sequence.
In this dissertation I make an argument for the use of animation
manifolds as a representation of complex animation sequences. In
particular animation manifolds can represent transitions between
shapes with differing topological structure and polygonal density.
I introduce the animation manifold, and show how it can be
constructed from a keyframe animation sequence and rendered using
raytracing or graphics hardware. I then adapt three Laplacian
editing frameworks to the higher dimensional context. I derive new
boundary conditions for both primal and dual Laplacian methods, and
present a technique to adaptively regularise the sampling of a
deformed manifold after editing.
The animation manifold can be used to represent a morph sequence
between surfaces of arbitrary topology. I present a novel framework
for achieving this by connecting planar cross sections in a higher
dimension with a new constrained Delaunay triangulation. Topological
alteration is achieved by using the Voronoi skeleton, a novel
structure which provides a fast medial axis approximation.
Bayesian inference for latent variable modelsPaquet, UlrichUniversity of Cambridge, Computer Laboratory2008-07enTextUCAM-CL-TR-724ISSN 1476-2986
Bayes’ theorem is the cornerstone of statistical inference. It
provides the tools for dealing with knowledge in an uncertain world,
allowing us to explain observed phenomena through the refinement of
belief in model parameters. At the heart of this elegant framework
lie intractable integrals, whether in computing an average over some
posterior distribution, or in determining the normalizing constant
of a distribution. This thesis examines both deterministic and
stochastic methods in which these integrals can be treated. Of
particular interest shall be parametric models where the parameter
space can be extended with additional latent variables to get
distributions that are easier to handle algorithmically.
Deterministic methods approximate the posterior distribution with a
simpler distribution over which the required integrals become
tractable. We derive and examine a new generic α-divergence message
passing scheme for a multivariate mixture of Gaussians, a particular
modeling problem requiring latent variables. This algorithm
minimizes local α-divergences over a chosen posterior factorization,
and includes variational Bayes and expectation propagation as
special cases.
Stochastic (or Monte Carlo) methods rely on a sample from the
posterior to simplify the integration tasks, giving exact estimates
in the limit of an infinite sample. Parallel tempering and
thermodynamic integration are introduced as ‘gold standard’ methods
to sample from multimodal posterior distributions and determine
normalizing constants. A parallel tempered approach to sampling from
a mixture of Gaussians posterior through Gibbs sampling is derived,
and novel methods are introduced to improve the numerical stability
of thermodynamic integration.
A full comparison with parallel tempering and thermodynamic
integration shows variational Bayes, expectation propagation, and
message passing with the Hellinger distance α = 1/2 to be perfectly
suitable for model selection, and for approximating the predictive
distribution with high accuracy.
Variational and stochastic methods are combined in a novel way to
design Markov chain Monte Carlo (MCMC) transition densities, giving
a variational transition kernel, which lower bounds an exact
transition kernel. We highlight the general need to mix variational
methods with other MCMC moves, by proving that the variational
kernel does not necessarily give a geometrically ergodic chain.
Beyond node degree: evaluating AS topology modelsHaddadi, HamedFay, DamienJamakovic, AlmerimaMaennel, OlafMoore, Andrew W.Mortier, RichardRio, MiguelUhlig, SteveUniversity of Cambridge, Computer Laboratory2008-07enTextUCAM-CL-TR-725ISSN 1476-2986
Many models have been proposed to generate Internet Autonomous
System (AS) topologies, most of which make structural assumptions
about the AS graph. In this paper we compare AS topology generation
models with several observed AS topologies. In contrast to most
previous works, we avoid making assumptions about which topological
properties are important to characterize the AS topology. Our
analysis shows that, although matching degree-based properties, the
existing AS topology generation models fail to capture the
complexity of the local interconnection structure between ASs.
Furthermore, we use BGP data from multiple vantage points to show
that additional measurement locations significantly affect local
structure properties, such as clustering and node centrality.
Degree-based properties, however, are not notably affected by
additional measurements locations. These observations are
particularly valid in the core. The shortcomings of AS topology
generation models stems from an underestimation of the complexity of
the connectivity in the core caused by inappropriate use of BGP
data.
Modular fine-grained concurrency verificationVafeiadis, ViktorUniversity of Cambridge, Computer Laboratory2008-07enTextUCAM-CL-TR-726ISSN 1476-2986
Traditionally, concurrent data structures are protected by a single
mutual exclusion lock so that only one thread may access the data
structure at any time. This coarse-grained approach makes it
relatively easy to reason about correctness, but it severely limits
parallelism. More advanced algorithms instead perform
synchronisation at a finer grain. They employ sophisticated
synchronisation schemes (both blocking and non-blocking) and are
usually written in low-level languages such as C.
This dissertation addresses the formal verification of such
algorithms. It proposes techniques that are modular (and hence
scalable), easy for programmers to use, and yet powerful enough to
verify complex algorithms. In doing so, it makes two theoretical and
two practical contributions to reasoning about fine-grained
concurrency.
First, building on rely/guarantee reasoning and separation logic, it
develops a new logic, RGSep, that subsumes these two logics and
enables simple, modular proofs of fine-grained concurrent algorithms
that use complex dynamically allocated data structures and may
explicitly deallocate memory. RGSep allows for ownership-based
reasoning and ownership transfer between threads, while maintaining
the expressiveness of binary relations to describe inter-thread
interference.
Second, it describes techniques for proving linearisability, the
standard correctness condition for fine-grained concurrent
algorithms. The main proof technique is to introduce auxiliary
single-assignment variables to capture the linearisation point and
to inline the abstract effect of the program at that point as
auxiliary code.
Third, it demonstrates this approach by proving linearisability of a
collection of concurrent list and stack algorithms, as well as
providing the first correctness proofs of the RDCSS and MCAS
implementations of Harris et al.
Finally, it describes a prototype safety checker, SmallfootRG, for
fine-grained concurrent programs that is based on RGSep. SmallfootRG
proves simple safety properties for a number of list and stack
algorithms and verifies the absence of memory leaks.
A novel auto-calibration system for wireless sensor
motesLiu, RuoshuiWassell, Ian J.University of Cambridge, Computer Laboratory2008-09enTextUCAM-CL-TR-727ISSN 1476-2986
In recent years, Wireless Sensor Networks (WSNs) research has
undergone a quiet revolution, providing a new paradigm for sensing
and disseminating information from various environments. In reality,
the wireless propagation channel in many harsh environments has a
significant impact on the coverage range and quality of the radio
links between the wireless nodes (motes). Therefore, the use of
diversity techniques (e.g., frequency diversity and spatial
diversity) must be considered to ameliorate the notoriously variable
and unpredictable point-to-point radio communication links. However,
in order to determine the space and frequency diversity
characteristics of the channel, accurate measurements need to be
made. The most representative and inexpensive solution is to use
motes, however they suffer poor accuracy owing to their low-cost and
compromised radio frequency (RF) performance.
In this report we present a novel automated calibration system for
characterising mote RF performance. The proposed strategy provides
us with good knowledge of the actual mote transmit power, RSSI
characteristics and receive sensitivity by establishing calibration
tables for transmitting and receiving mote pairs over their
operating frequency range. The validated results show that our
automated calibration system can achieve an increase of ±1.5 dB in
the RSSI accuracy. In addition, measurements of the mote transmit
power show a significant difference from that claimed in the
manufacturer’s data sheet. The proposed calibration method can also
be easily applied to wireless sensor motes from virtually any
vendor, provided they have a RF connector.
A robust efficient algorithm for point location in
triangulationsBrown, Peter J.C.Faigle, Christopher T.University of Cambridge, Computer Laboratory1997-02enTextUCAM-CL-TR-728ISSN 1476-2986
This paper presents a robust alternative to previous approaches to
the problem of point location in triangulations represented using
the quadedge data structure. We generalise the reasons for the
failure of an earlier routine to terminate when applied to certain
non-Delaunay triangulations. This leads to our new deterministic
algorithm which we prove is guaranteed to terminate. We also present
a novel heuristic for choosing a starting edge for point location
queries and show that this greatly enhances the efficiency of point
location for the general case.
Weighted spectral distributionFay, DamienHaddadi, HamedUhlig, SteveMoore, Andrew W.Mortier, RichardJamakovic, AlmerimaUniversity of Cambridge, Computer Laboratory2008-09enTextUCAM-CL-TR-729ISSN 1476-2986
Comparison of graph structures is a frequently encountered problem
across a number of problem domains. Comparing graphs requires a
metric to discriminate which features of the graphs are considered
important. The spectrum of a graph is often claimed to contain all
the information within a graph, but the raw spectrum contains too
much information to be directly used as a useful metric. In this
paper we introduce a metric, the weighted spectral distribution,
that improves on the raw spectrum by discounting those eigenvalues
believed to be unimportant and emphasizing the contribution of those
believed to be important.
We use this metric to optimize the selection of parameter values for
generating Internet topologies. Our metric leads to parameter
choices that appear sensible given prior knowledge of the problem
domain: the resulting choices are close to the default values of the
topology generators and, in the case of the AB generator, fall
within the expected region. This metric provides a means for
meaningfully optimizing parameter selection when generating
topologies intended to share structure with, but not match exactly,
measured graphs.
Adaptive evaluation of non-strict programsEnnals, Robert J.University of Cambridge, Computer Laboratory2008-08enTextUCAM-CL-TR-730ISSN 1476-2986
Most popular programming languages are strict. In a strict language,
the binding of a variable to an expression coincides with the
evaluation of the expression.
Non-strict languages attempt to make life easier for programmers by
decoupling expression binding and expression evaluation. In a
non-strict language, a variable can be bound to an unevaluated
expression, and such expressions can be passed around just like
values in a strict language. This separation allows the programmer
to declare a variable at the point that makes most logical sense,
rather than at the point at which its value is known to be needed.
Non-strict languages are usually evaluated using a technique called
Lazy Evaluation. Lazy Evaluation will only evaluate an expression
when its value is known to be needed. While Lazy Evaluation
minimises the total number of expressions evaluated, it imposes a
considerable bookkeeping overhead, and has unpredictable space
behaviour.
In this thesis, we present a new evaluation strategy which we call
Optimistic Evaluation. Optimistic Evaluation blends lazy and eager
evaluation under the guidance of an online profiler. The online
profiler observes the running program and decides which expressions
should be evaluated lazily, and which should be evaluated eagerly.
We show that the worst case performance of Optimistic Evaluation
relative to Lazy Evaluation can be bounded with an upper bound
chosen by the user. Increasing this upper bound allows the profiler
to take greater risks and potentially achieve better average
performance.
This thesis describes both the theory and practice of Optimistic
Evaluation. We start by giving an overview of Optimistic Evaluation.
We go on to present a formal model, which we use to justify our
design. We then detail how we have implemented Optimistic Evaluation
as part of an industrial-strength compiler. Finally, we provide
experimental results to back up our claims.
A new approach to Internet bankingJohnson, MatthewUniversity of Cambridge, Computer Laboratory2008-09enTextUCAM-CL-TR-731ISSN 1476-2986
This thesis investigates the protection landscape surrounding online
banking. First, electronic banking is analysed for vulnerabilities
and a survey of current attacks is carried out. This is represented
graphically as an attack tree describing the different ways in which
online transactions can be attacked.
The discussion then moves on to various defences which have been
developed, categorizing them and analyzing how successful they are
at protecting against the attacks given in the first chapter. This
covers everything from TLS encryption through phishing site
detection to two-factor authentication.
Having declared all current schemes for protecting online banking
lacking in some way, the key aspects of the problem are identified.
This is followed by a proposal for a more robust defence system
which uses a small security device to create a trusted path to the
customer, rather than depend upon trusting the customer’s computer.
The protocol for this system is described along with all the other
restrictions required for actual use. This is followed by a
description of a demonstration implementation of the system.
Extensions to the system are then proposed, designed to afford extra
protection for the consumer and also to support other types of
device. There is then a discussion of ways of managing keys in a
heterogeneous system, rather than one managed by a single entity.
The conclusion discusses the weaknesses of the proposed scheme and
evaluates how successful it is likely to be in practice and what
barriers there may be to adoption in the banking system.
Computing surfaces – a platform for scalable interactive
displaysRrustemi, AlbanUniversity of Cambridge, Computer Laboratory2008-11enTextUCAM-CL-TR-732ISSN 1476-2986
Recent progress in electronic, display and sensing technologies
makes possible a future with omnipresent, arbitrarily large
interactive display surfaces. Nonetheless, current methods of
designing display systems with multi-touch sensitivity do not scale.
This thesis presents computing surfaces as a viable platform for
resolving forthcoming scalability limitations.
Computing surfaces are composed of a homogeneous network of
physically adjoined, small sensitive displays with local computation
and communication capabilities. In this platform, inherent
scalability is provided by a distributed architecture. The regular
spatial distribution of resources presents new demands on the way
surface input and output information is managed and processed.
Direct user input with touch based gestures needs to account for the
distributed architecture of computing surfaces. A scalable
middleware solution that conceals the tiled architecture is proposed
for reasoning with touch-based gestures. The validity of this
middleware is proven in a case study, where a fully distributed
algorithm for online recognition of unistrokes – a particular class
of touch-based gestures – is presented and evaluated.
Novel interaction techniques based around interactive display
surfaces involve direct manipulation with displayed digital objects.
In order to facilitate such interactions in computing surfaces, an
efficient distributed algorithm to perform 2D image transformations
is introduced and evaluated. The performance of these
transformations is heavily influenced by the arbitration policies of
the interconnection network. One approach for improving the
performance of these transformations in conventional network
architectures is proposed and evaluated.
More advanced applications in computing surfaces require the
presence of some notion of time. An efficient algorithm for internal
time synchronisation is presented and evaluated. A hardware solution
is adopted to minimise the delay uncertainty of special timestamp
messages. The proposed algorithm allows efficient, scalable time
synchronisation among clusters of tiles. A hardware reference
platform is constructed to demonstrate the basic principles and
features of computing surfaces. This platform and a complementary
simulation environment is used for extensive evaluation and
analysis.
Tangible user interfaces for peripheral
interactionEdge, DarrenUniversity of Cambridge, Computer Laboratory2008-12enTextUCAM-CL-TR-733ISSN 1476-2986
Since Mark Weiser’s vision of ubiquitous computing in 1988, many
research efforts have been made to move computation away from the
workstation and into the world. One such research area focuses on
“Tangible” User Interfaces or TUIs – those that provide both
physical representation and control of underlying digital
information.
This dissertation describes how TUIs can support a “peripheral”
style of interaction, in which users engage in short, dispersed
episodes of low-attention interaction with digitally-augmented
physical tokens. The application domain in which I develop this
concept is the office context, where physical tokens can represent
items of common interest to members of a team whose work is mutually
interrelated, but predominantly performed independently by
individuals at their desks.
An “analytic design process” is introduced as a way of developing
TUI designs appropriate for their intended contexts of use. This
process is then used to present the design of a bimanual desktop TUI
that complements the existing workstation, and encourages peripheral
interaction in parallel with workstation-intensive tasks.
Implementation of a prototype TUI is then described, comprising
“task” tokens for work-time management, “document” tokens for
face-to-face sharing of collaborative documents, and “contact”
tokens for awareness of other team members’ status and workload.
Finally, evaluation of this TUI is presented via description of its
extended deployment in a real office context.
The main empirically-grounded results of this work are a
categorisation of the different ways in which users can interact
with physical tokens, and an identification of the qualities of
peripheral interaction that differentiate it from other interaction
styles. The foremost benefits of peripheral interaction were found
to arise from the freedom with which tokens can be appropriated to
create meaningful information structures of both cognitive and
social significance, in the physical desktop environment and beyond.
Tabletop interfaces for remote collaborationTuddenham, PhilipUniversity of Cambridge, Computer Laboratory2008-12enTextUCAM-CL-TR-734ISSN 1476-2986
Effective support for synchronous remote collaboration has long
proved a desirable yet elusive goal for computer technology.
Although video views showing the remote participants have recently
improved, technologies providing a shared visual workspace of the
task still lack support for the visual cues and work practices of
co-located collaboration.
Researchers have recently demonstrated shared workspaces for remote
collaboration using large horizontal interactive surfaces. These
remote tabletop interfaces may afford the beneficial work practices
associated with co-located collaboration around tables. However,
there has been little investigation of remote tabletop interfaces
beyond limited demonstrations. There is currently little theoretical
basis for their design, and little empirical characterisation of
their support for collaboration. The construction of remote tabletop
applications also presents considerable technical challenges.
This dissertation addresses each of these areas. Firstly, a theory
of workspace awareness is applied to consider the design of remote
tabletop interfaces and the work practices that they may afford.
Secondly, two technical barriers to the rapid exploration of useful
remote tabletop applications are identified: the low resolution of
conventional tabletop displays; and the lack of support for existing
user interface components. Techniques from multi-projector display
walls are applied to address these problems. The resulting method is
evaluated empirically and used to create a number of novel tabletop
interfaces.
Thirdly, an empirical investigation compares remote and co-located
tabletop interfaces. The findings show how the design of remote
tabletop interfaces leads to collaborators having a high level of
awareness of each other’s actions in the workspace. This enables
smooth transitions between individual and group work, together with
anticipation and assistance, similar to co-located tabletop
collaboration. However, remote tabletop collaborators use different
coordination mechanisms from co-located collaborators. The results
have implications for the design and future study of these
interfaces.
Learning compound noun semanticsÓ Séaghdha, DiarmuidUniversity of Cambridge, Computer Laboratory2008-12enTextUCAM-CL-TR-735ISSN 1476-2986
This thesis investigates computational approaches for analysing the
semantic relations in compound nouns and other noun-noun
constructions. Compound nouns in particular have received a great
deal of attention in recent years due to the challenges they pose
for natural language processing systems. One reason for this is that
the semantic relation between the constituents of a compound is not
explicitly expressed and must be retrieved from other sources of
linguistic and world knowledge.
I present a new scheme for the semantic annotation of compounds,
describing in detail the motivation for the scheme and the
development process. This scheme is applied to create an annotated
dataset for use in compound interpretation experiments. The results
of a dual-annotator experiment indicate that good agreement can be
obtained with this scheme relative to previously reported results
and also provide insights into the challenging nature of the
annotation task.
I describe two corpus-driven paradigms for comparing pairs of nouns:
lexical similarity and relational similarity. Lexical similarity is
based on comparing each constituent of a noun pair to the
corresponding constituent of another pair. Relational similarity is
based on comparing the contexts in which both constituents of a noun
pair occur together with the corresponding contexts of another pair.
Using the flexible framework of kernel methods, I develop techniques
for implementing both similarity paradigms.
A standard approach to lexical similarity represents words by their
co-occurrence distributions. I describe a family of kernel functions
that are designed for the classification of probability
distributions. The appropriateness of these distributional kernels
for semantic tasks is suggested by their close connection to proven
measures of distributional lexical similarity. I demonstrate the
effectiveness of the lexical similarity model by applying it to two
classification tasks: compound noun interpretation and the 2007
SemEval task on classifying semantic relations between nominals.
To implement relational similarity I use kernels on strings and sets
of strings. I show that distributional set kernels based on a
multinomial probability model can be computed many times more
efficiently than previously proposed kernels, while still achieving
equal or better performance. Relational similarity does not perform
as well as lexical similarity in my experiments. However, combining
the two models brings an improvement over either model alone and
achieves state-of-the-art results on both the compound noun and
SemEval Task 4 datasets.
Deny-guarantee reasoningDodds, MikeFeng, XinyuParkinson, MatthewVafeiadis, ViktorUniversity of Cambridge, Computer Laboratory2009-01enTextUCAM-CL-TR-736ISSN 1476-2986
Rely-guarantee is a well-established approach to reasoning about
concurrent programs that use parallel composition. However, parallel
composition is not how concurrency is structured in real systems.
Instead, threads are started by ‘fork’ and collected with ‘join’
commands. This style of concurrency cannot be reasoned about using
rely-guarantee, as the life-time of a thread can be scoped
dynamically. With parallel composition the scope is static.
In this paper, we introduce deny-guarantee reasoning, a
reformulation of rely-guarantee that enables reasoning about
dynamically scoped concurrency. We build on ideas from separation
logic to allow interference to be dynamically split and recombined,
in a similar way that separation logic splits and joins heaps. To
allow this splitting, we use deny and guarantee permissions: a deny
permission specifies that the environment cannot do an action, and
guarantee permission allow us to do an action. We illustrate the use
of our proof system with examples, and show that it can encode all
the original rely-guarantee proofs. We also present the semantics
and soundness of the deny-guarantee method.
Static contract checking for HaskellXu, NaUniversity of Cambridge, Computer Laboratory2008-12enTextUCAM-CL-TR-737ISSN 1476-2986
Program errors are hard to detect and are costly, to both
programmers who spend significant efforts in debugging, and for
systems that are guarded by runtime checks. Static verification
techniques have been applied to imperative and object-oriented
languages, like Java and C#, for checking basic safety properties
such as memory leaks. In a pure functional language, many of these
basic properties are guaranteed by design, which suggests the
opportunity for verifying more sophisticated program properties.
Nevertheless, few automatic systems for doing so exist. In this
thesis, we show the challenges and solutions to verifying advanced
properties of a pure functional language, Haskell. We describe a
sound and automatic static verification framework for Haskell, that
is based on contracts and symbolic execution. Our approach gives
precise blame assignments at compile-time in the presence of
higher-order functions and laziness.
First, we give a formal definition of contract satisfaction which
can be viewed as a denotational semantics for contracts. We then
construct two contract checking wrappers, which are dual to each
other, for checking the contract satisfaction. We prove the
soundness and completeness of the construction of the contract
checking wrappers with respect to the definition of the contract
satisfaction. This part of my research shows that the two wrappers
are projections with respect to a partial ordering
crashes-more-often and furthermore, they form a projection pair and
a closure pair. These properties give contract checking a strong
theoretical foundation.
As the goal is to detect bugs during compile time, we symbolically
execute the code constructed by the contract checking wrappers and
prove the soundness of this approach. We also develop a technique
named counter-example-guided (CEG) unrolling which only unroll
function calls on demand. This technique speeds up the checking
process.
Finally, our verification approach makes error tracing much easier
compared with the existing set-based analysis. Thus equipped, we are
able to tell programmers during compile-time which function to blame
and why if there is a bug in their program. This is a breakthrough
for lazy languages because it is known to be difficult to report
such informative messages either at compile-time or run-time.
High precision timing using self-timed circuitsFairbanks, ScottUniversity of Cambridge, Computer Laboratory2009-01enTextUCAM-CL-TR-738ISSN 1476-2986
Constraining the events that demarcate periods on a VLSI chip to
precise instances of time is the task undertaken in this thesis.
High speed sampling and clock distribution are two example
applications. Foundational to my approach is the use of self-timed
data control circuits.
Specially designed self-timed control circuits deliver high
frequency timing signals with precise phase relationships. The
frequency and the phase relationships are controlled by varying the
number of self-timed control stages and the number of tokens they
control.
The self-timed control circuits are constructed with simple digital
logic gates. The digital logic gates respond to a range of analog
values with a continuum of precise and controlled delays. The
control circuits implement their functionality efficiently. This
allows the gates to drive long wires and distribute the timing
signals over a large area. Also gate delays are short and few,
allowing for high frequencies.
The self-timed control circuits implement the functionality of a
FIFO that is then closed into a ring. Timing tokens ripple through
the rings. The FIFO stages use digital handshaking protocols to pass
the timing tokens between the stages. The FIFO control stage detects
the phase between the handshake signals on its inputs and produces a
signal that is sent back to the producers with a delay that is a
function of the phase relationship of the input signals.
The methods described are not bound to the same process and
systematic skew limitations of existing methods. For a certain power
budget, timing signals are generated and distributed with
significantly less power with the approaches to be presented than
with conventional methods.
State-based Publish/Subscribe for sensor systemsTaherian, SalmanUniversity of Cambridge, Computer Laboratory2009-01enTextUCAM-CL-TR-739ISSN 1476-2986
Recent technological advances have enabled the creation of networks
of sensor devices. These devices are typically equipped with basic
computational and communication capabilities. Systems based on these
devices can deduce high-level, meaningful information about the
environment that may be useful to applications. Due to their scale,
distributed nature, and the limited resources available to sensor
devices, these systems are inherently complex. Shielding
applications from this complexity is a challenging problem.
To address this challenge, I present a middleware called SPS
(State-based Publish/Subscribe). It is based on a combination of a
State-Centric data model and a Publish/Subscribe (Pub/Sub)
communication paradigm. I argue that a state-centric data model
allows applications to specify environmental situations of interest
in a more natural way than existing solutions. In addition, Pub/Sub
enables scalable many-to-many communication between sensors,
actuators, and applications.
This dissertation initially focuses on Resource-constrained Sensor
Networks (RSNs) and proposes State Filters (SFs), which are
lightweight, stateful, event filtering components. Their design is
motivated by the redundancy and correlation observed in sensor
readings produced close together in space and time. By performing
context-based data processing, SFs increase Pub/Sub expressiveness
and improve communication efficiency.
Secondly, I propose State Maintenance Components (SMCs) for
capturing more expressive conditions in heterogeneous sensor
networks containing more resourceful devices. SMCs extend SFs with
data fusion and temporal and spatial data manipulation capabilities.
They can also be composed together (in a DAG) to deduce higher level
information. SMCs operate independently from each other and can
therefore be decomposed for distributed processing within the
network.
Finally, I present a Pub/Sub protocol called QPS (Quad-PubSub) for
location-aware Wireless Sensor Networks (WSNs). QPS is central to
the design of my framework as it facilitates messaging between
state-based components, applications, sensors, and actuators. In
contrast to existing data dissemination protocols, QPS has a layered
architecture. This allows for the transparent operation of routing
protocols that meet different Quality of Service (QoS) requirements.
Analysis of affective expression in speechSobol-Shikler, TalUniversity of Cambridge, Computer Laboratory2009-01enTextUCAM-CL-TR-740ISSN 1476-2986
This dissertation presents analysis of expressions in speech. It
describes a novel framework for dynamic recognition of acted and
naturally evoked expressions and its application to expression
mapping and to multi-modal analysis of human-computer interactions.
The focus of this research is on analysis of a wide range of
emotions and mental states from non-verbal expressions in speech. In
particular, on inference of complex mental states, beyond the set of
basic emotions, including naturally evoked subtle expressions and
mixtures of expressions.
This dissertation describes a bottom-up computational model for
processing of speech signals. It combines the application of signal
processing, machine learning and voting methods with novel
approaches to the design, implementation and validation. It is based
on a comprehensive framework that includes all the development
stages of a system. The model represents paralinguistic speech
events using temporal abstractions borrowed from various disciplines
such as musicology, engineering and linguistics. The model consists
of a flexible and expandable architecture. The validation of the
model extends its scope to different expressions, languages,
backgrounds, contexts and applications.
The work adapts an approach that an utterance is not an isolated
entity but rather a part of an interaction and should be analysed in
this context. The analysis in context includes relations to events
and other behavioural cues. Expressions of mental states are related
not only in time but also by their meaning and content. This work
demonstrates the relations between the lexical definitions of mental
states, taxonomies and theoretical conceptualization of mental
states and their vocal correlates. It examines taxonomies and
theoretical conceptualisation of mental states in relation to their
vocal characteristics. The results show that a very wide range of
mental state concepts can be mapped, or described, using a
high-level abstraction in the form of a small sub-set of concepts
which are characterised by their vocal correlates.
This research is an important step towards comprehensive solutions
that incorporate social intelligence cues for a wide variety of
applications and for multi-disciplinary research.
Vehicular wireless communicationCottingham, David N.University of Cambridge, Computer Laboratory2009-01enTextUCAM-CL-TR-741ISSN 1476-2986
Transportation is vital in everyday life. As a consequence, vehicles
are increasingly equipped with onboard computing devices. Moreover,
the demand for connectivity to vehicles is growing rapidly, both
from business and consumers. Meanwhile, the number of wireless
networks available in an average city in the developed world is
several thousand. Whilst this theoretically provides near-ubiquitous
coverage, the technology type is not homogeneous.
This dissertation discusses how the diversity in communication
systems can be best used by vehicles. Focussing on road vehicles, it
first details the technologies available, the difficulties inherent
in the vehicular environment, and how intelligent handover
algorithms could enable seamless connectivity. In particular, it
identifies the need for a model of the coverage of wireless
networks.
In order to construct such a model, the use of vehicular sensor
networks is proposed. The Sentient Van, a platform for vehicular
sensing, is introduced, and details are given of experiments carried
out concerning the performance of IEEE 802.11x, specifically for
vehicles. Using the Sentient Van, a corpus of 10 million signal
strength readings was collected over three years. This data, and
further traces, are used in the remainder of the work described,
thus distinguishing it in using entirely real world data.
Algorithms are adapted from the field of 2-D shape simplification to
the problem of processing thousands of signal strength readings. By
applying these to the data collected, coverage maps are generated
that contain extents. These represent how coverage varies between
two locations on a given road. The algorithms are first proven fit
for purpose using synthetic data, before being evaluated for
accuracy of representation and compactness of output using real
data.
The problem of how to select the optimal network to connect to is
then addressed. The coverage map representation is converted into a
multi-planar graph, where the coverage of all available wireless
networks is included. This novel representation also includes the
ability to hand over between networks, and the penalties so
incurred. This allows the benefits of connecting to a given network
to be traded off with the cost of handing over to it.
In order to use the multi-planar graph, shortest path routing is
used. The theory underpinning multi-criteria routing is overviewed,
and a family of routing metrics developed. These generate efficient
solutions to the problem of calculating the sequence of networks
that should be connected to over a given geographical route. The
system is evaluated using real traces, finding that in 75% of the
test cases proactive routing algorithms provide better QoS than a
reactive algorithm. Moreover, the system can also be run to generate
geographical routes that are QoS-aware.
This dissertation concludes by examining how coverage mapping can be
applied to other types of data, and avenues for future research are
proposed.
TCP, UDP, and Sockets: Volume 3: The Service-level
SpecificationRidge, ThomasNorrish, MichaelSewell, PeterUniversity of Cambridge, Computer Laboratory2009-02enTextUCAM-CL-TR-742ISSN 1476-2986
Despite more than 30 years of research on protocol specification,
the major protocols deployed in the Internet, such as TCP, are
described only in informal prose RFCs and executable code. In part
this is because the scale and complexity of these protocols makes
them challenging targets for formal descriptions, and because
techniques for mathematically rigorous (but appropriately loose)
specification are not in common use.
In this work we show how these difficulties can be addressed. We
develop a high-level specification for TCP and the Sockets API,
describing the byte-stream service that TCP provides to users,
expressed in the formalised mathematics of the HOL proof assistant.
This complements our previous low-level specification of the
protocol internals, and makes it possible for the first time to
state what it means for TCP to be correct: that the protocol
implements the service. We define a precise abstraction function
between the models and validate it by testing, using verified
testing infrastructure within HOL. Some errors may remain, of
course, especially as our resources for testing were limited, but it
would be straightforward to use the method on a larger scale. This
is a pragmatic alternative to full proof, providing reasonable
confidence at a relatively low entry cost.
Together with our previous validation of the low-level model, this
shows how one can rigorously tie together concrete implementations,
low-level protocol models, and specifications of the services they
claim to provide, dealing with the complexity of real-world
protocols throughout.
Similar techniques should be applicable, and even more valuable, in
the design of new protocols (as we illustrated elsewhere, for a MAC
protocol for the SWIFT optically switched network). For TCP and
Sockets, our specifications had to capture the historical
complexities, whereas for a new protocol design, such specification
and testing can identify unintended complexities at an early point
in the design.
Optimising the speed and accuracy of a Statistical GLR
ParserWatson, Rebecca F.University of Cambridge, Computer Laboratory2009-03enTextUCAM-CL-TR-743ISSN 1476-2986
The focus of this thesis is to develop techniques that optimise both
the speed and accuracy of a unification-based statistical GLR
parser. However, we can apply these methods within a broad range of
parsing frameworks. We first aim to optimise the level of tag
ambiguity resolved during parsing, given that we employ a front-end
PoS tagger. This work provides the first broad comparison of tag
models as we consider both tagging and parsing performance. A
dynamic model achieves the best accuracy and provides a means to
overcome the trade-off between tag error rates in single tag per
word input and the increase in parse ambiguity over multipletag per
word input. The second line of research describes a novel
modification to the inside-outside algorithm, whereby multiple
inside and outside probabilities are assigned for elements within
the packed parse forest data structure. This algorithm enables us to
compute a set of ‘weighted GRs’ directly from this structure. Our
experiments demonstrate substantial increases in parser accuracy and
throughput for weighted GR output.
Finally, we describe a novel confidence-based training framework,
that can, in principle, be applied to any statistical parser whose
output is defined in terms of its consistency with a given level and
type of annotation. We demonstrate that a semisupervised variant of
this framework outperforms both Expectation-Maximisation (when both
are constrained by unlabelled partial-bracketing) and the extant
(fully supervised) method. These novel training methods utilise data
automatically extracted from existing corpora. Consequently, they
require no manual effort on behalf of the grammar writer,
facilitating grammar development.
Citation context analysis for information
retrievalRitchie, AnnaUniversity of Cambridge, Computer Laboratory2009-03enTextUCAM-CL-TR-744ISSN 1476-2986
This thesis investigates taking words from around citations to
scientific papers in order to create an enhanced document
representation for improved information retrieval. This method
parallels how anchor text is commonly used in Web retrieval. In
previous work, words from citing documents have been used as an
alternative representation of the cited document but no previous
experiment has combined them with a full-text document
representation and measured effectiveness in a large scale
evaluation.
The contributions of this thesis are twofold: firstly, we present a
novel document representation, along with experiments to measure its
effect on retrieval effectiveness, and, secondly, we document the
construction of a new, realistic test collection of scientific
research papers, with references (in the bibliography) and their
associated citations (in the running text of the paper)
automatically annotated. Our experiments show that the
citation-enhanced document representation increases retrieval
effectiveness across a range of standard retrieval models and
evaluation measures.
In Chapter 2, we give the background to our work, discussing the
various areas from which we draw together ideas: information
retrieval, particularly link structure analysis and anchor text
indexing, and bibliometrics, in particular citation analysis. We
show that there is a close relatedness of ideas between these areas
but that these ideas have not been fully explored experimentally.
Chapter 3 discusses the test collection paradigm for evaluation of
information retrieval systems and describes how and why we built our
test collection. In Chapter 4 we introduce the ACL Anthology, the
archive of computational linguistics papers that our test collection
is centred around. The archive contains the most prominent
publications since the beginning of the field in the early 1960s,
consisting of one journal plus conferences and workshops, resulting
in over 10,000 papers. Chapter 5 describes how the PDF papers are
prepared for our experiments, including identification of references
and citations in the papers, once converted to plain text, and
extraction of citation information to an XML database. Chapter 6
presents our experiments: we show that adding citation terms to the
full-text of the papers improves retrieval effectiveness by up to
7.4%, that weighting citation terms higher relative to paper terms
increases the improvement and that varying the context from which
citation terms are taken has a significant effect on retrieval
effectiveness. Our main hypothesis that citation terms enhance a
full-text representation of scientific papers is thus proven.
There are some limitations to these experiments. The relevance
judgements in our test collection are incomplete but we have
experimentally verified that the test collection is, nevertheless, a
useful evaluation tool. Using the Lemur toolkit constrained the
method that we used to weight citation terms; we would like to
experiment with a more realistic implementation of term weighting.
Our experiments with different citation contexts did not conclude an
optimal citation context; we would like to extend the scope of our
investigation. Now that our test collection exists, we can address
these issues in our experiments and leave the door open for more
extensive experimentation.
A better x86 memory model: x86-TSO (extended
version)Owens, ScottSarkar, SusmitSewell, PeterUniversity of Cambridge, Computer Laboratory2009-03enTextUCAM-CL-TR-745ISSN 1476-2986
Real multiprocessors do not provide the sequentially consistent
memory that is assumed by most work on semantics and verification.
Instead, they have relaxed memory models, typically described in
ambiguous prose, which lead to widespread confusion. These are prime
targets for mechanized formalization. In previous work we produced a
rigorous x86-CC model, formalizing the Intel and AMD architecture
specifications of the time, but those turned out to be unsound with
respect to actual hardware, as well as arguably too weak to program
above. We discuss these issues and present a new x86-TSO model that
suffers from neither problem, formalized in HOL4. We believe it is
sound with respect to real processors, reflects better the vendor's
intentions, and is also better suited for programming. We give two
equivalent definitions of x86-TSO: an intuitive operational model
based on local write buffers, and an axiomatic total store ordering
model, similar to that of the SPARCv8. Both are adapted to handle
x86-specific features. We have implemented the axiomatic model in
our memevents tool, which calculates the set of all valid executions
of test programs, and, for greater confidence, verify the witnesses
of such executions directly, with code extracted from a third, more
algorithmic, equivalent version of the definition.
The snooping dragon: social-malware surveillance of the
Tibetan movementNagaraja, ShishirAnderson, RossUniversity of Cambridge, Computer Laboratory2009-03enTextUCAM-CL-TR-746ISSN 1476-2986
In this note we document a case of malware-based electronic
surveillance of a political organisation by the agents of a nation
state. While malware attacks are not new, two aspects of this case
make it worth serious study. First, it was a targeted surveillance
attack designed to collect actionable intelligence for use by the
police and security services of a repressive state, with potentially
fatal consequences for those exposed. Second, the modus operandi
combined social phishing with high-grade malware. This combination
of well-written malware with well-designed email lures, which we
call social malware, is devastatingly effective. Few organisations
outside the defence and intelligence sector could withstand such an
attack, and although this particular case involved the agents of a
major power, the attack could in fact have been mounted by a capable
motivated individual. This report is therefore of importance not
just to companies who may attract the attention of government
agencies, but to all or