Moving Toward A Science Of Digital Forensic Evidence Examination

This paper summarizes a keynote speech on moving toward a science of digital forensic evidence examination. The basic premise is that we are not a normal science today, and to get to normal science, we need to make progress in many areas. The question at hand, is where we should move as a community, and how we should get from here to there.

1. Background

Like almost every scientific endeavor, the examination of digital forensic evidence (DFE) started out somewhere between an art and a craft. People with special skills and knowledge leverage that skill set and knowledge base to put forth notions about the meaning of DFE in the context of legal matters. While the court system greatly appreciates science and its role through expert testimony in providing probative information, that appreciation is substantially challenged by the lack of a scientific base, in the form of adequate peer reviewed publications associated with professional societies, a well-defined and well understood body of knowledge, an underlying scientific methodology that the courts can understand, an experimental basis, and all of the other things that go with normal science. As the volume and criticality of DFE has increased, there is that an increasing recognition of the limitations of DFE, and more importantly, the limitations of the underlying science.

In making progress in the science of digital forensic evidence examination, it may be helpful to look at the advancement of science in other areas. In most areas of
science, a scientific methodology consists of four basic elements; (1) studying the past and current theories, methods and their experimental basis; (2) identifying
inconsistencies between current theories and repeatable experimental outcomes, (3) hypothesizing a new theory and performing experiments to test the new theory,
and (4) publishing the results. However, in an area where there is no pre-existing scientific theory, a new methodology, theory, experimental basis, and perhaps
even a new physics has to be built from scratch. In the case of DFE examination, only one attempt has been made to do this so far,[1] and this paper is substantially
about that attempt. Where no citation appears in this paper, it is to be assumed to cite [1].

1.1.The call for science in forensics

The US Supreme Court has spoken [2] and the National Research Council has concurred. [3] A rigorous scientific approach is needed for forensic evidence to warrant use in the courts in the United States, and much of the world is likely to follow that approach, if it isn't already following it.

failures of forensics. Recent failures have been quite dramatic. For example, in the Madrid bombing case, and the US FBI declared that a fingerprint from the scene demonstrated the presence of an Oregon attorney. However, that attorney, after having been arrested, was clearly demonstrated to have been on the other side of the world at the time to question. The side effect is that fingerprints are now being challenged as valid scientific evidence across the land, and around the world. [4] A similar situation exists in cases where forensic examiners have done a poor job, and testified in numerous cases, typically for the prosecution. The inability to effectively challenge evidence by such supposedly experts through a scientific methodology and inquiry process makes this sort of evidence extremely problematic, and all the more so because of the limits of human integrity involved in this case. In case after case, when the details are examined, forensic evidence seems to come up short under close scrutiny, and if competently challenged. The solution is simple. Build and apply real science, and the truth will out.

2. A first attempt at proposing a science

This first attempt at proposing a science for DFE examination consists of the creation and enumeration of some elements of an epistemology and physics of digital information, a model of the DFE examination process within the context of the legal environment, and the interpretation of existing information, experimental results, and theory in the new model. As the first model to attempt to create a scientific structure for DFE examination, the name for the model is proposed as "the standard model", but of course, it may only become a standard to the extent that it is embraced by the community, and of course it will have to be adapted with time as the community comes to
decide and adapt to the realities of the scientific method.

2.1.An epistemology for digital forensics

Epistemology studies the nature of knowledge, its presuppositions, foundations, extent, and validity. In the case of DFE examination, some basics may be reasonably assumed for the purposes of creating a science. Here are some of the epistemological issues already identified.

Digital evidence consists entirely of sequences of binary values that we call bits. Thus, in this limited field, we do not deal with the physical nature of normal space, but we operate in a very different space.

The physics of DFE is different than that for matter and energy, and thus the normal assumptions that are made with respect to the way the world works do not apply, or don't apply in the same way, to DFE. Among the substantial differences are, without limit:

DFE has observation without alteration and duplication without removal.

Computational complexity limits what can be done with what resources in what time frame - a different "speed of light".

Unlike most physical evidence, which is very often transfer evidence and sometimes trace evidence, DFE is always trace evidence, but essentially never transfer evidence.

DFE is normally latent in nature in that it can only be observed through the use of tools. This then implies a multitude of requirements surrounding those tools and their use.

As a "scientific" approach, the theories are not casual theories, but "scientific theories". That means that:

They are constructs that are testable.

Refutation can destroy a theory, but finite confirmations cannot "prove" it. They can only confirm it.

Scientific theories change slowly, and once accepted, only change because of dramatic changes in understanding of the underlying physics. Those changes only relate to rare cases.

The "theories" of DFE lead to a form of a physics of digital information. Many of them are based on existing widely accepted mathematical knowledge, but some are still somewhat conjectures from computer engineering, computer science, finite mathematics, and related areas.

2.2. A quick introduction to information physics

The physics of digital information is significantly different than the physics of the physical world we deal with on a day-to-day basis. There are many differences
between these worlds and many of them are described in more detail elsewhere,[1] but to get a sense of the sorts of differences we face, is noted that many of the underlying assumptions of the physical world, such as smoothness, continuous space, the notion of transfer, continuous time, and even the speed of light, are all
very different in the digital world, and in many cases, simply don't hold true. The implications of these differences are, in some sense, profound.

Input sequences to digital systems produce outputs and state changes as a function of the previous state. To the extent that the state or outputs produce stored
and/or captured bit sequences, these form traces of the event sequences that caused them. Thus the definition of a trace may be stated as: "A set of bit sequences
produced from the execution of a finite state machine.

We generally think of the physical space we live in as a space that diverges with time, with any given initial conditions in history producing a wide variety of
possible future outcomes. As a result, when looking at a physical trace, at least theoretically, we could identify a unique historical event sequence that produced such a
trace. But the digital space converges with time, so that instead of the one to many relation that we see in the physical world, we see a many to one relation in the
digital world. That means that a very large number of potentially very different input sequences and initial states may produce identical traces (i.e., from
subsequent states and sequences). Almost any digital trace we identify could be the result of a large number of different historical event sequences, and the number
of those sequences increases dramatically with the passage of time (i.e., execution of FSMs). Thus the traces from digital mechanisms are not, in general, unique as to the input sequences that produced them.

Another less mathematical sort of problem is the relationship between the unlimited granularity of the physical world in both time and space and the finite granularity of the digital world in both time and space. Because of this difference, at the interface between the physical and digital world there is a discontinuity, and quite minor differences are exaggerated near the discontinuity while major differences are ignored away from the discontinuity. The limited sensor and actuator capacity of the devices that convert between the digital and physical world also largely prevent the exchange of a wide variety of information that is potentially probative, as well is making a wide variety of forgeries at the interface far easier than they might otherwise be. This then also implies that input sequences do not directly demonstrate what non-digital events sequences may have produced them. As a result, additional effort is required to attribute traces to real-world causes and forgery is potentially far easier in the digital space than in the physical space.

While a great deal of time and effort could be spent discussing in substantial detail the rather large number of such differences, many of these are already documented and the larger implication of these examples is that digital forensic evidence is the result of processing with FSMs, and that inherently limits its potential utility for providing probative information regarding real-world events. DFE examiners must take these limitations into account when undertaking their examinations, and when testifying about the results of those examinations. These limitations are due directly to the limits of DFE and the methodologies used to understand and work with it.

2.3. A quick introduction to the model

The model of DFE examination is related to an overarching model of digital forensics. [5] It can be codified in mathematical terms as follows:

Dr. Frederick Cohen, Ph.D., and his associates have recognized expertise in E-Discovery, Digital Forensics, Computer Security, Information Protection, and Related Areas. For more than 30 years, they have served as an unbiased independent source of top quality information in this arena.