The project's approach to modeling
probable precontact archaeological resource locations uses Geographic Information
Systems (GIS) and statistical analysis to identify relationships between known
archaeological sites and their environments. The results are maps indicating
areas of low, medium, and high potential for archaeological sites. The field
and laboratory procedures adopted in the project were selected because of the
requirements of this kind of modeling approach (see Section
2.2). These requirements are: (1) archaeological data acquired through probabilistic
methods; (2) quantifiable data and variables formatted into a Geographic Information
System (GIS); (3) criteria to evaluate model accuracy; (4) procedures for using,
maintaining, and updating the model; and (5) an implementation program. Given
these requirements, the Mn/Model project was divided into three developmental
phases: (1) basic data accumulation and the creation of prototype models; (2)
formal model development; and (3) model refinement and implementation.

While there are many approaches
to the development of predictive locational models, all must choose between
various kinds of units of analysis, dependent and independent variables, types
of models and decision rules, and model evaluation criteria.

2.2.1 Units
of Analysis

In archaeological modeling studies,
the unit of analysis is a parcel of land. For the purposes of this discussion,
a parcel is considered to be the smallest basic unit, represented by a single
cell in the raster GIS. Each land parcel has archaeological attributes (e.g.
presence or absence of archaeological sites, site type) and environmental attributes
(e.g., elevation, slope, proximity to water, soil type). The purpose of these
studies is to determine the degree of association between the archaeological
attributes (the dependent variables) in these parcels and
their environmental characteristics (the independent
variables) (Carr 1985:125). Researchers then induce from this relationship
that other parcels with similar environmental characteristics will have similar
archaeological resources.

The size of land parcels in archaeological
modeling studies can vary widely, depending on the intent of the modeling project
(Kvamme 1990:268-269; Parker 1986:410-414). As a rule, parcels should be of equal
area to facilitate quantification and statistical analysis. Where a detailed level
of resolution is called for, parcel
size should be small; where the intent is to identify broad regions of higher
or lower resource density, parcel size can be quite large. In general, high resolution
models have parcel sizes equal to or less than 10 acres (4 ha or 40,469 m2),
while models of very low resolution have parcel sizes equal to or greater than
about 250 acres (101 ha or 1,011,715 m2). However,
each size range comes with a cost. For example, if parcel size is very large,
the probability that parcels will contain archaeological resources can be very
high, while the precision of the model is fairly low. On the other hand, if parcel
size is very small, the cost of mapping environmental and archaeological resources
can be very high.

2.2.2 Dependent
Variables: Archaeological Events

The definition of archaeological
events in modeling projects depends on the purpose of the project. In most academic
projects, the goal is to model the locational behavior of different functional,
chronological, and cultural types of occupations (components). By contrast,
the goal of most cultural resource management projects to meet the requirements
of federal legislation is to identify areas that contain archaeological sites
without regard to the actual site type. Given this goal, and the difficulty
involved in clearly identifying age and meaningful functional and cultural types
of occupations at most archaeological sites, the most frequently used dependent
variable in cultural resource management projects is the simple dichotomous
case of "archaeological resources are present (S)" and "archaeological
resources are not present (S’)" (e.g., Bradley et al. 1986; Kvamme 1984,
1986, 1990; Larralde and Chandler 1981; Parker 1985; Scholtz 1981; Stone 1984;
Tipps 1983; Warren et al. 1987). Because this approach lumps occupations of
various kinds together, it incorporates a great deal of locational variability
that may reduce the potential predictive power of the model (e.g., Judge 1973;
Roper 1979). However, the approach has the advantages of minimizing complexity,
focusing on non-ambiguous mutually exclusive events, and producing large sample
sizes. Many powerful predictive models have been built using this simple solution
to defining all the possible events that can occur in a land parcel (e.g., Kvamme
1988a, 1990; Kvamme and Jochim 1989; Limp et al. 1987; Parker 1985).

Other choices of dependent variables
in locational modeling in archaeology have included multiple site types (e.g.,
Kvamme 1985; Kvamme 1988a; Limp et al. 1987; Parker 1986; Scholtz 1981), counts
of artifact density (Green 1973; Nance et al. 1983; Zubrow and Harbaugh 1978),
and various measures of site significance (e.g., James and Knudson 1983; Woodward
Clyde Consultants 1978). Like the dichotomous archaeological resource present/absent
choice, these depend on the fact that there are common locational tendencies
that crosscut functional and cultural categories, such as proximity to water
and preference for level ground, and that many locations in a region are unsuitable
for most kinds of activities for similar environmental reasons, such as the
presence of swamps or very steep slopes (e.g., Kvamme 1985; Kvamme and Jochim
1989). The advantages and disadvantages of these choices are reviewed in Kohler
and Parker (1986), Judge and Sebastian (1988), and Kvamme (1990).

This project chose the archaeological
resource present/archaeological resource absent dependent variable as its basic
unit of analysis because of the requirements of the National Historic Preservation
Act of 1966 (as amended). A cultural resource must be over 50 years old and
have significance and integrity to be considered a historic property. A cultural
resource can be determined significant if it meets one or more of the four federal
eligibility requirements. In other words, the significance of the resource must
be analyzed. While a predictive model can assist agencies planning projects
by determining the probability that a cultural resource may be in the area of
potential effect, significance is generally analyzed through the development
of historic contexts. Therefore, the scope of the project was designated towards
the identification component of the legislation and 36CFR 800 (Code of Federal
Regulations 800). The modeling of significance should be considered in the future
as part of Mn/Model’s evolution as outlined in the implementation plan.

A variety of independent variables
have been used in modeling archaeological site locations, including sociocultural
(Scholtz 1981; Zimmerman 1977) and radiometric characteristics (e.g., Custer
et al. 1986), and positional parameters (Kvamme 1988b; Limp et al. 1987; Parker
1985). However, most modeling projects in North America have focused on the
economic component of site location, as environmental factors are more easily
quantified and generally considered directly related to locational decisions
by hunter-gatherers without advanced transportation (e.g., Bettinger 1980; Jochim
1976; Wood 1978). The argument is (1) in these types of societies the most important
economic transactions are with the regional environment; and (2) people in these
societies tend to minimize the time and effort they expend in these transactions
(Kvamme 1990:271). The effect was to encourage location close to important environmental
resources. Since these assumptions are not applicable to more complex societies
engaged in a market economy, the model developed here is intended to apply only
to settlements in Minnesota before Euroamerican social and political factors
began controlling settlement locations. The end of that period is set here at
1837, the date established by the Minnesota State Historical Preservation Office
(SHPO) for the end of the contact period, when written records and historical
methods of investigation become a more effective avenue for determining site
location.

The focus on environmental or biophysical characteristics of land parcels, such as slope, soil type, elevation, plant
community type, and distance to water, is a practical one. These variables are
relatively easy to identify today through measurements or observations made
on maps, aerial photographs, remotely sensed data sets, and digital spatial
information sources, such as GIS. Environmentally-based predictive models work
by correlating the location of a sample of known sites with the environmental
attributes of their surroundings and predicting that other, unknown sites will
be present wherever similar sets of characteristics occur. The goal, then, is
to define the environmental attributes of parcels that have some bearing on
the distribution of archaeological resources in a study area.

Prior research and hunter-gatherer
settlement theory show that open-air site placements were most often a function
of a matrix of environmental factors (e.g., Jochim 1976; Larralde and Chandler
1981; Roper 1979; Scholtz 1981; Shermer and Tiffany 1985; Thomas and Bettinger
1976; Williams et al. 1973). As a rule, the variables chosen are those that
reflect stable landform characteristics through time, such as elevation, slope,
and aspect, to insure that there is some correspondence between modern map-measured
data and the prehistoric-early historic environment. Some potentially important
variables, such as plant community composition and water table elevation, are
notoriously sensitive to climatic changes and, as a result, are difficult to
use without recourse to proxy measures (Kohler and Parker 1986:415). Nonetheless,
distance to nearest water can be measured from major drainage locations and
lakes, which are landform features. The assumption is that, though lake size
has changed through time, distance to a lake's edge today serves as a useful
proxy measure of distance to water in the past.

Since the environmental variables
most suited to a particular model depend in part on the physical nature of the
region under investigation, they cannot be determined without analysis. Consequently,
most modeling projects initially measure many landform, hydrological, soil,
and geologic characteristics, including slope, aspect, elevation, local relief,
landform type, horizontal distance to the nearest permanent water and stream
confluence, and distance to streams. When measurements of all these characteristics
must be obtained by GIS, they must be both detectable and quantifiable from
available data sources (e.g., Kvamme 1990). Variables with low predictive power
are filtered out throughout the model development process. Justification for
the adoption of these variables and procedures for operationalizing their measurement
can be found in Hasenstab (1983), Kvamme (1986), Kvamme and Kohler (1988), Limp
et al. (1987), Parker (1985, 1986), Roper (1979), Scholtz (1981), and Warren
et al (1987), among other authors. General reviews of the issues involved are
provided by Judge and Sebastian (1988), Kohler and Parker (1986), and Kvamme
(1990).

Although this empiric correlative
procedure has been effective in the formation of predictive locational models
using biophysical variables, the importance of social, political, and ideological
factors in the spatial location of settlements cannot be ignored (Sebastian
and Judge 1988:5-6; Kvamme 1997:1). Their identification will become an increasingly
important focus of the modeling process as more is learned about the archaeology
of specific regions.

2.2.4 Types
of Models and Decision Rules

The modeling approach described
is based on the assumption that human behavior is nonrandom and, therefore,
that settlements and other activity places are nonrandomly distributed. This
means that significant regional patterning should exist in the distribution
of archaeological resources, an assumption supported by many studies of settlement
data (e.g., Brandt et al. 1992:269: Judge 1973; Kvamme 1985; Roper 1979; Thomas
and Bettinger 1976). Other key assumptions are that: (1) archaeological resource
locations are nonrandomly distributed with respect to identifiable environmental
variables; and that (2) the archaeological site samples are representative of
resource locations in this region. While the second assumption can be satisfied
by some form of random sampling design, the first requires a statistical analysis
that is capable of identifying patterns and differences between types of locations
that generally possess or do not possess open-air sites.

Both univariate and multivariate statistical models are used
to identify environmental variables on which the distributional differences
of dependent variables (archaeological resource presence/absence) are most pronounced.
Several non-parametric univariate statistical tests are performed on the individual
variables, and a logistic regression technique is most often used to explore
multivariate differences between parcels with and without resources. Many researchers
have adopted multiple logistic regression models because they:

make no assumptions about the
form of the distribution of the data (i.e., they are a nonparametric technique);

are robust classifiers no matter
what the distributional form (which is important in environmentally diverse
regions);

can handle nominal, ordinal,
and interval level independent variables; and

seem to produce better results
than other multivariate modeling strategies.

The results of these statistical
procedures are used, at least initially, to formulate decision rules that, when
applied to the environmental variables for each land parcel, indicate the potential
for presence or absence of archaeological resources. Decision rules may be based
on Boolean logic (if x is true, then a site is likely to be present) or mathematical
equations. For logistic regression, the decision rules take the form of an equation
that calculates a specific point on the continuous logistic transformation scale.
Since a fundamental issue in locational modeling is the size of the weight that
should be applied to each of the independent variables investigated, a model
will be selected by first calibrating the decision rule to sample data (Kvamme
1988). In empiric-correlative predictive models, such as those derived from
logistic regressions, variables are not weighted prior to analysis. Rather,
variables are automatically weighted by the statistical procedure through coefficients
(see Kvamme 1992:37-38). For a review of the wide variety of algorithms used
in archaeology to develop model decision rules, see Kohler and Parker (1986)
and Kvamme (1988a, 1990).

As expressed by Kvamme (1990:261),
a "predictive model simply is a decision rule that assigns a land parcel
location to one of the defined archaeological event classes on the basis of
other conditions or characteristics of the location." Once determined,
these decision rules can be applied to any unsurveyed parcel of land to decide
whether the model specifies that a site is likely to be present. Presumably,
the subsequent application of decision rules to the entire landscape captures
a pattern left behind in the archaeological record. This methodology facilitates
cultural resource management planning by delineating the environmental patterns
exhibited by surveyed parcels of land that do or do not contain archaeological
resources and mapping them across unsurveyed regions using GIS.

2.2.5 Model
Evaluation

A predictive site model in archaeology
is simply a decision rule that assigns land parcels in a study area to one of
a number of mutually exclusive and exhaustive archaeological events (such as
potential for resource presence or absence) on the basis of environmental or
other non-archaeological characteristics of the locations. Presumably, the identification
of the key environmental attributes of the land parcels associated with the
presence or absence of an archaeological resource is a means of developing a
model that will be more effective than a by-chance model. Assessment of model
performance and accuracy are obviously necessary, for a predictive model meets
this standard.

Model testing may involve a number
of procedures. One common procedure involves the determination of the a priori, or chance, probability of the occurrence of certain archaeological events; followed
by an independent test of the model’s effectiveness against this probability.
Another testing procedure involves evaluating how well a model predicts sites
in a testing data set that is separate from the data used to build the model.
Alternatively, one can test the correspondence between models developed from
separate datasets for the same region, then calculate the coefficient of agreement
(kappa) for the models.

An a priori probability is
the "pure chance" probability that a land parcel does or does not
contain archaeological resources. As a by-chance locational model, it provides
a baseline that helps define what other models must accomplish. In regional
studies, by-chance models can be calculated by determining the relative frequency
of sites in a random sample of surveyed land parcels (e.g., Hord and Brooner
1976; Kvamme 1983, 1988a; Parker 1985:187). For instance, if 50 land parcels
contain resources in a surveyed sample of 500 parcels, then the probability
that a land parcel contains resources by chance is 0.1. Since this probability
of correctly assigning a land parcel to high or low site potential is no better
than chance, these probabilities can be considered by-chance locational models.
A more refined technique of calculating a priori probabilities is to
divide the summed site areas by the actual areas surveyed (Kvamme 1988a:410;
1990:260).

A second kind of model testing involves
the comparison of an independent random sample of known sites against model
assignments. This independent sample is referred to as the "testing data,"
whereas the database used to build the model is the "training data."
A completely independent group of site locations must be used, because using
the same sites as were used to develop the model results in test sample dependency,
producing overly optimistic assessments of model performance (e.g., Kohler and
Parker 1986; Kvamme 1988a, 1990; Mosteller and Tukey 1977). Other strategies
have been developed to surmount problems that result from testing a model on
data from which it was derived. These include split sampling (Limp et al. 1987;
Mosteller and Tukey 1977), jackknife methods (Kvamme 1988a; Mosteller and Tukey
1977), and sequential analysis methods (Limp and Lafferty 1981). All have severe
drawbacks. Therefore, a completely independent random sample survey is carried
out when feasible, for it makes it easier to assess the accuracy of models (e.g.,
Rose and Altschul 1988).

Calculating its percentage of correct
predictions in the test sample and comparing this with the likelihood of a correct
prediction by chance alone determine the predictive power of a model. A "correct
prediction" means that an archaeological site in the test population falls
into an area where the model predicts sites should be present. The correct predictions,
as a percentage of the total population of sites in the test sample, determine
the model's specific percent of predictive accuracy (Kvamme 1992:32). An arbitrary
suitable level of performance suggests a good model should correctly predict
about 85 percent of open-air sites.

The main method of assessing model
performance in archaeology is usually some form of cross-tabulation that compares
the actual presence of sites in the test population and the model assigned probabilities
of resources in the same locations. A number of statistical tests can then be
used to determine the significance of these cross-tabulation frequencies (e.g.,
Congalton et al. 1983; Kvamme 1988a, 1990). In addition, the ratio between the
percentage of sites correctly predicted and the percent of the study area classified
as likely to have archaeological resources present can be used to calculate
a gain statistic (Kvamme 1988a, page 329). Higher values of this statistic indicate
the degree of improvement of the model over by-chance models.

Finally, if more than one model
can be built for a region, each using a different dataset, measuring the correspondence
between these models can help evaluate their stability. This is a technique
borrowed from remote sensing. Models with high levels of correspondence (the
majority of cells assigned the same probability classes in both models) are
likely to be more reliable and less likely to change when new data are incorporated.
Models with low levels of correspondence are indicative of variability between
datasets. Neither may be adequately representative of the distribution of sites
within a region, and new data would likely produce yet a different model. The kappa statistic (Bonham-Carter 1994: 245)
can be used to measure the amount of agreement between models and correct for
the expected amount of agreement.

2.2.6 The Role
of Geographic Information Systems

The effective and efficient application
of the predictive archaeological models developed in the 1980s was severely
hampered by the labor effort required to manually measure map variables in large-scale
government-funded projects (Brown 1981; Judge and Sebastian 1988). In fact,
the application of high-resolution models was virtually impossible without restricting
sample size and the range of variables investigated. These limitations have
been overcome in recent years through the application of GIS technology, which
automates the entire process (e.g., Hasenstab 1983; Kvamme 1986, 1989, 1990;
Limp et al. 1987; Marozas and Zack 1987; Warren et al. 1987). Unlike traditional
database management systems, GIS has a spatial, or mappable, component that
allows the capture, efficient manipulation, analysis, and storage of geographical
information.

Two important properties of GIS
are its ability to encode information from diverse sources and to generate new
data. Geographically distributed information can be encoded from such diverse
sources as aerial photographs, remotely sensed digital imagery, and conventional
maps, like geologic, soils, and topographic maps. Since much of this information
is already being mapped into GIS systems by state and federal agencies, including
the location of known archaeological sites, this aspect of predictive modeling
is not as time consuming as it may seem. GIS is also able to generate secondary
data from other primary data sources. Digital elevation data at 1:24,000 scale,
for example, can be employed to obtain measures for other layers, such as slope,
aspect, land relief, and local terrain variability (Kvamme and Kohler 1988).

GIS technology can be used in all
phases of predictive archaeological models, that is, in their development, testing,
and application. By using a cell-based system that represents land parcel locations,
GIS can store separate thematic map layers that contain values for environmental
variables, like proximity to water, aspect, slope, and elevation, for each cell
(e.g., Wansleeben 1988). From these, GIS can measure any number of environmental
variables across the entire surface of a study area and then apply the model
decision rule to this surface to determine the assignment of each parcel of
land within the area to one or another of the dependent archaeological event
classes (here "resources are present" or "resources are not present").
All of this is accomplished by a series of mathematical operations on each land
parcel, or cell, in the surface. Since the number of cells in a surface is a
function of its areal extent and resolution, developing high resolution multivariate
models of large areas is completely dependent on the capabilities of GIS.

Large-scale, government-funded projects
in the 1980s have been responsible for the development of sophisticated procedures
for developing and assessing the application of archaeological predictive models.
Recent innovations in GIS technology and increased availability of data in GIS
format have made the application of these models a realistic and sensible component
of cultural resource management.

During the first developmental phase
of the project: (1) basic data for both independent (environmental characteristics)
and dependent (archaeological events) variables were gathered through conversion
of existing data, new field surveys, and an examination of Minnesota Statewide
Archaeological Survey manuscripts on file at the SHPO office; and (2) prototype
models were developed (Figure 2.1 a, 2.1
b, 2.1 c).

2.3.1 Basic
Data Accumulation

The most accurate cultural resource
predictive models are based on field information gathered through probability-based
search procedures. In a probability-based search procedure, a controlled chance
process is used to select areas for survey. Places surveyed may be completely
randomly distributed or distributed randomly within selected areas of interest.
Models based on such surveys are more precise because their use of randomly located
field units reduces bias. Unlike intuitive or expert systems models, they do not
assume that the distribution of known archaeological properties is necessarily
a reliable guide to the distribution of all similar resources. The model must
apply equally well to parcels of land that do not contain cultural resources as
well as to those that do. Consequently, survey results must also include information
places that were surveyed, but where no archaeological resources were found. The
State of Minnesota was fortunate in this project to already have a very large,
if still untapped, database gathered using "probability-based" search
procedures to some extent. This database was a product of the late 1970s LCMR
funded Minnesota Statewide Archaeological Survey (MnSAS), which investigated more
than twenty counties (Minnesota Historical Society 1981). The Phase 1 prototype
models were built on that database, on the results of Mn/Model field surveys in
1995 and 1996, and on qualified Cultural Resource Management (CRM)
surveys (Chapter 5). Data from CRMsurveys
in the state were considered qualified if a probabilistic or 100 percent search
procedure was utilized (in a 100 percent search procedure, an entire project area
is surveyed and rigorous field methods are employed).

Although the original statewide
survey was able to provide much of the project’s archaeological data for developmental Phase 1, that survey, when compared to
current survey standards, was both procedurally and statistically flawed. There
were four problems in interpreting statewide survey data. First, a 50 m transect
and shovel-testing interval was used, which meant that fewer sites might have
been encountered than using the contemporary standard 15 m interval. Second,
possible statistical errors occurred in selecting quarter-quarter (40 acres)
sections for survey. Although the initial selection of quarter-quarter sections
was random, the procedures used to stratify some counties and to select random
quarter-quarter sections for survey might have been biased. Third, the absence
of a standard stratification procedure meant that most survey data were not
directly comparable either between counties or to the Mn/Model surveys. Finally,
although a probabilistic search procedure was used, the fact that it was stratified
on the basis of archaeologists’ expectations involved some circular reasoning,
for the stratification procedure was designed to support archaeologists’ expectations
by focusing survey on areas where sites were thought to occur.

Because of these problems, two small
but more rigorous seasons of field surveys were conducted as part of Mn/Model
Phase 1 in 1995 and 1996. Besides providing additional, rigorously collected
field data, the purpose of the Mn/Model surveys was to allow an assessment and
standardization of the original probabilistic database so that it could be interpreted
appropriately for this project. The MnSAS database was compared to the databases
of these two surveys by the project statistician to determine: (1) the probability
that smaller sites are under represented in the results of MnSAS because of
the transect width; and, (2) whether the stratified sampling strategy adopted
in MnSAS produced a biased sample of all existing surficial sites (see Chapter
5).

2.3.1.1 Archaeological Field
Surveys

The goals of the Phase 1 Mn/Model
archaeological field surveys were: (1) to critically assess the effectiveness
of the MnSAS in locating sites of all sizes; (2) to add new field data to the
archaeological database using probability-based field procedures; and (3) to
ensure that all landscape types selected within the sampling frame (see below)
were tested regardless of their suspected site potential. The third goal was
as important in this project as the first two, for it is necessary to state
with equal confidence in a cultural resource predictive model where sites will
most likely not be found as where they most likely will be located. In
addition, the model-building process proceeds by contrasting the environmental
associations of archaeological sites with non-site locations. This sampling
procedure ensured the location of land parcels lacking archaeological resources.

Predictive models of archaeological
resource location are based upon a variety of basic landscape, vegetative, and
hydrological variables that differ widely in scale and nature. Some of these
variables, such as slope and aspect, are continuous while others, such as landform
and vegetation type, are discrete. A primary goal of a modeling projectlike
this oneis to determine which combination of these variables produces
the highest accuracy in predicting resource presence or absence. Effective combinations
may vary both locally, within different parts of the landscape, and regionally,
between different environmental zones. Thus, the state-wide model is composed
of a series of submodels, each specific to an identifiable environmental region.
Since an essential component of the modeling process was to assess the relative
importance of landscape variables in environmentally different regions, the
first archaeological field survey was carried out in three different archaeological
resource regions (see Chapter 3) for which
MnSAS data were available.

The first field surveys were conducted
in the Prairie Lakes, Central Lakes Deciduous, and Central Lakes Coniferous
archaeological resource regions. Three field crews conducted the survey during
the warm weather months of 1995. Portions of Nicollet County (Prairie Lakes
region), Stearns County (Central Lakes Deciduous region), and Beltrami
County (Central Lakes Coniferous region) were inventoried for archaeological
properties.

To ensure compatibility for comparison
with data derived from the MnSAS, an enhanced version of the stratified random
sampling procedure was adopted. MnSAS surveys were conducted by sampling various
landscape types. The most basic distinction between types was simply "adjacent
to" and "away from" water. Finer subdivisions included "lakeshore,"
"streamshore," "marsh/wetlands," and "lake inlet/outlet."
Since the landscape types in the three surveyed archaeological resource regions
vary slightly, a stratification scheme based upon the basic adjacent-to-water
and away-from-water division used in MnSAS was employed. Each of these basic
divisions was further subdivided in accordance with the terminology used in
MnSAS and the landscape types present in each survey region. For example, in
Stearns County, "adjacent-to-water" was divided into "Stream/River,"
"Lake," and "Wetland," and "Lake" into "Larger
than 40 acres" and "Smaller than 40 acres" (a Minnesota Statewide
Archaeological Survey division). Besides providing a more detailed sample of
regional landscape formations than in MnSAS, the strategy ensured comparability
with those earlier surveys, for the survey results could be compared at different
levels of landscape hierarchy.

Public Land Survey System (PLSS) quarter-quarter
(40 acres) sections representing the complete range of landscape types were selected
from United States Geological Survey (USGS) topographic maps using a stratified
random sampling procedure. These were then field surveyed for the presence or
absence of archaeological resources. In this survey, the sampling frames in the three archaeological resource regions varied in size of unit surveyed,
in portion of county surveyed, and in number of sections surveyed. This was because
of differences in landscape variability between counties and differential ease
of survey (e.g., forested areas are more time-consuming to survey than are agricultural
fields). A detailed rationale and description of this sampling strategy, its results,
and field procedures were provided in the field report (see Appendices
C and D).

The second Mn/Model Phase 1 archaeological
field surveys, carried out during the warmer months of 1996, were originally
intended to focus on scattered landscape types identified by the earliest prototype
models and sampled using a probability-based procedure. The main goal was to
determine whether particular landscape types consistently contained archaeological
properties. They were to provide data to test the first models and to test the
integrity of Minnesota’s archaeological
resource regions (Figure 2.2). The surveys were to occur in the
northeast corner of the state (the Border Lakes, Lake Superior Shore, and eastern
Central Lakes Coniferous regions), in the Red River Valley and Southeast Riverine
regions, in the Twin Cities Metro area, and in MnDOT District 8 (Figure
2.2) in the southwestern portion of the state.

However, there were a number of
problems with this approach. First, Mn/SAS surveys were not conducted in many
of these areas, so probabilistic-based data were not available to build the
prototype models for them. Second, continuing to use a landscape stratification
regime was likely to reinforce the sampling bias inherent in MnSAS and other
previous surveys. Using the MnSAS approach of sampling locations near water
(a relatively small portion of the landscape), in preference to locations away
from water, was almost certain to find that the majority of sites would be near
water. Likewise, building models based on such biased sampling, then further
designing surveys to specifically test these models, as was planned, would fail
to extend our understanding of archaeological site location in Minnesota.

The survey approach was altered
to better suit the project’s needs. The alternative focused on sampling selected
10 to 40 acre parcels of land identified by GIS using computer generated random
points for three counties (simple random sample). The goals of this survey were
(1) to compare the results of a completely random survey to those of the geographically
stratified, and therefore potentially biased, survey adopted the first summer;
(2) to provide data that could be used to test any future predictive models;
and (3) to estimate the a priori probability of finding an archaeological site.
A priori or "by chance" probabilities can be used (but are not necessary)
to rate the performance of predictive models. They can also provide valuable
information about how often sites occur across the landscape. These surveys
took place in Wabasha County (Southeast Riverine archaeological resource region),
in Wright County (Central Lakes Deciduous archaeological resource region), and
in Cass County (Central Lakes Coniferous archaeological resource region). A
detailed rationale and description of this sampling strategy, its results, and
field procedures are provided in Appendices C and D.

2.3.1.2 Geomorphic Methods

The two major contributions geologists
and geomorphologists can make to a project of this nature are to interpret:
(1) the age of strata and surfaces; and (2) the likelihood of preservation of
archaeological resources within strata. The first contribution assists in determining
which land surfaces people could have occupied in the past. The first people
in Minnesota probably arrived as long ago as 12,000 B.P.; therefore, older buried
surfaces can be ignored in surveys here. The second contribution helps archaeologists
determine the probable presence and remaining integrity of buried sites. Sites
in retransported sediments are disturbed and, therefore, less significant according
to National Register criteria than those in undisturbed sediments.

Geomorphologists prepared a detailed
high resolution map of landform sediment assemblages within seven major river
valleys and one ancient lake basin. This layer was converted to GIS format at
the end of Phase 2. Attributes include landform units and, for each of these,
general sediment sequences, age estimates, and surface and buried potential
for archaeological sites. Fieldwork was necessary to document the underlying
geology and to correlate it with surficial patterns recognized on soils maps,
vegetation maps, or aerial photographs for development of the GIS database.

Fieldwork included describing undisturbed
core samples collected from uplands to bottomlands along predetermined transects
within each of the eight MnDOT districts. MnDOT drill-rig teams were utilized
when cores were required within existing highway right-of-ways; otherwise, a
combination of a Giddings soil-probe and contracted drillers were used on private
property. Selected organic materials were collected, identified (when possible),
and radiocarbon-dated to help determine the
absolute ages of buried horizons and bracket the ages of land surfaces. Geomorphology
was used to bracket the ages of strata and land surfaces based on crosscutting
relations (for example, a terrace inset beneath a higher terrace is younger
than the higher terrace because of the principle of down-cutting relations in
a stream system). Eighty radiocarbon dates were included in the geomorphic study
to build a time framework for the predictive model (see Chapter
12 and Appendix E).

2.3.1.3 The Archaeological
Database

Because Mn/Model is designed to
predict high, medium, and low probability zones for the location of precontact
and contact period archaeological sites, one of the early priorities of this
project was to establish the locations of known sites and areas where no sites
were identified. The information used to compile this archaeological database
was gleaned from several different sources (see Chapter
5). Chief among them was the recently digitized archaeological database
from the State Historic Preservation Office (SHPO), which contains all known
and verified sites in the state except more recently discovered sites on federal
lands, which are in the process of being added. Other sources of archaeological
site information were the Chippewa and Superior National Forest databases, the
National Park Service database, and data from the Mn/Model archaeological surveys
in 1995 and 1996.

All known sites were represented
in the database by the UTM coordinates of their centroids, as originally
recorded in archaeologists’ survey reports. There was no attempt to establish
or include information about site boundaries. It was felt that site boundaries
were often too nebulous to digitize as finite entities. Moreover, digitizing
site boundaries could not have been accomplished within the budget and time
frame of the project. In addition to the archaeological sites, negative survey
points (nonsites), which are necessary for building an accurate predictive model,
were collected from a variety of survey projects (MnSAS, Mn/Model, and CRM).
All points, positive and negative, were incorporated into a separate database
for each of Minnesota’s 87 counties. Because of other considerations, some counties
had more than one database. Chapter 5 discusses
the data collection, database assembly, and quality control methods in detail.

The predictive model
was intended to forecast the spatial locations of both precontact (pre-A.D.
1650) and contact (A.D. 1650-1837) period archaeological sites. Although these
periods together lasted for ca. 11,000 years, archaeological sites dating to
this time span share many features, including common locational factors, the
rarity of visible, standing structures, and a need for field search procedures
in their location. No attempt was made in this project to model the location
of postcontact (historic) period sites. The reasons for their location are different
from those of the precontact and contact periods. In addition, their location
can usually be more easily and reliably accessed through archival records. Historic sites will be added as a thematic layer to the GIS database as
a later enhancement, but will never be used for modeling.

The total number
of sites employed in the Mn/Model archaeological database that contain a contact
period component but lack a precontact occupation (45) is less than one percent
of the total sites occupied during the precontact period (5,743). Twenty-eight
of these sites are placed within the functional types of trading post (n=9),
mortuary (n=10), basecamp (n=1), wild ricing (n=6), sugar
mapling (n=1), and quarries (n=1). The remaining 17 sites have
not been assigned a function. The 45 contact period sites lacking any precontact
components are spread out over 15 of the 20 Ecological Classification System
subsections in Minnesota (Figure 3.11), including
the Anoka Sand Plain (n=6), Aspen Parklands (n=1), Big Woods (n=1),
Border Lakes (n=5), Chippewa Plains (n=3), Glacial Lake Superior
Plain (n=1), Hardwood Hills (n=1), Mille Lacs Uplands (n=6),
Minnesota River Prairie (n=3), North Shore Highlands (n=1), Rochester
Plateau (n=2), Pine Moraines and Outwash Plains (n=8), St. Croix
Moraines and Outwash Plains (n=2), North Shore (n=1) and the Tamarack
Lowlands (n=4). Given the relatively small number of these contact period
sites, their inclusion has a minimal impact on the Phase 1, 2, and 3 models.

2.3.2 Developing
the Initial GIS Models

The end-products of this phase of
the project were prototype GIS models for five archaeological
resource regions. These models were based on the assumptions and principles
discussed earlier in this chapter. Among the steps involved in developing the
models were: (1) the selection of environmental attributes and variables; (2)
the conversion of these data to a standard GIS format; (3) the addition of the
locations of known precontact and contact period sites and negative survey points
as a cultural resources thematic layer; (4) the definition of measurement standards
for the independent and dependent variables; and (5) the statistical analyses
of these data to derive the prototype models. More than 40 environmental variables
were incorporated into the GIS model as thematic layers in Phase 1. Statistical
analyses eventually reduced this number to fewer than 12 for most models. Specifics
of these procedures are discussed in Chapters
4, 6, and 7,
and in Appendix B.

Following the first archaeological
field survey in 1995, the modeling tasks were carried out in a pilot project
for Nicollet County. The purpose of the pilot was to assess how these five components
of model building could be most efficiently executed. The first major deliverable
consisted of a conceptual model for Nicollet County. Methods developed in the
process of building this conceptual model were refined and applied to the development
of the Phase 1 prototype models.

By the end of Phase 1, information
from the two seasons of field survey and all counties included in the MnSAS
was incorporated into five regional models. The Phase 1 models were: (1) based
on a modest sample of precontact and very small sample of contact period archaeological
resources from probability-based surveys; (2) high resolution (30 m x 30 m cells);
and (3) delineated low, medium, and high probability zones for the presence
of archaeological resources. These Phase 1 models were applied to 27 counties
in five archaeological resource regions. These were the only counties in the
state in which probability-based surveys had been conducted.

The goals of the second phase of
the project were to add additional archaeologicaland environmental data
to the GIS database, to refine modeling techniques, and to extend modeling to
the entire state. An additional goal was to develop information on major river
valley sediments and the paleoclimate of Minnesota, so that the state’s paleoenvironments
could be better represented.

2.4.1 Refining
and Extending the Models

Archaeological and environmental
data from counties not considered in Phase 1 were incorporated into the
GIS in this phase. This allowed the model to be extended statewide.

Additional statewide environmental
databases incorporated into this modeling phase included historic vegetation
from the Minnesota DNR (based on Marschner, 1974) and low resolution quaternary
geology from MGC100 (see Chapter 6). Statewide
environmental data are referred to as "basic" data and were used to
develop a set of consistent, statewide models. Additional layers that were not
available statewide were developed and tested in certain areas. These included
high resolution soils data (available for only 48 counties), bedrock geology
(for selected counties in southeastern Minnesota), and Trygg maps of historic
cultural features (digitized for 20 counties). These data were incorporated
into "enhanced" models for the areas they covered. Comparison of these
models to the basic models allowed evaluation of whether the contribution of
the enhancement data sets contributed significantly to model improvement.

Lessons learned in developing the
Phase 1 prototype models were used to refine the original modeling techniques
(Chapter 7). Evaluation of Phase 1 prototype
models showed the locations of negative survey points to be closely associated
with site locations because of the stratified sampling procedure adopted and
SHPO survey recommendations based on intuitive site probability areas. To be
certain that characteristics of sites were compared with a representative sample
of the landscape, the Phase 2 modeling procedures used truly random points,
generated by the GIS, as nonsites. Although the random points were not surveyed
for the presence of sites, site density figures generated for regions of the
state (see Chapter 5) indicated that the
likelihood of a site falling on a random point was very low. It was felt that
the availability of large numbers (usually several thousand per subregion) of
true random points captured the environmental diversity of an area and more
than compensated for this risk.

Another important Phase 2 refinement
was in the use of S-Plus statistical software, rather than ARC/INFO
GRID, for the logistic regression (Chapter
7). S-Plus provided more rigorous methods for narrowing down the very large
number of environmental variables for inclusion into the models. This assured
that the best combinations of variables would be incorporated into the models.
The S-Plus output also provided information about the regression analysis that
aided in the comparison and evaluation of the models.

Another change in Phase 2 procedures
was in the assignment of areas to high, medium, and low probability classes.
In Phase 1, raw model values were classified into high, medium, and low probability
areas that each occupied 33% of the landscape. In Phase 2 a weighted technique
was used that aimed to: (1) minimize the size of the areas classified as high
and medium probability; while (2) capturing a large proportion of known sites
in those areas; and, (3) maximizing the gain statistic for the model (Chapter
7).

Two types of models were developed
in Phase 2. They used exactly the same modeling procedures, but were based on
different databases. First, "basic" models were developed using only
the basic (statewide) data. Since all archaeological resource regions share
these data layers, these models are comparable with one another. After basic
models were developed, "enhancement" data were added to the database
for certain areas, and "enhanced" models were built. These enhanced
models could be compared with the basic models to determine the contribution
of the additional higher resolution data. Since the enhanced data layers seldom
overlapped, models enhanced by Trygg data were often in different parts of the
state than models enhanced by soils data.

Throughout the early stages of Mn/Model,
there was discussion as to the contribution of single artifacts and lithic scatters
to the archaeological database (see Section 6.4.1.1). It was determined early on that single artifacts could
be found almost anywhere in the landscape, so they were excluded from the analysis
in Phase 1. However, at the beginning of Phase 2, the value of lithic scatters
was still unclear. Consequently, two sets of Phase 2 basic and enhanced models
were developed. The first excluded single artifact find spots from the population
of known sites, as did the Phase 1 models. The second set also excluded lithic
scatters. This permitted evaluation of the contribution of lithic scatters to
the performance of the Phase 2 models (Section
8.6.2 and Table 8.6.4).

Together with the larger site populations
available for modeling, these model refinements resulted in significant improvements
over the Phase 1 models (see Section
8.6.2).

2.4.2 Developing
the Third and Fourth Dimensions of the Model

In nearly all large-scale archaeological
site location predictive models, modeling the location of deeply buried sites
is absent. These are sites often encountered in river valley projects, such
as bridge repairs or construction, and they are expensive to excavate. Development
of a three-dimensional predictive model of archaeological site location in such
areas was essential to this project.

Of equal importance in cultural resource management is understanding the fourth dimension of the age of
strata and surfaces in alluvial deposits. Not all deeply buried sites are of
equal National Register significance. Very old Paleoindian camp sites, for example,
tend to be small in area, with light artifact scatters. However, they are highly
significant because of their age and rarity in the state. Modeling the fourth,
or temporal, dimension of buried deposits is a means of providing information
to cultural resource managers that will help them assess site significance and
integrity. As mentioned earlier, 80 radiocarbon dates included in the geomorphic
study were used to build an alluvial valley temporal framework for the geomorphic
model. Thegeomorphic information gathered in Phase 1 was incorporated
into a GIS format in Phase 2 to add depth and time dimensions to the database.
These dimensions are represented in maps of the distribution, in major river
basins, of sediments that may or may not contain buried archaeological sites
withcontextual integrity (Chapter
12).

As a companion to the geomorphically-derived
temporal dimension within river valleys, a paleoclimate team provided additional
temporal data to the GIS team in Phase 2. The paleoclimate team modeledthe
climate (temperature/precipitation) and vegetation of Minnesota for the past
12,000 years, using a paleoclimate reconstruction tool developed by Dr. Reid
Bryson at the University of Wisconsin-Madison. Their paleoclimatic models were
linked to eight different types of pollen, from pollen profiles, to model climate/vegetation
shifts across the state at a temporal resolution of every 100 years.

Since it is assumed that precontact
people in Minnesota settled near critical resources, such as water and shelter,
it is important to determine, when developing a predictive model, how these
resources shifted through time. Raw output from the paleoclimate and paleovegetation
models were provided to the GIS team. Surface modeling techniques were used
to convert these raw data to maps (trend surfaces). However, evaluation of the
results determined that these data were too coarse to be useful for the purposes
of this project (see Chapter 6).

Major research tasks initially identified
for the third phase of the project were to: (1) develop a model of suitable
pre-Woodland habitats for predicting locations of very old or buried sites;
(2) test the geomorphic models with archaeological data; and (3) enhance the
statewide predictive model with additional data and refined methods, developing
a model of areas not adequately surveyed, and developing a layer of confidence
in the model. Simultaneously, the preliminary Phase 2 models were applied to
cultural resource management projects and, based on this experience, development
of an implementation plan was begun.

2.5.1 Modeling
Pre-Woodland Habitats

The intention was to use data from
the Minnesota Department of Natural Resources (DNR) statewide geomorphic mapping
project as the basis for modeling pre-woodland habitats statewide. However,
these data were not ready in time, so this part of the project was abandoned.

2.5.2 Testing
Geomorphic Models with Archaeological Data

Geomorphic models of river valleys
and selected upland quads were developed in all phases of Mn/Model without reference
to the locations of archaeological sites. In Phase 3, these models were tested
using the archaeological database. Details of these tests are provided in Section
12.17.

2.5.3 Enhancing
the Statewide Predictive Model

The results of the basic models
through Phase 2 of this project produced gain statistics ranging from 0.28 to
0.89. They classified between 8 percent and 63 percent of the landscape as high
and medium probability and correctly predicted 59 percent to 87 percent of the
known archaeological sites. A project goal was to develop models with a minimum
gain statistic of 0.61. This means no more than 33 percent of the landscape
is classified as high and medium probability, and at least 85 percent of the
known sites are correctly predicted in each region. The primary objective of
Phase 3 was to improve the models to achieve this project goal. Improved models
should have the following characteristics:

They should predict known sites
at least as well as the preliminary models.

Less land area should be occupied
by medium and high probability values.

The model should have fewer variables,
to minimize storage space and processing time and to facilitate model interpretation.
There are also reasons to believe that fewer variables will increase the model’s
ability to predict future data sets, as additional variables may simply reflect
noise or anomalies in the data set used to build the model.

A second modeling objective in Phase
3 was to develop a model of survey bias. Phase 1 and 2 models predicted negative
survey points almost as well as they predicted sites in some regions. This indicated
a degree of bias in the selection of places to survey. Knowing this posed an
interpretation problem: were areas classified as low probability because there
were no sites there, or because they had not been surveyed? By developing a
model of survey bias, undersurveyed areas can be distinguished from low probability
areas.

The third Phase 3 modeling objective
was to develop a layer of confidence in the model. Model results depend on the
quality of the data used. Data quality is not constant across the state. It
was determined that a GIS thematic layer indicating the degree of confidence
to place in the model, as a function of the quality of data input, would help
interpretation and implementation.

2.5.3.1 Improving the Model

Several approaches were taken to
improve model performance. These included: changing the model regionalization
scheme; incorporating new environmental data layers; increasing the number of
sites used to develop the models; and simplified procedures for classifying
raw model values into high, medium, and low probability categories.

Phase 2 models exhibited patterns
that were artifacts of the boundaries of the archaeological resource regions
(Anfinson 1990) used for model development (Figure
3.10). These consisted of high probability cells clustered along regional
boundaries (Figure 8.4). Often, these boundaries
did not represent natural features. Also, some of the archaeological regions
were more environmentally homogeneous than others. The more homogeneous regions
tended to produce better models. At the beginning of Phase 3, the Ecological
Classification System (Hanson and Hargrave 1996) sections and subsections became
available as a GIS layer from Mn/DNR (Figure 3.11).
It was adopted as the Phase 3 regionalization scheme because its units were
based on combinations of environmental factors (geomorphology, hydrology, soils,
historic vegetation) and its boundaries followed natural features.

In Phase 3, two new datasets from
Mn/DNR were incorporated into the development of the models. These were the
major and minor watershed boundaries and the distribution of bearing trees from
the PLSS records. Watershed boundaries were used to determine locations of divides
(ridges) and watershed areas. Bearing tree data were substituted for the very
low resolution tree species distribution data used in Phases 1 and 2.

Phase 1 and 2 model performance
was clearly related to the numbers of archaeological sites used to build the
models. Two tactics increased the numbers of sites used to build the Phase 3
models. First, lithic scatters were always included in the training dataset.
Second, sites from all sources were included. This was a difficult decision
because it was a departure from using data only from probabilistically-based
surveys. However, site numbers were so low in part of the state that it was
deemed necessary. Even with these concessions, models suffered from the paucity
of sites. Finally, after most regions had been modeled, the project team, in
consultation with the project advisors, decided to combine the training and
testing datasets and use the entire population of known sites (excluding single
artifacts) to build the last round of Phase 3 models.

In Phase 2 models, raw model values
were reclassified into high, medium, and low probabilities based on a complicated
system that was intended to maximize the gain statistic. However, the procedures
were difficult to use consistently and the results often maximized the gain
statistic at the expense of the numbers of sites predicted. After a re-consideration
of the project goals, it was determined that the numbers of sites predicted
was more important than the gain statistic or size of the area classified as
high and medium probability. Phase 3 procedures were simplified to emphasize
consistency in site predictions, with a specific objective of including as close
to 85 percent of the sites as was possible in the high and medium probability
areas. This sometimes came at the expense of the size of the area classified
as high and medium probability and of the gain statistic.

2.5.3.2 Modeling Survey Bias

Evaluation of Phase 1 and 2 models
provided evidence of an already suspected bias in the locations of past surveys.
Extensive parts of the state not only contained no sites, they also contained
no negative survey points. It was suspected that the primary bias had been for
archaeologists to look for sites near water bodies. However, it seemed it should
be possible to model this bias using the same methods as were used to model
site potential. In Phase 3, models of survey probabilities were developed using
all surveyed places (negative survey points and sites of all types from all
sources) as the instance "resources present" and random points as
the instance "resources absent" (Chapter
7). High, medium, and low probabilities in these models indicate the likelihood
that places like those so classified have been adequately sampled. These models
were then combined with the site probability models to make the survey implementation
models, which indicate site probabilities weighted by the probability that an
area has been adequately surveyed. Where low survey probabilities and low site
potential intersect, the area is classified as "unknown." This is
designed to prevent misinterpretation of extensive areas as low site potential,
when they have not been adequately surveyed and will help MnDOT justify surveys
in such places.

2.5.3.3 Developing a Layer
of Confidence

Originally, the confidence layer
had been intended as an indicator of data quality. However, given the amount
of information available at the end of Phase 3 modeling, it became feasible
to develop two different confidence layers. The first was, as originally envisioned,
an indication of the varying quality of data across the state (Figure
4.1). Data quality is considered an important determinant of model results.
The second confidence layer was based on measures of the stability and performance
of the preliminary Phase 3 models (Figure 7.3).
Methods for developing these layers are discussed in Chapter
7.

2.5.4
Model Demonstration

While the Phase
3 models were being developed, MnDOT Cultural Resources Staff began using the
Phase 2 models to help scope/review projects. Models, as well as underlying
data layers, were viewed in ArcView. Consultants were asked to map their own
intuitive models for the project area before field work commenced. The consultant
was never shown Mn/Model. The MnDOT Cultural Resources Staff determined which
areas would be surveyed. These surveys were designed to test both Mn/Model and
the consultant's intuitive model. Refer to Chapter
9 for more information about these procedures.

2.5.5 Final
Implementation

Final implementation of Mn/Model
at MnDOT began in late 1999, following completion of the project documented
in this report. It involved a number of tasks, including data preparation and
distribution, application development, platform transfer, and completion of
documentation.

First, the data and models were
made accessible to MnDOT staff. All project deliverables were inventoried at
MnDOT and missing items were requested from the consultant. Then, the data
and models were organized and copied to the MnDOT GIS server. A user interface
was developed as an ArcView Extension to simplify access to these GIS layers.
Separate training sessions were prepared for the use of Mn/Model environmental
and archaeological data and predictive models and for use of the geomorphic
models. These training sessions were provided to selected MnDOT staff and representatives
from other government agencies.

Because Mn/Model was developed on
the UNIX platform, the ARC/INFO AMLs and S-Plus scripts used in Phase 1 through
3 contained UNIX system commands. During the course of the project, MnDOT changed
its standard GIS platform from UNIX to Windows NT. So that they will be useful
for future rounds of modeling, the Mn/Model AMLs and S-Plus scripts are being
edited to replace UNIX system commands with the equivalent NT commands. At they
same time, they are being thoroughly documented.

Project documentation was also being
improved and completed. This report was extensively edited and reformatted to
make it ready for publication on the Internet. A web site was designed, and
additional information about the project was prepared for inclusion. A manual
was prepared for use by future geomorphology consultants who may be hired to
perform further mapping of landform sediment assemblages. Metadata
were edited and updated. A process model was developed to facilitate transfer
of knowledge about the modeling process to future staff or to other states.
Data modeling occurred simultaneously with the process model, to both document
and improve the Mn/Model database design.

Finally, the Mn/Model Implementation
Plan was drafted. It guides the use of Mn/Model by MnDOT and other agencies.
It defines both who can use the models and how the models are to be used.

During the implementation phase,
Cultural Resources staff continued using Mn/Model for planning, scoping, and
reviewing projects at MnDOT. Ongoing test of the model with project data, combined
with periodic data and model updates, will ensure that Mn/Model's performance
improves over time.

An instructive result of large-scale
projects like Mn/Model is the actual time involved in completing the numerous
tasks that together represent the phases of the project. Because of the size
and complexity of the project, a long-range, coordinated strategy was established
to guide the development of the predictive models in a timely manner. Primary
project activities were identified and assigned a task number. All of the tasks
were organized within work plan flow charts that indicated the relationship
between tasks, their estimated dates of completion, and the level of effort
required for each task. Figure 2.3 contains the preliminary and final
project schedule. Figure 2.4 contains the
preliminary and final deliverable schedule.

Overall, the project extended for
more than a year beyond its originally intended end date. Conversion of environmental
data to GIS format was one reason for this extension. At the onset of the project,
the scope specified a model with a 100 m resolution. A number of statewide environmental
data layers were available at that cell size. However, in the early project
stages, team archaeologists began demanding a higher resolution model, with
a 10 m cell size. A compromise was reached for a 30 m cell size, the resolution
of the 1:24,000 USGS digital elevation models (DEMs).
However, very little data from 1:24,000 sources had been converted to GIS format.
The DEMs were not complete for the entire state. The National Wetlands Inventory
(NWI) had just been completed, but was in the process of being converted to
ARC/INFO format by the Land Management Information Center (LMIC). County soil
surveys, when available digitally, were in EPPL7 format.

State agencies were very cooperative
in providing data as quickly as they could. Mn/DNR had already converted the
available 1:24,000 DEMs to ARC/INFO GRID format and provided them to the project.
LMIC accelerated their schedule for conversion of NWI data to ARC/INFO coverages
and provided us with sets of coverages as soon as they were ready. LMIC also
acted as an intermediary in acquiring the available soil data. Still, a large
number of conversion tasks remained for the Mn/Model GIS team once data were
received. The statewide extent and high resolution of the environmental databases
required that conversion tasks be done by regions because of hard drive and
processor limitations.

When data conversion was completed
for one region, modeling of that region commenced, and data conversion for another
region began. Simultaneous data conversion and modeling provided some efficiencies.
However, the need to work out conversion and modeling procedures during this
phase mitigated against optimum progress. Consequently, Phase 1 was extended
eight months beyond its originally intended end date. Doing the work over again,
with procedures already developed, would take much less time.

Another under-estimate was in the
time required to map river valleys and upland landforms and convert these geomorphic
maps to GIS format. The project geomorphologists, who had no previous experience
with preparing data for GIS, did not anticipate the amount of time that would
be required for quality control and preparation of attribute data. Consequently,
these GIS layers could not be ready soon enough to incorporate them into the
site probability models in any way.

Another task slighted in the original
research plan was that of interpreting the models. This time-consuming task
was initiated towards the end of Phase 2. However, since Phase 2 models were
still considered preliminary and there was little time remaining, a rigorous
interpretation was attempted for only one region. This provided an opportunity
to work out interpretation procedures that could be applied to the Phase 3 models.
Still, interpretation of a large number of regional models takes time and cannot
be started until a complete set of models is completed for at least one region.
Moreover, in the careful examination of models that occurs in interpretation,
inconsistencies pointing to data or modeling errors sometimes become apparent.
Correcting these problems and creating new models should be an option at this
point. In future projects, several months at the end of the final model development
phase should be set aside for interpretation.

Finally, the original research plan
under-estimated the time required to adequately document such a complex project.
Originally, only two months were allotted for preparation of the reports for
Phases 1 and 2. Seven months were allotted for documenting the project at the
end of Phase 3. Report preparation, review, and revision took a total of six
months each at the ends of Phases 1 and 2. Although the additional work done
in those Phases provided a solid foundation for much of the final project report,
no one anticipated the extent to which methods would change in Phase 3 or the
time it would take to interpret and document each subregional model. Moreover,
there were key personnel changes during the documentation phase that delayed
the projected for four months. In addition, the requirement to provide this
report in Web-ready format added time to its completion. Finally, the time required
to perform quality control on the deliverables, in particular this report, far
exceeded the original expectations.

Hammer, J.
1993 A New Predictive Site Location Model for Interior New York
State. Man in the Northeast
45:39-76.

Hanson, D.H. and B.C. Hargrave
1996 Development of a Multilevel Ecological Classification System
for the State of Minnesota.
Environmental Monitoring and Assessment
39:75-84. Kluwer Academic Publishers.
Netherlands.

Hasenstab, R. J.
1983 A Preliminary Cultural Resource Sensitivity Analysis for
the Proposed Flood Control
Facilities Construction in the Passaic
River Basin of New Jersey. Soil Systems, Inc.
Submitted to the Passaic River Basin Special
Studies Branch, Department of the Army. New
York District Army Corps of Engineers,
New York.

James, S.E., and R. Knudson
1983 Predicting Site Significance: Management Applications of High-Resolution
Modeling. Paper
presented at the 48th Annual Meeting of
the Society for American Archaeology, Pittsburgh.

Kvamme, K.L., and M.A. Jochim
1989 The Environmental Basis of Mesolithic Settlement. In The
Mesolithic in Europe: Papers
Presented at the Third International Symposium,
edited by C. Bonsall, pp. 1-12. John
Donald Publishers, Edinburgh.

Marschner, F. J.
1974 The Original Vegetation of Minnesota. Compiled from
U.S. General Land Office Survey
notes. North Central Forest Experiment
Station, Forest Service, U.S. Department of Agriculture.

Maynard, P.F., and A.H. Strahler
1981 The Logit Classifier, a General Maximum Likelihood Discriminant
for Remote Sensing
Applications. Proceedings of the Fifteenth
International Symposium on Remote Sensing of
Environment, pp. 213-222. International
Society of Electrical and Electronic Engineers, Ann
Arbor, Michigan.

Sebastian, L. and W.J. Judge
1988 Predicting the Past: Correlation, Explanation, and the use
of Archaeological Models. In Quantifying the Present and Predicting
the Past: Theory, Method, and Application of
Archaeological Predictive Modeling, edited by W.J. Judge and L, Sebastian, pp. 1-18. U.S.
Government Printing Office, Washington,
D.C.

Shermer, S.J., and J.A. Tiffany
1985 Environmental Variables as Factors in Site Location: An Example
From the Upper Midwest. Midcontinental Journal of Archaeology 10:215-240.

Stone, D.F.
1984 A Regional Synthesis and Archaeological Site Location Prediction
Model for South
Coastal Santa Barbara County, California.
Unpublished Master's thesis, Department of
Anthropology, University of California,
Santa Barbara.

Thomas, D.H., and R.L. Bettinger
1976 Prehistoric Piñon Ecotone Settlements of the Upper Reese
River Valley, Central
Nevada. Anthropological Papers of the
American Museum of Natural History 53(3). New
York.

Williams, L., P.H. Thomas, and C.L.
Bettinger.
1973 Notions to Numbers: Great Basin Settlements as Polythetic
Sets. In Research and Theory in
Current Archaeology, edited by C.L.
Redman, pp. 215-238. John Wiley, NewYork.