2 Data resources developed for biodemographic
studies

Several data sets were collected directly for the purpose
of studies of familial resemblance in human longevity and estimation of heritability of
human life span. These data sets deserve to be made available for further studies of
familial clustering of human longevity.

2.1 Data on Finnish and Swedish middle class
and noble families

Data comprising 12,768 cases were collected from various published Finnish and Swedish
nobility and middle-class genealogies by Dr. Eva Jalavisto from the University of Helsinki
(Finland) in 1951. This study was supported by Finnish Cultural Foundation ("Suomen
Kulttuurirahasto"). The results of data analysis were published [91], but the data were
not used later. We have made an attempt to contact Dr. Jalavisto requesting her database
and have found out that unfortunately she died in 1966. We plan to contact the colleagues
of Dr. Jalavisto (Dr. P.C. Holmberg in particular) at the Department of Physiology,
University of Helsinki in order to find out the destiny of this data collection.

2.2 French data on village of Arthez dAsson

This data set is the reconstruction of families from 1744-1975 based on the parish
registers and civil registers (5196 births, 1538 marriages, 4365 deaths) for the village
of Arthez dAsson situated in the Ouzom Valley in Bearn (Department of Pyrenees
Atlantiques). The data were collected and analyzed by Dr. Jean-Pierre Bocquet-Appel
(Laboratoire dInformatique pour les Sciences delHomme, Centre National de la
Recherche Scientifique, Musee de l'Homme, 17 place du Trocadero, 75116, Paris, France) and
Dr. Lucienne Jakobi (CRA, Musee de lHomme, F-75116, Paris, France). They have
records for 542 paternal grandfathers (maximum life span of 98 years), 542 records for
maternal grandfathers (maximum life span of 92 years), 542 records for paternal
grandmothers (maximum life span of 97 years) and 542 records for maternal grandmothers
(maximum life span of 92 years). For parents they have 542 records for fathers (maximum
life span of 98 years) and 542 records for mothers (maximum life span of 102 years). For
children they have 674 records for sons (maximum life span 94 years) and 688 records for
daughters (maximum life span 102 years). The brief description of their database is
published [24-25].

2.3 Hauge-Harvald Database on Elderly Danish Twins

This database is a part of Odense Archive of Population Data on Aging dataset (Program
Director - Dr. James W. Vaupel).

The Hauge-Harvald Database on Elderly Danish Twins consists of
individual level data on twin pairs born in Denmark between 1870 and 1930. For each twin
pair, date of birth and dates of death (if dead), sex, and zygosity are available. Data
availability: All of the above data are available and will be sent to qualified
researchers on request.

This database was used in many studies on human longevity [42, 43, 82, 118].

2.4 Cambridge Group for the History of Population and Social Structure Research
Projects

Data-Bases

English sources allow the construction of large-scale, individual data-sets for
demographic analysis of adult males from c1300 and for the general population from 1538.
These fall into two groups: those with full reproductive histories from family
reconstitutions or genealogies, and those for unlinked individuals.

For the general English population from 1538 - 1837 the researchers have family
reconstitutions of 110,000 marriages in 26 communities. Prof. J. Knodel who has kindly
provided the access to his family reconstitution study of 29,000 marriages in
14 German villages (16001950) [101]. Also, Dr. T.H.
Hollingsworth has kindly allowed the researchers to use his descendant genealogy of the
British Peerage from 1603 to 1959, which contains full life-histories for 28,000
individuals and partial information on their grand-children.

The unlinked individuals, suitable for mortality studies only, come from elite groups
and include a medieval database containing about 750 land-holders and 750 monks. In
addition, there are records for 14,000 Scottish church ministers (1530 - 1927) and we have
access to data for 17,000 Members of Parliament (1500-1860), courtesy of the History of
Parliament Trust.

1st Project: Long-Run Changes in Adult Mortality

Researcher: J. Oeppen

The major findings of this project are that, with the exception of two early epidemic
periods, levels of adult survival were largely unchanged until c1700, when the modern rise
begins. From this date, females follow a pattern of steady improvement shared by other
European countries. For men of all social classes, this upward trend is checked by
urbanisation and industrialisation in the first half of the nineteenth century. There is
little evidence of differentials in life-expectancy by sex or social class before 1800,
although there is a strong marital status effect. Only amongst women aged 65 to 85 does
wealth seem to be an advantage.

2nd Project: Historical Bio-Demography

Researchers: J. Oeppen and R. Davies

Amongst the hypotheses being investigated, a number are of interest to analysts of
longevity, and exploit the fact that most of the data is for periods with high exogenous
stress over the life-course, and natural fertility. For example, it was found across all
the data that the reproductive history of a woman has no influence on her survival after
age 50, thus confirming a long-held view amongst demographers. It was also found that the
difference in mortality in the reproductive years between single and married women in the
Peerage is explained by the number of children born, multiplied by the risk of maternal
mortality. However, the gap between men and women in these ages is too large to be
explained by maternal-mortality alone. Before 1900, maternal mortality is higher for
elite-group women than amongst the general population. See [128] for more
information.

Current research projects

Database

Historical population database of the Valserine valley

Short description of the database

The Valserine register reconstructs the population of five villages located in a narrow
valley of the French Jura mountains, near the Swiss frontier, 10 km west of Geneva.
Although more than 4,000 inhabitants dwelt in this region at the beginning of the 19th
century, about 1,000 locals live there nowadays. The reconstruction of the population was
carried out through the analysis of all parochial and civil registers available. Data
about approximately 70,000 vital events such as birth, baptism, marriage, death, burial
from 1680 to 1980 have been recorded into two computer files of the SYGAP software. One
file contains 46,390 individual mentions, the other one contains 14,115 union mentions.
Individuals of this database can be linked to each other through kinship networks. This
database was implemented on the initiative of BideauA, Brunet G and Plauchu H
within the framework of a co-operation between historical demographers and geneticists
aimed at studying the transmission of Rendu-Osler disease, a rare benign autosomal
dominant vascular disorder.

From this database a set of familial biographies has been selected to study the
relationships between life spans of parents and children. The main focus in the study was
on late mortality, taking into account post-reproductive survival (above age 50) for both
generations. The first step of the study was to test for the existence of a familial
component of longevity. The results showed a non-negligible contribution of a familial
component in the variability of life-span above age 50. The second step consists in
detecting particular patterns of inheritance associated with specific factors. In other
terms, the goal is to test the influence of particular factors on the magnitude of the
"correlation" between parents' and children longevity. A strong sex-effect in
the transmission of longevity has been identified suggesting the contribution of
sex-linked genetic traits. These results have led the researchers to develop and fit
survival models integrating sex-linked traits.

Studies on longevity are published in [44, 45]. Valserine Database
was also used in many other biodemographic studies [21, 49, 50, 83, 141].

2.6 Framingham Longevity Study

Family patterns for age at death were examined in a 40 year follow-up of 5209 men and
women (2900 deceased, 2309 living) in the Framingham Longevity Study [33].

2.7 Database on Six New England Families

13,656 records for members of 6 New England families (Bradford, Bulkeley, Cushman,
Denison, Pardee, Waterman) born 1650-1874 were collected and analyzed to study the
historical trend in resemblance between first-degree relatives for age at death [117]. The maximum
recorded life span in this sample was 109 years (female born in 18th century).

2.8 The Utah Population Database

The creation and development of the Utah Population Database (UPDB) became possible
when early in the 1970s permission was provided to copy records which are maintained by
the Genealogical Society of Utah (GSU), an organization operated by the Church of Jesus
Christ of Latter-day Saints (LDS). As a religious mandate members of the LDS church have
been encouraged to identify their ancestors [14]. The GSU, located in Salt Lake City,
Utah, is the world's largest repository of genealogical records. Each family group sheet
submitted to the GSU by the members of the LDS church represents data on three
generations: a husband and wife, their children, and their parents. The core of the Utah
Population Database is a set of genealogies with more than 1.2 million names linked
together in familial structures. These genealogies were selected for those records where
at least one family member was born or died in Utah or on the pioneer trail followed by
the members of the LDS church as they migrated from the midwestern states to Utah [14].

The second major set of records is the Utah Cancer Registry that was initiated in 1952
in large hospitals and which became statewide in 1966. The Utah Cancer Registry is
maintained as a separate database, but could be linked to genealogical data if necessary.

UPDB also includes 1880 census data for the state of Utah representing
approximately 143,000 individuals.

Another set of data is a file of death certificates for the period 1934 through 1981
that was included into UPDB. Now the entry of birth certificates is also in progress.

The UPDP is in the process of continuous developing with input and linkage to
genealogical data the 20th century death certificates and birth certificates.
Combining with census data, this database could provide a unique opportunity for longevity
studies, since many variables (e.g., social status data from the census, causes of death
from death certificates) could be taken into account in addition to the standard familial
variables (birth order, life span of parents and spouses, parity, etc.) drawn from
genealogies. For more information on the Utah Population Database see special publications
[11, 12, 14-17, 95, 112, 119, 157].

The Utah Population Database was used in several studies on life span inheritance. In
one study 20,682 familial longevity records for 9,719 families with twins and sibs were
collected and analyzed by Dr. Grace Wyshak (Department of Preventive and Social Medicine,
Harvard Medical School, Boston, MA) [182]. In another study
an analysis of twin longevity on 2,242 sets of twins, extracted from the Mormon Genealogy
Data Base (currently UPDB) was carried out by Dr. Dorit Carmelli [39-40].

2.9 The Laredo Epidemiological Project

The Laredo Epidemiology Project is a study of the patterns of degenerative disease,
particularly cancer, in the families of Laredo, Texas. This mostly Mexican-American
population is of manageable size and relatively culturally homogeneous. This project is
based on the family reconstitution using parish records on births, deaths and marriages
from all 12 Catholic parishes in Laredo, and also records from civil registries and the
one hospital in Laredo [38,
173]. The
genealogical history of Laredo was reconstituted by the grouping of 350,000 individual
church and civil vital event records into multigenerational families, with record linkage
based on matching names [38].
The Laredo population database was used in several studies of familial aggregation of
chronic diseases [38,
174, 175].

2.10 Genealogical Data on European Royal and Noble Families

One of the best sources of genealogical data available is the famous German edition of
the "Genealogisches Handbuch des Adels" (Genealogical Yearbook of Nobility) -
the most reliable and complete data source on European royal and nobility families [75-78]. This edition is
known world-wide as 'Gotha Almanac' ('Old Gotha' published in Gotha in 1763-1944 [7], and 'New Gotha'
published in Marburg since 1951 [75-78]). Data from the
Gotha Almanach were often used in early biodemographic studies of fertility (see [85], pp. 199-224, for
references), although later this important source of genealogical data has been
undeservedly forgotten.

Each volume of the New Gotha Almanach contains about 2,000 genealogical records
appropriate for analysis, with more than 100 volumes of this edition already published.
Thus, more than 200,000 genealogical records are available from this data source. The high
quality of information published in this edition is ensured by the fact that the primary
information is drawn from the German Noble Archive (Deutsches Adelsarchiv). The Director
of the German Noble Archive (Archivdirektor) is also the Editor of the New Gotha Almanach.

It was not until 1995 when the information form the "New Gotha" Almanachs has
been partially computerized by the research team of Dr. Leonid A. Gavrilov and
Dr. Natalia S. Gavrilova (Center on Aging, NORC and University of Chicago). By the
end of 1998 the database contained information on over 20,000 adult persons (over
30 years) born in 1700-1900 with complete information on their parents (including
birth and death dates) and spouses. The studies of longevity using this database
demonstrated substantially non-linear sex-specific transmission of human longevity from
parents to offspring [66,
67, 71-74] and effects of
parental age at reproduction on the survival of adult offspring [57, 58, 65, 68-70].

2.11 The Database of Qing Nobility (China)

This database has been developed by Dr. James Lee (California Institute for Technology)
and his colleagues (Dr. Wang Feng and Dr. Cameron Campbell) using genealogies of the Qing
dynasty (1640-1911). By the mid-twentieth century the Imperial Lineage in its entirety
included almost 200,000 persons: 80,000 in the principal line, and 120,000 in collateral
lines. Dr. James Lee and colleagues transcribed and organized vital records for more than
80,000 individuals from the principal imperial line (most of whom lived in Beijing) using
data from the Chinese historical archives and the copy of principal-line genealogy
available through the Genealogical Society of Utah. One of the advantages of this database
is virtually complete registration of females, with a sex ratio of 109 for children born
before the Opium War (1839-42), that is close to the sex ratio of birth of 105. After
1840, however, the quality of the data deteriorates significantly (a reflection of
dynastic decline) [106,
107, 170].

2.12 Wang Genealogical Database (China)

Chinese genealogies were collected and analyzed for the purpose of mortality studies by
Zhongwei Zhao. The main source in his study was the General Genealogy of Wang Clan or Wang
genealogy of Chinese upper class families. The Wang genealogy covers a very long period of
time - since 900 until 1800. The Chinese database contained records for about 30,000
individuals, although many records were incomplete [183]. Also, in most
Chinese genealogies no individual records were made for women. The life expectancy of
individuals in the database was very low for the entire historical period (e.g., about
30 years in 1700-1749 at age 30) even for these upper class families [183].

2.13 Special populations

Data on special religious sects are often used by epidemiologists and geneticists in
their studies. These sects usually share a relatively uniform environment and have a
rather unique life-style that isolates them from the general population. In addition to
Mormon data (described earlier), Hutterite and Amish populations are well studied now.

A. Hutterite Population

The Hutterites are an Anabaptist sect that originated in Moravia in 1528. Between
1874 and 1877, approximately 900 members of the sect migrated from Russia to the United
States to the area which is now South Dakota. The Hutterites represent a closed population
with high levels of fertility and consanguinity. The group maintains a stable residence
pattern and keeps extensive genealogical records that could be used in studies of familial
aggregation of human longevity. Until recently Hutterite data were used in numerous
studies of fertility and inheritance of genetic disorders including [86, 104, 127].

B. The Old Order Amish Genealogy Database

The unique genealogic registry of Lancaster County, Pennsylvania, Amish contains
information on 8,163 marriages, dating back to the time of the pioneer migrants in the
1700s and spanning more than 10 generations. This database also represents closed
population with high level of inbreeding. The individual records in this database,
however, are heavily truncated and the total number of persons in the database is not very
high [100]. An Old
Order Amish genealogy database is maintained now at the Department of Epidemiology, Johns
Hopkins University School of Hygiene and Public Health, Baltimore, Maryland. This database
was used in the studies of consanguinity effects and infant mortality and other studies in
genetic epidemiology [3,
51, 98, 100].