Linking Chan/Seon/Zen Figures and Their Texts: Problems and
Developments in the Construction of a Relational Database

*This file is encoded in Unicode. Should it not be displayed
accurately by your browser, please try to switch the
character-set to Universal UTF-8.
**This article is a slightly revised version of a
presentation at the 2001 Electronic Buddhist Text
Institute (EBTI) Conference in Seoul. The proceedings of
the conference can be found in the Korean journal Jeon ja
bul jeon 3 (2001) published by EBTI (http://ebti.dongguk.ac.kr).

Abstract

Issues related to the construction of a database on Buddhist
historical figures and their written legacy are discussed in
the paper, which deliberately takes the researcher's point of
view, reviewing concrete examples rather than elaborating on
technical issues. One part of the IRIZ "Zen Knowledge Base"
project initiated by Urs App is to establish a unique ID
number for each Chan/Seon/Zen figure, thereby enabling each
author to be linked with the extant documents. The primary
stages of this project having now been completed, the paper
presents some initial results and working hypotheses [see endnote], and reflects
on wider issues related to the digitization of Buddhist
research materials.

1
Chan/Seon/Zen Figures

1.1 Case Study
1: Lineage Charts of the Zengaku daijiten

Everyone involved in Buddhist research is familiar with the
Zengaku daijiten 禪學大辭典 ["Large Dictionary of Zen
Studies"] compiled at Komazawa University. In many respects
this dictionary is far from satisfactory, but it remains a
major reference work. Among the materials included is a
sequence of charts showing the main lineages of the Chan,
Seon and Zen traditions ( Zenshû hôkeifu 禪宗法系譜).
Although these charts are based in part on legend rather than
history, they nevertheless situate many of the major figures
who contributed to the development of the Zen schools in
China, Korea and Japan. Another distinctive feature is that
each page of these charts carries a line number to facilitate
the location of each figure listed in the index.

The presentation of these lineage charts actually suggests a
matrix, where each particular location can be mathematically
defined by its horizontal and vertical coordinates. This gave
Urs App and Christian Wittern the idea of adding a number
indicating the column. The digits for the page, line, and
column would thus represent an ID number pointing to each
Chan/Seon/Zen figure. The first section of these charts was
included in ZenBase CD 1. For example, the Sixth
Patriarch Caoxi Huineng 曹溪慧能 could be identified with the
sequence of characters "ZGD-C-04-01-01" where "ZGD" stands
for Zengaku daijiten and "C" for China, "04" for
page 4, "01" for line 1, and "01" for column 1.

If we think in terms of a relational database, one of the
first requirements for linking different files (known as
"tables" in the jargon of relational databases) is that each
piece of information be identified by a unique set of
characters (called a "key" or "primary key" in the jargon of
relational databases). This unique set of characters serves
as a common denominator between different types of
information. A concrete example would be a database of proper
names linked to another database of bibliographic references.
To link them, a field (or "column") shared by both databases
is required, and it must be unique-that is, there should be
only one record (or "row") containing this information. This
functions as the "key" for establishing the relation between
the different databases. In the case of the Zengaku
daijiten charts, the attribution of a unique ID number
to each figure obviously comes very close to meeting this
requirement. There is only one minor obstacle: some figures
appear more than once on the charts. This obstacle can easily
be overcome by deciding in each case which occurrence will
serve as the "main ID" and which one will serve as the
"secondary ID", the latest occurrence usually being chosen as
the main one. This choice is arbitrary, but remember that the
objective here is not historical accuracy but the
establishment of a reference number that can function as a
basis for relational purposes.

It should be pointed out here that the requirement for a
unique "key" exemplifies the gap that remains between
information science and the humanities. Reality is always
more complex than the schemes that attempt to describe it.
For example, many texts include several layers of authorship,
and it would be inaccurate to attribute them to a single
author. The Biyanlu 碧巌録 ["The Blue Cliff Record"] is
a notorious illustration of such multiple authorship, with a
set of cases collected by Xuedou Zhongxian 雪竇重顯 (980-1052)
followed by capping phrases added by his disciple Yuanwu
Keqin 圜悟克勤 (1063-1135). This issue will have to be addressed
when dealing more specifically with bibliographic data.

The first results achieved by this method include a set of
Zen-related names from China, Korea and Japan, with each name
having a specific ID number. This first group of raw data
will help clarify the possibilities and limitations involved
in building a relational database. One of our first
conclusions drawn from use of these data is that the key
should be constructed on the basis of information related to
the author, because the human factor always comes first. This
excludes the category of anonymous works, however. The scheme
shown in Figure 1 might help in understanding this basic
structure.

Figure 1. Key allowing Zen figures to be linked to
their texts

An additional remark is necessary here, for some of the basic
assumptions of the key scheme are being partially revised.
Thanks to remarks by Fred Coulson of the Tibetan Buddhist
Resource Center (http://www.tbrc.org/) after
my EBTI presentation, I have begun considering how it would
be possible to establish a unique ID number or key, without
having this number associated with a particular source or a
particular meaning. In other words, the key should be
automatically generated as a random number, the only
condition being that the number is unique.

In the average case of one author having produced many works,
these works being in turn the object of modern secondary
scholarly literature, this approach has proved satisfactory,
especially for the secondary literature. In the application
used (FileMaker Pro 5.5), this implied the use of the
"Random" function, with the simple operation "Random*
1000000000000000" to obtain a 15 digit number, which serves
as the key for all publications linked to a specific Chan
text. All records of classical literature will thus have
their own "Classic ID" and a list of "random IDs" referring
to secondary literature. The only drawbacks of this simple
approach are:

Duplicating a record involves duplicating the ID, which
must then be manually changed.

When the input is done by different users on different
databases that are not connected through a LAN, there is no
guarantee that identical randomly-generated IDs will not
coexist. Checking for duplicates is possible, but it
involves reformatting all the links if duplicates are
found.

1.2 Case Study
2: Biographies in the Kinsei zenrin sôbôden

Most Zen researchers are familiar with the Kinsei zenrin
sôbôden 近世禪林僧寶傳 (biographies of Tokugawa period Zen
monks), which remains a major resource in the study of
Tokugawa and Meiji Zen figures. Compiled by Dokuon Jôshu 獨園承珠
(Ogino 荻野, 1819-1895), who recognized
the need for material on masters coming after the period
covered in the Enpô dentôroku 延寶傳燈録 (the previous
comprehensive biography, published in 1706), the Kinsei
zenrin sôbôden is readily available in the indexed
facsimile edition issued by Shibunkaku (Kyoto 1973).

The original edition, printed in 1890, includes 119 main
figures. In 1938 Gyokugen Buntei 玉鉉文鼎 (Obata 1870-1945) wrote a sequel, the
Zoku kinsei zenrin sôbôden (Sequel to the Kinsei
zenrin sôbôden), which includes 417 figures. This work
too is included in the facsimile edition mentioned above.

Producing an electronic version of this text presented
several difficulties, one of them being its frequent use of
non-standard forms of Chinese characters ( itaiji
異体字). Using this electronic text, I began incorporating data
into a database of proper names, aiming for the same results
as with the Zengaku daijiten lineage charts.

The linear character of the biographies listed in the
Kinsei zenrin sôbôden made it much more complicated
to produce a unique ID number. This led to two decisions:

when a figure appears in the Zengaku daijiten
lineage charts, this ID number takes precedence;

the attribution of a new ID number to Zen-related figures
is to follow a systematic procedure beginning with the
Zengaku daijiten lineage charts.

ID numbers for figures not included in those charts are
constructed on the basis of the page numbers of a
biographical source, if possible the earliest.

Consider one example, the biography of Kansô Zentei 乾叟禪貞
(1624-1680). Kansô Zentei appears in Kinsei zenrin
sôbôden Vol. 2, pages 10-12, but is missing from the
Zengaku daijiten lineage charts. We can thus
attribute the ID number "KZS-2-010-012" to Kansô, with "KZS"
indicating the Kinsei zenrin sôbôden , "2" the
volume number, and "010-012" the page numbers.

Once this ID number has been established, it is possible to
produce HTML filenames or HTML links automatically. A few
examples are posted on the IRIZ Web site (http://www.iijnet.or.jp/iriz/),
in the Japanese section.

1.3 Lunar
Calendar and Solar Calendar: Date Calculation Issues

When dealing with the Kinsei zenrin sôbôden or other
traditional Chinese, Korean or Japanese sources, the dates of
birth and death are often difficult to calculate. First, most
dates are given according to imperial era names, which are
based on the lunar calendar. In the case of Japan, this
calendar remained in use until 1872 (Meiji 5), when the
3rd day of the 12th month was declared to be 1
January of Meiji 6 (1873) [1].

Until recently, accurate conversion between the two calendars
required careful calculations based on the Japanese
Chronological Tables, but many scholars simply
transposed the traditional dates into the Gregorian calendar
with no adjustments whatsoever. This is why so many
inaccuracies exist in Japanese reference works, beginning
with the Zengaku daijiten. Take the well-known
example of Hakuin Ekaku 白隱慧鶴. Hakuin was born on the second
year of the Jôkyô 貞享 era, twelfth month, twenty-fifth day.
Generally speaking the year Jôkyô 2 corresponds to 1685, but
Hakuin's birth date is actually 19 January 1686, owing to the
gap between the lunar and solar calendars. The situation is
the same with the year of his death, which occurred on the
eleventh day of the twelfth month of Meiwa 明和 5. This
corresponds to 18 January 1769 (Katô 1985: pp. 39 and 248).
Further problems can emerge when giving someone's age during
any particular year. The ages given in Hakuin's biography,
for example, follow the traditional system, in which a person
is considered to be one year of age in the year of his or her
birth. Thus, Hakuin is said to have been one year old in
Jôkyô 2; if one automatically assumes Jôkyô 2 to correspond
to 1685 in the Western solar calendar system, one can easily
conclude that he was age one at a time when actually he had
yet to be born. In 1695 his biography gives his age as 11,
although, again, by the solar calendar he would have been at
most nine years old.

At Hanazono University, I have tried to point out this lack
of accuracy, without success. To make the point clearer, I
usually mention the case of Johann Sebastian Bach, whose year
of birth, 1685, would generally be given as Jôkyô 2 in the
traditional Japanese system. Who, then, was born first, Bach
or Hakuin? Even without knowing the birthday of Bach, one can
assert that he was born prior to Hakuin if one is aware that,
due to the discrepancies between the two calendars, Hakuin's
birth actually occurred only in 1686.

This story is just one illustration that care must be taken
even with details. When it comes to calculating traditional
dates, there are two points to remember. First, one should be
careful when dates of birth or death occur in the eleventh or
twelfth month of the lunar year, because there is a strong
probability that they occur in the following year according
to the solar calendar. Second, when only the person's age and
year of death are known, there are often two possibilities
for the year of birth. Since, as explained above, a person's
year of age is calculated in accordance with the traditional
system that gives one year of age at the time of birth, there
can be two possible birth years depending upon which month of
the lunar year that person's birth took place. There is no
way to avoid this problem if the month of birth is unknown.

Now, fortunately, there are convenient tools that spare the
researcher the trouble of manually calculating the solar
equivalents of traditional dates. A Japanese site provides an
excellent DOS
utility called WHEN.EXE that converts traditional dates
into other systems. The conversion can also be done online.

1.4 Critical
Assessment of Traditional Accounts

As in most fields that depend on historical sources, Buddhist
studies is always confronted with the need to evaluate the
reliability of written documents. In addition to dating
sources, the researcher must always question the explicit or
implicit agendas of those who composed these sources. This
need is especially great in the case of biographies.

For example, the Kinsei zenrin sôbôden reflects,
first of all, Dokuon Jôshu's and Gyokugen Buntei's criteria
for choosing which figures should be included; each
individual biography, furthermore, reveals which aspects of
that person's life the compilers considered instructive for
their readers. Such biographies therefore necessarily omit
less prominent figures and leave out aspects of the
biographies that appeared unimportant to the author. Although
we cannot avoid using these accounts (often they are the only
remaining documents), we should treat them essentially as
hagiographies and not take their contents at face value. This
methodological remark is aimed only at underlining the fact
that the digitization of Buddhist material implies a certain
amount of selection and evaluation, and cannot be considered
a value-free mechanical task to be delegated to computer
specialists.

2
Chan/Seon/Zen Texts

2.1 The
Question of Sorting Sources

The first questions in sorting sources are "What is a
Chan/Seon/Zen text?" and "Does this particular category of a
Chan text have any relevance?"

The delimitation of Chan/Seon/Zen texts is less
straightforward than it might seem. One simplistic way to
define a Chan text would be to say that, since the prototype
of a "school" centered on Chan practice began to emerge
during the Tang dynasty, texts produced by representatives of
this school are "Chan texts." However, a serious examination
of which texts were actually used by representatives
of the Chan tradition shows that much importance was given to
scriptures usually ascribed to so-called "traditional
Buddhism." All sutras may to a certain extent fit into this
category, not to speak of non-Buddhist classics.

It seems therefore more productive to avoid establishing a
rigid sectarian label of "Chan/Seon/Zen text". One pragmatic
approach would be to say that every document used by
individuals who claim for themselves the designation "Chan
monk", "Chan nun", or "Chan lay practitioner" is a
Chan/Seon/Zen text.

If we accept this wide and unrestricted way of handling Chan
texts, is it still relevant to make a distinction between
Chan texts and Buddhist texts?

Although there are certain distinctively Chan genres of
literature, such as "recorded sayings" ( yulu 語録)
and "lamp histories" (dengshi 燈史), our broad
definition clearly extends beyond these. If the premise is
that what makes a category meaningful is its distinctiveness,
then the question of whether a particular source material is
a "Chan text" should perhaps be considered secondary. It
might even be more productive to simply consider it as a
"document", without precluding its origins or religious
setting.

An alternative way of considering the issue is to regard all
sectarian categories as fundamentally delusory, and as
tending to prevent our correct understanding of specific
phases in the "history of ideas". The discipline of religious
studies is nevertheless required to respect claims made by
the individuals who are the object of research; this means
that self-proclaimed affiliations with such and such a branch
of the Chan tradition can also be taken as valid information,
as an indication that this person claims for himself or
herself a link with a type of Buddhism that puts emphasis on
meditation.

We are now increasingly aware that there is no homogeneous
"Chan tradition" and that all serious research must take into
account the complex maze of reciprocal influences between
different schools of meditation such as Tiantai. The range of
sources can therefore be considered extendable depending on
the focus of the researcher and upon his or her willingness
to cope with a variety of "external documents".

Consider one example from my own experience in studying Tôrei Enji 東嶺圓慈 (1721-1792), an
important disciple of Hakuin affiliated with the Japanese
Rinzai tradition. Tôrei's broad erudition is conspicuous in
all of his writings, and he often quotes from what are
generally regarded as Shinto sources. For instance, he
frequently refers to the apocryphal Sendai kuji hongi
taiseikyô 先代舊事本紀大成經. In his Shûmon mujintôron
宗門無盡燈論, Tôrei also quotes from the Toyuke kôtaijin
gochinza hongi 豊受皇大神御鎭座本記 and the Jingû gokuhi
hôkihongi 神宮極祕寶基本紀, two texts belonging to the Five
Scriptures of Ise (Shintô gobusho 神道五部書). Tôrei's
familiarity with Shinto scriptures is, of course, rather
exceptional among Zen people. Nevertheless, if our
methodological rules are to be considered valid they must
also apply to unusual cases. Here I would say that the study
of Shinto scriptures is so important for the understanding of
Tôrei's thought that these documents can, in a loose sense,
be considered "Zen texts", not because of their content but
because a prominent Zen figure like Tôrei employed them in
his writings.

We are thus left with a very broad definition of
"Chan/Seon/Zen texts". Now the problem is how to sort these
texts in a way that would make them easily retrievable for
researchers.

Widely used collections, such as the Taishô shinshû
daizôkyô or the Manji zokuzôkyô provide a
simple means to classify texts according to the collections'
volume numbers. The case of the Manji zokuzôkyô is a
bit more complex due to its numerous printings, but the
database recently published on our institute's Web site (http://www.iijnet.or.jp/iriz/)
should help resolve this difficulty. Later I will give some
practical examples using these collections.

The purely bibliographical aspect of these different texts
will ultimately have to be resolved by librarians, but here I
would like to share a simple "tip" on how to search texts in
chronological order. The only requirement is to include date
information in the name of the file to be searched. Certain
texts cannot be accurately dated, but the completion dates of
the major "lamp histories" are known. I would therefore have
a folder including these texts, with filenames looking as
follows:missing 0790 Lidaifabaoji 歴代法寶記.TXTmissing 0952 Zutangji 祖堂集.TXT
0961 Zongjinglu 宗鏡録.TXT
1004 Jingde chuangdenglu 景徳傳燈録.TXT
1036 Tiansheng guangdenglu 天聖廣燈録.TXTmissing 1101 Jianzhong jingguo xudenglu
建中靖國續燈録.TXT
1107 Linjianlu 林間録.TXTmissing 1135 Zongmen tongyaoji
宗門統要集.TXTmissing 1204 Jiatai pudenglu 嘉泰普燈録.TXT
and so on...

The dates above rely on Yanagida Seizan's "Zenseki kaidai",
with "missing" indicating that there is no electronic version
yet and that the text should be searched manually. The
advantage is that we have thus a list that will report all
occurrences of a particular expression in chronological
order. This is much better than any dictionary, and even
enables the linguist to trace the evolution of a particular
expression. I use Matt Brunk's SpeedSearch (http://www.kagi.com/brunk)
on a Macintosh, but the same can be done with Fgrep and a DOS
batchfile (with shorter filenames) or with an editor
including Grep search options, such as Hidemaru (http://hidemaru.xaxon.co.jp/).

2.2 Basic
Requirements for Philological Searches

This leads to a brief digression on a question no doubt
familiar to every reader: what makes data truly useful for
researchers? We now have a vast array of digitized Chinese,
Korean and Japanese texts, which have helped make textual
searches completely different from even 10 years ago.
However, many of these digital texts are still far from
meeting the basic conditions for use as research tools. One
difficulty is in finding a "middle way" between "ready-made"
applications that depend on one system and one specific type
of software, and raw data that users with insufficient
computer literacy have difficulty using.

Among the data proliferating on the Web, many are not
accurately searchable simply because the inputters are
unaware that a line-feed character prevents accurate searches
for Chinese compounds. Even in the case of good quality data,
such as the electronic texts recently included in Tendai CD2,
one finds that the files have no headers and that the
characters missing in the JIS codes are uniformly replaced by
a "black star" ( kuroboshi ★) character. The fact
that these problems, though identified many years ago, remain
unaddressed leads me to believe that the Electronic Buddhist
Text Institute (EBTI) should establish guidelines and adopt a
more active role in promoting them.

It should also be stressed again that data depending on one
platform (usually Windows) or one type of environment or
application are unacceptable. Such data are not only
unavailable to users of less-common systems, but are a
long-term preservation risk, since operating systems and even
character encodings change quickly. In other words, the
durability of such data is in question.

To be truly useful for researchers, data should therefore be
retrievable on any machine or any operating system, and
searchable with any search utility. In the case of Chinese
texts, each line of the text files should end with a
punctuation sign. Information about the sources used, the
stage of correction, and the people in charge of the editing
work, should be clearly listed in the headers of each file.
This seems obvious, but is apparently not so much to many
researchers, since data released even by respected
institutions fail to meet any of these requirements.

2.3 How to
Identify a Single Figure Having Various Names

One of the difficulties faced by researchers who deal with
Chan/Seon/Zen figures is the multiplicity of their names.
Since the direct mention of a cleric's personal name was
considered a lack of respect, it was already customary in
China to use a place name or a temple name to indicate the
identity of a monk or a nun.

For instance, in the case of Shishuang Chuyuan 石霜楚圓
(986-1039), "Shishuang" indicates the temple where he resided
as abbot, the Shishuangshan Chongsheng chanyuan 石霜山崇勝禪院 in
Hunan Prefecture, while "Chuyuan" is his ordination name (
hui 諱). His other surname was Ciming 慈明. Since he
resided at a number of temples at different times in his
career, he came to be know variously as Ciming Chanshi 慈明禪師,
Xinghua Ciming 興化慈明, Nanyuan Chuyuan 南源楚圓, and Xinghua
Chuyuan 興化楚圓. Inasmuch as he is a well-known figure in Chan
history, one of the simplest ways to identify him would be to
assign him the ID number ZGD-C-06-05-01, derived from the
Zengaku daijiten lineage charts.

In Japan and Korea the matter has become even more
complicated because of the widespread use of
shitsugô 室號, that is, names deriving from the Zen
interview rooms of the respective teachers. Even today many
Zen practitioners refer to their teacher by his
shitsugô.

If we then add pre-ordination family names and imperially
bestowed honorific names (shigô 諡號), it is not
uncommon for the same person to be referred to by 10
different names. Assigning IDs to these figures therefore
appears a priority, especially if we think in terms of
establishing links between these figures and other elements
of information, such as bibliographic data. Let us see how
this could be done.

2.4 Case Study
1: The Zenseki kaidai

The Zenseki kaidai, a reference work by Yanagida
Seizan (1976), remains a convenient introduction to 329
essential texts related to the Chan tradition. An early
electronic version was included in the ZenBase CD1
(App 1995), with a revised version in database format
recently added to the IRIZ Web site. Among the texts
included, 110 have at least one author or compiler whose ID
is included in the lineage charts of the Zengaku
daijiten and has already been input. This provides the
necessary key to establish a link between the texts and their
authors, enabling a researcher to go back and forth between
this bibliographical database and the lineage charts
described above. If one author has several works in the
Zenseki kaidai, all of these texts will be
accessible from the lineage charts.

One important feature in this case is that the relation is
bidirectional, but not equivalent. On the lineage-chart side
we have both individuals with texts and individuals without
texts, and on the Zenseki kaidai side we have texts
with multiple authors as well as texts whose authors are
unknown.

2.5 Case Study
2: The Taishô shinshû daizôkyô

The Taishô shinshû daizôkyô is probably the most
commonly used reference work in Sino-Japanese Buddhist
studies. Despite numerous defects in punctuation and other
areas, its status of a standard edition makes it fit for the
exchange of information between scholars. For example, even
if there are better editions of a text, it often remains more
convenient to identify it using the Taishô shinshû
daizôkyô volume and page number, allowing other
researchers to check the source.

The titles of the 3118 texts included in this collection
provide a good example of how it is possible to codify
classical texts and to attribute unique ID numbers. For
example, the Chinese translation of the Lotus Sutra, usually
identified as the Miaofa lianhuajing 妙法蓮華經 T. 9 No.
262, could be codified using the first letters of the title
Taishô shinshû daizôkyô followed by the volume
number and text number: TSD_09_0265. Since such arrays can be
produced automatically, this system is quite convenient for
naming HTML files or tagging.

2.6 Case Study
3: The Works of Tôrei Enji

Let us now examine a more complicated case. Tôrei Enji has
already been mentioned above, in section
2.1. I am currently preparing the collected works of Tôrei
for publication, in electronic and/or printed form. Except
for two publications of his included in the Taishô
shinshû daizôkyô (T. 81 No. 2575 and No. 2576), most of
his works remain in the form either of manuscripts or
woodblock print documents. There was thus a need to codify
his writings in a way that would make them easily
identifiable even if unpublished.

Following the basic idea of categorizing bibliographic data
according to author, I started with the simple hypothesis
that even the most prolific author would hardly produce more
than 899 works in a lifetime. I thus added 100 to the record
number automatically produced when creating a new entry, so
that the result is always a number of three digits between
101 and 999. For example, record #1 for Tôrei is his Bumo
onnanpôkyô chûge 佛説父母恩難報經註解 [Annotated Commentary to
The Sutra on the Difficulty of Repaying One's Debt of
Gratitude to One's Parents ], composed in 1770. Adding
100 to 1 gives the ID 101, which is appended to Tôrei's
lineage chart ID (ZGD-J-49-01-02). Thus for this text the
unique number is ZGD-J-49-01-02/101, which codifies the
meaning "Tôrei's text no. 1".

2.7 Creating
the equivalent of an ISBN number for Classics

The procedure followed for Tôrei's works can easily be
extended to almost every writer of a Chan/Seon/Zen text, the
only requirement being the assignment of an ID number. For
example, the Korean monk Kihwa 己和 (or Hamho Tukt'ong 涵虚得通,
1376-1433) is found in our database under the name Deuk-tong
Gi-hwa 得通己和 or 득통기화, with the ID ZGD-K-23-04-20. His
Commentary on the Sutra of Perfect Enlightenment
圓覺經疏 could accordingly be identified as ZGD-K-23-04-20/101,
and this number could in turn serve as basis for linking this
text with Charles Muller's English translation (Muller 1999).

The advantages of assigning a unique ID number to important
Chan/Seon/Zen texts are obvious, and can be compared to the
usefulness of having an ISBN number for modern books. I
believe that building such a databank of ISBN numbers for
classics could be of tremendous value not only to scholars
but to general readers as well. However, care should be taken
to account for textual variants, which are often important
for their bearing on the contents of the work. For instance,
the Platform Sutra should have at least half a dozen
different numbers to reflect the variants. This means that
the codification cannot be done mechanically--specialists on
the scriptures must decide if different editions of a similar
text should be handled under one label or should bear
different identifications.

One alternative would be to leave space for subsets of
particular IDs. The Platform Sutra is an especially
intricate case, because its author(s) is/are difficult to
ascertain, with most scholars now believing that it contains
several layers with different authorship. Nevertheless, one
could assign a number for the purpose of categorization,
without any pretension to historical accuracy, that would
identify the work as the first text under the ID of Caoxi
Huineng. Thus we would have ZGD-C-04-01-01/101. Each variant
would then be indicated with supplementary letters or numbers
indicating its provenance. For example, we could choose
ZGD-C-04-01-01/101/DHG1 to indicate the Dunhuang edition
included in the Taishô shinshû daizôkyô (T. 48 No.
2007; Stein manuscript No. 5475). The final word on such
matters should, of course, be given to librarians and archive
specialists, such as the scholars involved in the Dunhuang
project; my aim here is only to underline the utility of such
encoding.

Of course, another matter to consider is the need for
standardization. As with modern ISBN numbers, a single
institution would need to centralize and standardize the
information. Since this idea is presently limited to texts
belonging to the cultural area influenced by the Chinese
language, one institution in the CJK area could probably
assume the task, but it would need considerable resources.

2.8 The Need
to Account for Nontextual Sources

After putting so much emphasis on the need to differentiate
between various texts, a remark on the limitations of textual
sources appears necessary. Thanks in particular to the
efforts of Bernard Faure (1991, 1993), the field of Buddhist
studies is now more conversant with the one-sidedness of
historical approaches, and is increasingly receptive to
methodological alternatives. In addition to structuralist or
hermeneutical approaches, the application of anthropological
or sociological methods often helps to perceive this field
from a larger perspective. This is even more significant in
the case of the different Chan/Seon/Zen traditions, with
their strong oral tradition based upon use of the koans.
Archeological sources or documents highlighted by art history
frequently reveal information that is lacking from written
accounts. There is no need to overstress this aspect here,
but the work of Gregory Schopen (1997) is a telling example
of how archaeology or epigraphy can help overcome
preconceptions about Buddhist teachings and practice.

3 Linking
Chan/Seon/Zen Figures and Their Texts

3.1 Simple
Links and Their Limits

The above discussion on Chan/Seon/Zen figures and their texts
suggests the complexity of the relationships involved.
Disentangling this nexus of relationships to express the
relevant data in mathematical terms necessarily involves
simplification. The problem is to convey the necessity of
distinguishing between precise biographical or
bibliographical research and simplified reconstructions of
reality that serve as tools.

Drawing sketches with arrows, writing manuscript annotations,
or dog-earing books are common ways to note connections that
occur in our brain and coincide, so to speak, with the
exchanges of electric current between our synapses.
Hyperlinks were devised to imitate this natural way of
synthesizing distinct pieces of information, but they remain
essentially uni- or bidirectional. Figure 2 tries to
represent what happens in our minds regarding, say, Hakuin's
Yasenkanna 夜船閑話 [Idle Talks in a Night Boat][2].

Figure 2. Example of mental associations with
Hakuin's Yasenkanna

Each reader would, of course, have different associations
depending on his or her background. The fact is, however,
that even the relatively simple associations illustrated in
Figure 2 would be extremely difficult to express either with
hyperlinks or with the relational features of a database. The
reasons are many, among them being the vagueness of "Daoist
sources" or "Indian sutras." We are, in other words, in the
realm of "fuzzy logic" here, but this is the way the human
mind works.

The provisional conclusion I draw from this is that the very
attempt to codify every element and relation into a database
is an illusion. The scope of a database is limited,
and those limitations must be clearly expressed from the
outset.

3.2
Intertextuality as a Parameter of Religious Practice

When studying Chan/Seon/Zen figures one soon notices strong
relations between the texts and the meditation practice,
especially in schools using koans. The recent book on the
koan edited by Heine and Wright (2000) illustrates from
various perspectives the cardinal importance of
intertextuality. For instance, in the Japanese monastic
context a practitioner who reaches an insight into a specific
koan is asked to find a verse in the Zenrinkushû
禪林句集 [Anthology from the Zen Forest] that relates to his or
her understanding. This can be seen as a "literary exercise"
or a "pedagogical device", but it also demonstrates that
patterns of understanding follow certain tracks that are not
purely accidental, and that can to a certain extent be
mapped. Much about these patterns might be revealed by a
systematic analysis of the metaphors contained in collections
such as the Zenrinkushû. In this case too, proper
digitization of the text could provide a tool aiding in-depth
research.

3.3
Translations: How to Codify the Quality Criterion

A few decades ago there were so few modern-language
translations of Chan/Seon/Zen texts that researchers could
easily keep track of the quality of those that existed.
However, the recent proliferation of new translations has
created a growing need for some means of evaluating the
quality of the various renditions. The task of building a
database that includes translations of Chan/Seon/Zen texts
can thus be seen as establishing three items:

an ID for the original text;

a derived ID for the translation, including a code
representing the language;

criteria to assist readers in choosing the appropriate
translation (the first obvious criterion being whether a
translation is partial or comprehensive)[3].

The third item runs the risk of becoming quite subjective if
it is limited to the kind of rating used to award stars to
hotels. Since the work of reading a translation and checking
it against the original is equivalent to writing a book
review, it might be best to establish links to the actual
book reviews.

Researchers often spend far more time collecting and storing
information than they do analyzing it and using it to
formulate new hypotheses. In the humanities, figures indicate
that as much as 80 percent of our time is dedicated to
mechanical tasks, among these the classification of
documents. Noguchi Yukio 野口悠紀雄 argues that for the individual
researcher, classification is an endless and fruitless task
(1993, 1995, 1999, 2000), and proposes that library-type
classification by subject be discarded in favor of
chronological ordering (that is, ordering on the basis of
what document has last been used). His method basically
involves putting all material into A4 envelopes and placing
the most recently used envelope at the end of the row. Having
applied it to my own work for the past two years I am
completely free of the "lost child syndrome" ("Now where did
I put that piece of paper!").

Noguchi's ideas are largely inspired by discoveries related
to the use of computers. He argues that although we have
entered the age of digital information, our thinking is still
largely conditioned by habits inherited from our long
dependence on paper. We have been led by force of habit to
believe that if information is not properly labeled or
classified then it will be impossible to find when needed.
Noguchi shows, however, that this is not necessarily the
case.

Nevertheless, when building a database there seems to be no
way to avoid using fields, which amounts to classifying.
Similarly, the entire process of tagging, be it in SGML or
XML formats, involves labeling items of knowledge, often for
commercial purposes. The digitization of data in itself does
not necessitate classifying, but the use of database
applications compels it to a certain extent. Categories, even
the most sophisticated ones, once used necessarily reflect
the limits of our vocabulary and conceptual horizon.

Studying the history of religions implies the willingness to
take on the viewpoint of the object of study. When the
objects of study are Chan/Seon/Zen figures, this may
sometimes demand that we, like Zen monks, impose silence upon
our discursive minds and employ our more holistic abilities
in order to grasp relationships which are difficult to
codify. This should not be misconstrued as a negation of
rational ways of thinking, but as an augmentation of them. In
Buddhism, after all, the logic of equality precedes the logic
of differentiation without invalidating it.

Muller, Charles A. (1999) The Sutra of Perfect
Enlightenment: Korean Buddhism's Guide to Meditation (with
Commentary by the Son Monk Kihwa), edited by P. Sung
Bae (SUNY Series in Korean Studies. New York: State
University of New York Press)

Schopen, Gregory (1997) "Bones, Stones, and Buddhist Monks:
Collected Papers on the Archaeology, Epigraphy, and Texts
of Monastic Buddhism in India". In Studies in the
Buddhist Tradition, edited by L. O. Gómez (Honolulu:
University of Hawai'i Press)

Endnotes

[abstract endnote]
There are many issues in the digitization of Chan/Seon/Zen
sources that should be covered in a more methodical way. In
an attempt to prevent the paper from becoming too tedious, I
didn't include listings of the structures used in the
different databases, but I would be happy to communicate them
to interested readers. It is true that more could have been
added concerning the construction of a general data model
appropriate for this type of relational database. I also
neglected the whole area of information retrieval systems,
the applicability of which could be considered to avoid some
of the issues related to the construction of a database of
Chan/Seon/Zen figures. In many areas, I am willing to learn
more from database specialists and would be grateful for any
feedback on the matters discussed here.

[2] For those not
familiar with this text, see Mohr (1999), pp. 310-311.

[3] The first steps
in the direction of listing translations of Chan texts have
appeared in a rather unknown Chinese translation of
Yanagida's "Zenseki kaidai" with additions (Yanagida 1995,
1996 and 1998).