Proposal for the Taxonomy of CEN/TC304 - ICT- European
Localization Requirements

Source: Keld Simonsen

Date 24 November, 2000 (submitted 11 October 2000)

Note: Comments on this document are invited before 15th January 2001 (to the
editor:keld@dkuug.dk with a copy to the TC304 secretary
thorgeir@stri.is ). These comments will be
used to faciliate a revision and a decision to be taken on making this document
a standing TC304 document as proposed in its last paragraph number 5 at the next
TC304 plenary, currently planned 23rd February 2001.

A Taxonomy for CEN/TC304 - ICT - European Localization Requirements

Source: Keld Simonsen

Date: 2000-10-11

1 Introduction and scope

In order to approach standardization in a systematic way, a common approach
is to develop a way to classify the subject area, or a taxonomy. This helps in
two ways:

-a taxonomy helps to identify all aspects of the domain in question which
might be subject to standardization;

-a taxonomy helps to provide a logical structure for the standardization
activity.

A taxonomy has been developed of relevant concepts in the domain of
European localization requirements, based on user requirements for
functionality, as discussed in Clause 4 of Part I of the CEN/TC304/PT01
report on User requirements on IT.

By way of an application, all known current standards and standardization
activities have been grouped according to this taxonomy, thus forming another
type of taxonomy, that of the standards themselves.

2 A taxonomy for European localization
requirements.

The present classification of the concepts was made through the
identification of commonalties, such as characters, sets, fonts and rules
relating to presentation. The analysis was based on a much wider view of
"multi-cultural support", which attempts to map some of its concepts. Areas
relevant to this taxonomy were chosen and developed into the full taxonomy,
shown in clause 3.2. This latter choice comprises the technology which relates
to methods for specifying, and rules governing, the creation of unique
properties and codes which facilitate the presentation, storage and transmission
of individual characters.

The taxonomy in clause 3.2 was based on references ISO/IEC TR 10000-1, ISO TR
12382 and IEC 824 and the activities of appropriate standardization bodies, but
most notably the work of CEN/TC304 and ISO/IEC JTC 1.

3 Description of classification

3.1 Description

User requirements may be summed up in the single phrase "multi-cultural
support", being the need to accommodate all the requirements of different types
of users, whether they are racial, national, typographical, occupational or
individual. The primary choice was for text based topics, in line with the
capability of computer technology to code, store and process individual
characters.

The taxonomy in clause 3.2 takes the classic form of a tree structure, where
two major classes are recognized; Locales and Characters. The former deals with
the cultural environment of the user, the latter with the smallest divisible
parts that make up the messages which are being electronically processed.

A taxonomy of whatever phenomena can be constructed in several ways,
depending on its purpose and the aspects applied. (For instance, a number of
persons may be grouped firstly according to age, then according to gender, then
according to place of living -- or precisely the other way around, according to
need.) A taxonomy for standardization purposes naturally has to take into
account the most practical ways to group existing standards and standardization
projects as well as the logical connections between them and any conceptual
"holes" which may need to be filled in order to cover the full need for
standardization.

The following taxonomy is thus intended to provide a map for almost all of
the user requirements. Therefore the level of subordination in some cases go
very deep -- this does not mean that the actual standardization projects need a
taxonomy of the same complexity. When a sub-level is empty of existing or future
standards, the entries in that sub-level are simply collapsed and only the level
above remains.

3.2 Taxonomy for CEN/TC304

What follows is a specification of the taxonomy, and for informatioon, an
application of this to standardization and research projects currently going on.
The purpose is to illustrate one use of the taxonomy as well as to provide a map
of where the respective work is being carried out. To the taxonomy there is an
additional layer of where the specifications apply, such as world-wide, Europe
or a specific country.

Code

Title

Current standardization or research activity

/ (no id)

TAXONOMY

CEN/TC304

L/

LOCALES

ISO/IEC JTC1/SC22/WG20

L/1

Specifications

ISO/IEC JTC1/SC22/WG20

L/11

Languages

-

L/111

Natural languages

-

L/1111

Vocabulary

ISO/TC 37, ISO/IEC JTC1

L/11111

Standard terminology

ISO/IEC JTC1/SC22/WG20

L/11112

Thesauri

-

L/11113

Standard phrases

-

L/11114

Translation

LRE

L/1112

Grammar

ISO/IEC JTC1/SC22/WG20

L/1113

Orthography

ISO/IEC JTC1/SC22/WG20

L/11131

Alphabet

ISO/IEC JTC1/SC22/WG20

L/11132

Spelling

ISO/IEC JTC1/SC22/WG20

L/11133

Use of special characters

ISO/IEC JTC1/SC22/WG20

L/11134

Capitalization

ISO/IEC JTC1/SC22/WG20

L/11135

Hyphenation

ISO/IEC JTC1/SC22/WG20

L/11136

Punctuation

ISO/IEC JTC1/SC22/WG20

L/11137

Transcription

ISO/IEC JTC1/SC22/WG20

L/11138

Ordering

ISO/IEC JTC1/SC22/WG20, ISO/TC46, ISO/TC37, CEN/TC304

L/11139

Personal names and titles

ISO/IEC JTC1/SC22/WG20

L/1114

Speech

LRE

L/12

Cultural conventions

ISO/IEC JTC1/SC22/WG20, TOG, CEN/TC304

L/121

Cultural elements

ISO/IEC JTC1/SC22/WG20

L/1211

Orthography

ISO/IEC JTC1/SC22/WG20

L/12111

Date and time format

ISO/IEC JTC1/SC22/WG20

L/12112

Numeric separators

ISO/IEC JTC1/SC22/WG20

L/12113

Monetary format

ISO/IEC JTC1/SC22/WG20

L/12114

Telephone number format

PTTs, CEPT, ISO/IEC JTC1/SC22/WG20

L/12115

Payment number format

ISO/IEC JTC1/SC22/WG20

L/12116

Mail address format

CEN/PC8, ISO/IEC JTC1/SC22/WG20

L/12117

National places

ISO/IEC JTC1/SC22/WG20

L/1212

Measurement system

ISO/IEC JTC1/SC22/WG20

L/1213

Layout styles

-

L/1214

Paper sizes

ISO/TC6, CEN/TC172, ISO/IEC JTC1/SC22/WG20

L/13

Operating system dependency

ISO/IEC JTC1/SC22/WG15, IEEE, TOG

L/131

POSIX

ISO/IEC JTC1/SC22/WG15, IEEE, TOG

L/132

Other TOG

TOG

L/2

Registration

ISO/IEC JTC1/SC22/WG20

L/21

Procedures

ISO/IEC JTC1/SC22/WG20

L/211

Europe

CEN/TC304

L/2111

National

NBs

L/212

World-wide

ISO/IEC JTC1/SC22/WG20

L/3

Implementation

-

L/31

Fallback

ISO/IEC JTC1/SC22/WG20, IETF

C/

CHARACTERS

ISO/IEC JTC1/SC2

C/1

Character information

ISO/IEC JTC1/SC2, SC22

C/11

Identification

ISO/IEC JTC1/SC2/WG2

C/111

Characters

ISO/IEC JTC1/SC2

C/1111

Identifiers

ISO/IEC JTC1/SC2/WG2, SC22/WG20

C/1112

Attributes

ISO/IEC JTC1/SC22/WG20, Unicode

C/112

Repertoires

ISO/IEC JTC1/SC2, SC22

C/1121

Graphic characters

ISO/IEC JTC1/SC2

C/11211

Natural language alphabets

ISO/IEC JTC1/SC22/WG20

C/112111

Europe

CEN/TC304

C/1121111

General

CEN/TC304

C/1121112

Elderly/disabled

ISO/TC173

C/112112

World-wide

ISO/IEC JTC1/SC22/WG20

C/11212

Programming language alphabets

ISO/IEC JTC1/SC22/WG20

C/11213

Non-alphabetic symbols

ISO/IEC JTC1/SC22/WG20

C/112131

General

ISO/IEC JTC1/SC22/WG20

C/112132

Disabled/elderly

TIDE

C/1122

Control functions

ISO/IEC JTC1/SC2/WG3

C/1123

Registration

ISO/IEC JTC1/SC2/WG3

C/113

Glyphs

ISO/IEC JTC1/SC34

C/1131

Registration

UNICODE

C/1132

Character correspondence

UNICODE

C/114

Glyph repertoires

UNICODE

C/1141

Registration

UNICODE

C/1142

Repertoire correspondence

UNICODE

C/12

Manipulation

ISO/IEC JTC1/SC22/WG20

C/121

Transformation

CEN/TC304, ISO/IEC JTC1/SC22/WG20

C/1211

Case conversion

ISO/IEC JTC1/SC22/WG15, WG20

C/1212

Transliteration

ISO TC46 (bibliographic)

C/1213

Fallback representation

CEN/TC304 , ISO/IEC JTC1/SC22/WG20, IETF

C/2

Input/output

ISO/IEC JTC1/SC22, SC35

C/21

Input

ISO/IEC JTC1/SC35

C/211

Keyboard

ISO/IEC JTC1/SC35, CEN/TC304

C/212

Other means

ISO/IEC JTC1/SC35

C/22

Output

ISO/IEC JTC1/SC22, SC35

C/221

Character repertoires

ISO/IEC JTC1/SC2, CEN/TC304

C/222

Character attributes

ISO/IEC JTC1/SC22/WG20, Unicode

C/3

Electronic processing

ISO/IEC JTC1/SC22/WG20

C/31

Processing of coding schemes

ISO/IEC JTC1/SC2, SC22; CEN/TC304

C/311

Encoding of graphic characters

ISO/IEC JTC1/SC34 (text layout)

C/312

Encoding of control functions

ISO/IEC JTC1/SC2

C/313

Code transformations

CEN/TC304, ISO/IEC JTC1/SC22/WG20

C/3131

UCS--UCS

ISO/IEC JTC1/SC2/WG2

C/3132

UCS--other coding schemes

ISO/IEC JTC1/SC22/WG20, TOG

C/32

Interchange/communication

IETF

C/321

7-bit method

IETF

C/322

8-bit method

IETF

C/323

Multiple-octet method

IETF

C/33

Internationalization support

ISO/IEC JTC1/SC22/WG15 and WG20

C/331

Programming languages

ISO/IEC JTC1/SC22

C/3311

Language-dependent

ISO/IEC JTC1/SC22

C/3312

Language-independent

ISO/IEC JTC1/SC22/WG20

C/332

Operating systems

ISO/IEC JTC1/SC22/WG15

C/333

Communications

IETF, W3C

C/3331

Directory services

CEN/ISSS WS-DIR

C/3332

Telematics

IETF

4 List of groups

The following is the list of groups referenced above, and possible a web
reference.

5 Maintenance of the taxonomy

To allow widespread use of, and comment on, this taxonomy it is proposed that
it should be published as a CEN/TC304 standing document and given adequate
publicity as a freely accesible page on the CEN/TC304 web pages. It is
recommended that the upkeep, development and maintenance of the taxonomy should
be the responsibility of CEN/TC304.