I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

PowerPoint Slideshow about 'Shreve' - Anita

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

Digital libraries may contain resources in many languages. Accessible through the Internet, libraries may be consulted by individuals in other cultural/linguistic "locales" seeking resources in their own languages or searching across languages for resources in languages other than their own.

In order to enable the efficient and effective acquisition, storage and retrieval of cross-cultural and cross-linguistic resources, a digital library has to be designed from the outset to allow for heterogeneous linguistic and cultural content. The design process is called “internationalization.” The most effective internationalization strategies are standards-based.

(1) determining the metadata elements, attributes, value spaces and values that are culturally and linguistically dependent and are to be rendered in multiple languages. (2) creating a mechanism for internationalization that provides administrative control, cross-language tools capability, authority for keywords (terms), translations and translation equivalents. (3) providing an internationalization scheme that offers reusability and scalability and interfaces with relevant national and international standards.

Other issues are important (different writing systems and character sets of resources) and different display preferences (interface, resources), but we do not deal with these in this paper.

Localization is the preparation of locale-specific versions of a digital library resource or collection and consists of the translation of textual material into the language and textual conventions of the target locale and the adaptation of non-textual materials and delivery / display mechanisms to take into account the cultural requirements of that locale.

translation

internationalization

localization

adaptation

Internationalization is an “upstream” engineering process that should precede localization. Its aim is to make subsequent localization/translation easier, more efficient, and less costly.

As discussed in my ASIST 2003 presentation, there are two I18N approaches to support localizing a DL. The first approach is inline parallel and involves providing multiple local versions of, for instance, a title or keyword data element in a resource record. The data elements are flagged as “local” versions via the lang attribute. This is the most common localization method. Note that “equivalence” is assumed via adjacency and no authority is provided.

Translation memories and glossaries are the most common external localizing objects, but the growing use of statistically based corpus linguistics to create language resources will also make it possible to utilize other monolingual and multilingual resources in Digital Libraries. Standards for representing and storing some of these new language resources do not yet exist.

Internationalizing a DL not only involves providing and controlling translations of the content and metadata descriptive elements.

Internationalizing a metadata schema also involves determining the elements and element attributes that could affect the scheme’s ability to be used for classification, search, retrieval, and reuse of learning objects in multicultural and multilingual contexts.

An internationalization strategy begins with specifying all metadata elements that are culturally and linguistically dependent. Ideally, internationalization is a goal during initial schema development. Unfortunately, as with IEEE-LOM, internationalization may involve existing data elements in a pre-existing schema. Additions and modifications to the elements and element set may be necessary recommended.

Some “universal” metadata elements have values that may be very culturally dependent. For instance, LOM 5.6 Educational. Contexthas a value space [school, higher education, training, other]that is not only extremely limited, but derives from a single cultural context. Different countries have different educational systems. The LOM values are often not applicable or do not have a real correspondence.1

Although CEN has suggested simply “enlarging” the value space for such elements, true internationalization of these “system” dependent elements would involve providing a locale specification for the element so that a specific vocabulary could be retrieved.

Kindergarten

Elementary School

Middle School

High School

…

<education locale=‘en-US’>

<context>

value space </context>

</education>

The ISO 639 language codes and the ISO 3166

country codes do not

allow for even more “local” localization.

In Germany, for instance, the Bavarian school system differs from the German “norm.”

Some values may have one-to-one equivalence. Others do not. Middle school (junior high) may include one or more of Hauptschule / Realschule / Gymnasium / Gesamtschule. The values imply different age ranges, different educational objectives and values and different social structures.

Multilingual / multicultural restricted vocabularies should be concept-based. For two vocabulary items to be equivalent they should represent the same concept. The concepts should be documented in authoritative multilingual glossaries such as those specified in ISO 12620. Such glossaries provide one of the bases for external parallel metadata methods.

Concept objects are the core of terminology glossaries. They organize both monolingual and multilingual data. Organized into terminology glossary databases for computer-assisted translation, they are indispensable in today’s language industry.

When concepts are documented in authoritative multilingual glossaries they can also provide the basis for KOS (knowledge organization systems) of use in concept-mediated monolingual and multilingual browsing and searching in DLs.

<descrip type="context">For solids with spatial discontinuities, such as bounded solids or those containing holes, crack, interfaces, etc., we need to satisfy some prescribed boundary conditions.</descrip>

Thomas Baker, in his discussion of the Dublin Core in multiple languages, laments the lack of “comprehensive dictionaries” for metadata labels and vocabularies.2

Many issues in multilingual, multicultural DL development revolve around cultural variation in concept description and concept systems (KOS) and establishing linguistic authority (access to authoritative terms, documentation of authority and availability of authoritative equivalents). What we really need to support DL metadata schemas is not a “dictionary,” but standards-based external internationalization strategies such as TMX translation memories and multilingual terminology glossaries as defined by ISO TC 37’s ISO 12620 and other standards.

A concept-based multilingual glossary can be implemented to support cross-language searching. A glossary can provide authority for keyword selection where multilingual equivalents are then included in “parallel” in the resource record. Alternatively, a glossary-based DL can make it unnecessary to include more than one local term in the resource record.

A glossary can also be implemented to provide localized labels for data element names. In the event there are “local” versions of a schema (a Dublin Core or IEEE-LOM not in English) that need to be equated for software exchange, or data elements that need to be explained (training, help files) or used in an interface (resource submission form) a glossary can provide authoritative multi-language labels for a canonical data element name and its attributes.

Determining the metadata elements, attributes , value spaces and values that are culturally dependent and, if the display and interface are to be localized, those metadata elements that are to be rendered in multiple languages;

Providing external parallel strategies for localization;

The external parallel system is a more robust localization approach, providing control, administrative tools, authoritative terminology, and authority for translations and equivalents.

The external parallel system offers reusability, scalability and leverages the strengths of international standards.