Contents

UNL is designed to establish a simple foundation for representing the most central aspects of information and meaning in a machine- and human-language-independent form. As a language-independent formalism, UNL aims to code, store, disseminate and retrieve information independently of the original language in which it was expressed. In this sense, UNL seeks to provide tools for overcoming the language barrier in a systematic way.

At first glance, UNL seems to be a kind of interlingua, into which source texts are converted before being translated into target languages. It can, in fact, be used for this purpose, and very efficiently, too. However, its real strength is knowledge representation and its primary objective is to provide an infrastructure for handling knowledge that already exists or can exist in any given language.

Nevertheless, it is important to note that at present it would be foolish to claim to represent the “full” meaning of any word, sentence, or text for any language. Subtleties of intention and interpretation make the “full meaning,” however we might conceive it, too variable and subjective for any systematic treatment. Thus UNL avoids the pitfalls of trying to represent the “full meaning” of sentences or texts, targeting instead the “core” or “consensual” meaning most often attributed to them. In this sense, much of the subtlety of poetry, metaphor, figurative language, innuendo, and other complex, indirect communicative behaviors is beyond the current scope and goals of UNL. Instead, UNL targets direct communicative behavior and literal meaning as a tangible, concrete basis for most human communication in practical, day-to-day settings.

In the UNL approach, information conveyed by natural language is represented sentence by sentence as a hypergraph composed of a set of directed binary labeled links (referred to as relations) between nodes or hypernodes (the Universal Words, or simply UWs), which stand for concepts. UWs can also be annotated with attributes representing context information.

As an example, the English sentence ‘The sky was blue?!’ can be represented in UNL as follows:

In the example above, "sky(icl>natural world)" and "blue(icl>color)", which represent individual concepts, are UWs; "aoj" (= attribute of an object) is a directed binary semantic relation linking the two UWs; and "@def", "@interrogative", "@past", "@exclamation" and "@entry" are attributes modifying UWs.

UWs are intended to represent universal concepts, but are expressed in English words or in any other natural language in order to be humanly readable. They consist of a "headword" (the UW root) and a "constraint list" (the UW suffix between parentheses), where the constraints are used to disambiguate the general concept conveyed by the headword. The set of UWs is organized in the UNL Ontology, in which high-level concepts are related to lower-level ones through the relations "icl" (= is a kind of), "iof" (= is an instance of) and "equ" (= is equal to).

Relations are intended to represent semantic links between words in every existing language. They can be ontological (such as "icl" and "iof," referred to above), logical (such as "and" and "or"), and thematic (such as "agt" = agent, "ins" = instrument, "tim" = time, "plc" = place, etc.). There are currently 46 relations in the UNL Specs. They jointly define the UNL syntax.

Within the UNL Program, the process of representing natural language sentences in UNL graphs is called UNLization, and the process of generating natural language sentences out of UNL graphs is called NLization. UNLization, which involves natural language analysis and understanding, is intended to be carried out semi-automatically (i.e., by humans with computer aids); and NLization is intended to be carried out fully automatically.

The UNL Programme started in 1996, as an initiative of the Institute of Advanced Studies of the United Nations University in Tokyo, Japan. In January 2001, the United Nations University set up an autonomous organization, the UNDL Foundation, to be responsible for the development and management of the UNL Programme. The Foundation, a non-profit international organisation, has an independent identity from the United Nations University, although it has special links with the UN. It inherited from the UNU/IAS the mandate of implementing the UNL Programme so that it can fulfil its mission.

The Programme has already crossed important milestones. The overall architecture of the UNL System has been developed with a set of basic software and tools necessary for its functioning. These are being tested and improved. A vast amount of linguistic resources from the various native languages already under development, as well as from the UNL expression, has been accumulated in the last few years. Moreover, the technical infrastructure for expanding these resources is already in place, thus facilitating the participation of many more languages in the UNL system from now on. A growing number of scientific papers and academic dissertations on the UNL are being published every year.

The most visible accomplishment so far is the recognition by the Patent Co-operation Treaty (PCT) of the innovative character and industrial applicability of the UNL, which was obtained in May 2002 through the World Intellectual Property Organisation (WIPO). Acquiring the patents (US patents 6,704,700 and 7,107,206) for the UNL is a completely novel achievement within the United Nations.