Summary

Motivation

The primary motivation is to enhance/enrich the Unicode support level to allow
developers to write sophisticated Unicode-enabled regular expressions on the
Java platform. This is important to keep the Java Platform competitive with
other languages that already offer more complete support for Unicode regular
expressions.

Description

Java Regular Expressions are derived from Perl Regular Expression and are
supposed to provide Java developers most of the Perl style regression
expression features. Perl Regular Expressions have evolved rapidly in the past
couple years to follow Unicode Standard TR#18 Unicode Regular Expressions. Java Regular Expressions have claimed to be in conformance
with Level 1 of the same Unicode Standard TR#18 Unicode Regular Expressions,
plus RL2.1 Canonical Equivalents, which is the "lowest" level of conformance.
Given that the Unicode Standard has been widely accepted as the de facto
standard for development platforms and Java uses Unicode as its internal
encoding scheme, it appears that higher-level Unicode support is desirable for
developers working on Unicode-aware applications. The following new constructs
and features are proposed to provide better Unicode support in Java Regular
Expressions: