Abstract

The
XHTML
Basic
document
type
includes
the
minimal
set
of
modules
required
to
be
an
XHTML
host
language
document
type,
and
in
addition
it
includes
images,
forms,
basic
tables,
and
object
support.
It
is
designed
for
Web
clients
that
do
not
support
the
full
set
of
XHTML
features;
for
example,
Web
clients
such
as
mobile
phones,
PDA
s,
pagers,
and
settop
boxes.
The
document
type
is
rich
enough
for
content
authoring.

XHTML
Basic
is
designed
as
a
common
base
that
may
be
extended.
The
goal
of
XHTML
Basic
is
to
serve
as
a
common
language
supported
by
various
kinds
of
user
agents.

This
revision,
1.1,
supercedes
version
1.0
as
defined
in
http://www.w3.org/TR/2000/REC-xhtml-basic-20001219
.
In
this
revision,
several
new
features
have
been
incorporated
into
the
language
in
order
to
better
serve
the
small-device
community
that
is
this
language's
major
user:

Status
of
this
Document

This
section
describes
the
status
of
this
document
at
the
time
of
its
publication.
Other
documents
may
supersede
this
document.
A
list
of
current
W3C
publications
and
the
latest
revision
of
this
technical
report
can
be
found
in
the
W3C
technical
reports
index
at
http://www.w3.org/TR/.

This
document
is
based
upon
the
XHTML
Basic
1.1
Candidate
Recommendation
of
13
July
2007.
Feedback
received
during
that
review
resulted
only
in
minor
changes.
The
Working
Group
believes
that
this
specification
addresses
all
Candidate
Recommendation
issues.

Publication
as
a
Proposed
Recommendation
does
not
imply
endorsement
by
the
W3C
Membership.
This
is
a
draft
document
and
may
be
updated,
replaced
or
obsoleted
by
other
documents
at
any
time.
It
is
inappropriate
to
cite
this
document
as
other
than
work
in
progress.

If
this
document
is
approved
as
a
W3C
Recommendation,
it
will
supersede
the
19
December
2000
version
of
the
the
XHTML
Basic
Recommendation.

1.
Introduction

1.1.
XHTML
for
Small
Information
Appliances

HTML
4
is
a
powerful
language
for
authoring
Web
content,
but
its
design
does
not
take
into
consideration
issues
pertinent
to
small
devices,
including
the
implementation
cost
(in
power,
memory,
etc.
)
of
the
full
feature
set.
Consumer
devices
with
limited
resources
cannot
generally
afford
to
implement
the
full
feature
set
of
HTML
4.
Requiring
a
full-fledged
computer
for
access
to
the
World
Wide
Web
excludes
a
large
portion
of
the
population
from
consumer
device
access
of
online
information
and
services.

Because
there
are
many
ways
to
subset
HTML
,
there
are
many
almost
identical
subsets
defined
by
organizations
and
companies.
Without
a
common
base
set
of
features,
developing
applications
for
a
wide
range
of
Web
clients
is
difficult.

The
motivation
for
XHTML
Basic
is
to
provide
an
XHTML
document
type
that
can
be
shared
across
communities
(
e.g.
desktop,
TV
,
and
mobile
phones),
and
that
is
rich
enough
to
be
used
for
simple
content
authoring.
New
community-wide
document
types
can
be
defined
by
extending
XHTML
Basic
in
such
a
way
that
XHTML
Basic
documents
are
in
the
set
of
valid
documents
of
the
new
document
type.
Thus
an
XHTML
Basic
document
can
be
presented
on
the
maximum
number
of
Web
clients.

The
document
type
definition
for
XHTML
Basic
is
implemented
based
on
the
XHTML
modules
defined
in
XHTML
Modularization
[
XHTMLMOD
].

For
information
on
best
practices
for
mobile
content,
we
refer
you
to
[
MOBILEBP
].

1.2.
Background
and
Requirements

Information
appliances
are
targeted
for
particular
uses.
They
support
the
features
they
need
for
the
functions
they
are
designed
to
fulfill.
The
following
are
examples
of
different
information
appliances:

Mobile
phones

Televisions

PDA
s

Vending
machines

Pagers

Car
navigation
systems

Mobile
game
machines

Digital
book
readers

Smart
watches

Existing
subsets
and
variants
of
HTML
for
these
clients
include
Compact
HTML
[
CHTML
],
the
Wireless
Markup
Language
[
WML
],
and
the
"
HTML
4.0
Guidelines
for
Mobile
Access"
[
GUIDELINES
].
The
common
features
found
in
these
document
types
include:

Basic
text
(including
headings,
paragraphs,
and
lists)

Hyperlinks
and
links
to
related
documents

Basic
forms

Basic
tables

Images

Meta
information

This
set
of
HTML
features
has
been
the
starting
point
for
the
design
of
XHTML
Basic.
Since
many
content
developers
are
familiar
with
these
HTML
features,
they
comprise
a
useful
host
language
that
may
be
combined
with
markup
modules
from
other
languages
according
to
the
methods
described
in
"
XHTML
Modularization
"
[
XHTMLMOD
].
For
example,
XHTML
Basic
may
be
extended
with
a
custom
module
to
support
richer
markup
semantics
in
specific
environments.

It
is
not
the
intention
of
XHTML
Basic
to
limit
the
functionality
of
future
languages.
But
since
the
features
in
HTML
4
(frames,
advanced
tables,
etc.
)
were
developed
for
a
desktop
computer
type
of
client,
they
have
proved
to
be
inappropriate
for
many
non-desktop
devices.
XHTML
Basic
will
be
extended
and
built
upon.
Extending
XHTML
from
a
common
and
basic
set
of
features,
instead
of
almost
identical
subsets
or
the
too-large
set
of
functions
in
HTML
4,
will
be
good
for
interoperability
on
the
Web,
as
well
as
for
scalability.

Compared
to
the
rich
functionality
of
HTML
4,
XHTML
Basic
may
look
like
one
step
back,
but
in
fact,
it
is
two
steps
forward
for
clients
that
do
not
need
what
is
in
HTML
4
and
for
content
developers
who
get
one
XHTML
subset
instead
of
many.

1.3.
Design
Rationale

This
section
explains
why
certain
HTML
features
are
not
part
of
XHTML
Basic.

1.3.1.
Presentation

Many
simple
Web
clients
cannot
display
fonts
other
than
monospace.
Bi-directional
text,
bold
faced
font,
and
other
text
extension
elements
are
not
supported.

It
is
recommended
that
style
sheets
be
used
to
create
a
presentation
that
is
appropriate
for
the
device.

1.3.2.
Tables

Basic
XHTML
tables
([
XHTMLMOD
],
section
5.6.1)
are
supported,
but
tables
can
be
difficult
to
display
on
small
devices.
It
is
recommended
that
content
developers
follow
the
Web
Content
Accessibility
Guidelines
1.0
for
creating
accessible
tables
([
WCAG10
],
Guideline
5).
Note
that
in
the
Basic
Tables
Module,
nesting
of
tables
is
prohibited.

1.3.3.
Frames

Frames
are
not
supported.
Frames
depend
on
a
screen
interface
and
may
not
be
applicable
to
some
small
appliances
like
phones,
pagers,
and
watches.

2.
Conformance

This
section
is
normative.

2.1.
Document
Conformance

A
Conforming
XHTML
Basic
document
is
a
document
that
requires
only
the
facilities
described
as
mandatory
in
this
specification.
Such
a
document
must
meet
all
of
the
following
criteria:

The
document
must
conform
to
the
constraints
expressed
in
Appendix
B
.

The
root
element
of
the
document
must
be
<html>
.

The
name
of
the
default
namespace
on
the
root
element
must
be
the
XHTML
namespace
name,
http://www.w3.org/1999/xhtml
.

There
must
be
a
DOCTYPE
declaration
in
the
document
prior
to
the
root
element.
If
present,
the
public
identifier
included
in
the
DOCTYPE
declaration
must
reference
the
DTD
found
in
Appendix
B
using
its
Formal
Public
Identifier.
The
system
identifier
may
be
modified
appropriately.

The
DTD
subset
must
not
be
used
to
override
any
parameter
entities
in
the
DTD
.

XHTML
Basic
1.1
documents
SHOULD
be
labeled
with
the
Internet
Media
Type
"application/xhtml+xml"
as
defined
in
[
RFC3236
].
For
further
information
on
using
media
types
with
XHTML,
see
the
informative
note
[
XHTMLMIME
].

The
target
attribute
is
designed
to
be
a
general
hook
for
binding
to
an
external
environment
(such
as
Frames,
multiple
windows,
browser-tabbed
windows);
when
there
is
no
such
external
environment
bound
to
the
user
agent,
the
user
agent
can
ignore
the
target
attribute.
When
there
is
an
external
environment
bound,
the
conformance
requirements
for
the
target
attribute
are
defined
in
each
environment.

The
content
author
needs
to
be
aware
that
the
user
agent
behavior
for
the
target
attribute
depends
on
multiple
factors
such
as
the
existence
of
an
environment
binding,
restrictions
of
available
resources,
existence
of
other
applications
and
user
preferences
(such
as
pop-up
blockers),
and
implemententation-dependent
design
decisions.
When
there
is
no
external
environmental
conformance,
it
is
recommended
that
authors
do
not
depend
on
use
of
the
target
attribute.

It
should
be
noted
that
any
implementation-dependent
use
of
the
target
attribute
might
impede
interoperability.

4.
How
to
Use
XHTML
Basic

Although
XHTML
Basic
can
be
used
as
it
is
-
a
simple
XHTML
language
with
text,
links,
and
images
-
the
intention
of
its
simple
design
is
for
use
as
a
host
language.
A
host
language
can
contain
a
mix
of
vocabularies
all
rolled
into
one
document
type.
It
is
natural
that
XHTML
is
the
host
language,
since
that
is
what
most
Web
developers
are
used
to.

When
markup
from
other
languages
is
added
to
XHTML
Basic,
the
resulting
document
type
will
be
an
extension
of
XHTML
Basic.
Content
developers
can
develop
for
XHTML
Basic
or
take
advantage
of
the
extensions.
The
goal
of
XHTML
Basic
is
to
serve
as
a
common
language
supported
by
various
kinds
of
user
agents.

5.
XHTML
inputmode
Attribute
Module

This
section
is
normative
.

This
section
was
originally
a
component
of
XForms
1.0
,
and
was
written
by
Martin
Duerst.

The
inputmode
Attribute
Module
defines
the
inputmode
attribute.

inputmode
=
CDATA

This
attribute
specifies
style
information
for
the
current
element.

The
following
table
shows
additional
attributes
for
elements
defined
elsewhere
when
the
inputmode
module
is
selected.

Elements

Attributes

Notes

input&

inputmode
(
CDATA
)

When
the
Basic
Forms
or
Forms
Module
is
selected.

textarea&

inputmode
(
CDATA
)

When
the
Basic
Forms
or
Forms
Module
is
selected.

The
attribute
inputmode
provides
a
hint
to
the
user
agent
to
select
an
appropriate
input
mode
for
the
text
input
expected
in
an
associated
form
control.
The
input
mode
may
be
a
keyboard
configuration,
an
input
method
editor
(also
called
front
end
processor)
or
any
other
setting
affecting
input
on
the
device(s)
used.

Using
inputmode
,
the
author
can
give
hints
to
the
agent
that
make
form
input
easier
for
the
user.
Authors
should
provide
inputmode
attributes
wherever
possible,
making
sure
that
the
values
used
cover
a
wide
range
of
devices.

5.1
inputmode
Attribute
Value
Syntax

The
value
of
the
inputmode
attribute
is
a
white
space
separated
list
of
tokens.
Tokens
are
either
sequences
of
alphabetic
letters
or
absolute
URIs.
The
later
can
be
distinguished
from
the
former
by
noting
that
absolute
URIs
contain
a
':'.
Tokens
are
case-sensitive.
All
the
tokens
consisting
of
alphabetic
letters
only
are
defined
in
this
specification,
in
5.3
List
of
Tokens
(or
a
successor
of
this
specification).

This
specification
does
not
define
any
URIs
for
use
as
tokens,
but
allows
others
to
define
such
URIs
for
extensibility.
This
may
become
necessary
for
devices
with
input
modes
that
cannot
be
covered
by
the
tokens
provided
here.
The
URI
should
dereference
to
a
human-readable
description
of
the
input
mode
associated
with
the
use
of
the
URI
as
a
token.
This
description
should
describe
the
input
mode
indicated
by
this
token,
and
whether
and
how
this
token
modifies
other
tokens
or
is
modified
by
other
tokens.

5.2
User
Agent
Behavior

Upon
entering
an
empty
form
control
with
an
inputmode
attribute,
the
user
agent
should
select
the
input
mode
indicated
by
the
inputmode
attribute
value.
User
agents
should
not
use
the
inputmode
attribute
to
set
the
input
mode
when
entering
a
form
control
with
text
already
present.
To
set
the
appropriate
input
mode
when
entering
a
form
control
that
already
contains
text,
user
agents
should
rely
on
platform-specific
conventions.

User
agents
should
make
available
all
the
input
modes
which
are
supported
by
the
(operating)
system/device(s)
they
run
on/have
access
to,
and
which
are
installed
for
regular
use
by
the
user.
This
is
typically
only
a
small
subset
of
the
input
modes
that
can
be
described
with
the
tokens
defined
here.

Note:

Additional
guidelines
for
user
agent
implementation
are
found
at
[UAAG
1.0]
.

The
following
simple
algorithm
is
used
to
define
how
user
agents
match
the
values
of
an
inputmode
attribute
to
the
input
modes
they
can
provide.
This
algorithm
does
not
have
to
be
implemented
directly;
user
agents
just
have
to
behave
as
if
they
used
it.
The
algorithm
is
not
designed
to
produce
"obvious"
or
"desirable"
results
for
every
possible
combination
of
tokens,
but
to
produce
correct
behavior
for
frequent
token
combinations
and
predictable
behavior
in
all
cases.

First,
each
of
the
input
modes
available
is
represented
by
one
or
more
lists
of
tokens.
An
input
mode
may
correspond
to
more
than
one
list
of
tokens;
as
an
example,
on
a
system
set
up
for
a
Greek
user,
both
"greek
upperCase"
and
"user
upperCase"
would
correspond
to
the
same
input
mode.
No
two
lists
will
be
the
same.

Second,
the
inputmode
attribute
is
scanned
from
front
to
back.
For
each
token
t
in
the
inputmode
attribute,
if
in
the
remaining
list
of
tokens
representing
available
input
modes
there
is
any
list
of
tokens
that
contains
t
,
then
all
lists
of
tokens
representing
available
input
modes
that
do
not
contain
t
are
removed.
If
there
is
no
remaining
list
of
tokens
that
contains
t
,
then
t
is
ignored.

Third,
if
one
or
more
lists
of
tokens
are
left,
and
they
all
correspond
to
the
same
input
mode,
then
this
input
mode
is
chosen.
If
no
list
is
left
(meaning
that
there
was
none
at
the
start)
or
if
the
remaining
lists
correspond
to
more
than
one
input
mode,
then
no
input
mode
is
chosen.

Example:
Assume
the
list
of
lists
of
tokens
representing
the
available
input
modes
is:
{"cyrillic
upperCase",
"cyrillic
lowerCase",
"cyrillic",
"latin",
"user
upperCase",
"user
lowerCase"},
then
the
following
inputmode
values
select
the
following
input
modes:
"cyrillic
title"
selects
"cyrillic",
"cyrillic
lowerCase"
selects
"cyrillic
lowerCase",
"lowerCase
cyrillic"
selects
"cyrillic
lowerCase",
"latin
upperCase"
selects
"latin",
but
"upperCase
latin"
does
select
"cyrillic
upperCase"
or
"user
upperCase"
if
they
correspond
to
the
same
input
mode,
and
does
not
select
any
input
mode
if
"cyrillic
upperCase"
and
"user
upperCase"
do
not
correspond
to
the
same
input
mode.

5.3
List
of
Tokens

Tokens
defined
in
this
specification
are
separated
into
two
categories:
Script
tokens
and
modifiers
.
In
inputmode
attributes,
script
tokens
should
always
be
listed
before
modifiers.

5.3.1
Script
Tokens

Script
tokens
provide
a
general
indication
the
set
of
characters
that
is
covered
by
an
input
mode.
In
most
cases,
script
tokens
correspond
directly
to
[Unicode
Scripts]
.
Some
tokens
correspond
to
the
block
names
in
Java
class
java.lang.Character.UnicodeBlock
(
[Java
Unicode
Blocks]
)
or
Unicode
Block
names.
However,
this
neither
means
that
an
input
mode
has
to
allow
input
for
all
the
characters
in
the
script
or
block,
nor
that
an
input
mode
is
limited
to
only
characters
from
that
specific
script.
As
an
example,
a
"latin"
keyboard
doesn't
cover
all
the
characters
in
the
Latin
script,
and
includes
punctuation
which
is
not
assigned
to
the
Latin
script.
The
version
of
the
Unicode
Standard
that
these
script
names
are
taken
from
is
3.2.

Input
Mode
Token

Comments

arabic

Unicode
script
name

armenian

Unicode
script
name

bengali

Unicode
script
name

bopomofo

Unicode
script
name

braille

used
to
input
braille
patterns
(not
to
indicate
a
braille
input
device)

buhid

Unicode
script
name

canadianAboriginal

Unicode
script
name

cherokee

Unicode
script
name

cyrillic

Unicode
script
name

deseret

Unicode
script
name

devanagari

Unicode
script
name

ethiopic

Unicode
script
name

georgian

Unicode
script
name

greek

Unicode
script
name

gothic

Unicode
script
name

gujarati

Unicode
script
name

gurmukhi

Unicode
script
name

han

Unicode
script
name

hangul

Unicode
script
name

hanja

Subset
of
'han'
used
in
writing
Korean

hanunoo

Unicode
script
name

hebrew

Unicode
script
name

hiragana

Unicode
script
name
(may
include
other
Japanese
scripts
produced
by
conversion
from
hiragana)

ipa

International
Phonetic
Alphabet

kanji

Subset
of
'han'
used
in
writing
Japanese

kannada

Unicode
script
name

katakana

Unicode
script
name
(full-width,
not
half-width)

khmer

Unicode
script
name

lao

Unicode
script
name

latin

Unicode
script
name

malayalam

Unicode
script
name

math

mathematical
symbols
and
related
characters

mongolian

Unicode
script
name

myanmar

Unicode
script
name

ogham

Unicode
script
name

oldItalic

Unico
de
script
name

oriya

Unicode
script
name

runic

Unicode
script
name

simplifiedHanzi

Subset
of
'han'
used
in
writing
Simplified
Chinese

sinhala

Unicode
script
name

syriac

Unicode
script
name

tagalog

Unicode
script
name

tagbanwa

Unicode
script
name

tamil

Unicode
script
name

telugu

Unicode
script
name

thaana

Unicode
script
name

thai

Unicode
script
name

tibetan

Unicode
script
name

traditionalHanzi

Subset
of
'han'
used
in
writing
Traditional
Chinese

user

Special
value
denoting
the
'native'
input
of
the
user
(e.g.
to
input
her
name
or
text
in
her
native
language).

yi

Unicode
script
name

5.3.2
Modifier
Tokens

Modifier
tokens
can
be
added
to
the
scripts
they
apply
in
order
to
more
closely
specify
the
kind
of
characters
expected
in
the
form
control.
Traditional
PC
keyboards
do
not
need
most
modifier
tokens
(indeed,
users
on
such
devices
would
be
quite
confused
if
the
software
decided
to
change
case
on
its
own;
CAPS
lock
for
upperCase
may
be
an
exception).
However,
modifier
tokens
can
be
very
helpful
to
set
input
modes
for
small
devices.

start
input
with
one
uppercase
letter,
then
continue
with
lowercase
letters

digits

digits
of
a
particular
script
(e.g.
inputmode='thai
digits')

symbols

symbols,
punctuation
(suitable
for
a
particular
script)

predictOn

text
prediction
switched
on
(e.g.
for
running
text)

predictOff

text
prediction
switched
off
(e.g.
for
passwords)

halfWidth

half-width
compatibility
forms
(e.g.
Katakana;
deprecated)

5.4
Relationship
to
XML
Schema
pattern
facets

User
agents
may
use
information
available
in
an
XML
Schema
pattern
facet
to
set
the
input
mode.
Note
that
a
pattern
facet
is
a
hard
restriction
on
the
lexical
value
of
an
instance
data
node,
and
can
specify
different
restrictions
for
different
parts
of
the
data
item.
Attribute
inputmode
is
a
soft
hint
about
the
kinds
of
characters
that
the
user
may
most
probably
start
to
input
into
the
form
control.
Attribute
inputmode
is
provided
in
addition
to
pattern
facets
for
the
following
reasons:

The
set
of
allowable
characters
specified
in
a
pattern
may
be
so
wide
that
it
is
not
possible
to
deduce
a
reasonable
input
mode
setting.
Nevertheless,
there
frequently
is
a
kind
of
characters
that
will
be
input
by
the
user
with
high
probability.
In
such
a
case,
inputmode
allows
to
set
the
input
mode
for
the
user's
convenience.

In
some
cases,
it
would
be
possible
to
derive
the
input
mode
setting
from
the
pattern
because
the
set
of
characters
allowed
in
the
pattern
closely
corresponds
to
a
set
of
characters
covered
by
an
inputmode
attribute
value.
However,
such
a
derivation
would
require
a
lot
of
data
and
calculations
on
the
user
agent.

Small
devices
may
leave
the
checking
of
patterns
to
the
server,
but
will
easily
be
able
to
switch
to
those
input
modes
that
they
support.
Being
able
to
make
data
entry
for
the
user
easier
is
of
particular
importance
on
small
devices.

5.5
Examples

This
is
an
example
of
a
form
for
Japanese
address
input.
It
is
shown
in
table
form;
it
will
be
replaced
by
actual
syntax
in
a
later
version
of
this
specification.

Caption:

inputmode

Family
name

hiragana

(in
kana)

katakana

Given
name

hiragana

(in
kana)

katakana

Zip
code

latin
digits

Address

hiragana

(in
kana)

katakana

Email

latin
lowerCase

Telephone

latin
digits

Comments

user
predictOn

6.
Acknowledgements

Version
1.0
of
this
specification
was
prepared
by
the
W3C
HTML
Working
Group.
At
the
time
of
publication
of
the
first
edition,
the
members
were:

B.
XHTML
Basic
Document
Type
Definition

This
appendix
is
normative
.

The
DTD
Implementation
of
XHTML
Basic
1.1
is
contained
in
this
appendix.
There
are
direct
links
to
the
various
files,
and
the
files
are
also
contained
in
the
"Gzip'd
TAR"
and
"Zip"
archives
linked
to
at
the
top
of
this
document.

B.1.
SGML
Open
Catalog
Entry
for
XHTML
Basic

This
section
contains
the
SGML
Open
Catalog-format
definition
of
the
public
identifiers
for
XHTML
Basic.

B.3.
XHTML
Basic
Customizations

An
XHTML
Family
Document
Type
(such
as
XHTML
Basic)
must
define
the
content
model
that
it
uses.
This
is
done
through
a
separate
content
model
module
that
is
instantiated
by
the
XHTML
Modular
Framework.
The
content
model
module
and
the
XHTML
Basic
Driver
(above)
work
together
to
customize
the
module
implementations
to
the
document
type's
specific
requirements.
The
content
model
module
for
XHTML
Basic
is
defined
below: