Entire novels have been written about the security considerations that apply to HTML
documents. Many are listed in this document, to which the reader is referred for more details.
Some general concerns bear mentioning here, however:

HTML is scripted language, and has a large number of APIs (some of which are described in
this document). Script can expose the user to potential risks of information leakage, credential
leakage, cross-site scripting attacks, cross-site request forgeries, and a host of other
problems. While the designs in this specification are intended to be safe if implemented
correctly, a full implementation is a massive undertaking and, as with any software, user agents
are likely to have security bugs.

Even without scripting, there are specific features in HTML which, for historical reasons,
are required for broad compatibility with legacy content but that expose the user to unfortunate
security problems. In particular, the img element can be used in conjunction with
some other features as a way to effect a port scan from the user's location on the Internet.
This can expose local network topologies that the attacker would otherwise not be able to
determine.

HTML relies on a compartmentalization scheme sometimes known as the same-origin
policy. An origin in most cases consists of all the pages served from the same
host, on the same port, using the same protocol.

It is critical, therefore, to ensure that any untrusted content that forms part of a site be
hosted on a different origin than any sensitive content on that site. Untrusted
content can easily spoof any other page on the same origin, read data from that origin, cause
scripts in that origin to execute, submit forms to and from that origin even if they are
protected from cross-site request forgery attacks by unique tokens, and make use of any
third-party resources exposed to or rights granted to that origin.

Interoperability considerations:

Rules for processing both conforming and non-conforming content are
defined in the published specification.

Fragment identifiers used with text/html resources either refer to the
indicated part of the document or provide state information for in-page
scripts. Detailed processing for fragment identifiers is defined in the
"Navigating to a fragment identifier" section
(http://www.w3.org/TR/html/browsers.html#scroll-to-fragid).

Restrictions on usage:

No restrictions apply.

Provisional registration? (standards tree only):

No.

Additional information:

Deprecated alias names for this type: N/A

Magic number(s): No sequence of bytes can uniquely identify an HTML document. More
information on detecting HTML documents is available in the MIME Sniffing specification.
[MIMESNIFF]

Labeling a resource with the application/xhtml+xml type asserts that the resource is an XML
document that likely has a root element from the HTML namespace. Thus, the relevant
specifications are the XML specification, the Namespaces in XML specification, and
http://www.w3.org/TR/html. [XML][XMLNS]

Magic number(s): No sequence of bytes can uniquely identify an XHTML document. More
information on detecting XHTML documents is available in the MIME Sniffing specification.
[MIMESNIFF]

File extension(s): "xhtml" and "xht"
are sometimes used as extensions for XML resources that have a
root element from the HTML namespace.

Macintosh file type code: TEXT.

Object Identifiers: N/A

Person & email address to contact for further information:

Robin Berjon <robin@berjon.com>

Intended usage:

Common

Author:

Ian Hickson <ian@hixie.ch>

Change controller:

W3C

12.4 application/x-www-form-urlencoded

This registration has been filed successfully with IANA.

Type name:

application

Subtype name:

x-www-form-urlencoded

Required parameters:

No parameters

Optional parameters:

No parameters

Encoding considerations:

7bit

Security considerations:

In isolation, an application/x-www-form-urlencoded payload poses no security
risks. However, as this type is usually used as part of a form submission, all the risks that
apply to HTML forms need to be considered in the context of this type.

Fragment identifiers have no meaning with the application/x-www-form-urlencoded type.

Restrictions on usage:

This type is only intended to be used to describe HTML form submission payloads.

Provisional registration? (standards tree only):

No.

Additional information:

Deprecated alias names for this type: N/A

Magic number(s): There is no reliable mechanism for recognising such payloads.

File extension(s): N/A

Macintosh file type code: N/A.

Object Identifiers: N/A

Person & email address to contact for further information:

Robin Berjon <robin@berjon.com>

Intended usage:

Common

Author:

Ian Hickson <ian@hixie.ch>

Change controller:

W3C

12.5 text/cache-manifest

This registration has been filed successfully with IANA.

Type name:

text

Subtype name:

cache-manifest

Required parameters:

No parameters

Optional parameters:

charset

The charset parameter may be provided. The parameter's value must be
"utf-8". This parameter serves no purpose; it is only allowed for
compatibility with legacy servers.

Encoding considerations:

8bit

Security considerations:

Cache manifests themselves contain no executable content and pose no immediate risk unless
sensitive information is included within the manifest.

Implementations however, are required to follow specific rules when populating a cache based
on a cache manifest, to ensure that certain origin-based restrictions are honoured. Failure to
correctly implement these rules can result in information leakage, cross-site scripting
attacks, and the like.

Caching mechanisms are typically subjects of poisoning attacks and the one that this file type
supports is no exception. The published specification includes steps intended to mitigate such
issues (notably non-malicious cache poisoning from captive portals) but implementers are
advised to exercise caution in caching.

12.6 web+ scheme prefix

This section describes a convention for use with the IANA URI scheme registry. It does not
itself register a specific scheme. [RFC4395]

URI scheme name:

Schemes starting with the four characters "web+" followed by one or more letters in the range
a-z.

Status:

permanent

URI scheme syntax:

Scheme-specific.

URI scheme semantics:

Scheme-specific.

Encoding considerations:

All "web+" schemes should use UTF-8 encodings where relevant.

Applications/protocols that use this URI scheme name:

Scheme-specific.

Interoperability considerations:

The scheme is expected to be used in the context of Web applications.

Security considerations:

Any Web page is able to register a handler for all "web+" schemes. As
such, these schemes must not be used for features intended to be core platform features (e.g.
network transfer protocols like HTTP or FTP). Similarly, such schemes must not store
confidential information in their URLs, such as usernames, passwords, personal information, or
confidential project names.