Entire novels have been written about the security
considerations that apply to HTML documents. Many are listed in
this document, to which the reader is referred for more
details. Some general concerns bear mentioning here, however:

HTML is scripted language, and has a large number of APIs (some
of which are described in this document). Script can expose the
user to potential risks of information leakage, credential
leakage, cross-site scripting attacks, cross-site request
forgeries, and a host of other problems. While the designs in this
specification are intended to be safe if implemented correctly, a
full implementation is a massive undertaking and, as with any
software, user agents are likely to have security bugs.

Even without scripting, there are specific features in HTML
which, for historical reasons, are required for broad
compatibility with legacy content but that expose the user to
unfortunate security problems. In particular, the img
element can be used in conjunction with some other features as a
way to effect a port scan from the user's location on the
Internet. This can expose local network topologies that the
attacker would otherwise not be able to determine.

HTML relies on a compartmentalization scheme sometimes known as
the same-origin policy. An origin in most
cases consists of all the pages served from the same host, on the
same port, using the same protocol.

It is critical, therefore, to ensure that any untrusted content
that forms part of a site be hosted on a different
origin than any sensitive content on that site.
Untrusted content can easily spoof any other page on the same
origin, read data from that origin, cause scripts in that origin
to execute, submit forms to and from that origin even if they are
protected from cross-site request forgery attacks by unique
tokens, and make use of any third-party resources exposed to or
rights granted to that origin.

Interoperability considerations:

Rules for processing both conforming and non-conforming content
are defined in this specification.

The purpose of the text/html-sandboxed MIME type
is to provide a way for content providers to indicate that they
want the file to be interpreted in a manner that does not give the
file's contents access to the rest of the site. This is achieved
by assigning the Document objects generated from
resources labeled as text/html-sandboxed unique
origins.

Labeling a resource with the application/xhtml+xml
type asserts that the resource is an XML document that likely has
a root element from the HTML namespace. As such, the
relevant specifications are the XML specification, the Namespaces
in XML specification, and this specification. [XML][XMLNS]

11.4 text/cache-manifest

This registration is for community review and will be submitted
to the IESG for review, approval, and registration with IANA.

Type name:

text

Subtype name:

cache-manifest

Required parameters:

No parameters

Optional parameters:

No parameters

Encoding considerations:

Always UTF-8.

Security considerations:

Cache manifests themselves pose no immediate risk unless
sensitive information is included within the
manifest. Implementations, however, are required to follow
specific rules when populating a cache based on a cache manifest,
to ensure that certain origin-based restrictions are
honored. Failure to correctly implement these rules can result in
information leakage, cross-site scripting attacks, and the
like.

Interoperability considerations:

Rules for processing both conforming and non-conforming content
are defined in this specification.

Published specification:

This document is the relevant specification.

Applications that use this media type:

Web browsers.

Additional information:

Magic number(s):

Cache manifests begin with the string "CACHE
MANIFEST", followed by either a U+0020 SPACE character, a
U+0009 CHARACTER TABULATION (tab) character, a U+000A LINE FEED
(LF) character, or a U+000D CARRIAGE RETURN (CR) character.