WARNING: This Standards-Track document is Experimental. Publication as an XMPP Extension Protocol does not imply approval of this proposal by the XMPP Standards Foundation. Implementation of the protocol described herein is encouraged in exploratory implementations, but production systems are advised to carefully consider whether it is appropriate to deploy implementations of this protocol before it advances to a status of Draft.

Historically, XMPP has had no system for simple text styling.
Instead, specifications like XHTML-IM (XEP-0071) [1] that require full layout engines have
been used, leading to numerous security issues with implementations.
Some entities have also performed their own styling based on identifiers in
the body.
While this has worked well in the past, it is not interoperable and leads to
entities each supporting their own informal styling languages.

This specification aims to provide a single, interoperable formatted text
syntax that can be used by entities that do not require full layout engines.

Many important terms used in this document are defined in Unicode [2].
The terms "left-to-right" (LTR) and "right-to-left" (RTL) are defined in
Unicode Standard Annex #9 [3].
The term "formatted text" is defined in RFC 7764 [4].

Block

Any chunk of text that can be parsed unambiguously in one pass.
Blocks may contain one or more children which may be other blocks or
spans.
For example:

A single line of text comprising one or more spans

A block quotation

A preformatted code block

Formal markup language

A structured markup language such as LaTeX, SGML, HTML, or XML that is
formally defined and may include metadata unrelated to formatting or
text style.

Plain text

Text that does not convey any particular formatting or interpretation of
the text by computer programs.

Span

A group of text that may be rendered inline alongside other spans.
Spans may be either plain text with no formatting applied, or may be
formatted text that is enclosed by two styling directives.
Spans are always children of blocks and may not escape from their
containing block.
Some spans may contain child spans.
The following all contain spans marked by parenthesis:

(plain span)

(*strong span*)

(_emphasized span_)

(_emphasized span containing (*strong span*)_)

(span one )(*span two*)

Styling directive

A character or set of characters that indicates the beginning of a span
or block.
For example, in certain contexts the characters '*' (U+002A ASTERISK),
and '_' (U+005F LOW LINE) may be styling directives that indicate the
beginning of a strong or emphasis span and the string '```' (U+0060
GRAVE ACCENT) may be a styling directive that indicate the beginning of
a preformatted code block.

Whitespace character

Any Unicode scalar value which has the property "White_Space" or is in
category Z in the Unicode Character Database.

Individual lines of text that are not inside of a preformatted text
block are considered a "plain" block.
Plain blocks are not bound by styling directives and do not imply
formatting themselves, but they may contain spans which imply
formatting.
Plain blocks may not contain child blocks.

Example 1. Plain block text

<body>
(There are three blocks in this body marked by parens,)
(but there is no *formatting)
(as spans* may not escape blocks.)
</body>

A preformatted text block is started by a line beginning with "```"
(U+0060 GRAVE ACCENT), and ended by a line containing only three grave
accents or the end of the parent block (whichever comes first).
Preformatted text blocks cannot contain child blocks or spans.
Text inside a preformatted block SHOULD be displayed in a monospace font.

Example 2. Preformatted block text

<body>
```ignored
(println &quot;Hello, world!&quot;)
```
This should show up as monospace, preformatted text ⤴
</body>

Example 3. No closing preformatted text sequence

<body>
&gt; ```
&gt; (println &quot;Hello, world!&quot;)
The entire blockquote is a preformatted text block, but this line
is plaintext!
</body>

A quotation is indicated by one or more lines with a byte stream
beginning with a '>' (U+003E GREATER-THAN SIGN).
Block quotes may contain any child block, including other quotations.
Lines inside the block quote MUST have leading spaces trimmed before
parsing the child block.
It is RECOMMENDED that text inside of a block quote be indented or
distinguished from the surrounding text in some other way.

Example 4. Quotation (LTR)

<body>
&gt; That that is, is.
Said the old hermit of Prague.
</body>

Example 5. Nested Quotation

<body>
&gt;&gt; That that is, is.
&gt; Said the old hermit of Prague.
Who?
</body>

Matches of spans between two styling directives MUST contain some text
between the two styling directives and the opening styling directive MUST
be located at the beginning of the line, or after a whitespace character.
The opening styling directive MUST NOT be followed by a whitespace
character and the closing styling directive MUST NOT be preceeded by a
whitespace character.
Spans are always parsed from the beginning of the byte stream to the end
and are lazily matched.
Characters that would be styling directives but do not follow these rules
are not considered when matching and thus may be present between two other
styling directives.

For example, each of the following would be styled as indicated:

*strong*

plain *strong* plain

*strong* plain *strong*

*strong*plain*

* plain *strong*

Nothing would be styled in the following messages (where "\n" represents a
new line):

Text enclosed by a '`' (U+0060 GRAVE ACCENT) is a preformatted span SHOULD
be displayed inline in a monospace font.
A preformatted span may only contain a single plain span.
Inline formatting directives inside the preformatted span are not
rendered.
For example, the following all contain valid preformatted spans:

This document does not define a regular grammar and thus styling cannot be
matched by a regular expression.
Instead, a simple parser can be constructed by first parsing all text into
blocks and then recursively parsing the child-blocks inside block
quotations, the spans inside individual lines, and by returning the text
inside preformatted blocks without modification.

It is RECOMMENDED that formatting characters be displayed and formatted in
the same manner as the text they apply to.
For example, the string "*emphasis*" would be rendered as
"*emphasis*".

When displaying text with formatting, developers should take care to ensure
sufficient contrast exists between styled and unstyled text so that users
with vision deficiencies are able to distinguish between the two.

Formatted text may also be rendered poorly by screen readers.
When applying formatting it may be desirable to include directives to
exclude formatting characters from being read.

Appendix C: Legal Notices

Copyright

Permissions

Permission is hereby granted, free of charge, to any person obtaining a copy of this specification (the "Specification"), to make use of the Specification without restriction, including without limitation the rights to implement the Specification in a software program, deploy the Specification in a network service, and copy, modify, merge, publish, translate, distribute, sublicense, or sell copies of the Specification, and to permit persons to whom the Specification is furnished to do so, subject to the condition that the foregoing copyright notice and this permission notice shall be included in all copies or substantial portions of the Specification. Unless separate permission is granted, modified works that are redistributed shall not contain misleading information regarding the authors, title, number, or publisher of the Specification, and shall not claim endorsement of the modified works by the authors, any organization or project to which the authors belong, or the XMPP Standards Foundation.

Disclaimer of Warranty

## NOTE WELL: This Specification is provided on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. ##

Limitation of Liability

In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall the XMPP Standards Foundation or any author of this Specification be liable for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising from, out of, or in connection with the Specification or the implementation, deployment, or other use of the Specification (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if the XMPP Standards Foundation or such author has been advised of the possibility of such damages.

IPR Conformance

This XMPP Extension Protocol has been contributed in full conformance with the XSF's Intellectual Property Rights Policy (a copy of which can be found at <https://xmpp.org/about/xsf/ipr-policy> or obtained by writing to XMPP Standards Foundation, P.O. Box 787, Parker, CO 80134 USA).

Appendix D: Relation to XMPP

The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 6120) and XMPP IM (RFC 6121) specifications contributed by the XMPP Standards Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this document has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.

Appendix E: Discussion Venue

The primary venue for discussion of XMPP Extension Protocols is the <standards@xmpp.org> discussion list.

Appendix F: Requirements Conformance

The following requirements keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".

5. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <http://www.iana.org/>.

6. The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used in the context of XMPP extension protocols approved by the XMPP Standards Foundation. For further information, see <https://xmpp.org/registrar/>.