RTP MIDI: An RTP Payload Format for MIDI

RTP MIDI

Internet telephony and video-conferencing programs send audio and video
over the net using the Real-time Transport Protocol
(RTP). RTP
is an Internet Engineering Task Force
(IETF) standard, whose payload
formats are developed in the Audio-Video Transport payload working
group (payload).

We have worked within AVT-payload to standardize RTP MIDI, a payload
format to send MIDI over networks using
RTP. MIDI
is a standard for coding the gestures of musical performance --
pressing piano keys, striking drum pads, moving faders, etc).

RTP MIDI is able to send MIDI over a "lossy" network (a network that
loses packets). To prevent "stuck notes" and other artifacts, RTP
MIDI uses a feed-forward resiliency system (the recovery
journal) to recover from packet loss.

We anticipate three major application areas for RTP MIDI:

MIDI over wired and wireless LANs.
RTP MIDI may be used to send real-time MIDI streams over wired and
wireless Local Area Networks
(LANs). Apple uses RTP MIDI as the
transport layer for
the MIDI
Network Driver, included
in OS X and in iOS (the
operating system for the iPhone, the
iPad, and the iPod Touch).

Network Musical Performance.
VoIP and videoconferencing applications may add support for network
musical performance via RTP MIDI. In a network performance,
musicians located in different physical locations interact over a
network to perform as they would if located in the same room.

Content Streaming.
Content streams may begin to use MIDI for low-bitrate music coding,
perhaps in conjunction with normative sound synthesis methods such as
Structured
Audio. Applications include Internet broadcasting, multimedia
presentations, and telephony audicons and ring tones.

To Learn More

Implementors should refer to
RFC 6295 and
RFC 4696
for the final version of RTP MIDI.

RFC 6295 was
approved in 2011, and fixes many document errors in the first RTP
MIDI RFC (RFC
4695). See Section 12 of RFC 6295 for a complete change log.
Errors in RFC 4696
are documented on its
errata page.

This paper, presented at the
117th AES convention, is a good introduction to how RTP MIDI works,
and how it fits into the IETF media protocol stack. The AES paper
discusses a protocol that is a snapshot of RTP MIDI as it existed in
October 2004.

In network musical performance applications, one cause of concern is
the latency between performers. This
paper, presented at the NOSSDAV 2001 conference, discusses
latency (and other issues) in network musical performances, in the
context of an application that uses a proto-version of RTP MIDI as the
network transport.

Tobias Erichsen
has created a MIDI Network Driver for Windows that can interoperate
with Apple's RTP MIDI implementation. His driver is free for private,
non-commercial use, and is available for download
here.

nmj is a Java
library that lets developers write Android apps that interoperate with
Apple's RTP MIDI implementation.
TouchDAW is an
example of an Android app that is based on nmj technology.

Jim Young has written an RTP MIDI stack for Windows 8 using the
WinRT API (video).

Wireshark now includes an
RTP MIDI dissector, written
by Tobias Erichsen, that
interoperates with Apple's RTP MIDI implementation.

MidiShare, a realtime
operating system for musical applications, includes an RTP MIDI library
in its development branch.

The (unofficial) reference implementation for RTP MIDI is the network
stack in sfront, an MPEG 4 Structured Audio
decoder.

Networking is no longer enabled in the sfront
distribution, because we no longer host the required network services.
However, the networking source code still ships in the
distribution. Developers wishing to examine the network code can download
sfronthere, and
follow these instructions
for locating the network source code. Alternatively, we offer
a smaller distribution that contains only the
network source code (click here
to download). Note that the network code (and sfront
itself) is BSD-licensed.

John Lazzaro and John Wawrzynek (2004). An RTP Payload Format for MIDI.
The 117th Convention of the Audio Engineering Society,
October 28-31, 2004, San Francisco, CA.
[PDF].

John Lazzaro and John Wawrzynek (2001). A
Case for Network Musical Performance.The 11th International
Workshop on Network and Operating Systems Support for Digital Audio
and Video (NOSSDAV 2001) June 25-26, 2001, Port Jefferson, New York
[PDF].