Network Working Group A. Bryan
Internet-Draft N. McNab
Intended status: Standards Track H. Nordstrom
Expires: July 24, 2011 T. Tsujikawa
P. Poeml
MirrorBrain
A. Ford
Roke Manor Research
January 20, 2011
Metalink/HTTP: Mirrors and Cryptographic Hashes in HTTP Header Fieldsdraft-bryan-metalinkhttp-19
Abstract
This document specifies Metalink/HTTP: Mirrors and Cryptographic
Hashes in HTTP header fields, a different way to get information that
is usually contained in the Metalink XML-based download description
format. Metalink/HTTP describes multiple download locations
(mirrors), Peer-to-Peer, cryptographic hashes, digital signatures,
and other information using existing standards for HTTP header
fields. Clients can use this information to make file transfers more
robust and reliable.
Editorial Note (To be removed by RFC Editor)
Discussion of this draft should take place on the HTTPBIS working
group mailing list (ietf-http-wg@w3.org), althought this draft is not
a WG item.
The changes in this draft are summarized in Appendix C.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
Bryan, et al. Expires July 24, 2011 [Page 1]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 2011
This Internet-Draft will expire on July 24, 2011.
Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Bryan, et al. Expires July 24, 2011 [Page 2]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 20111. Introduction
Metalink/HTTP is an alternative representation of Metalink
information, which is usually presented as an XML-based document
format [RFC5854]. Metalink/HTTP attempts to provide as much
functionality as the Metalink/XML format by using existing standards
such as Web Linking [RFC5988], Instance Digests in HTTP [RFC3230],
and Entity Tags (also known as ETags) [RFC2616]. Metalink/HTTP is
used to list information about a file to be downloaded. This can
include lists of multiple URIs (mirrors), Peer-to-Peer information,
cryptographic hashes, and digital signatures.
Identical copies of a file are frequently accessible in multiple
locations on the Internet over a variety of protocols (such as FTP,
HTTP, and Peer-to-Peer). In some cases, users are shown a list of
these multiple download locations (mirrors) and must manually select
a single one on the basis of geographical location, priority, or
bandwidth. This distributes the load across multiple servers, and
should also increase throughput and resilience. At times, however,
individual servers can be slow, outdated, or unreachable, but this
can not be determined until the download has been initiated. Users
will rarely have sufficient information to choose the most
appropriate server, and will often choose the first in a list which
might not be optimal for their needs, and will lead to a particular
server getting a disproportionate share of load. The use of
suboptimal mirrors can lead to the user canceling and restarting the
download to try to manually find a better source. During downloads,
errors in transmission can corrupt the file. There are no easy ways
to repair these files. For large downloads this can be extremely
troublesome. Any of the number of problems that can occur during a
download lead to frustration on the part of users.
Some popular sites automate the process of selecting mirrors using
DNS load balancing, both to approximately balance load between
servers, and to direct clients to nearby servers with the hope that
this improves throughput. Indeed, DNS load balancing can balance
long-term server load fairly effectively, but it is less effective at
delivering the best throughput to users when the bottleneck is not
the server but the network.
This document describes a mechanism by which the benefit of mirrors
can be automatically and more effectively realized. All the
information about a download, including mirrors, cryptographic
hashes, digital signatures, and more can be transferred in
coordinated HTTP header fields hereafter referred to as a Metalink.
This Metalink transfers the knowledge of the download server (and
mirror database) to the client. Clients can fallback to other
mirrors if the current one has an issue. With this knowledge, the
Bryan, et al. Expires July 24, 2011 [Page 4]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 2011
client is enabled to work its way to a successful download even under
adverse circumstances. All this can be done without complicated user
interaction and the download can be much more reliable and efficient.
In contrast, a traditional HTTP redirect to a mirror conveys only
extremely minimal information - one link to one server, and there is
no provision in the HTTP protocol to handle failures. Furthermore,
in order to provide better load distribution across servers and
potentially faster downloads to users, Metalink/HTTP facilitates
multi-source downloads, where portions of a file are downloaded from
multiple mirrors (and optionally, Peer-to-Peer) simultaneously.
1.1. Operation Overview
Detailed discussion of Metalink operation is covered in Section 2;
this section will present a very brief, high-level overview of how
Metalink achieves its goals.
Upon connection to a Metalink/HTTP server, a client will receive
information about other sources of the same resource and a
cryptographic hash of the whole resource. The client will then be
able to request chunks of the file from the various sources,
scheduling appropriately in order to maximise the download rate.
1.2. Examples
A brief Metalink server response with ETag, mirrors, .metalink,
OpenPGP signature, and a cryptographic hash of the whole file:
Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
Link: <http://www2.example.com/example.ext>; rel=duplicate
Link: <ftp://ftp.example.com/example.ext>; rel=duplicate
Link: <http://example.com/example.ext.torrent>; rel=describedby;
type="application/x-bittorrent"
Link: <http://example.com/example.ext.metalink>; rel=describedby;
type="application/metalink4+xml"
Link: <http://example.com/example.ext.asc>; rel=describedby;
type="application/pgp-signature"
Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO
DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ==
1.3. Notational Conventions
This specification describes conformance of Metalink/HTTP.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in BCP 14, [RFC2119], as
scoped to those conformance targets.
Bryan, et al. Expires July 24, 2011 [Page 5]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 20112. Requirements
In this context, "Metalink" refers to Metalink/HTTP which consists of
mirrors and cryptographic hashes in HTTP header fields as described
in this document. "Metalink/XML" refers to the XML format described
in [RFC5854].
Metalink resources include Link header fields [RFC5988] to present a
list of mirrors in the response to a client request for the resource.
Metalink servers MUST include the cryptographic hash of a resource
via Instance Digests in HTTP [RFC3230]. Valid algorithms are found
in the IANA registry named "Hypertext Transfer Protocol (HTTP) Digest
Algorithm Values" at
<http://www.iana.org/assignments/http-dig-alg/http-dig-alg.xhtml>.
SHA-256 and SHA-512 were added by [RFC5843].
Metalink servers are HTTP servers with one or more Metalink
resources. Metalink servers MUST support the Link header fields for
listing mirrors and MUST support Instance Digests in HTTP [RFC3230].
Metalink servers MUST return the same Link header fields and Instance
Digests on HEAD requests. Metalink servers and their associated
mirror servers SHOULD all share the same ETag policy. To have the
same ETag policy means that ETags are synchronized across servers for
resources that are mirrored, i.e. byte-for-byte identical files will
have the same ETag on mirrors that they have on the Metalink server.
ETags could be based on the file contents (cryptographic hash) and
not server-unique filesystem metadata. The emitted ETag could be
implemented the same as the Instance Digest for simplicity. Metalink
servers can offer Metalink/XML documents that contain cryptographic
hashes of parts of the file and other information.
Mirror servers are typically FTP or HTTP servers that "mirror"
another server. That is, they provide identical copies of (at least
some) files that are also on the mirrored server. Mirror servers can
also be Metalink servers. Mirror servers SHOULD support serving
partial content. HTTP mirror servers SHOULD share the same ETag
policy as the originating Metalink server. HTTP Mirror servers
SHOULD support Instance Digests in HTTP [RFC3230].
Metalink clients use the mirrors provided by a Metalink server with
Link header fields [RFC5988]. Metalink clients MUST support HTTP and
SHOULD support FTP [RFC0959]. Metalink clients MAY support
BitTorrent [BITTORRENT], or other download methods. Metalink clients
SHOULD switch downloads from one mirror to another if a mirror
becomes unreachable. Metalink clients MAY support multi-source, or
parallel, downloads, where portions of a file can be downloaded from
multiple mirrors simultaneously (and optionally, from Peer-to-Peer
sources). Metalink clients MUST support Instance Digests in HTTP
Bryan, et al. Expires July 24, 2011 [Page 6]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 2011
[RFC3230] by requesting and verifying cryptographic hashes. Metalink
clients MAY make use of digital signatures if they are offered.
3. Mirrors / Multiple Download Locations
Mirrors are specified with the Link header fields [RFC5988] and a
relation type of "duplicate" as defined in Section 9.
A brief Metalink server response with two mirrors only:
Link: <http://www2.example.com/example.ext>; rel=duplicate;
pri=1; pref
Link: <ftp://ftp.example.com/example.ext>; rel=duplicate;
pri=2; geo=gb; depth=1
[[Some organizations have many mirrors. Only send a few mirrors, or
only use the Link header fields if Want-Digest is used?]]
It is up to the server to choose how many Link header fieldss to
send. Such a decision could be a hard-coded limit, a random
selection, based on file size, or based on server load.
3.1. Mirror Priority
Entries for mirror servers are listed in order of priority (from most
preferred to least) or have a "pri" value, where mirrors with lower
values are used first.
This is purely an expression of the server's preferences; it is up to
the client what it does with this information, particularly with
reference to how many servers to use at any one time.
3.2. Mirror Geographical Location
Entries for a mirror servers can have a "geo" value, which is a
[ISO3166-1] alpha-2 two letter country code for the geographical
location of the physical server the URI is used to access. A client
can use this information to select a mirror, or set of mirrors, that
are geographically near (if the client has access to such
information), with the aim of reducing network load at inter-country
bottlenecks.
3.3. Coordinated Mirror Policies
There are two types of mirror servers: preferred and normal.
Preferred mirror servers are HTTP mirror servers that MUST share the
same ETag policy as the originating Metalink server. Preferred
Bryan, et al. Expires July 24, 2011 [Page 7]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 2011
mirrors make it possible to detect early on, before data is
transferred, if the file requested matches the desired file. Entries
for preferred HTTP mirror servers have a "pref" value. By default,
if unspecified then mirrors are considered "normal" and do not
necessarily share the same ETag policy. FTP mirrors, as they do not
emit ETags, are considered "normal". ([draft-ietf-ftpext2-hash]
allows for FTP mirrors to be coordinated and provide file hashes).
HTTP Mirror servers SHOULD support Instance Digests in HTTP
[RFC3230]. Optimally, mirror servers will share the same ETag policy
and support Instance Digests in HTTP.
3.4. Mirror Depth
Some mirrors can mirror single files, whole directories, or multiple
directories.
Entries for mirror servers can have a "depth" value, where "depth=0"
is the default. A value of 0 means ONLY that file is mirrored and
that other URI path segments are not. A value of 1 means that file
and all other files and URI path segments contained in the rightmost
URI path segment are mirrored. For values of N, you go up N-1 URI
path segments above. A value of 2 means means going up one URI path
segment above, and all files and URI path segments contained are
mirrored. For each higher value, another URI path segment closer to
the Host is mirrored.
A mirror with a depth value of 4:
Link: <http://www2.example.com/dir1/dir2/dir3/dir4/dir5/example.ext>;
rel=duplicate; pri=1; pref; depth=4
In the above example, 4 URI path segments up are mirrored, from
/dir2/ on down.
4. Peer-to-Peer / Metainfo
Entries for metainfo files, which describe ways to download a file
over Peer-to-Peer networks or otherwise, are specified with the Link
header fields [RFC5988] and a relation type of "describedby" and a
type parameter that indicates the MIME type of the metadata available
at the URI. Since metainfo files can sometimes describe multiple
files, or the filename may not be the same on the Metalink server and
in the metainfo file but still have the same content, an optional
name parameter can be used.
Bryan, et al. Expires July 24, 2011 [Page 8]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 2011
A brief Metalink server response with .torrent and .metalink:
Link: <http://example.com/example.ext.torrent>; rel=describedby;
type="application/x-bittorrent"; name="differentname.ext"
Link: <http://example.com/example.ext.metalink>; rel=describedby;
type="application/metalink4+xml"
Metalink clients MAY support the use of metainfo files for
downloading files.
4.1. Metalink/XML Files
Full Metalink/XML files for a given resource can be specified as
shown in Section 4. This is particularly useful for providing
metadata such as cryptographic hashes of parts of a file, allowing a
client to recover from partial errors (see Section 7.1.2).
5. OpenPGP Signatures
OpenPGP signatures [RFC3156] are specified with the Link header
fields [RFC5988] and a relation type of "describedby" and a type
parameter of "application/pgp-signature".
A brief Metalink server response with OpenPGP signature only:
Link: <http://example.com/example.ext.asc>; rel=describedby;
type="application/pgp-signature"
Metalink clients MAY support the use of OpenPGP signatures.
6. Cryptographic Hashes of Whole Files
Metalink servers MUST provide Instance Digests in HTTP [RFC3230] for
files they describe with mirrors via Link header fields. Mirror
servers SHOULD as well. If Instance Digests are not provided by the
Metalink servers, the Link header fields MUST be ignored.
A brief Metalink server response with cryptographic hash:
Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO
DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ==
Bryan, et al. Expires July 24, 2011 [Page 9]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 20117. Client / Server Multi-source Download Interaction
Metalink clients begin a download with a standard HTTP [RFC2616] GET
request to the Metalink server. A Range limit is optional, not
required. Alternatively, Metalink clients can begin with a HEAD
request to the Metalink server to discover mirrors via Link header
fieldss. After that, the client follows with a GET request to the
desired mirrors.
GET /distribution/example.ext HTTP/1.1
Host: www.example.com
The Metalink server responds with the data and these header fields:
HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Length: 14867603
Content-Type: application/x-cd-image
Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
Link: <http://www2.example.com/example.ext>; rel=duplicate; pref
Link: <ftp://ftp.example.com/example.ext>; rel=duplicate
Link: <http://example.com/example.ext.torrent>; rel=describedby;
type="application/x-bittorrent"
Link: <http://example.com/example.ext.metalink>; rel=describedby;
type="application/metalink4+xml"
Link: <http://example.com/example.ext.asc>; rel=describedby;
type="application/pgp-signature"
Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO
DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ==
From the Metalink server response the client learns some or all of
the following metadata about the requested object, in addition to
also starting to receive the object:
o Object size.
o ETag.
o Mirror profile link, which can describe the mirror's priority,
whether it shares the ETag policy of the originating Metalink
server, geographical location, and mirror depth.
o Peer-to-peer information.
o Metalink/XML, which can include partial file cryptographic hashes
to repair a file.
o Digital signature.
o Instance Digest, which is the whole file cryptographic hash.
(Alternatively, the client could have requested a HEAD only, and then
skipped to making the following decisions on every available mirror
Bryan, et al. Expires July 24, 2011 [Page 10]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 2011
server found via the Link header fieldss)
If the object is large and gets delivered slower than expected then
the Metalink client starts a number of parallel ranged downloads (one
per selected mirror server other than the first) using mirrors
provided by the Link header fields with "duplicate" relation type,
using the location of the original GET request in the "Referer"
header field. The size and number of ranges requested from each
server is for the client to decide, based upon the performance
observed from each server. Further discussion of performance
considerations is presented in Section 8.
If no range limit was given in the original request then work from
the tail of the object (the first request is still running and will
eventually catch up), otherwise continue after the range requested in
the first request. If no Range was provided, the original connection
must be terminated once all parts of the resource have been
retrieved. It is recommended that a HEAD request is undertaken
first, so that the client can find out if there are any Link header
fieldss, and then Range-based requests are undertaken to the mirror
servers as well as on the original connection.
Preferred mirrors have coordinated ETags, as described in
Section 3.3, and If-Match conditions based on the ETag SHOULD be used
to quickly detect out-of-date mirrors by using the ETag from the
Metalink server response. If no indication of ETag syncronisation/
knowledge is given then If-Match should not be used, and optimally
there will be an Instance Digest in the mirror response which we can
use to detect a mismatch early, and if not then a mismatch won't be
detected until the completed object is verified. Early file mismatch
detection is described in detail in Section 7.1.1.
One of the client requests to a mirror server:
GET /example.ext HTTP/1.1
Host: www2.example.com
Range: bytes=7433802-
If-Match: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
Referer: http://www.example.com/distribution/example.ext
The mirror servers respond with a 206 Partial Content HTTP status
code and appropriate "Content-Length" and "Content Range" header
fields. The mirror server response, with data, to the above request:
Bryan, et al. Expires July 24, 2011 [Page 11]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 2011
HTTP/1.1 206 Partial Content
Accept-Ranges: bytes
Content-Length: 7433801
Content-Range: bytes 7433802-14867602/14867603
Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO
DYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ==
If the first request was not Range limited then abort it by closing
the connection when it catches up with the other parallel downloads
of the same object.
Downloads from mirrors that do not have the same file size as the
Metalink server are considered unusable and the client can deal with
it as it sees fit.
If a Metalink client does not support certain download methods (such
as FTP or BitTorrent) that a file is available from, and there are no
available download methods that the client supports, then the
download will have no way to complete.
Once the download has completed, the Metalink client MUST verify the
cryptographic hash of the file. If the cryptographic hash offered by
the Metalink server with Instance Digests does not match the
cryptographic hash of the downloaded file, see Section 7.1.2 for a
possible way to repair errors.
If the download can not be repaired, it is considered corrupt. The
client can attempt to re-download the file.
7.1. Error Prevention, Detection, and Correction
Error prevention, or early file mismatch detection, is possible
before file transfers with the use of file sizes, ETags, and
cryptographic hashes. Error detection requires Instance Digests, or
cryptographic hashes, to determine after transfers if there has been
an error. Error correction, or download repair, is possible with
partial file cryptographic hashes.
Note that cyptographic hashes obtained from Instance Digests are in
base64 encoding, while those from Metalink/XML and FTP HASH are in
hexadecimal.
7.1.1. Error Prevention (Early File Mismatch Detection)
In HTTP terms, the requirement is that merging of ranges from
multiple responses must be verified with a strong validator, which in
this context is the same as either Instance Digest or a strong ETag.
Bryan, et al. Expires July 24, 2011 [Page 12]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 2011
In most cases it is sufficient that the Metalink server provides
mirrors and Instance Digest information, but operation will be more
robust and efficient if the mirror servers do implement a
synchronized ETag as well. In fact, the emitted ETag can be
implemented the same as the Instance Digest for simplicity, but there
is no need to specify how the ETag is generated, just that it needs
to be shared among the mirror servers. If the mirror server provides
neither synchronized ETag or Instance Digest, then early detection of
mismatches is not possible unless file length also differs. Finally,
the error is still detectable, after the download has completed, when
the merged response is verified.
ETags can not be used for verifying the integrity of the received
content. But it is a guarantee issued by the Metalink server that
the content is correct for that ETag. And if the ETag given by the
mirror server matches the ETag given by the master server, then we
have a chain of trust where the master server authorizes these
responses as valid for that object.
This guarantees that a mismatch will be detected by using only the
synchronized ETag from a master server and mirror server, even
alerted by the mirror servers themselves by responding with an error,
preventing accidental merges of ranges from different versions of
files with the same name. This even includes many malicious attacks
where the data on the mirror has been replaced by some other file,
but not all.
Synchronized ETag can not strictly protect against malicious attacks
or server or network errors replacing content, but neither can
Instance Digest on the mirror servers as the attacker most certainly
can make the server seemingly respond with the expected Instance
Digest even if the file contents have been modified, just as he can
with ETag, and the same for various system failures also causing bad
data to be returned. The Metalink client has to rely on the Instance
Digest returned by the Metalink master server in the first response
for the verification of the downloaded object as a whole.
If the mirror servers do return an Instance Digest, then that is a
bonus, just as having them return the right set of Link header
fieldss is. The set of trusted mirrors doing that can be substituted
as master servers accepting the initial request if one likes.
The benefit of having slave mirror servers (those not trusted as
masters) return Instance Digest is that the client then can detect
mismatches early even if ETag is not used. Both ETag and slave
mirror Instance Digest do provide value, but just one is sufficient
for early detection of mismatches. If none is provided then early
detection of mismatches is not possible unless the file length also
Bryan, et al. Expires July 24, 2011 [Page 13]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 2011
differs, but the error is still detected when the merged response is
verified.
If FTP servers support the FTP HASH command [draft-ietf-ftpext2-hash]
and the same hash algorithm as the originating Metalink server, then
that information can be used for early file mismatch detection.
7.1.2. Error Correction
Partial file cryptographic hashes can be used to detect errors during
the download. Metalink servers are not required to offer partial
file cryptographic hashes in Metalink/XML as specified in
Section 4.1, but they are encouraged to do so.
If the object cryptographic hash does not match the Instance Digest
then fetch the Metalink/XML if available, where partial file
cryptographic hashes can be found, allowing detection of which server
returned incorrect data. If the Instance Digest computation does not
match then the client needs to fetch the partial file cryptographic
hashes, if available, and from there figure out what of the
downloaded data can be recovered and what needs to be fetched again.
If no partial cryptographic hashes are available, then the client
MUST fetch the complete object from other mirrors.
8. Multi-server Performance
When opting to download simultaneously from multiple mirrors, there
are a number of factors (both within and outside the influence of the
client software) that are relevant to the performance achieved:
o The number of servers used simultaneously.
o The ability to pipeline sufficient or sufficiently large range
requests to each server so as to avoid connections going idle.
o The ability to pipeline sufficiently few or sufficiently small
range requests to servers so that all the servers finish their
final chunks simultaneously.
o The ability to switch between mirrors dynamically so as to use the
fastest mirrors at any moment in time
Obviously we do not want to use too many simultaneous connections, or
other traffic sharing a bottleneck link will be starved. But at the
same time, good performance requires that the client can
simultaneously download from at least one fast mirror while exploring
whether any other mirror is faster. Based on laboratory experiments,
we suggest a good default number of simultaneous connections is
probably four, with three of these being used for the best three
mirrors found so far, and one being used to evaluate whether any
Bryan, et al. Expires July 24, 2011 [Page 14]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 2011
other mirror might offer better performance.
The size of chunks chosen by the client should be sufficiently large
that the chunk request header fields and reponse header fields
represent neglible overhead, and sufficiently large that they can be
pipelined effectively without needing a very high rate of chunk
requests. At the same time, the amount of time wasted waiting for
the last chunk to download from the last server after all the other
servers have finished should be minimized. Note that Range requests
impose an overhead on servers and clients need to be aware of that
and not abuse them.
9. IANA Considerations
Accordingly, IANA will make the following registration to the Link
Relation Type registry.
o Relation Name: duplicate
o Description: Refers to a resource whose available representations
are byte-for-byte identical with the corresponding representations of
the context IRI.
o Reference: This specification.
o Notes: This relation is for static resources. That is, an HTTP GET
request on any duplicate will return the same representation. It
does not make sense for dynamic or POSTable resources and should not
be used for them.
10. Security Considerations10.1. URIs and IRIs
Metalink clients handle URIs and IRIs. See Section 7 of [RFC3986]
and Section 8 of [RFC3987] for security considerations related to
their handling and use.
10.2. Spoofing
There is potential for spoofing attacks where the attacker publishes
Metalinks with false information. In that case, this could deceive
unaware downloaders that they are downloading a malicious or
worthless file. Also, malicious publishers could attempt a
distributed denial of service attack by inserting unrelated URIs into
Metalinks.
Bryan, et al. Expires July 24, 2011 [Page 15]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 201110.3. Cryptographic Hashes
Currently, some of the digest values defined in Instance Digests in
HTTP [RFC3230] are considered insecure. These include the whole
Message Digest family of algorithms which are not suitable for
cryptographically strong verification. Malicious people could
provide files that appear to be identical to another file because of
a collision, i.e. the weak cryptographic hashes of the intended file
and a substituted malicious file could match.
If a Metalink contains whole file hashes as described in Section 6,
it SHOULD include SHA-256, as specified in [FIPS-180-3], or stronger.
It MAY also include other hashes.
10.4. Signing
Metalinks should include digital signatures, as described in
Section 5.
Digital signatures provide authentication, message integrity, and
non-repudiation with proof of origin.
11. References11.1. Normative References
[BITTORRENT]
Cohen, B., "The BitTorrent Protocol Specification",
BITTORRENT 11031, February 2008,
<http://www.bittorrent.org/beps/bep_0003.html>.
[FIPS-180-3]
National Institute of Standards and Technology (NIST),
"Secure Hash Standard (SHS)", FIPS PUB 180-3,
October 2008.
[ISO3166-1]
International Organization for Standardization, "ISO 3166-
1:2006. Codes for the representation of names of
countries and their subdivisions -- Part 1: Country
codes", November 2006.
[RFC0959] Postel, J. and J. Reynolds, "File Transfer Protocol",
STD 9, RFC 0959, October 1985.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
Bryan, et al. Expires July 24, 2011 [Page 16]

Internet-Draft Metalink/HTTP: Mirrors and Hashes January 2011
This draft, compared to the Metalink/XML format [RFC5854] :
o (+) Reuses existing HTTP standards without much new besides a Link
Relation Type. It's more of a collection/coordinated feature set.
o (?) The existing standards don't seem to be widely implemented.
o (+) No XML dependency, except for Metalink/XML for partial file
cryptographic hashes.
o (+) Existing Metalink/XML clients can be easily converted to
support this as well.
o (+) Coordination of mirror servers is preferred, but not required.
Coordination could be difficult or impossible unless you are in
control of all servers on the mirror network.
o (-) Requires software or configuration changes to originating
server.
o (-?) Tied to HTTP, not as generic. FTP/P2P clients won't be
using it unless they also support HTTP, unlike Metalink/XML.
o (-) Requires server-side support. Metalink/XML can be created by
user (or server, but server component/changes not required).
o (-) Also, Metalink/XML files are easily mirrored on all servers.
Even if usage in that case is not as transparent, this method
still gives access to all download information (with no changes
needed to servers) from all mirrors (FTP included).
o (-) Not portable/archivable/emailable. Metalink/XML is used to
import/export transfer queues. Not as easy for search engines to
index?
o (-) Not as rich metadata.
o (-) Not able to add multiple files to a download queue or create
directory structure.
Appendix C. Document History
[[ to be removed by the RFC editor before publication as an RFC. ]]
Known issues concerning this draft:
o Some organizations have many mirrors. Should all be sent, or only
a certain number? All should be included in the Metalink/XML, if
used.
o Using Metalink/XML for partial file cryptographic hashes. That
adds XML dependency to apps for an important feature. Is there a
better method?
-19 : January 20, 2011.
o Julian Reschke's review.
-18 : January 1, 2010.
Bryan, et al. Expires July 24, 2011 [Page 18]