RFC 7620

Scenarios with Host Identification Complications

Independent Submission M. Boucadair, Ed.
Request for Comments: 7620 B. Chatras
Category: Informational Orange
ISSN: 2070-1721 T. Reddy
Cisco Systems
B. Williams
Akamai, Inc.
B. Sarikaya
Huawei
August 2015 Scenarios with Host Identification Complications
Abstract
This document describes a set of scenarios in which complications
when identifying which policy to apply for a host are encountered.
This problem is abstracted as "host identification". Describing
these scenarios allows commonalities between scenarios to be
identified, which is helpful during the solution design phase.
This document does not include any solution-specific discussions.
IESG Note
This document describes use cases where IP addresses are overloaded
with both location and identity properties. Such semantic
overloading is seen as a contributor to a variety of issues within
the routing system [RFC4984]. Additionally, these use cases may be
seen as a way to justify solutions that are not consistent with IETF
Best Current Practices on protecting privacy [BCP160] [BCP188].
Status of This Memo
This document is not an Internet Standards Track specification; it is
published for informational purposes.
This is a contribution to the RFC Series, independently of any other
RFC stream. The RFC Editor has chosen to publish this document at
its discretion and makes no statement about its value for
implementation or deployment. Documents approved for publication by
the RFC Editor are not a candidate for any level of Internet
Standard; see Section 2 of RFC 5741.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc7620.

1. Introduction
The goal of this document is to enumerate scenarios that encounter
the issue of uniquely identifying a host among those sharing the same
IP address. Within this document, a host can be any device directly
connected to a network operated by a network provider, a Home
Gateway, or a roaming device located behind a Home Gateway.
An exhaustive list of encountered issues for the Carrier-Grade NAT
(CGN), Address plus Port (A+P), and application proxies scenarios are
documented in [RFC6269]. In addition to those issues, some of the
scenarios described in this document suffer from additional issues
such as:
o Identifying which policy to enforce for a host (e.g., limit access
to the service based on some counters such as volume-based service
offerings); enforcing the policy will have an impact on all hosts
sharing the same IP address.
o Needing to correlate between the internal address:port and
external address:port to generate and therefore enforce policies.
o Querying a location server for the location of an emergency caller
based on the source IP address.
The goal of this document is to identify scenarios the authors are
aware of and that share the same complications in identifying which
policy to apply for a host. This problem is abstracted as the host
identification problem.
The analysis of the scenarios listed in this document indicates
several root causes for the host identification issue:
1. Presence of address sharing (CGN, A+P, application proxies,
etc.).
2. Use of tunnels between two administrative domains.
3. Combination of address sharing and presence of tunnels in the
path.
Even if these scenarios share the same root causes, describing the
scenario allows to identify what is common between the scenarios, and
then this information would help during the solution design phase.
2. Scope
This document can be used as a tool to design a solution(s) that
mitigates the encountered issues. Note, [RFC6967] focuses only on
the CGN, A+P, and application proxies cases. The analysis in
[RFC6967] may not be accurate for some of the scenarios that do not
span multiple administrative domains (e.g., Section 10.1).

This document does not target means that would lead to exposing a
host beyond what the original packet, issued from that host, would
have already exposed. Such means are not desirable nor required to
solve the issues encountered in the scenarios discussed in this
document. The focus is exclusively on means to restore the
information conveyed in the original packet issued by a given host.
These means are intended to help in identifying which policy to apply
for a given flow. These means may rely on some bits of the source IP
address and/or port number(s) used by the host to issue packets.
To prevent side effects and misuses of such means on privacy, a
solution specification document(s) should explain, in addition to
what is already documented in [RFC6967], the following:
o To what extent the solution can be used to nullify the effect of
using address sharing to preserve privacy (see, for example,
[EFFOpenWireless]). Note, this concern can be mitigated if the
address-sharing platform is under the responsibility of the host's
owner and the host does not leak information that would interfere
with the host's privacy protection tool.
o To what extent the solution can be used to expose privacy
information beyond what the original packet would have already
exposed. Note, the solutions discussed in [RFC6967] do not allow
extra information to be revealed other than what is conveyed in
the original packet.
This document covers both IPv4 and IPv6.
This document does not include any solution-specific discussions. In
particular, the document does not elaborate whether explicit
authentication is enabled or not.
This document does not discuss whether specific information is needed
to be leaked in packets, whether this is achieved out of band, etc.
Those considerations are out of scope.
3. Scenario 1: Carrier-Grade NAT (CGN)
Several flavors of stateful CGN have been defined. A non-exhaustive
list is provided below:
1. IPv4-to-IPv4 NAT (NAT44) [RFC6888] [STATELESS-NAT44]
2. DS-Lite NAT44 [RFC6333]
3. Network Address and Protocol Translation from IPv6 Clients to
IPv4 Servers (NAT64) [RFC6146]

4. IPv6-to-IPv6 Network Prefix Translation (NPTv6) [RFC6296]
As discussed in [RFC6967], remote servers are not able to distinguish
between hosts sharing the same IP address (Figure 1). As a reminder,
remote servers rely on the source IP address for various purposes
such as access control or abuse management. The loss of the host
identification will lead to issues discussed in [RFC6269].
+-----------+
| HOST_1 |----+
+-----------+ | +--------------------+ +------------+
| | |------| Server 1 |
+-----------+ +-----+ | | +------------+
| HOST_2 |--| CGN |----| INTERNET | ::
+-----------+ +-----+ | | +------------+
| | |------| Server n |
+-----------+ | +--------------------+ +------------+
| HOST_3 |-----+
+-----------+
Figure 1: CGN Reference Architecture
Some of the above-referenced CGN scenarios will be satisfied by
eventual completion of the transition to IPv6 across the Internet
(e.g., NAT64), but this is not true of all CGN scenarios (e.g., NPTv6
[RFC6296]) for which some of the issues discussed in [RFC6269] will
be encountered (e.g., impact on geolocation).
Privacy-related considerations discussed in [RFC6967] apply for this
scenario.
4. Scenario 2: Address plus Port (A+P)
A+P [RFC6346] [RFC7596] [RFC7597] denotes a flavor of address-sharing
solutions that does not require any additional NAT function to be
enabled in the service provider's network. A+P assumes subscribers
are assigned with the same IPv4 address together with a port set.
Subscribers assigned with the same IPv4 address should be assigned
non-overlapping port sets. Devices connected to an A+P-enabled
network should be able to restrict the IPv4 source port to be within
a configured range of ports. To forward incoming packets to the
appropriate host, a dedicated entity called the Port-Range Router
(PRR) [RFC6346] is needed (Figure 2).
Similar to the CGN case, remote servers rely on the source IP address
for various purposes such as access control or abuse management. The
loss of the host identification will lead to the issues discussed in

The administrator of the proxy may have many reasons for wanting to
proxy traffic - including caching, policy enforcement, malware
scanning, reporting on network or user behavior for compliance, or
security monitoring.
The same administrator may also wish to selectively hide or expose
the internal host identity to servers. He/she may wish to hide the
identity to protect end-user privacy or to reduce the ability of a
rogue agent to learn the internal structure of the network. He/she
may wish to allow upstream servers to identify hosts to enforce
access policies (for example, on documents or online databases), to
enable account identification (on subscription-based services) or to
prevent spurious misidentification of high-traffic patterns as a DoS
attack. Application-specific protocols exist for enabling such
forwarding on some plaintext protocols (e.g., Forwarded headers on
HTTP [RFC7239] or time-stamp-line headers in SMTP [RFC5321]).
Servers not receiving such notifications but wishing to perform host
or user-specific processing are obliged to use other application-
specific means of identification (e.g., cookies [RFC6265]).
Packets/connections must be received by the proxy regardless of the
IP address family in use. The requirements of this scenario are not
satisfied by eventual completion of the transition to IPv6 across the
Internet. Complications will arise for both IPv4 and IPv6.
Privacy-related considerations discussed in [RFC6967] apply for this
scenario.
6. Scenario 4: Distributed Proxy Deployment
This scenario is similar to the proxy deployment scenario (Section 5)
with the same use cases. However, in this instance part of the
functionality of the application proxy is located in a remote site.
This may be desirable to reduce infrastructure and administration
costs or because the hosts in question are mobile or roaming hosts
tied to a particular administrative zone of control but not to a
particular network.
In some cases, a distributed proxy is required to identify a host on
whose behalf it is performing the caching, filtering, or other
desired service - for example, to know which policies to enforce.
Typically, IP addresses are used as a surrogate. However, in the
presence of CGN, this identification becomes difficult. Alternative
solutions include the use of cookies, which only work for HTTP
traffic, tunnels, or proprietary extensions to existing protocols.

packets to the receiver via one or more machines in the overlay
network, applying various performance enhancement methods.
+------------------------------------+
| |
| INTERNET |
| |
+-----------+ | +------------+ |
| HOST_1 |-----| OVRLY_IN_1 |-----------+ |
+-----------+ | +------------+ | |
| | |
+-----------+ | +------------+ +-----------+ | +--------+
| HOST_2 |-----| OVRLY_IN_2 |-----| OVRLY_OUT |-----| Server |
+-----------+ | +------------+ +-----------+ | +--------+
| | |
+-----------+ | +------------+ | |
| HOST_3 |-----| OVRLY_IN_3 |-----------+ |
+-----------+ | +------------+ |
| |
+------------------------------------+
Figure 6: Overlay Network Reference Architecture
Such overlay networks are used to improve the performance of content
delivery [IEEE1344002]. Overlay networks are also used for
peer-to-peer data transport [RFC5694], and they have been suggested
for use in both improved scalability for the Internet routing
infrastructure [RFC6179] and provisioning of security services
(intrusion detection, anti-virus software, etc.) over the public
Internet [IEEE101109].
In order for an overlay network to intercept packets and/or
connections transparently via base Internet connectivity
infrastructure, the overlay ingress and egress hosts (OVERLAY_IN and
OVERLAY_OUT) must be reliably in path in both directions between the
connection-initiating HOST and the SERVER. When this is not the
case, packets may be routed around the overlay and sent directly to
the receiving host, presumably without invoking some of the advanced
service functions offered by the overlay.
For public overlay networks, where the ingress and/or egress hosts
are on the public Internet, packet interception commonly uses network
address translation for the source (SNAT) or destination (DNAT)
addresses in such a way that the public IP addresses of the true
endpoint hosts involved in the data transport are invisible to each
other (see Figure 7). For example, the actual sender and receiver
may use two completely different pairs of source and destination
addresses to identify the connection on the sending and receiving

networks in cases where both the ingress and egress hosts are on the
public Internet.
IP hdr contains: IP hdr contains:
SENDER -> src = sender --> OVERLAY --> src = overlay2 --> RECEIVER
dst = overlay1 dst = receiver
Figure 7: NAT Operations in an Overlay Network
In this scenario, the remote server is not able to distinguish among
hosts using the overlay for transport. In addition, the remote
server is not able to determine the overlay ingress point being used
by the host, which can be useful for diagnosing host connectivity
issues.
In some of the above-referenced scenarios, IP packets traverse the
overlay network fundamentally unchanged, with the overlay network
functioning much like a CGN (Section 3). In other cases, connection-
oriented data flows (e.g., TCP) are terminated by the overlay in
order to perform object caching and other such transport and
application-layer optimizations, similar to the proxy scenario
(Section 5). In both cases, address sharing is a requirement for
packet/connection interception, which means that the requirements for
this scenario are not satisfied by the eventual completion of the
transition to IPv6 across the Internet.
More details about this scenario are provided in [OVERLAYPATH].
This scenario does not introduce privacy concerns since the
identification of the host is local to a single administrative domain
(i.e., Content Delivery Network (CDN) Overlay Network) or passed to a
remote server to help forwarding back the response to the appropriate
host. The host identification information is not publicly available
nor can be disclosed to other hosts connected to the Internet.
8. Scenario 6: Policy and Charging Control Architecture (PCC)
This issue is related to the PCC framework defined by 3GPP in
[TS23.203] when a NAT is located between the Policy and Charging
Enforcement Function (PCEF) and the Application Function (AF) as
shown in Figure 8.
The main issue is: PCEF, the Policy and Charging Rule Function
(PCRF), and AF all receive information bound to the same User
Equipment (UE) but without being able to correlate between the piece
of data visible for each entity. Concretely,

This scenario does not introduce privacy concerns since the
identification of the host is local to a single administrative domain
and is meant to help identify which policy to select for a UE.
9. Scenario 7: Emergency Calls
Voice Service Providers (VSPs) operating under certain jurisdictions
are required to route emergency calls from their subscribers and have
to include information about the caller's location in signaling
messages they send towards Public Safety Answering Points (PSAPs)
[RFC6443] via an Emergency Service Routing Proxy (ESRP) [RFC6443].
This information is used both for the determination of the correct
PSAP and to reveal the caller's location to the selected PSAP.
In many countries, regulation bodies require that this information be
provided by the network rather than the user equipment, in which case
the VSP needs to retrieve this information (by reference or by value)
from the access network where the caller is attached.
This requires the VSP call server receiving an emergency call request
to identify the relevant access network and to query a Location
Information Server (LIS) in this network using a suitable lookup key.
In the simplest case, the source IP address of the IP packet carrying
the call request is used both for identifying the access network
(thanks to a reverse DNS query) and as a lookup key to query the LIS.
Obviously, the user-id as known by the VSP (e.g., telephone number or
email-formatted URI) can't be used as it is not known by the access
network.
The above mechanism is broken when there is a NAT between the user
and the VSP and/or if the emergency call is established over a VPN
tunnel (e.g., an employee remotely connected to a company Voice over
IP (VoIP) server through a tunnel wishes to make an emergency call).
In such cases, the source IP address received by the VSP call server
will identify the NAT or the address assigned to the caller equipment
by the VSP (i.e., the address inside the tunnel). This is similar to
the CGN case in (Section 3) and overlay network case (Section 7) and
applies irrespective of the IP versions used on both sides of the NAT
and/or inside and outside the tunnel.
Therefore, the VSP needs to receive an additional piece of
information that can be used to both identify the access network
where the caller is attached and query the LIS for his/her location.
This would require the NAT or the tunnel endpoint to insert this
extra information in the call requests delivered to the VSP call
servers. For example, this extra information could be a combination
of the local IP address assigned by the access network to the

caller's equipment with some form of identification of this access
network.
However, because it shall be possible to set up an emergency call
regardless of the actual call control protocol used between the user
and the VSP (e.g., SIP [RFC3261], Inter-Asterisk eXchange (IAX)
[RFC5456], tunneled over HTTP, or proprietary protocol, possibly
encrypted), this extra information has to be conveyed outside the
call request, in the header of lower-layer protocols.
Privacy-related considerations discussed in [RFC6967] apply for this
scenario.
10. Other Deployment Scenarios
This section lists deployment scenarios that are variants of
scenarios described in previous sections.
10.1. Open WLAN or Provider WLAN
In the context of Provider WLAN, a dedicated Service Set Identifier
(SSID) can be configured and advertised by the Residential Gateway
(RG) for visiting terminals. These visiting terminals can be mobile
terminals, PCs, etc.
Several deployment scenarios are envisaged:
1. Deploy a dedicated node in the service provider's network that
will be responsible for intercepting all the traffic issued from
visiting terminals (see Figure 11). This node may be co-located
with a CGN function if private IPv4 addresses are assigned to
visiting terminals. Similar to the CGN case discussed in
Section 3, remote servers may not be able to distinguish visiting
hosts sharing the same IP address (see [RFC6269]).
2. Unlike the previous deployment scenario, IPv4 addresses are
managed by the RG without requiring any additional NAT to be
deployed in the service provider's network for handling traffic
issued from visiting terminals. Concretely, a visiting terminal
is assigned with a private IPv4 address from the IPv4 address
pool managed by the RG. Packets issued from a visiting terminal
are translated using the public IP address assigned to the RG
(see Figure 12). This deployment scenario induces the following
identification concerns:

* The provider is not able to distinguish the traffic belonging
to the visiting terminal from the traffic of the subscriber
owning the RG. This is needed to identify which policies are
to be enforced such as: accounting, Differentiated Services
Code Point (DSCP) remarking, black list, etc.
* Similar to the CGN case Section 3, a misbehaving visiting
terminal is likely to have some impact on the experienced
service by the subscriber owning the RG (e.g., some of the
issues are discussed in [RFC6269]).
+-------------+
|Local_HOST_1 |----+
+-------------+ |
| |
+-------------+ +-----+ | +-----------+
|Local_HOST_2 |--| RG |-|--|Border Node|
+-------------+ +-----+ | +----NAT----+
| |
+-------------+ | | Service Provider
|Visiting Host|-----+
+-------------+
Figure 11: NAT Enforced in a Service Provider's Node +-------------+
|Local_HOST_1 |----+
+-------------+ |
| |
+-------------+ +-----+ | +-----------+
|Local_HOST_2 |--| RG |-|--|Border Node|
+-------------+ +-NAT-+ | +-----------+
| |
+-------------+ | | Service Provider
|Visiting Host|-----+
+-------------+
Figure 12: NAT Located in the RG
This scenario does not introduce privacy concerns since the
identification of the host is local to a single administrative domain
and is meant to help identify which policy to select for a visiting
UE.

10.2. Cellular Networks
Cellular operators allocate private IPv4 addresses to mobile
terminals and deploy NAT44 function, generally co-located with
firewalls, to access public IP services. The NAT function is located
at the boundaries of the Public Land Mobile Network (PLMN).
IPv6-only strategy, consisting in allocating IPv6 prefixes only to
mobile terminals, is considered by various operators. A NAT64
function is also considered in order to preserve IPv4 service
continuity for these customers.
These NAT44 and NAT64 functions bring some issues that are very
similar to those mentioned in Figure 1 and Section 8. These issues
are particularly encountered if policies are to be applied on the Gi
interface.
Note: 3GPP defines the Gi interface as the reference point between
the Gateway GPRS Support Node (GGSN) and an external Packet Domain
Network (PDN). This interface reference point is called SGi in 4G
networks (i.e., between the PDN Gateway and an external PDN).
Because private IP addresses are assigned to the mobile terminals,
there is no correlation between the internal IP address and the
external address:port assigned by the NAT function, etc.
Privacy-related considerations discussed in [RFC6967] apply for this
scenario.
10.3. Femtocells
This scenario can be seen as a combination of the scenarios described
in Sections 8 and 10.1.
The reference architecture is shown in Figure 13.
A Femto Access Point (FAP) is defined as a home base station used to
graft a local (femto) cell within a user's home to a mobile network.

o determine the corresponding FAP's public IPv4 address associated
with the UE's inner IPv4 address that is assigned by the mobile
network to identify the mobile UE, which allows the PCRF to
retrieve the special UE's policy (e.g., QoS) to be passed onto the
Broadband Policy Control Function (BPCF) at the BBF network.
SeGW would have the complete knowledge of such mapping, but the
reasons for being unable to use SeGW for this purpose are explained
in Section 2 of [IKEv2-CP-EXT].
This scenario involves PCRF/BPCF, but it is valid in other deployment
scenarios making use of Authentication, Authorization, and Accounting
(AAA) servers.
The issue of correlating the internal IP address and the public IP
address is valid even if there is no NAT in the path.
This scenario does not introduce privacy concerns since the
identification of the host is local to a single administrative domain
and is meant to help identify which policy to select for a UE.
10.4. Traffic Detection Function (TDF)
Operators expect that the traffic subject to the packet inspection is
routed via the Traffic Detection Function (TDF) as per the
requirement specified in [TS29.212]; otherwise, the traffic may
bypass the TDF. This assumption only holds if it is possible to
identify individual UEs behind the Basic NAT or NAPT invoked in the
RG connected to the fixed broadband network, as shown in Figure 14.
As a result, additional mechanisms are needed to enable this
requirement.