Silicon photonic interconnects have been proposed as a solution to address chip I/O communication bottlenecks in multi-core architectures.
In this paper, we perform comprehensive design exploration of inter-chip photonic links and networking architectures. Because the energy
efficiencies of such architectures have been shown to be highly sensitive to link utilization, our design exploration covers designs where
sharing occurs. By means of shared buses and silicon photonic switches, link utilizations can be improved. To conduct this exploration, we
introduce a modeling methodology that captures not only the physical layer characteristics in terms of link capacity and energy efficiency
but also the network utilization of silicon photonic chip-to-chip designs. Our models show that silicon photonic interconnects can sustain
very high loads (over 100 Tb/s) with low energy costs (< 1 pJ/bit). On the other hand, resource-sharing architectures typically used to cope
with low and sporadic loads come at a relatively high energy cost.

Authors affliliation: Columbia University, USA

R. Hendry, S. Rumley, D. Nikolova and K. Bergman

Accelerating Spark with RDMA for Big Data Processing: Early Experiences

Abstract

Apache Hadoop MapReduce has been highly successful in processing large-scale data-intensive batch applications on commodity clusters. However,
for low-latency interactive applications and iterative computations, Apache Spark, an emerging in-memory processing framework, has been stealing
the limelight. Recent studies have shown that current generation Big Data frameworks (like Hadoop) cannot efficiently leverage advanced features
(e.g. RDMA) on modern clusters with high-performance networks. One of the major bottlenecks is that these middleware are traditionally written
with sockets and do not deliver best performance on modern HPC systems with RDMA-enabled high performance interconnects. In this paper, we first
assess the opportunities of bringing the benefits of RDMA into the Spark framework. We further propose a high performance RDMA- based design for
accelerating data shuffle in the latest Spark package on high-performance networks. Performance evaluation shows that our proposed design can
achieve 79-83% performance improvement for GroupBy, compared with the default Spark running with IP over InfiniBand (IPoIB FDR) on a 128-256
core cluster. We adopt a plug-in based approach that can make our design to be easily integrated with newer Spark releases. To the best our
knowledge, this is the first design for accelerating Spark with RDMA for Big Data processing.

Input/output (I/O) bus networks using Ethernets
transfer a PCI Express (PCIe) I/O packet between a host and an
I/O device by encapsulating it into an Ethernet frame. Because
the size of PCIe packets is generally small in such systems, the
overhead to individually encapsulate them into Ethernet frames
lowers the PCIe bandwidth provided by Ethernet connections.
This decrease in bandwidth directly degrades the performance
of I/O devices, which transmit high-throughput PCIe traffic.
Examples of such devices include PCIe solid-state drives and
graphics processing units. We propose a method of aggregating
multiple PCIe packets into a single Ethernet frame in an end-to-
end manner to provide high-throughput PCIe bandwidth.
The aggregation is performed at the bottleneck-link rate of the
Ethernet path through which those packets are transferred. The
number of aggregated PCIe packets is adaptively determined
by aggregating ones that reside in a transmission queue when
it is scheduled to transmit packets. This enables low-latency
bandwidth enhancement because the proposed method does not
increase the transmission latency of PCIe packets by waiting
for more packets to aggregate. In addition, it does not require
individual configuration of operation parameters, such as aggregation
threshold and time out value, depending on the rate of
the PCIe traffic of I/O devices. We implemented our method in
a PCIe-to-Ethernet bridge prototype using a field-programmable
gate array. As a result, I/O performance improved up to 41%.

Authors affliliation: NEC, Japan

J. Suzuki, Y. Hayashi, M. Kan, S. Miyakawa and T. Yoshikawa

12:00-13:10

Lunch

13:10-14:00

Switch Memory/TCAM

Session Chair: Ada Gavrilovska

Strategies for Mitigating TCAM Space Bottlenecks

Abstract

Transport networks satisfy requests to forward data in a given topology. At the level of a network element, forwarding decisions are defined by flows.
To implement desired data properties during forwarding, a network operator imposes economic models by applying policies to flows. In real applications,
the number of different policies is much smaller than the number of flows. In this work, we draw from our experience in classifier design for commercial
systems and demonstrate how to share classifiers that represent policies between flows while still implementing them per flow per policy state. The
resulting space saving is several orders of magnitude higher than any state-of-the art methods which reduce space of classifiers representation.

Data center networks demand high bandwidth switches. These networks also sustain common incast scenarios, which require large switch buffers. Therefore,
network and switch designers encounter a buffer-bandwidth tradeoff as follows. Large switch buffers allow absorbing larger incast workload. However, higher
switch bandwidth allows both faster buffer draining and more link pausing, which reduces buffering demand for incast. As the two features compete for
silicon resources and device power budget, modeling their relative impact on the network is critical.

In this work our aim is to evaluate this buffer-bandwidth tradeoff. We analyze the worst case incast scenario in the lossless network and find by
how much the buffer size can be reduced, while the link bandwidth is increased to stand in the same network performance. In addition, we analyze
the multi-level incast cascade and support our findings by simulations. Our analysis shows that increasing bandwidth allows reducing the buffering
demand by at least the same ratio. In particular, we show that the switch buffers can be omitted if the links bandwidth is doubled.

Authors affliliation: Mellanox Technologies, Israel

A. Shpiner, E. Zahavi and O. Rottenstreich

14:00-14:25

Session Chair: Madeleine Glick

Silicon Photonics for Bandwidth Scalable Solutions

Abstract

In this presentation we will review recent progress in commercialization of Silicon photonics to provide a scalable high bandwidth interconnectivity solution
for the growing needs in Datacenter and HPC applications.

M. Asghari, VP, Silicon Photonics R&D at Mellanox (Invited Talk)

Bio

Dr. Mehdi Asghari has around 20 years of research and product development experience within the silicon photonics industry. Currently, Dr. Asghari is the VP of Silicon
Photonics R&D at Mellanox. Prior to that he was the CTO at Kotura Inc. Previously Dr. Asghari served as VP of Research and Development at Bookham Inc. in the UK. Bookham was
the first company commercializing Silicon photonics.

Dr. Asghari holds a Ph.D. degree in Optoelectronics from the University of Bath, a M.Sc. degree in Optoelectronics and Laser Devices from the Heriot-Watt and St. Andrews
Universities, and a MA degree in Engineering from Cambridge University. He has authored or co-authored over 150 Journal and Conference publications and holds more than
15 active patents within the fields of silicon photonics and optoelectronics.

14:25-14:45

Afternoon Break

14:45-15:10

Session Chair: Cyriel Minkenberg

Data & Control Plane Interconnect solutions for SDN & NFV networks

Abstract

Software defined and functionally disaggregated network elements rely heavily on deterministic and secure data & control plane communication within
and across the network elements. In these environments scalability, reliability and performance of the whole network relies heavily on the deterministic
behavior of this interconnect. In this presentation, Raghu Kondapalli will discuss various aspects of this data & control plane interconnect including its
functional requirements and solution components suitable for SDN/NFV environments.

Kondapalli brings a rich experience and deep knowledge of Data Center, Service provider & Enterprise Networking business, specifically in envisioning end
to end solutions. Most recently he was a founder and CTO of a Cloud based video collaboration company Cloud Grapes Inc., where he was the chief architect
for the cloud based video-as-a-service solution. Prior to Cloud Grapes, he led technology and system architecture teams at Marvell, Nokia & Nortel. Kondapalli
has about 35 patent applications in process and has been a thought leader behind many technologies at companies he worked at.

Kondapalli received a master's degree in Electrical Engineering from San Jose State University and a bachelor's degree in Electronics and Telecommunications
from Osmania University, India.

Virtualization of network elements reduces operation and capital expenses and provides the ability for operators to offer new network services faster and to scale
those services based on demand. Throughput, connection rate, low latency and low jitter are few important challenges in virtualization world. If not designed well,
processing power requirements go up, thereby reducing the cost benefits. This presentation discusses the performance challenges in VMMs (Virtual Machine Monitors)
and the opportunities to offload VMM packet processing. It also discusses vNF offload opportunities in various market segments. Finally, this presentation presents
openflow role in VMM and vNF offloads. It mainly discusses the Openflow as a communication protocol between control/offload layers and the advantages of using OF
pipeline to implement offload engines.

In his role at Freescale, Srini Addepalli focuses on architecture of fast path & data plane software utilizing acceleration technologies and Multicore processors.
Additionally, Srini is leading orchestration, NFV, SDN/OF and API security software initiatives in Freescale and also a contributor in ONF working groups.
Srini previously served as Chief Architect at Intoto and was responsible for the architecture of their Unified Threat Management software products for single and
multicore processors. He is a 20+-year veteran in networking and data communications and has worked at Intoto, Holontech, NEC and HP.

Yuval will talk about Facebook's recently unveiled "Wedge" top-of-rack network switch and a new Linux-based operating system for that switch, code-named "FBOSS."
These projects break down the hardware and software components of the network stack even further, to provide a new level of visibility, automation, and control
in the operation of the network. By combining the hardware and software modules together in new ways, "Wedge" and "FBOSS" depart from current networking design
paradigms to leverage our experience in operating hundreds of thousands of servers in our data centers.

Y. Bachar, Facebook (Invited Talk)

Bio

Yuval Bachar is a hardware network architect at Facebook, responsible for the current and
future networking hardware platform architecture and design, and a driving technologist for the
Open Networking initiative.
Prior to his role in Facebook Yuval Bachar was a Senior Director of Engineering in the CTO
office in the Data Center Group at Cisco, which is responsible for the company"s data center
product portfolio and next generation innovation. In this capacity, he drives and supports next
generation systems' architecture, product market positioning, and cross-Cisco product
alignment. Yuval Bachar is also responsible for evangelizing new technologies around IoE and
it's integration to the data center architecture.
Prior to his role in Cisco Mr. Bachar served as VP/CTO of High-End Systems Business Unit at
Juniper Networks. In this capacity, he drove Juniper's high-end systems architecture, market,
and technology positioning; including product, technology, and system level innovation and
alignment across the company.
Earlier, Mr. Bachar served in a range of roles at Cisco Systems for over 12 years. In his most
recent role he served as Principal Engineer/Technical Director in the Routing and Service
Provider Technology Group where he was responsible for system level architecture in the Core
Routing Business Unit. Previously, Mr. Bachar held management roles for groups ranging from
5 to 75 people, managing ASIC, hardware, software and test teams. Prior to that Mr. Bachar
worked in a technical leader role in the Desktop Switching Business Unit responsible for the
first Cisco desktop modular switch as well as the architecture for the current generation for the
Cisco desktop switching products (the Catalyst 3K family).
Before joining Cisco Mr. Bachar held various roles in Digital Equipment Corporation (DEC) in
the semiconductor group.
Mr. Bachar has been a contributor to the PCI standard and several IEEE standards. He holds
six approved US patents in the networking and system design areas and three Cisco pioneer
awards.
Mr. Bachar has a BSEE from the Technion in Haifa.

16:50-18:00

Executive Roundtable

Session Chair: Dan Pitt

The future of packet processing: ASICs vs. CPUs vs. GPUs

Moderator: Bob Wheeler, Linley Group

Bio

Bob Wheeler is the principal analyst for networking and has been part of The Linley Group since 2001. He has more than 25 years of experience in the networking, PC,
and semiconductor industries. Most recently, he was the division marketing manager for AMD's Network Products Division before becoming an independent consultant in 1997.
Bob also has experience as a software engineer, engineering manager, and operations manager.

M. Kalkunte, Vice President, Broadcom

Bio

Mohan Kalkunte is the Vice President of Network Switching Architecture group in Broadcom. As the head of the Architecture group, Mohan is responsible for the development of
multiple flagship Ethernet Switching chips that span Small, Medium, Enterprise, Data Center and Service provider markets. Mohan has over 25 years of experience in the field
of networking and semiconductors.

Mohan is a named inventor of over 100 issued patents. He was elected a Broadcom Fellow in 2009. Mohan is a IEEE Senior Member and has several publications. He holds a Bachelor's
degree in Mechanical Engineering from Bangalore University, India, a Master's degree from Syracuse University and a Ph.D in Industrial Engineering and Operations Research
from The Ohio State University in 1988.

H. Xu, Sr. Director, Cisco

Bio

Howie Xu is managing all of the engineering team in Cisco's Cloud and Networking Services group. Before joining Cisco, Howie was VP of Engineer at Big Switch
Networks where he built up the engineering team and the products from scratch. Before that, he was one of the first engineers for VMware's hypervisor product and
led VMware's networking R&D group for almost a decade. Howie is one of the co-inventors for hypervisor virtual switching, owns multiple key VMware's network or IP
storage virtualization patents, and a frequent speaker, panelist, or keynote speaker at numerous industry and analyst events on network virtualization, SDN, NFV,
data center networking, and cloud networking.

U. Elzur, SDN Architecture Director, Intel Data Center Group

Bio

Uri is responsible for creating long term vision, technical strategy, architectures and products for server platforms, working with multiple groups and product divisions._Uri is a
networking specialist with more than 25 years of industry experience and a proven track record of creating innovative product architectures, strategies and intellectual property in
Networking, Security and related technologies. Prior to joining Intel, Uri has held a position of a Sr. Director at Broadcom, managing an architecture team with responsibilities over
the company's NIC architecture and strategy. In that role Uri led multiple innovations in the areas of Virtualization, TCP Offload, RDMA, iSER, iSCSI/FCoE.

Y. Bachar, Hardware Network Architect, Facebook

Bio

Dr. Dennison joined NVIDIA in September of 2013 and leads the Network Research Group. Prior to NVIDIA, he worked on software systems such as
high-performance distributed applications, database scaling for the cloud and software-defined networking. He also architected and led the
development of the ASIC chipset for the Avici Terabit Router which utilized a 3-D toroidal network. At BBN, Dr. Dennison was the principal
investigator for MicroPath, a wearable computer that connected to other wearables over a very low power RF network. Dr. Dennison
holds Ph.D., M.S., and B.S. degrees from the Massachusetts Institute of Technology.

L. Dennison, Director Networking Research, NVIDIA

Bio

Dr. Dennison joined NVIDIA in September of 2013 and leads the Network Research Group. Prior to NVIDIA, he worked on software systems such as
high-performance distributed applications, database scaling for the cloud and software-defined networking. He also architected and led the
development of the ASIC chipset for the Avici Terabit Router which utilized a 3-D toroidal network. At BBN, Dr. Dennison was the principal
investigator for MicroPath, a wearable computer that connected to other wearables over a very low power RF network. Dr. Dennison
holds Ph.D., M.S., and B.S. degrees from the Massachusetts Institute of Technology.

Ethernet link rates historically have grown by factors of 10, from 10 Mb/s to most recently 100 Gb/s. This planned rate increase has happened
independently of the increase in SERDES rates. Originally this was not a problem, but now multiple SERDES are required to implement 40G and
100G Ethernet, sometimes leading to inefficient gearboxes to match SERDES rates to Ethernet rates. Now we are beginning to see data center
operators and chip makers step up to guide Ethernet towards a more efficient solution, and a direction that looks a lot more like PCI Express. We
are beginning to see channelized Ethernet.

Nathan Farrington, Rockley Photonics (Invited Talk)

Bio

Nathan Farrington is the Director of Software and System Architecture at Rockley Photonics, where he leads the software engineering team, designs next-generation data center networking
solutions, and tries to bridge the gap between the data center networking and optical communications communities. Previously, Nathan was the founder and CEO of Packetcounter, a
computer network software and services company. Before that he was a data center network engineer at Facebook developing the Open Compute Project top-of-rack switch. Before grad
school, Nathan worked for the US Navy in mobile robotics and situational awareness applications for the Department of Homeland Security.

Nathan graduated from the University of California, San Diego, with a PhD in Computer Science and Computer Engineering. He was advised by Amin Vahdat, now at Google, as well as George
Porter, George Papen, and Yeshaiahu "Shaya" Fainman. NathanŐs dissertation topic was on novel optical communications for data center networks. Nathan has served on the TPC
of OFC/NFOEC 2014 and 2015, and as a reviewer for numerous journals including IEEE/ACM Transactions on Networking, IEEE/OSA Journal of Lightwave Technology, IEEE Micro, and
ACM SIGCOMM Computer Communications Review. He is a member of the IEEE, the OSA, and the ACM.

Fat trees are a common topology used by High Performance Clusters and Data Center Networks. We present Quasi Fat Tree (QFT), a new flavor of fat
trees, that emerged in recent years as a dominant network topology. This topology is appealing to cluster designers because it offers better
performance for many concurrent small jobs which may fit in its extended 3-hop host groups. In this paper, we formulate the graph structure of
this new topology, and derive a closed-form and fault resilient contention-free routing algorithm for all global shift permutations. This routing
is important for optimizing the run-time of large computing jobs which utilize MPI collectives. The algorithm is verified by running its implementation
as an OpenSM routing engine on various sizes of QFT topologies.

Optical interconnects, which support the transport of large bandwidths over warehouse-scale distance, can help to further scale data-movement capabilities in
high performance computing (HPC) platforms. However, due to the circuit switching nature of optical systems and additional peculiarities, such as sensitivity
to temperature and the need for wavelength channel locking, optical links generally show longer link initialization delays. These delays are a major obstacle
in exploiting the high bandwidth of optics for application speedups, especially when low-latency remote direct memory access (RDMA) is required or small messages
are used.
These limitations can be overcome by maintaining a set of frequently used optical circuits based on the temporal locality of the application and
by maximizing the number of reuses to amortize initialization overheads. However, since circuits cannot be simultaneously maintained between all
source-destination pairs, the set of selected circuits must be carefully managed. This paper applies techniques inspired by cache optimizations to
intelligently manage circuit resources with the goal of maximizing the circuit successful 'hit' rate. We propose the concept of "circuit reuse
distance" and design circuit replacement policies based on this metric. We profile the reuse distance based on a group of representative HPC
applications with different communications patterns and show the potential to amortize circuit setup delay over multiple circuit requests. We
then develop a Markov transition matrix based reuse distance predictor and two circuit replacement policies. The proposed predictor provides
significantly higher accuracy than traditional maximum likelihood prediction and the two replacement policies are shown to effectively increase
the hit rate compared to the Least Recently Used policy. We further investigate the tradeoffs between the realized hit rate and energy consumption.
Finally, the feasibility of the proposed concept is experimentally demonstrated using silicon photonic devices in an FPGA-controlled network
testbed.

The Tofu Interconnect 2 (Tofu2) is a system interconnect designed for the successor model of the FUJITSU Supercomputer PRIMEHPC FX10. Tofu2 uses a
25 Gbps transmission technology that is about two times faster than those of existing HPC interconnects, and uses optical transceivers in an
extremely high ratio. Despite the major change of physical transport medium, Tofu2 inherits and enhances the features of Tofu1. This paper
describes the specifications including frame format, implementation, and preliminary evaluation results of Tofu2. The effective throughput of
Put transfer was evaluated to be 11.46 GB/s that was about 92% link efficiency. The total throughput of simultaneous 4-way transfer was evaluated
to be 45.82 GB/s. Tofu2 reduced one-way communication latency by about 0.2 usec. The elimination of the host bus contributed about 60 nsec of
reduction, and the rest of the reduction had been derived from the cache injection technique.

The Impact of ARM-based Solutions in Next Generation Cloud and Networking Infrastructure

Abstract

This session will focus on the major trends in cloud and networking infrastructure, with respect to software architecture and open source initiatives, as well
as how ARM-based solutions will factor into solutions. Software-Defined Networking concepts and the Network Function Virtualization (NFV) initiative under ETSI
are redefining the way in which network infrastructure software is being architected. We will overview the need for well-defined layers, with well-defined
interfaces and industry stakeholder collaboration required to realize the full potential of the vision for the future network platforms. We will give insight
into how the ARM ecosystem and open source initiatives add unique value in delivering the required platforms of the future.

Bob Monkman, ARM (Invited Talk)

Bio

Bob Monkman is part of the Enterprise Segment Marketing Team at ARM, located in San Jose, CA and focused on Enterprise Networking and Software Initiatives for the
Enterprise Segment like Linaro Networking Group, SDN and NFV. Bob has been in embedded computing for 25+ years, much of it in the commercial RTOS/Middleware world
focused on the network communications infrastructure space and did a stint in High Performance Clustered Computing. Bob's career roles have spanned largely Product
Management/Marketing, Strategic Alliances, and Business Development, but he started his career as a hardware/software engineer for Tellabs. BobŐs technology experience
includes embedded RTOS, Linux, M2M/IoT, Systems Management Middleware strategy and development tools- now delving deeper into silicon designs. Bob was the Carrier-Grade
Linux PM at MontaVista from 2003-2005, where he was active in the Service Availability Forum and Carrier Grade Linux Working Group. Bob holds a BSEE from the
University of Illinois.

13:40-14:30

Network Traffic and Protocols

Session Chair: Ryan Grant

Traffic Optimization in Multi-Layered WANs using SDN

Abstract

Wide area networks (WAN) forward traffic through a mix of packet and optical data planes, composed by a variety of devices from different vendors.
Multiple forwarding technologies and encapsulation methods are used for each data plane (e.g. IP, MPLS, ATM, SONET, Wavelength Switching). Despite
standards defined, the control planes of these devices are usually not interoperable, and different technologies are used to manage each forwarding
segment independently (e.g. OpenFlow, TL-1, GMPLS). The result is lack of coordination between layers and inefficient resource usage. In this paper
we discuss the design and implementation of a system that uses unmodified OpenFlow to optimize network utilization across layers, enabling practical
bandwidth virtualization. We discuss strategies for scalable traffic monitoring and to minimize losses on route updates across layers. We explore
two use cases that benefit from multi-layer bandwidth on demand provisioning. A prototype of the system was built open using a traditional circuit
reservation application and an unmodified SDN controller, and its evaluation was per-formed on a multi-vendor testbed.

Modern distributed applications in high-performance
computing (HPC) fields often need to disseminate
data efficiently from one cluster to an arbitrary number of
others by using multicast techniques. InfiniBand, with its high throughput,
low latency and low overhead communications, has
been increasingly adopted as an HPC cluster interconnection.
Although InfiniBand hardware multicast is efficient and
scaleable, it is based on Unreliable Datagrams (UD)
which cannot guarantee reliable data distribution. This
makes InfiniBand multicast not the best fit for modern
distributed applications. This paper presents the design and
implementation of a reliable multicast protocol for InfiniBand
(IBRMP). IBRMP is based on InfiniBand unreliable hardware
multicast, and utilizes InfiniBand Reliable Connection (RC)
to guarantee data delivery. According to our experiments,
IBRMP takes full advantage of InfiniBand multicast which
reduces communication traffic significantly. In our testing
environment, using IBRMP is up to five times faster than
using only RC to disseminate data among a group of receivers.

Michael Howard co-founded market research firm Infonetics Research in 1990, and today is recognized worldwide as one of the industry's leading experts in
emerging markets, network operator trends, and user buying patterns. Michael leverages over 40 years of communications industry experience, including 22 years
in market research, to author numerous works year-round, including quarterly vendor market share and forecast reports, service provider surveys, Continuous
Research Service (CRS) analyst notes, white papers, and custom research. He specializes in mobile backhaul, small cells, carrier Ethernet, edge and core routers,
IP/MPLS control planes, IP VPNs, cloud access, 40GE/100GE, software-defined networks (SDNs), OpenFlow, and packet-optical transport.

Matthew Palmer, Partner at Wiretap Ventures

Bio

Matt has 20+ years of software-defined networking (SDN), cloud computing, SaaS, & computer networking experience. Matt is currently a Partner at Wiretap Ventures,
a management, marketing, and product consulting organization for Cloud Service Providers and Software-defined networking companies. Matt was most recently co-founder
and CEO at Pareto Networks which was acquired by Aerohive Networks. At Pareto, Matt successfully incubated and launched the company, being recognized as Most Innovative
Cloud Computing Provider at UP-START 2010 and Hot Emerging Vendor by CRN Magazine. Matt was previously VP Enterprise Planning and Operations at Juniper Networks where he
led the Enterprise Business Team, responsible for managing Juniper's $500M enterprise business. He previously held various executive roles at Juniper and Netscreen
leading security and enterprise networking strategy and corporate development activities, including strategic partnerships, investment and M&A. Prior, Matt
led business development at Qualys. Matt received his Bachelor of Science in Business Management from Indiana University and has 7 issued and 6 pending
patents in software-defined networking, security and cloud computing.