About the 40- and 100-Gbps Ethernet MAC and PHY MegaCore Function

The Altera® 40- and 100-Gbps Ethernet (40GbE and 100GbE) media access
controller (MAC) and PHY MegaCore® functions implement the IEEE 802.3ba 40G and 100G Ethernet
Standard with an option to support the IEEE
802.3ap-2007 Backplane Ethernet Standard. This product is included in the Altera MegaCore® IP Library and available from the Quartus II IP Catalog.

Note: The full product name,
40- and 100-Gbps Ethernet MAC and PHY MegaCore Function, is
shortened to
40-100GbE IP core in this document. In addition, although multiple
variations are available from the parameter editor, this document refers to
this product as a single IP core, because all variations are configurable from
the same parameter editor.

As illustrated, on the MAC client side you can choose a wide,
standard
Avalon® Streaming (Avalon-ST) interface, or a
narrower, custom streaming interface.
Depending on the variant you choose, the MAC client side Avalon
Streaming (Avalon-ST)
interface is either 256 or 512 bits of data mapped to either four
or ten 10.3125 Gbps transceiver PHY links, depending on data rate, or to four
25.78125 Gbps transceiver PHY links.

The 40GbE (XLAUI) interface has 4x10.3125 Gbps links. The 100GbE (CAUI)
interface has 10x10.3125 Gbps links. Several additional
options are available. For Arria V GZ, Stratix IV, and Stratix V devices, you can
configure a lower-rate 40GbE option with 4x6.25 Gbps links. For Stratix V devices
only, you can configure a 40GbE 40GBASE-KR4 variation to support Backplane Ethernet.
For Stratix V GT devices only, you can configure a 100GbE CAUI-4 option, with
4x25.78125 Gbps links.

The FPGA serial transceivers are compliant with the IEEE 802.3ba standard
XLAUI, CAUI, and CAUI-4 specifications. The IP core configures the transceivers to
implement the relevant specification for your IP core variation. You can connect the
transceiver interfaces directly to an external physical medium dependent (PMD) optical
module or to another device.

You can configure and generate most configurations of the
40‑100GbE IP core in transmit (TX) only, receive (RX) only, or duplex mode. The
100GbE CAUI-4 option and the 40GBASE-KR4 options are available in duplex mode
only.

The IP core provides standard MAC and physical coding sublayer (PCS) functions
with a variety of configuration and status registers. You can exclude the statistics
registers. If you exclude these registers, you can monitor the statistics
counter increment vectors that the IP core provides at the client side interface and
maintain your own counters.

40- and 100-Gbps Ethernet MAC and PHY IP Core Supported Features

The 40- and 100-Gbps Ethernet MAC and PHY IP core
offers
the following features:

Parameterizable
through the IP Catalog available with the Quartus II software.

Compliant
with
the
IEEE 802.3ba-2010 High Speed Ethernet Standard available
on the IEEE website (www.ieee.org).

Avalon-ST data path interface
connects to client logic with the start of frame in the most significant byte (MSB)
when optional adapters are used. Interface has data width 256 or 512 bits depending
on the data rate.

The
40-100GbE IP core can support full wire line speed with a 64-byte
frame length and back-to-back or mixed length traffic, up to a
programmable frame size greater than 9600 bytes, with no dropped packets.

For a detailed specification of the Ethernet protocol refer to the
IEEE 802.3ba-2010 High Speed Ethernet Standard.

40-100GbE IP Core Device Family and Speed Grade Support

The following sections list the device family and device speed grade
support offered by the
40‑100GbE IP core:

Device Family Support

Table 1. Altera IP Core Device Support Levels

Device Support Level

Definition

Preliminary

The IP core is verified with preliminary timing models for this
device family. The IP core meets all functional requirements, but might still
be undergoing timing analysis for the device family. It can be used in
production designs with caution.

Final

The IP core is verified with final timing models for this device
family. The IP core meets all functional and timing requirements for the device
family and can be used in production designs.

Table 2.
40-100GbE IP Core Device Family Support. Shows the level of support offered by the
40‑100GbE IP core for each Altera device family.

IP Core Verification

To ensure functional correctness of the 40‑100GbE IP core, Altera performs extensive validation through both
simulation and hardware testing. Before releasing a version of the 40- and 100-Gbps Ethernet MAC and PHY IP core,
Altera runs comprehensive regression tests in the current
version of the Quartus® II software.

Altera verifies that the current version of the Quartus II software
compiles the previous version of each IP core. Any exceptions to this verification are
reported in the Altera IP Release Notes. Altera does not verify
compilation with IP core versions older than the previous release.

Randomized error injection
tests that inject Frame Check Sequence (FCS) field errors, runt packets, and
corrupt control characters, and then check for the proper response from the IP
core

Assertion based tests to
confirm proper behavior of the IP core with respect to the specification

Extensive coverage of our
runtime configuration space and proper behavior in all possible modes of
operation

Hardware Testing

Altera performs hardware testing of the key functions of the
40-100GbE MAC and PHY IP core. The Altera
hardware tests of the
40‑100GbE IP core also ensure reliable solution coverage for
hardware related areas such as
synchronization, and reset recovery. The IP core is tested with
Stratix IV and
Stratix V devices.

Performance and Resource Utilization

The following sections provide performance and resource utilization data
for the
40GbE and 100GbE IP cores.

Resource Utilization for 40GbE IP Cores

Resource utilization changes if the statistics counters are configured
in the IP core. You can specify whether to include or not include the
statistics counters in the 40-100GbE parameter editor, but you cannot change
the setting dynamically.

The 24.24 Gbps variations of the 40-100GbE IP core use the same
resources as the standard 40GbE IP core variations. The 40GBASE-KR4 variations
require more resources only for the PHY component.

Resource Utilization for 100GbE IP Cores

Resource utilization changes if the statistics counters are configured
in the IP core. You can specify whether to include or not include the
statistics counters in the 40-100GbE parameter editor, but you cannot change
the setting dynamically.

MAC with custom streaming client interface and with
statistics counters

26900

42300

12

alt_e100_mac_rx:mac_rx

9500

15600

12

alt_e100_mac_tx:mac_tx

12600

18400

0

alt_e100_mac_csr:mac_csr without
statistics counters

1200

1900

0

alt_e100_mac_csr:mac_csr with
statistics counters

4900

8300

0

PHY

8600

9900

0

alt_e100_phy_pcs:phy_pcs

2900

46900

0

alt_e100_pcs_rx:pcs_rx

16700

28500

0

alt_e100_pcs_tx:pcs_tx

11200

16600

0

alt_e100_phy_csr:phy_csr

1100

1700

0

alt_e100_phy_pma_siv:pma

600

500

0

In the standard 100GbE variations, as in the 40GbE variations, some
resource utilization numbers decrease when statistics counters are not
configured in the IP core. For example, compare the values for the MAC with
custom streaming client interface on a Stratix IV device with statistics
counters included or not included. When counters are included, the MAC requires
26600 ALMs, but when the counters are not included, the MAC requires 23000
ALMs. The difference is 3600 ALMs. In a Stratix V device, the difference is
2900 ALMs.

Resource Utilization for 100GbE CAUI–4 IP Cores

Resource utilization changes if the statistics counters are configured
in the IP core. You can specify whether to include or not include the
statistics counters in the 40-100GbE parameter editor, but you cannot change
the setting dynamically.

Table 8. 100GbE CAUI–4 IP Core FPGA Resource Utilization . Lists the resources and expected performance for selected variations
of the 100GbE CAUI-4 IP core with statistics counters included or not included.
The results were obtained using the Quartus II software v13.1 for a Stratix V
5SGTMC7K2F40C2 device.

Top-level modules are in
bold.

The numbers of ALMs and
logic registers are rounded up to the nearest 100.

The numbers of ALMs,
before rounding, are the
ALMs needed numbers from the Quartus II Fitter Report.

Getting Started

Installing and Licensing IP Cores

The Altera IP Library provides many useful IP core functions for
production use without purchasing an additional license.
You can evaluate
any Altera® IP core in simulation and compilation in the
Quartus® II software using the OpenCore® evaluation feature. Some Altera IP cores, such as MegaCore® functions, require that you purchase a
separate license for production use. You can use the OpenCore Plus feature to evaluate IP
that requires purchase of an additional license until you are satisfied with the
functionality and performance. After you purchase a license, visit the Self Service
Licensing Center to obtain a license number for any Altera product.

Figure 2. IP Core Installation Path

Note: The default IP installation directory
on Windows is <drive>:\altera\<version
number>; on Linux it is <home
directory>/altera/<version number>.

OpenCore Plus IP Evaluation

Altera's free OpenCore® Plus feature allows you to evaluate
licensed MegaCore® IP cores in simulation and hardware before
purchase.
You need only purchase a license for MegaCore IP cores if you
decide to take your design to production. OpenCore Plus supports the following evaluations:

Simulate the behavior of a
licensed IP core in your system.

Verify the functionality,
size, and speed of the IP core quickly and easily.

In the IP Catalog, locate and double-click the name of the IP
core to customize. The New IP Variation window appears.

Specify a top-level name for your custom IP variation. The parameter editor
saves the IP variation settings in a file named
<your_ip>.qip
.

Click OK. The parameter
editor appears.

Specify the
parameters and options for your IP variation in the parameter editor, including
one or more of the following. Refer to your IP core user guide for information
about specific IP core parameters.

After you click Finish and optionally follow the additional step
to generate a simulation testbench and example project, if available for
your IP core variation, the parameter editor adds the
top-level .qip file to the current
project automatically. If you are prompted to manually add the .qip file to the project, click
Project > Add/Remove Files in Project to add the file.

IP Core Parameters

Files Generated for the 40-100GbE IP Core

The Quartus II software version 14.1 generates the following
output for your 40-100GbE IP core.

Figure 3. IP Core Generated Files

Simulating the IP Core

You can simulate your
40GbE or 100GbE IP core variation with the functional simulation
model and the testbench
or example design
generated with the IP core. The functional simulation model is a
cycle-accurate model that allows for fast functional simulation of your IP core
instance using industry-standard VHDL or Verilog HDL simulators. If your IP core variation does not generate a matching testbench,
you can create your own testbench to exercise the IP core functional simulation
model.

The functional simulation
model and testbench files are generated in project subdirectories. These
directories also include scripts to compile and run the example design.

Note: Use the simulation models only for simulation and not for synthesis
or any other purposes. Using these models for synthesis creates a nonfunctional
design.

In the top-level wrapper file for your simulation project,
you can set the
FAST_SIMULATION parameter to enable simulation
optimization. Parameters are set through the IP core parameter editor. In general, you should not change them manually.
The only exception is
the
FAST_SIMULATION parameter. You should set the
FAST_SIMULATION parameter on the PHY blocks by adding
the following line to the top-level wrapper file:

defparam <dut instance>.FAST_SIMULATION = 1;

Note: You can use the example testbench as a guide for setting the
simulation parameters in your own simulation environment.
This line is already present in the Altera-provided
testbench that is generated with the IP core.

Integrating Your IP Core in Your Design

When you integrate your IP core instance in your design, you must pay attention
to the following items:

Pin Assignments

When you integrate your
40-100GbE IP core instance in your design, you must make appropriate
pin assignments. You can create a virtual pin to avoid making specific pin
assignments for top-level signals while you are simulating and not ready to map
the design to hardware.

When you configure the Altera Transceiver Reconfiguration Controller,
you must specify the number of reconfiguration interfaces. The number of
reconfiguration interfaces required for the 40GbE and 100GbE IP cores depends on the IP core variation.

Table 13. Number of Reconfiguration Interfaces. Lists the number of reconfiguration interfaces you should
specify for the Altera Transceiver Reconfiguration Controller for your Arria V GZ or Stratix V 40-100GbE IP core that includes a PHY component.

You can configure your reconfiguration controller with additional
interfaces if your design connects with multiple transceiver IP cores. You can leave
other options at the default settings or modify them for your preference.

You should connect the reconfig_to_xcvr and reconfig_from_xcvr ports of the 40-100GbE IP core to the
corresponding ports of the reconfiguration controller.

The CAUI–4 variations have four reconfiguration channels,
numbered consecutively from reconfig_to_xcvr0 and
reconfig_from_xcvr0 to reconfig_to_xcvr3 and reconfig_from_xcvr3. The CAUI–4 reconfiguration channels must be
connected to the four reconfiguration controller groupings. The reconfiguration
controller groupings include ch0_2_from_xcvr,
ch3_5_from_xcvr, ch6_8_from_xcvr, and ch9_11_from_xcvr.

You must also connect the mgmt_clk_clk and mgmt_rst_reset ports
of the Altera Transceiver Reconfiguration Controller. The mgmt_clk_clk port must have a clock setting in the range of
100–125MHz; this setting can be shared with the 40-100GbE IP core clk_status port.
The mgmt_rst_reset port must be deasserted before,
or deasserted simultaneously with, the 40-100GbE IP core pma_arst_ST
port.

Refer to the example project for RTL that connects the Altera
transceiver reconfiguration controller to the IP core..

9 The CAUI-4 configuration requires 12 interfaces split into
four groups of three; the interface grouping should be set
to 3, 3, 3, 3.

Placement Settings for the 40-100GbE IP Core

The Quartus II software provides the options to specify design partitions and
LogicLock™ regions for incremental compilation, to control
placement on the device. To achieve timing closure for your design, you might need to
provide floorplan guidelines using one or both of these features.

The appropriate floorplan is always design-specific, and depends on your
full design.

40-100GbE IP Core Testbenches

Altera provides a testbench and an
example design
with most variations of the 40-100GbE IP core. The testbench is available
for simulation of your IP core, and the example design
targets a C2 speed grade device and can be run on hardware
. You can run the testbench to observe the IP core behavior on
the various interfaces in simulation.

Altera offers testbenches for the following configurations:

Non-40GBASE-KR4 IP core
variations that have all of the following properties:

Includes both MAC and
PHY components (Core options has the value of
MAC & PHY)

Full duplex
(Duplex mode has the value of
Full Duplex)

40GBASE-KR4 IP core
variations that have all of the following properties:

Includes both MAC and
PHY components (Core options has the value of
MAC & PHY)

With adapters
(MAC client interface has the value of
Avalon-ST interface)

Without Synchronous
Ethernet support (Enable SyncE support is turned off)

When you generate your IP core and turn on
Generate example design, the Quartus II software
generates the testbench and example design for your variation. If your IP core
variation does not meet the criteria for a testbench, the generation process
does not create a testbench. Turning on
Generate example design does not force the
software to generate a testbench if none is defined for your variation.

MAC-only, PHY-only, TX-only, and RX-only IP core variations
do not generate an example design and testbench. 40GBASE-KR4 IP core variations
with the custom streaming interface, without RX equalization enabled, with
Synchronous Ethernet support, or with the link training microprocessor
interface, do not generate a testbench. (However, 40GBASE-KR4 IP core
variations that conform to all the requirements with the exception of the
requirement of adapters, do generate an example design that runs in hardware).

Conceptually, the testbenches for the 40‑100GbE IP cores with
adapters (IP cores with an Avalon-ST client interface) and the testbenches for the
40‑100GbE IP cores without adapters (IP cores with the custom streaming client
interface) are identical, except for the bandwidth. The following sections first
describe the testbenches that include adapters and then describe the testbenches without
adapters.

You can simulate the testbench that you generate with your IP core variation.
The testbench illustrates packet traffic, in addition to providing information regarding
the transceiver PHY. The non-40GBASE-KR4 testbenches tie
off the reconfiguration control interface for your IP core, and do not exercise
transceiver reconfiguration. However, the 40GBASE-KR4 testbench exercises
auto-negotiation and link training, in addition to generating and checking packet
traffic.

Testbenches with Adapters

Figure 4. 40-100GbE IP Core Testbenches with Adapters. Illustrates the top-level modules of the non-40GBASE-KR4 40GbE
and 100GbE example testbenches that use adapters. In the file names, * denotes
40 for 40GbE IP cores and 100 for 100GbE IP cores.

Figure 5. 40GBASE-KR4 40GbE IP Core Testbench with Adapters. Illustrates the top-level modules of the 40GBASE-KR4 example
testbench that uses adapters. To support the simulation of auto-negotiation,
the testbench uses two instances of the IP core instead of configuring the IP
core in loopback mode.

The testbench wrapper file.
For non-KR4 variations, this file includes all of the testbench modules.

alt_e40_avalon_tb_packet_gen.v

The packet generator.
This file is present only for 40GBASE-KR4 variations.

alt_e40_avalon_tb_packet_gen_sanity_check.v

The packet checker.
This file is present only for 40GBASE-KR4 variations.

alt_e40_avalon_tb_sample_tx_rom.hex

The sample TX ROM.
This file is present only for 40GBASE-KR4 variations.

alt_e40_avalon_tb_sample_tx_rom.v

Lists the contents of the sample TX ROM
(alt_e40_avalon_tb_sample_tx_rom.hex).
This file is present only for 40GBASE-KR4 variations.

Testbench Scripts

run_vsim.do

The ModelSim script to run the testbench.

run_vcs.sh

The Synopsys VCS script to run the testbench.

run_ncsim.sh

The Cadence NCSim script to run the testbench.

Figure 6. Typical 40GbE Traffic on the Avalon-ST Interface Using the Four-
to Two-Word Adapters. Shows typical traffic from the simulation testbench created using
the
<instance_name>_example/alt_e40_e100/example_testbench/run_vsim.do
script in ModelSim.

Note: Client logic must maintain the
l4_tx_valid signal asserted while asserting SOP,
through the assertion of EOP. Client logic should not pull this signal low
during a packet transmission.

The markers in the figure show the following sequence of events:

At marker 1, the
application asserts
l4_tx_startofpacket, indicating the beginning of a
TX packet.

At marker 2, the
application asserts
l4_tx_endofpacket, indicating the end of the TX
packet. The value on
l4_tx_empty[4:0] indicates that the 2 least
significant bytes of the last data cycle are empty.

At marker 3, the IP core
asserts
l4_rx_startofpacket, indicating the beginning of an
RX packet. A second transfer has already started on the TX bus.

At marker 4, the 40GbE IP
core deasserts
l4_rx_valid, indicating that the IP core does not
have new valid data to send to the client on
l4_rx_data[255:0].
l4_rx_data[255:0] remains valid and unchanged for a
second cycle.

A marker 5, the 40GbE IP
core asserts
l4_rx_valid, indicating that the it has valid data
to send to the client on
l4_rx_data[255:0].

At marker 6, the 40GbE IP
core deasserts
l4_rx_valid, indicating that it does not have new
valid data to send to the client on
l4_rx_data[255:0].
l4_rx_data[255:0] remains unchanged for a second
cycle.

At marker 7, the 40GbE IP
core asserts
l4_rx_valid, indicating that the it has valid data
to send to the client on
l4_rx_data[255:0].

At marker 8, the 40GbE IP
core deasserts
l4_rx_valid, indicating that the 40GbE IP core does
not have new valid data to send to the client on
l4_rx_data[255:0].
l4_rx_data[255:0] remains unchanged for a second
cycle.

At marker 9, the IP core
asserts
l4_rx_endofpacket, indicating the end of the RX
packet.
l4_rx_empty[4:0]
has a value of 0x1D, indicating that 29 least significant
bytes of the last cycle of the RX packet empty.

Note: The ready latency on the TX client interface with adapters is 0.

Testbenches without Adapters

Figure 7. 40-100GbE IP Core Testbench Without Adapters. Illustrates the top-level modules of the 40GbE and 100GbE
example testbenches that do not use adapters. In the file names, * denotes 40
for 40GbE IP cores and 100 for 100GbE IP cores.

Understanding the Testbench Behavior

The non-40GBASE-KR4 testbenches send traffic through the IP core in
transmit-to-receive loopback mode, exercising the transmit side and receive side of
the IP core in the same data flow. These testbenches send traffic to allow the
Ethernet lanes to lock, and then send packets to the transmit client data interface
and check the data as it returns through the receive client data interface.

The 40GBASE-KR4 testbench sends traffic through the two IP cores in each
direction, exercising the receive and transmit sides of both IP cores. This
testbench exercises auto-negotiation and link training, and then sends and checks
packets in data mode.

The 40-100GbE IP cores implement
virtual lanes as defined in theIEEE 802.3ba-2010 40G and 100G
Ethernet Standard. The 40GbE IP cores are fixed at four virtual lanes and
each lane is sent over a 10 Gbps physical lane. The 100GbE IP cores are fixed at 20
virtual lanes; the 20 virtual lanes are typically bit-interleaved over ten 10-Gbps
physical lanes. When the lanes arrive at the receiver the lane streams are in an
undefined order. Each lane carries a periodic PCS-VLANE alignment tag to restore the
original ordering. The simulation establishes a random permutation of the physical
lanes that is used for the remainder of the simulation.

Within each virtual lane stream, the data is 64B/66B encoded. Each word has two
framing bits which are always either 01 or 10, never 00 or 11. The RX logic uses
this pattern to lock onto the correct word boundaries in each serial stream. The
process is probabilistic due to false locks on the pseudo-random scrambled stream.
To reduce hardware costs, the receiver does not test alignments in parallel;
consequently, the process can be somewhat time-consuming in simulation.

In the 40GBASE-KR4 testbench, some register values are set to produce a shorter
runtime. For example, timeout counters and the number of steps used in link training
are set to smaller values than would be prudent in hardware. To override this
behavior and use the normal settings in simulation, add the following line to your
IP core variation top-level file or to the testbench top-level file,
alt_e40_avalon_kr4_tb.sv:

`define ALTERA_RESERVED_XCVR_FULL_KR_TIMERS

Both the word lock and the alignment marker lock implement hysteresis
as defined in the
IEEE 802.3ba-2010 40G and 100G Ethernet Standard. Multiple
successes are required to acquire lock and multiple failures are required to
lose lock. The “fully locked” messages in the simulation log indicate the point
at which a physical lane has successfully identified the word boundary and
virtual lane assignment.

In the event of a catastrophic error, the RX PCS automatically
attempts to reacquire alignment. The MAC properly identifies errors in the
datastream.

Simulating the 40‑100GbE IP Core With the Testbenches

You can simulate the 40‑100GbE IP
core using the Altera-supported versions of the Mentor Graphics ModelSim® SE, Cadence NCSim, and Synopsys VCS simulators for the
current version of the Quartus II
software.
The ModelSim-AE simulator does not have the capacity to simulate this IP core.

The example testbenches simulate packet traffic at the digital level.
The testbenches do not require special SystemVerilog class libraries.

The top-level testbench file for non-40GBASE-KR4 variations consists of a simple
packet generator and checker and one core in a loopback configuration. The packet generator skews and reorders its transmitter digital
output to emulate actual transceiver behavior and optical cabling lane permutations.

The top-level testbench file for 40GBASE-KR4 variations consists of a symmetric
arrangement with two IP cores and traffic between them. For each IP core there is a
packet generator to send traffic on the TX side of the IP core and a packet checker to
check the packets it receives from the other IP core. The two IP cores communicate with
each other through their Ethernet link, in which the testbench injects random skew. The
40GBASE-KR4 testbench connects each IP core to a
reconfiguration bundle
,
and exercises auto-negotiation, link training, and data mode.

The example testbenches contain the test files and run scripts for the
ModelSim, Cadence, and Synopsys simulators. The run scripts use the file lists
in the wrapper files. When you launch a simulation from the original directory,
the relative filenames in the wrapper files allow the run script to locate the
files correctly.
You can access design files from any location if your
directory structure matches the structure assumed in the run script path names.

The following
examples
provide directions for generating the testbench and running tests
with the ModelSim, Cadence, and Synopsys simulators.

Note: When
prompted at the start of generation, you must turn on Generate example design. Turning on
Generate example design is the
only process that generates a functional testbench and a functional example
design.

When the IP core is generated in <working directory>, the testbench and example project
are generated in <working
directory>/<IP core variation>/_example/alt_e40_e100.

The directory with the testbench and example project has two
subdirectories:

example, which contains the example design
project

example_testbench, which contains the
demonstration testbench

Simulating with the Modelsim Simulator

To run the simulation in the supported versions of the ModelSim simulation tool,
follow these steps:

Change directory to the <variation_name>_example/alt_e40_e100/example_testbench
directory.

In the command line, type:
vsim -c -do run_vsim.do

The example testbench will run and pass.

The ModelSim-AE simulator does not have the capacity to simulate this IP core.

Simulating with the NCSim Simulator

To run the simulation in the supported versions of the NCSim simulation tool,
follow these steps:

Change directory to the <variation_name>_example/alt_e40_e100/example_testbench
directory.

In the command line, type:
sh run_ncsim.sh

The example testbench will run and pass.

Simulating with the VCS Simulator

To run the simulation in the supported versions of the VCS simulation tool,
follow these steps:

Change directory to the <variation_name>_example/alt_e40_e100/example_testbench
directory.

In the command line, type:
sh run_vcs.sh

The example testbench will run and pass.

Testbench Output Example: 40GbE IP Core with Adapters

This section shows successful simulation using the 40GbE IP core with adapters testbench (alt_40gbe_tb.sv
). The testbench connects the Ethernet TX lanes to the
Ethernet RX lanes, so that the IP core is in an external loopback configuration. In
simulation, the testbench resets the IP core and waits for lane alignment and deskew to
complete successfully. The packet generator sends ten packets on the Ethernet TX lanes
and the packet checker checks the packets when the IP core receives them on the Ethernet
RX lanes.

Testbench Output Example: 100GbE IP Core with Adapters

This section shows successful simulation using the
100GbE IP core
with adapters
testbench (alt_100gbe_tb.sv
).
The testbench connects the Ethernet TX lanes to the Ethernet RX lanes, so that
the IP core is in an external loopback configuration. In simulation, the
testbench resets the IP core and waits for lane alignment and deskew to
complete successfully. The packet generator sends ten packets on the Ethernet
TX lanes and the packet checker checks the packets when the IP core receives
them on the Ethernet RX lanes.

Compiling the Full Design and Programming the FPGA

You can use the Start Compilation command on the
Processing menu in the Quartus II software to compile your design. After
successfully compiling your design, program the targeted Altera device with the
Programmer and verify the design in hardware.

Functional Description

This chapter provides a detailed description of the
40‑100GbE IP core. The chapter begins with a high-level overview of
typical Ethernet systems and then provides detailed descriptions of the MAC,
transmit (TX) and receive (RX) datapaths, signals, register descriptions, and
an Ethernet glossary. This chapter includes the following sections:

High Level System Overview

40-100GbE MAC and PHY Functional Description

The Altera
40‑100GbE IP core implements the 40‑100GbE Ethernet MAC in
accordance with the IEEE 802.3ba 2010 40G and 100G Ethernet
Standard. This IP core handles the frame encapsulation and flow of data
between a client logic and Ethernet network via a 40‑100GbE Ethernet PCS and
PMA (PHY).

In the transmit direction, the MAC accepts client frames, and inserts
inter-packet gap (IPG), preamble, start of frame delimiter (SFD), padding, and CRC bits
before passing them to the PHY. The PHY encodes the MAC frame as required for reliable
transmission over the media to the remote end.

In the receive direction, the PHY passes frames to the MAC. The MAC accepts
frames from the PHY, performs checks, updates statistics counters, strips out the CRC,
preamble, and SFD, and passes the rest of the frame to the client. In RX preamble
pass-through mode, the MAC passes on the preamble and SFD to the client instead of
stripping them out.

The MAC includes the following interfaces:

Datapath
client-interface–The following options are available:

40GbE with
adapters—Avalon‑ST, 256 bits

40GbE—Custom streaming,
128 bits

100GbE with
adapters—Avalon‑ST, 512 bits

100GbE—Custom streaming, 320 bits

Datapath PHY
side–The following options are available:

40GbE—XLAUI

100GbE—CAUI, CAUI–4

Management
interface—Avalon-MM host slave interface for MAC management. This interface has
a data width of 32 bits and an address width of 16 bits.

The PHY includes the following interfaces:

Datapath MAC–The
following options are available:

40GbE—XLAUI

100GbE—CAUI, CAUI–4

Datapath Ethernet
interface–The following options are available:

40GbE—Four 10.3125 Gbps
serial links

40GbE
24.24—Four 6.25 Gbps serial links

100GbE—Ten 10.3125 Gbps
serial links

100GbE CAUI–4—Four
25.78125 Gbps serial links

40-100GbE IP Core TX Datapath

The TX MAC module receives the client payload data with the destination and
source addresses and then adds, appends, or updates various header fields in accordance
with the configuration specified. The MAC does not modify the destination address or the payload received from
client. However, the TX MAC module adds a preamble (if the IP core is not
programmed to
receive the preamble from user logic), pads the payload of frames greater than eight
bytes to satisfy the minimum Ethernet frame payload of 46 bytes, and calculates the CRC over the
entire MAC frame. (If padding is added, it is also included in the CRC calculation).
The TX MAC
module can also modify the source address, and always inserts IDLE bytes to maintain
an average IPG.

Figure 9. Typical Client Frame at the Transmit Interface. Illustrates the changes that the TX MAC makes to the client frame.
This figure uses the following notational conventions:

<p> = payload size = 0–1500 bytes, or 9600 bytes for jumbo
frames.

<s> =
padding bytes = 0–46 bytes.

<l> =
number of IPG bytes

The following sections describe the functions that the TX module
performs:

Preamble, Start, and SFD Insertion

In the TX datapath the MAC appends an eight-byte preamble that begins with a
Start byte (0xFB) to the client frame.
This MAC module also incorporates the functions of the
reconciliation sublayer.

The source of the 6‑byte preamble and 1‑byte SFD
depends on whether you turn on the TX preamble pass-through feature by setting bit 1 of the Preamble Pass-Through Configuration register at offset
0x125
.

If the TX preamble pass-through feature is
turned on, the client provides the eight-byte preamble (including Start byte and final 1-byte SFD) on the data bus. However, the IP core overwrites the Start byte the client provides,
and only passes on the intermediate six bytes and the SFD. The client is
responsible for providing
an appropriate SFD byte.

Address Insertion

The client provides the destination MAC address and the source address
of the local MAC. However, if enabled by bit [31] of the
MADDR_CTRL register at offset 0xC2, the source MAC
address can be replaced by the source address contained in two, 32-bit MAC
registers:
SRC_AD_LO and
SRC_AD_HI.

Length/Type Field Processing

This two-byte header represents either the length of the payload or the
type of MAC frame. When the value of this field is equal to or greater than
1536 (0x600) it indicates a type field. Otherwise, this field provides the
length of the payload data that ranges from 0–1500 bytes. The TX MAC does not
modify this field before forwarding it to the network.

Frame Padding

When the length of client frame is less than 64 bytes (meaning the payload
is less than 46 bytes) and greater than eight bytes, the TX MAC module inserts pad bytes
(0x00) after the payload to create a frame length equal to the minimum size of 64
bytes.

Frame Check Sequence (CRC-32) Insertion

The TX MAC computes and inserts a CRC32 checksum in the transmitted MAC frame.
The frame check sequence (FCS) field contains a 32-bit CRC value. The MAC computes the
CRC32 over the frame bytes that include the source address, destination address, length,
data, and pad (if applicable). The CRC checksum computation excludes the preamble, SFD,
and FCS. The encoding is defined by the following generating polynomial:

Inter‑Packet Gap Generation and Insertion

The TX MAC maintains the minimum
inter‑packet gap (IPG) between transmitted frames required by the IEEE 802.3
Ethernet standard. The standard requires an average minimum IPG of 96 bit times
(or 12 byte times).
The deficit idle counter maintains the average IPG of 12 bytes.

The MAC adjusts the IPG to compensate for Alignment Marker
insertion by the PHY. You can program this adjustment using the
IPG_DEL_PERIOD and
IPG_DEL_ENABLE registers at offsets 0x126 and 0x127,
respectively. By default, the adjustment removes one Idle byte for every 16384
bytes. This removal rate corresponds to the bandwidth used by the Alignment
Marker that the PHY inserts in the outgoing Ethernet communication. You can
modify the value in the
IPG_DEL_PERIOD register to specify more or less frequent
removal of Idle bytes from the sequence.

40-100GbE IP Core TX Data Bus Interfaces

This section describes the TX data bus at the user interface and
includes the following topics:

40-100GbE IP Core TX Data Bus with Adapters (Avalon-ST Interface)

The
40-100GbE IP core TX datapath with adapters employs the Avalon-ST
protocol. The Avalon-ST protocol is a synchronous point-to-point,
unidirectional interface that connects the producer of a data stream (source)
to a consumer of data (sink). The key properties of this interface include:

Start of packet (SOP) and
end of packet (EOP) signals delimit frame transfers.

A valid signal qualifies
signals from source to sink.

The sink applies
backpressure to the source by using the ready signal. The source typically
responds to the deassertion of the ready signal from the sink by driving the
same data until the sink can accept it. The
readyLatency defines the relationship between
assertion and deassertion of the ready signal, and cycles which are considered
to be
ready for data transfer.The
readyLatency on the TX client interface is zero
cycles.

Altera provides an Avalon-ST interface with adapters for both the 40GbE and 100GbE IP cores. The
Avalon-ST interface requires that the start of packet (SOP) always be in the MSB,
simplifying the interpretation and processing of incoming data.

The TX adapter for the 100GbE IP core increases the
client interface Avalon-ST bus width from 5 words (320 bits) to 8 words (512 bits).
The TX adapter for the 40GbE IP core increases the client interface Avalon-ST bus
width from 2 words (128 bits) to 4 words (256 bits). In both cases the client
interfaces operate at a frequency above 315 MHz in the standard IP core variations,
and at or above the frequency of 190.90 MHz in 24.24 Gbps variations.

The client acts as a source and the TX MAC acts as a sink in the
transmit direction.

Table 18. Signals of the TX Client Interface with Adapters
. In the table,
<n> = 4 for the 40GbE IP core and
<n> = 8 for the 100GbE IP core.
<l> is log2(8*<n>).
All interface signals are clocked by the
clk_txmac clock.

Signal Name

Direction

Description

l<n>_tx_data[<n>*64-1:0]

Input

TX data. If the preamble pass-through feature is enabled, data
begins with the preamble.

l<n>_tx_empty[<l>-1:0]

Input

Indicates the number of empty bytes on
l<n>_tx_data when
l<n>_tx_endofpacket is asserted.

l<n>_tx_startofpacket

Input

When asserted, indicates the start of a packet. The packet
starts on the MSB.

l<n>_tx_endofpacket

Input

When asserted, indicates the end of packet.

l<n>_tx_ready

Output

When asserted, the MAC is ready to receive data. The
l<n>_tx_ready signal acts as an
acknowledge. The source drives
l<n>_tx_valid and
l<n>_tx_data[<n>*64-1:0],
then waits for the sink to assert
l<n>_tx_ready. The
readyLatency is zero cycles, so that the IP
core accepts valid data in the same cycle in which it asserts
l<n>_tx_ready.

The tx_ready signal indicates the MAC is ready
to receive data in normal operational model. However, the
tx_ready signal might not be an adequate
indication following reset. To avoid sending packets before the
Ethernet link is able to transmit them reliably, you should
ensure that the application does not send packets on the TX
client interface until after the lanes_deskewed
signal is asserted.

l<n>_tx_valid

Input

When asserted
l<n>_tx_data is valid. This
signal must be continuously asserted between the assertions of
l<n>_tx_startofpacket and
l<n>_tx_endofpacket for the same
packet.

Figure 11. Traffic on the TX and RX Client Interface for 40GbE IP Core Using
the Four- to Two-Word Adapters. Shows typical traffic for the TX and RX Avalon-ST interface 40GbE IP core.
This example shows a part of a ModelSim simulation of the parallel testbench
provided with the IP core.

Figure 12. Traffic on the TX and RX Client Interface for 100GbE IP Core Using
the Eight- to Five-Word Adapters. Shows typical traffic for the TX and RX Avalon-ST interface of the 100GbE IP
core. This example shows a part of a ModelSim simulation of the parallel testbench
provided with the IP core.

Table 19. Signals of the TX Client Interface Without Adapters . In the table,
<w> = 2 for the 40GbE IP core and
<w> = 5 for the 100GbE IP core.

Signal Name

Direction

Description

din[<w>*64-1:0]

Input

Data bytes to send in big-Endian mode.

din_start[<w>-1:0]

Input

Start of packet (SOP) location in the TX data bus. Only the
most significant byte of each 64‑bit word may be a start of packet. Bit 63 or
127 are possible for the 40GbE and bits 319, 255, 191, 127, or 63 are possible
for 100 GbE.

din_end_pos[<w>*8-1:0]

Input

End of packet. Any byte may be the last byte in a packet.

din_ack

Output

Indicates that input data was accepted by the IP core.

clk_txmac

Input

TX MAC clock. The minimum clock frequency is 315 MHz for the
circuit to function correctly. The
clk_txmac and
clk_rxmac which clocks the RX datapath are not
related and their rates do not have to match.

The IP core reads the bytes in big endian order. A packet may start in
the most significant byte of any word. A packet may end on any byte.

To avoid sending packets before the IP core completes the reset sequence, you should
ensure that the application does not send packets on the TX client interface until
after the lanes_deskewed signal is asserted.

Bus Quantization Effects With Adapters

The TX custom streaming interface allows a packet to start at any of two
or five positions to maximize utilization of the link bandwidth. The TX
Avalon-ST interface only allows start of packet (SOP) to be placed at the most
significant position. If the SOP were restricted to the most significant
position in the client logic data bus in the custom streaming interface, bus
bandwidth would be reduced.

Figure 14. Reduced Bandwidth With Left-Aligned SOP Requirement. Illustrates the reduction of bandwidth that would be caused by
left‑aligning the SOP for the 100GbE IP core.

Example A shows the minimum-sized packet of eight words. Example B shows
an 11‑word packet which is the worst-case for bandwidth utilization. Assuming
another packet is waiting for transmission, the effective ingress bandwidth is
reduced by 20% and 26%, respectively. Running the MAC portion of the logic
slightly faster than is required can mitigate this loss of bandwidth.
Additional increases in the MAC frequency can provide further mitigation,
although doing so makes timing closure more difficult. The wider data bus for
the Avalon-ST interface also helps to compensate for the Avalon-ST left‑aligned
SOP requirement.

User Interface to Ethernet Transmission

The IP core reverses the bit stream for transmission per Ethernet
requirements. The transmitter handles the insertion of the inter‑packet gap,
frame delimiters, and padding with zeros as necessary. The transmitter also
handles FCS computation and insertion.

The Ethernet MAC and PHY transmit complete packets. After transmission
begins, it must complete with no IDLE insertions. Between the end of one packet
and the beginning of the next packet, the data input is not considered and the
transmitter sends IDLE characters. An unbounded number of IDLE characters can
be sent between packets.

40GbE IP Core Without Adapters

The following figures illustrate the transmission of a short packet when
preamble pass-through is turned off and when it is turned on.

Figure 15. Short Packet Example Without Preamble. Illustrates the transmission of a short packet when preamble
pass-through is turned off.

Bus Representation of a Short TX Packet Without Preamble

This example shows the Verilog HDL code that represents the simple
packet illustrated in the previous figure. Note that bit
din_end[5] in the second cycle, corresponding to the
“Last data” in the figure, is asserted.

Figure 16. Short Packet Example With Preamble. Illustrates the transmission of a short packet when preamble
pass-through is turned on. In this example, the preamble starts in the MSB;
however, this need not be the case.

Bus Representation of a Short TX Packet With Preamble

This example shows the Verilog HDL code that represents the simple
packet illustrated in the previous figure. Note that bit
din_end[5] in the second cycle, corresponding to the
“Last data” in the figure, is asserted.

Figure 17. Sample 40GbE IP Core TX Bus Activity. Illustrates the deassertion of the
din_ack signal. The data beginning with 0xe6e7 is
not immediately accepted. The
din bus must be held until
din_ack returns to one. At this point normal data
flow resumes.

100GbE IP Core Without Adapters

The following figures illustrate the transmission of a short packet when
preamble pass-through is turned off and when it is turned on.

Figure 18. Short Packet Example Without Preamble. Illustrates the transmission of a short packet for the 100GbE IP
core when preamble pass-through is turned off.

Bus Representation of a Short TX Packet Without Preamble

This example shows the Verilog HDL code that represents the simple
packet illustrated in the preceding figure. Note that bit
din_end[13] corresponding to the “Last data” in the
figure, is asserted.

Figure 20. Sample 100GbE IP Core TX Bus Activity. Illustrates the deassertion of the
din_ack signal. The data beginning with 0x0202 is
not immediately accepted. The
din bus must be held until
din_ack returns to one. At this point normal data
flow resumes.

The TX logic supports packets of less than the usual length. However,
no more than two start-of-packets can occur in the same clock cycle.

For example,
din_start might be set to 5’b11000, indicating the
start of a new packet in two successive words. In this case,
din_end_pos could equal 40’h0101000000, indicating two
packets of eight bytes. Each 8‑byte packet is padded with zeros to create a
64-byte packet.

Order of Transmission

The IP core transmits bytes on the Ethernet link starting with the
preamble and ending with the FCS in accordance with the IEEE 802.3 standard.
Transmit frames the IP core receives on the client interface are big‑endian.
Frames the MAC sends to the PHY on the XGMII/CGMII between the MAC and the PHY
are little‑endian; the MAC TX transmits frames on this interface beginning with
the least significant byte.

Figure 21. Byte Order on the Client Interface Lanes Without Preamble
Pass‑Through. Describes the byte order on the Avalon-ST interface when the
preamble pass-through feature is turned off. Destination Address[40] is the
broadcast/multicast bit (a type bit), and Destination Address[41] is a locally
administered address bit.

For example, the destination MAC address includes the following six
octets AC-DE-48-00-00-80. The first octet transmitted (octet 0 of the MAC
address described in the 802.3 standard) is AC and the last octet transmitted
(octet 7 of the MAC address) is 80. The first bit transmitted is the low-order
bit of AC, a zero. The last bit transmitted is the high order bit of 80, a one.

The preceding table and the following figure show that in this example,
0xAC is driven on
DA5(DA[47:40]) and 0x80 is driven on
DA0(DA[7:0]).

Figure 22. Octet Transmission on the Avalon-ST Signals Without Preamble
Pass-Through. Illustrates how the octets of the client frame are transferred
over the TX datapath when preamble pass-through is turned off.

Figure 23. Byte Order on the Avalon-ST Interface Lanes With Preamble
Pass‑Through. Describes the byte order on the Avalon-ST interface when the
preamble pass-through feature is turned on. Recall that the IP
core overwrites the Start byte you provide on the client interface, but does
not overwrite the SFD byte.

Figure 24. Octet Transmission on the Avalon-ST Signals With Preamble
Pass-Through. Illustrates how the octets of the client frame are transferred over
the TX datapath when preamble pass-through is turned on. The eight preamble
bytes precede the destination address bytes. The preamble bytes are reversed:
the application must drive the Start byte on
l8_tx_data[455:448] and the SFD byte on
l8_tx_data[511:504].

The destination address and source address bytes follow the preamble
pass-through in the same order as in the case without preamble pass-through.

40-100GbE IP Core RX Datapath

The
40-100GbE RX MAC receives Ethernet frames from the PHY and forwards
the payload with relevant header bytes to the client after performing some MAC
functions on header bytes.

Figure 25. Flow of Frame Through the MAC RX Without Preamble
Pass-Through. Illustrates the typical flow of frame through the MAC RX when the
preamble pass-through feature is turned off. In this figure,
<p> is payload size (0–1500 bytes), and
<s> is the number of pad bytes (0–46 bytes).

Figure 26. Flow of Frame Through the MAC RX With Preamble Pass-Through Turned
On. Illustrates the typical flow of frame through the MAC RX when the
preamble pass-through feature is turned on. In this figure,
<p> is payload size (0–1500 bytes), and
<s> is the number of pad bytes (0–46 bytes).

The following sections describe the functions performed by the RX MAC:

40-100GbE IP Core RX Filtering

The 40-100GbE IP core can operate in cut-through mode or in store and
forward mode. In cut‑through mode, the IP core does not buffer incoming
Ethernet packets for filtering. It can filter out incoming runt packets, but
cannot filter on any other criteria. The value in bit 0 of the
RX_FILTER_CTRL register at offset 0x103 determines the
mode, and the value in bit 3 determines whether the IP core filters runt
packets.

When the IP core is in cut‑through mode, it does not filter incoming
Ethernet packets based on destination address. Therefore, when in cut-through
mode, the IP core is in promiscuous receive mode. The Ethernet standard
definition of promiscuous receive mode requires that the IP core accept all
valid frames, regardless of destination address. The 40-100GbE IP core accepts
or rejects invalid frames based on the filtering criteria that are turned on.

In store and forward mode, you can enable oversized-frame handling. When
the maximum frame size is set to 9600 bytes, the IP core passes some of the
frames between 9601-9644 bytes in size, and drops frames of 9645 bytes or more.
For the 100GbE IP core, if the frame size is within 44 bytes over the specified
maximum frame size, it may or may not be dropped, but oversized frames of over
44 bytes will always be dropped. For the 40GbE IP core, if the frame size is
within 20 bytes over the specified maximum frame size, it may or may not be
dropped, but oversized frames of over 20 bytes will always be dropped.

The 40-100GbE IP core supports the following filtering options:

Destination address
mismatch—refer to the descriptions of the
RX_FILTER_CTRL register and the
MADDR_CTRL register and the link below to the Address
Checking topic.

Runt frame—refer to the
description of the
dout_runt_last_data signal.

40-100GbE IP Core Preamble Processing

The preamble sequence is Start, six preamble bytes, and SFD. If this
sequence is incorrect the frame is ignored. The Start byte must be on receive
lane 0 (most significant byte). The IP core uses the SFD byte (0xD5) to
identify the last byte of the preamble. The MAC RX looks for the Start, six
preamble bytes and SFD.

By default, the MAC RX removes all Start, SFD, preamble, and IPG bytes
from accepted frames. However, if you turn on the RX preamble
pass-through feature, by setting bit 0 of the
Preamble Pass-Through Configuration register at offset
0x125
, the MAC RX does not remove
the eight-byte preamble sequence.

In the user interface, the EOP signal (l<n>_rx_endofpacket or dout_last_data
) indicates the end of CRC bytes if CRC is not removed. When CRC is removed, the
EOP signal indicates the final byte of payload.

By default, the IP core asserts the FCS error signal
(l<n>_rx_fcs_error or
dout_fcs_error) and the EOP signal on the same clock
cycle if the current frame has an FCS error. However, if the IP core is in RX
automatic pad removal mode, the signals might not be asserted in the same clock
cycle.

40-100GbE IP Core CRC Checking

The 32-bit CRC field is received in the order: X32, X30, . . . X1, and
X0 , where X32 is the most significant bit of the FCS field and occupies the
least significant bit position in the first FCS byte.

If a CRC32 error is detected, the RX MAC marks the frame invalid by asserting
the dout_fcs_error and
dout_fcs_valid
signals.

When operating in the cut-through or store and forward mode,
with Avalon–ST or the custom streaming client interface, the FCS result is
always preserved.

RX CRC Forwarding

The CRC-32 field is forwarded to the client interface after the final
byte of data, if the CRC removal option is not enabled.

RX Automatic Pad Removal Control

In the 40GbE and 100GbE MAC configurations, you can enable and disable
RX automatic pad removal with a configuration register bit in run-time.

The following figures illustrate the normal format of received data at
the MAC RX interface.

Figure 27. Flow of Frame Through the MAC RX Without Preamble
Pass-Through. Illustrates the typical flow of frame through the MAC RX when the
preamble pass-through feature is turned off. In this figure,
<p> is payload size (0–1500 bytes), and
<s> is the number of pad bytes (0–46 bytes).

Figure 28. Flow of Frame Through the MAC RX With Preamble Pass-Through Turned
On. Illustrates the typical flow of frame through the MAC RX when the
preamble pass-through feature is turned on. In this figure,
<p> is payload size (0–1500 bytes), and
<s> is the number of pad bytes (0–46 bytes).

In these figures, normal packet data, highlighted in gray, lasts until
the end of the payload section. However, the IP core might pass along
additional padding, marked with PAD, to ensure that the frame length is at
least 64 bytes. The Ethernet standard requires padding insertion when the
payload length is less than 46 bytes. EOP at the RX interface is normally
marked after padding, but you can disable CRC removal to place the EOP at the
end of the CRC block.

When enabled, RX automatic pad removal moves the EOP marker to the end
of the payload as indicated in the length field, whether padding has been
inserted or not. If the length is greater than 46 bytes, no padding has been
inserted and this feature has no effect. When you enable RX automatic pad
removal, the CRC is excluded from the EOP marker on packets that have stripped
padding, and enabling CRC retention has no effect on padded packets.

Note: Signals ending in
*_fcs_valid and
*_fcs_error are not shifted along with the new EOP
marker. Instead, they function as if pad removal were disabled. Do not rely on
these signals when operating in RX automatic pad removal mode.

The
PAD_CONFIG register controls RX automatic pad removal.
By default, pad removal is disabled. Statistics counting is not affected by RX
automatic pad removal; data reports as default, as if the padding were not
removed.

Address Checking

Unicast—Specifies a
destination address is a unicast (individual) address. Bit 0 is 0.

Multicast—Specifies a
destination address is a multicast or group address. Bit 0 is 1.

Broadcast—Specifies a
broadcast address when all 48 bits in the destination address are all 1s,
48’hFFFF_FFFF_FFFF.

If destination address matching is enabled, IP core address checking
compares the address to the address programmed in the destination address
register, and accepts only the frames with a matching address. You must enable
filtering to discard mismatched destination addresses.

To enable address checking, you must turn ensure your 40-100GbE IP core
has the following values in the specified register fields:

Bit 0 of the
RX_FILTER_CTRL register at offset 0x103 has the value
of 0.

Bit 0 of the
MADDR_CTRL register at offset 0x140 has the value of
1.

Bit 30 of the
MADDR_CTRL register has the value of 1.

The
MADDR_CTRL fields allow you to turn off destination
address checking but still enable the IP core to filter RX traffic based on
other criteria.

If bit 0 of the
RX_FILTER_CTRL register has the value of 1, the IP core
is in promiscuous receive mode. In this mode, the IP core omits address
checking and accepts all the Ethernet frames it receives, except possibly runt
frames.

40-100GbE IP Core RX Data Bus with Adapters (Avalon-ST Interface)

The adapter for the RX interface of the 100GbE IP core
increases the bus width from 5 words (320 bits) to 8 words (512 bits). The
adapter for the RX interface of the 40GbE IP core increases the bus width from
2 word (128 bits) to 4 words (256 bits). The Avalon-ST interface always locates
the SOP at the MSB, simplifying the interpretation of incoming data.

The RX MAC acts as a source and the client acts as a sink in the
receive direction.

Figure 29. RX MAC to Client Interface
with Adapters. The Avalon-ST interface bus width varies with the IP core
variation. In the figure, <n> = 4 for the 40GbE IP
core and <n> = 8 for the 100GbE IP core. <l> is log2(8*<n>).

Table 21. Signals of the RX Client Interface with Adapters
. In the table,
<n> = 4 for the 40GbE IP core and
<n> = 8 for the 100GbE IP core.
<l> is log2(8*<n>).

Name

Direction

Description

l<n>_rx_data[<n>*64-1:0]

Output

RX data.

l<n>_rx_empty[<l>-1:0]

Output

Indicates the number of empty bytes on
l<n>_rx_data when
l<n>_rx_endofpacket is asserted,
starting from the least significant byte (LSB).

l<n>_rx_startofpacket

Output

When asserted, indicates the start of a packet. The packet
starts on the MSB.

l<n>_rx_endofpacket

Output

When asserted, indicates the end of packet.

l<n>_rx_error

Output

When asserted, indicates an error condition.

l<n>_rx_valid

Output

When asserted, indicates that RX data is valid. Only valid
between the
l<n>_rx_startofpacket and
l<n>_rx_endofpacket signals.

l<n>_rx_fcs_valid

Output

When asserted, indicates that FCS is valid.

l<n>_rx_fcs_error

Output

When asserted, indicates an FCS error condition.

Runt frames always force an FCS error condition. However, if a
packet is eight bytes or smaller, it is considered a decoding
error and not a runt frame, and the IP core does not flag it as
a runt.

Figure 30. Traffic on the TX and RX Client Interface for 40GbE IP Core Using
the Four- to Two-Word Adapters. Shows typical traffic for the TX and RX Avalon-ST interface 40GbE IP core.
This example shows a part of a ModelSim simulation of the parallel testbench
provided with the IP core.

Figure 31. Traffic on the TX and RX Client Interface for 100GbE IP Core Using
the Eight- to Five-Word Adapters. Shows typical traffic for the TX and RX Avalon-ST interface of the 100GbE IP
core. This example shows a part of a ModelSim simulation of the parallel testbench
provided with the IP core.

The RX bus without adapters consists of five 8-byte words, or 320
bits, operating at a frequency above 315 MHz for the 100GbE IP core or two
8‑byte words, or 128 bits, for the 40GbE IP core, nominally at 315 MHz. This
bus drives data from the RX MAC to the RX client.

Table 22. Signals of the RX Client Interface Without Adapters. In the table,
<w> = 2 for the 40GbE IP core and
<w> = 5 for the 100GbE IP core.

Signal Name

Direction

Description

dout_d[<w>*64-1:0]

Output

Received data and Idle bytes. In RX preamble pass-through
mode, this bus also carries the preamble.

dout_c[<w>*8-1:0]

Output

Indicates control bytes on the data bus. Each bit of dout_c
indicates whether the corresponding byte of
dout_d is a control byte. A bit is asserted
high if the corresponding byte on
dout_d is an Idle byte or the Start byte, and
has the value of zero if the corresponding byte is a data byte or, in preamble
pass-through mode, a preamble or SFD byte.

dout_first_data[<w>-1:0]

Output

Indicates the first data word of a frame, in the current
clk_rxmac cycle. In RX preamble pass-through
mode, the first data word is the word that contains the preamble. When the RX
preamble pass-through feature is turned off, the first data word is the first
word of Ethernet data that follows the preamble.

dout_last_data[<w>*8–1:0]

Output

Indicates the final data byte of a frame, before the FCS, in
the current
clk_rxmac cycle.

dout_runt_last_data[<w>-1:0]

Output

Indicates that the last_data (the final data byte of the
frame) is the final data byte of a runt frame (< 64 bytes). If a frame is
eight bytes or smaller, it is considered a decoding error and not a runt frame,
and the IP core does not flag it with this signal.

dout_payload[<w>-1:0]

Output

Word contains packet data (including destination and source
addresses) as opposed to only containing Idle bytes. CRC and padding bits are
considered data for this signal. When preamble pass-through is turned on, the
preamble is also considered data for this signal.

dout_fcs_error

Output

The current or most recent last_data byte is part of a frame
with an incorrect FCS (CRC-32) value. By default, the IP core asserts
dout_fcs_error in the same cycle as the
dout_last_data signal. However, in RX
automatic pad removal mode, the
dout_fcs_error signal might lag the
dout_last_data signal for the frame.

dout_fcs_valid

Output

The FCS error bit is valid.

dout_dst_addr_match[<w>-1:0]

Output

The first data word in a frame that matches the destination
address in the
DST_AD0_LO and
DST_AD0_HI registers. However, if bit 30 of
the
MADDR_CTRL register has the value of 0, the
address is always considered to match. Otherwise, if bit 0 of the
MADDR_CTRL register has the value of 0, the
address is always considered to not match.

dout_valid

Output

The
dout_d bus contents are valid. This signal is
occasionally deasserted due to clock crossing.

clk_rxmac

Input

RX MAC clock. The minimum clock frequency is 315 MHz. The
clk_rxmac clock and the
clk_txmac clock (which clocks the TX datapath)
are not related and their rates do not have to match.

The data bytes use 100 Gigabit Media Independent Interface
(CGMII‑like) encoding. For packet payload bytes, the
dout_c bit is set to 0 and the
dout_d byte is the packet data. You can use this
information to transmit out‑of-spec data such as customized preambles when
implementing non-standard variants of the
IEEE 802.3ba-2010 100G Ethernet Standard. If the
additional customized data is not required, simply ignore all words which are
marked with
dout_payload = 0 and discard the
dout_c bus.

In RX preamble pass-through mode,
dout_c has the value of 1 while the start byte of the
preamble is presented on the RX interface, and
dout_c has the value of 0 while the remainder of the
preamble sequence (six-byte preamble plus SFD byte) is presented on the RX
interface. While the preamble sequence is presented on the RX interface,
dout_payload has the value of 1.

100GbE IP Core RX Client Interface Examples

Example on RX Client Interface Without Preamble
Pass-Through

In the figure,
dout_last_data is asserted in the second cycle,
indicating the end of a packet. The
dout_d signal returns to 0 (5’h1c = 5’b11100). The
dout_c and
dout_d busses are set to 1b’1 and 8’h07, respectively,
to indicate idling. In the fourth cycle,
dout_first_data is asserted and a short packet begins.
This packet terminates successfully in the final cycle. Note that the first
packet has a CRC error (dout_fcs_error = 1 and
fcs_valid = 1).

The
dout_first_data signal marks the first word of data.
The first byte of data is always the most significant byte of the word. There
are 5 legal starting positions for the 100GbE IP core and 2 legal starting
positions for the 40GbE IP core. The
dout_last_data signal marks the last data byte before
the FCS (CRC). It can occur at any byte position.

The
dout_runt_last_data signal indicates that the packet
ending in this word is less than the minimum legal length of 64 bytes from
first data to the last FCS byte inclusive. Runt frames of eight or fewer bytes
cannot be legally represented in CGMII and trigger a decoding error rather than
this flag.

The
dout_fcs_error and
dout_fcs_valid signals indicate the result of the CRC
checking logic.
dout_fcs_valid = 1 and
dout_fcs_error = 1 indicate a corrupted frame. Note
that the CRC checking network works correctly on runt frames of 40–63 bytes.
Runt frames of less than 40 bytes may be incorrectly determined to have a
proper CRC.

The
dout_payload signal marks words that contain frame
data payload. Words that contain data payload might also contain Idle or
sequence control information, or preamble/frame delimiters. However, if the
word contains any frame data, the
dout_payload bit that corresponds to that word has the
value of 1.

The
dout_valid signal exists for clock rate compensation.
This signal is almost always asserted. When
dout_valid is deasserted all other dout signals should
be ignored for one clock cycle.

Example on RX Client Interface With Preamble Pass-Through

Figure 34. Typical Traffic on the RX Client Interface for 100GbE IP Core
Without Adapters and With Preamble Pass-Through.

In the figure, the IP core is in preamble pass-through mode, and
places the preamble sequence on the data bus
dout_d. The
dout_first_data signal marks the first word of data,
including the full preamble sequence, beginning with the Start byte. The data
in each packet begins with the Start byte (0x55), six-byte preamble
(0x555555555555), and SFD (0xD5), and must align with one of the legal starting
positions. The IP core asserts the
dout_c signal high for the Start byte, and holds
dout_c low for the remainder of the preamble, which is
considered to be data. The
dout_c signal is the only signal that distinguishes
any part of the preamble sequence from data. The second and third packets in
the example each begin in the first word of
dout_d, with the preamble sequence. However, they
could each legally begin at the start of any other word of
dout_d; the start of each word is a legal starting
position.

Error Conditions on the RX Datapath

The RX MAC indicates error conditions by asserting
l<n>_rx_error. The following error
conditions are defined:

Received frame terminated
early or with an error

Received frame has a CRC
error

Error characters received
from PHY

Receive frame is too short
(less than 64 bytes) or too long (longer than the maximum specified length)

40GbE Lower Rate 24.24 Gbps MAC and PHY

The 40GbE MAC and PHY IP core configured on certain device speed grades can run
at 24.24 Gbps with a 4 x 6.25 Gbps external interface. To generate a 40GbE IP core that
runs at the 24.24 Gbps rate, the Quartus II software generates the PHY with a data rate
of 6250.0 Mbps, instead of the 10312.5 Mbps data rate for the normal 40GbE IP core
variations.

Congestion and Flow Control Using Pause Frames

The 40-100GbE IP core provides flow control to
reduce congestion at the local or remote link partner. When either link partner
experiences congestion, the respective transmit control sends pause frames. The
pause frame instructs the remote transmitter to stop sending data for the duration
that the congested receiver specified in an incoming XOFF frame.

When the IP core receives the XOFF pause control frame, if the
following conditions all hold, the IP core stops transmitting frames for a period
equal to the pause quanta of the incoming pause frame. While paused, the IP core
does not transmit data but can transmit pause frames.

The
appropriate bit of the
RECEIVE_PAUSE_CONTROL register has the value of 1.

Address matching is
positive.

The pause quanta can be configured in the pause quanta register of the
device sending XOFF frames. If the pause frame is received in the middle of a
frame transmission, the transmitter finishes sending the current frame and then
suspends transmission for a period specified by the pause quanta. Data
transmission resumes when a pause frame with quanta of zero is received or when
the timer has expired. The pause quanta received overrides any counter
currently stored. When more than one pause quanta is sent, the value of the
pause is set to the last quanta received.

10 This is a multicast destination
address. You can use a unicast address if unicast addresses are
enabled in the pause register.

11 The bytes P1 and P2 are filled
with the value configured in the
pause_quanta
register.

Conditions Triggering XOFF Frame Transmission

The TX MAC transmits XOFF frames when one of the following conditions
occurs:

Client requests XOFF transmission—A
client can explicitly request that XOFF frames be sent using the pause control
interface signals. When pause_insert_tx is asserted and pause_insert_time is not
zero, an XOFF frame is sent to the Ethernet network when the current
frame transmission completes.

Host (software) requests XOFF
transmission—Setting the pause control
register
triggers a request that an XOFF frame be sent.

Both options are available
in the 40-100GbE IP core with or without adapters.

Conditions Triggering XON Frame Transmission

The TX MAC transmits XON frames when one of the following conditions
occurs:

Client requests XON transmission—A
client can explicitly request that XON frames be sent using the pause control
interface signals. Ifpause_insert_tx is asserted and pause_insert_time is zero, an XON frame is sent to the
Ethernet network when the current frame transmission completes.

Host (software) requests XON
transmission—Setting
the pause control
register
triggers
a request that an XON frame be sent.

Both options are available in the
40-100GbE IP core with or without adapters.

Pause Transmission Logic

Figure 36. Block Diagram of the Pause Transmission Logic.

Pause Control and Generation Interface

The pause control interface implements flow control as specified by
the IEEE 802.3ba2010 100G Ethernet Standard.The pause logic, upon
receiving a pause packet, temporarily stops packet transmission, and can pass
the pause packets through as normal traffic or drop the pause control frames in
the RX direction.

Table 23. Pause Control and Generation Signals. Describes the signals that implement pause control. You can also access the pause functionality using the pause
registers for any variant of the 40-100GbE IP core.

Signal Name

Direction

Description

pause_insert_tx

Input

Edge triggered
signal which directs the IP core to insert a pause frame
on the Ethernet link.

pause_insert_time[15:0]

Input

Specifies the duration of the pause in pause quanta. The pause
control settings in the pause registers determine the duration of the pause
quanta (pause quanta is equal to 512-bit time).

pause_insert_mcast

Input

When asserted, specifies that the IP core should generate a
pause packet with the well‑known multicast address of 01-80-C2-00-00-01. If
deasserted, the pause is generated with the specified MAC address
(pause_insert_dst), which is typically a unicast address.

pause_insert_dst[47:0]

Input

Specifies the MAC address for a unicast pause.

pause_insert_src[47:0]

Input

Specifies source address of the pause packet.

pause_match_from_rx

Output

Asserted to indicate an RX pause signal match.
Used only when RX configurations are instantiated.
The IP core asserts
this signal when it receives a pause request with an
address match, to signal the TX MAC to throttle its transmissions
on the Ethernet link.

pause_time_from_rx[15:0]

Output

Asserted for RX pause time. Used only when RX configurations
are instantiated.

pause_match_to_tx

Input

Asserted to indicate a TX pause signal match. Used only when
TX configurations are instantiated.

pause_time_to_tx[15:0]

Input

Asserted for TX pause time. Used only when TX configurations
are instantiated.

Pause Control Frame and Non‑Pause Control Frame Filtering and Forwarding

The 40GbE and 100GbE MAC IP cores can pass the pause packets through
as normal traffic or drop the pause control frames in the RX direction. You can
enable and disable pass-through with the following configuration control bits:

RX_FILTER_CTRL bit [4] enables and disables pause
filtering.

RX_FILTER_CTRL bit [5] enables and disables control
filtering.

By default, pass-through is disabled.

The following rules define pause control frames filtering control:

The
RX_FILTER_CTRL register contains options to filter
different packets types, such as runt packets, FCS error packets, address
mismatch packets, and so on, from the RX MAC. The
RX_FILTER_CTRL register contains one bit to enable
pause packet filtering and one bit to enable non-pause control packet
filtering. The reset state for both bits is 1, where filtering is enabled. The
bits are gated by
RX_FILTER_CTRL bit [0], which enables and disables
all filtering.

If you have enabled pause
packet filtering, the IP core drops packets that enter the RX MAC and match the
length and type of 0x8808 with an opcode of 0x1 (pause packets) and does not
process them or forward them to the client interface.

If you have enabled
non‑pause control packet filtering, the IP core drops packets that enter the RX
MAC and match the length and type of 0x8808 with an opcode other than 0x1
(pause packets) and does not forward them to the client interface.

If you have disabled pause
packet filtering, the RX MAC forwards pause packets to the client interface
depending on their destination address. If destination address filtering is not
enabled, you are forwarded all pause packets. If destination address filtering
is enabled, you are only forwarded pause packets with a valid packet multicast
address or a destination address matching the
40‑100GbE IP core address.

Pause and control packet pass-through do not affect the pause
functionality in the TX or RX MAC.

40-100GbE IP Core Modes of Operation

This section explains the cut-through, store and forward, and
promiscuous modes of the 40-100GbE IP core.

In the normal mode of operation, the 40-100GbE IP core MAC transmits and
receives data through a PHY to and from a remote link partner Ethernet MAC. You
can program IP core registers to control the way in which the IP core RX MAC
operates.

You can program the RX MAC to selectively filter incoming Ethernet
packets based on various criteria. For example, the RX MAC performs address
filtering, various header checking, and control frame termination according to
the IEEE 802.3 standard. You must enable filtering to discard mismatched
destination addresses.

If you choose to accept all incoming Ethernet packets, and not filter on
any criterion, except possibly to filter out runt packets, the IP core is
configured in cut‑through mode. If you filter based on any criterion other than
runt packets, the IP core is configured in store and forward mode, in which it
buffers the incoming packet for checking before processing in the MAC.

If the IP core is in cut‑through mode, it meets the criteria for
promiscuous receive mode, as defined in the Ethernet standard. This definition
specifies that the Ethernet implementation accept all valid frames, regardless
of destination address. In cut-through mode, the IP core accepts all Ethernet
frames that are sufficiently well-formed to be identified. Runt frames are
invalid frames, according to the Ethernet standard, and therefore their
acceptance or rejection is immaterial to the criteria for promiscuous receive
mode.

Link Fault Signaling Interface

The
40‑100GbE IP core provides link fault signaling as defined in
the IEEE 802.3ba-2010 100G Ethernet Standard
. The
40GbE and 100GbE MAC include a Reconciliation Sublayer (RS)
located between the MAC and the XLGMII or CGMII to manage local and remote
faults. Link fault signaling on the Ethernet link is disabled by default but
can be enabled by
the
Enable Link Fault Sequence register. When
enabled, the local RS TX logic
can transmit
remote fault sequences in case of a local fault and
can transmit
IDLE control words in case of a remote fault. An
additional configuration register (MAC/RS link fault sequence
configuration) is provided to select the type of information to be
transmitted in case of a local or remote fault. Using the configuration bits,
you can send remote fault sequence ordered sets, IDLE control words, or regular
traffic in the case of a local or remote fault.

The RS RX logic sets
remote_fault_status or
local_fault_status to 1 when the RS RX block receives
remote fault or local fault sequence ordered sets. When valid data is received
in more than 127 columns, the RS RX logic resets the relevant fault status
(remote_fault_status or
local_fault_status) to 0.

Figure 37. Link Fault Signaling Example.

The IEEE standard specifies RS monitoring of RXC<7:0> and
RXD<63:0> for Sequence
ordered_sets. For more information, refer to
Figure 81–9—Link Fault Signaling state diagram
and
Table 81-5—Sequence
ordered_sets in the
IEEE 802.3ba 2010 100G Ethernet Standard
. The variable
link_fault is set to indicate the value of an RX
Sequence
ordered_set when four
fault_sequences containing the same fault value are
received with fault sequences separated by less than 128 columns and with no
intervening
fault_sequences of different fault values. The
variable
link_faultis set to OK following any interval of 128 columns not
containing a remote fault or local fault Sequence
ordered_set.

Table 24. Signals of the Link Fault Signaling Interface

Signal Name

Direction

Description

remote_fault_from_rx

Output

Asserted when remote fault is detected in RX MAC. Available in RX-only variations.

If link fault signaling is disabled, this signal is present
but is tied low (always has the value of 0).

local_fault_from_rx

Output

Asserted when local fault is detected in RX MAC. Available in RX-only variations.

If link fault signaling is disabled, this signal is present
but is tied low (always has the value of 0).

remote_fault_to_tx

Input

Input to the TX MAC. Asserted when remote fault is detected.
Visible in TX-only variations and used internally in duplex IP core variations.

local_fault_to_tx

Input

Input to the TX MAC. Asserted when local fault is detected.
Visible in TX-only variations and used internally in duplex IP core variations.

remote_fault_status

Output

Asserted when remote fault is detected in RX MAC in a duplex IP core
variation. In duplex IP core variations, remote_fault_from_rx is connected internally to
remote_fault_to_tx, and this
signal is available externally as remote_fault_status.

local_fault_status

Output

Asserted when local fault is detected in RX MAC in a duplex
IP core variation. In duplex IP core variations,
local_fault_from_rx is connected internally to
local_fault_to_tx, and this signal is available
externally as
local_fault_status.

Statistics Counters Interface

The statistics counters module is a synthesis
option
that you
select in the 40-100GbE parameter editor.
However, the statistics status bit output vectors are provided whether you select the
statistics counters module option or not.

The increment vectors are brought to the top level as output ports. If you configure the statistics counters module,
t
he increment vectors also function
internally as input ports to the control and status registers (CSR).

Asserted for one cycle when an errored multicast TX frame,
excluding control frames, is transmitted.

tx_inc_mcast_data_ok

Output

Asserted for one cycle when a valid multicast TX frame,
excluding control frames, is transmitted.

tx_inc_bcast_data_err

Output

Asserted for one cycle when an errored broadcast TX frame,
excluding control frames, is transmitted.

tx_inc_bcast_data_ok

Output

Asserted for one cycle when a valid broadcast TX frame,
excluding control frames, is transmitted.

tx_inc_ucast_data_err

Output

Asserted for one cycle when an errored unicast TX frame,
excluding control frames, is transmitted.

tx_inc_ucast_data_ok

Output

Asserted for one cycle when a valid unicast TX frame, excluding
control frames, is transmitted.

tx_inc_mcast_ctrl

Output

Asserted for one cycle when a valid multicast TX frame is
transmitted.

tx_inc_bcast_ctrl

Output

Asserted for one cycle when a valid broadcast TX frame is
transmitted.

tx_inc_ucast_ctrl

Output

Asserted for one cycle when a valid unicast TX frame is
transmitted.

tx_inc_pause

Output

Asserted for one cycle when a valid pause TX frame is
transmitted.

tx_inc_fcs_err

Output

Asserted for one cycle when a TX packet with FCS errors is
transmitted.

tx_inc_fragment

Output

Asserted for one cycle when a TX frame less than 64 bytes and
reporting a CRC error is transmitted.

tx_inc_jabber

Output

Asserted for one cycle when an oversized TX frame reporting a
CRC error is transmitted.

tx_inc_sizeok_fcserr

Output

Asserted for one cycle when a valid TX frame with FCS errors is
transmitted.

RX Statistics Counter Increment Vectors

rx_inc_runt

Output

Asserted for one cycle when an RX runt packet is received.

rx_inc_64

Output

Asserted for one cycle when a 64-byte RX frame is received.

rx_inc_127

Output

Asserted for one cycle when a 65–127 byte RX frame is received.

rx_inc_255

Output

Asserted for one cycle when a 128–255 byte RX frame is received.

rx_inc_511

Output

Asserted for one cycle when a 256–511 byte RX frame is received.

rx_inc_1023

Output

Asserted for one cycle when a 512–1023 byte RX frame is
received.

rx_inc_1518

Output

Asserted for one cycle when a 1024–1518 byte RX frame is
received.

rx_inc_max

Output

Asserted for one cycle when a maximum-size RX frame is
received.

rx_inc_over

Output

Asserted for one cycle when an oversized RX frame is received.

rx_inc_mcast_data_err

Output

Asserted for one cycle when an errored multicast RX frame,
excluding control frames, is received.

rx_inc_mcast_data_ok

Output

Asserted for one cycle when valid a multicast RX frame,
excluding control frames, is received.

rx_inc_bcast_data_err

Output

Asserted for one cycle when an errored broadcast RX frame,
excluding control frames, is received.

rx_inc_bcast_data_ok

Output

Asserted for one cycle when a valid broadcast RX frame,
excluding control frames, is received.

rx_inc_ucast_data_err

Output

Asserted for one cycle when an errored unicast RX frame,
excluding control frames, is received.

rx_inc_ucast_data_ok

Output

Asserted for one cycle when a valid unicast RX frame, excluding
control frames, is received.

rx_inc_mcast_ctrl

Output

Asserted for one cycle when a valid multicast RX frame is
received.

rx_inc_bcast_ctrl

Output

Asserted for one cycle when a valid broadcast RX frame is
received.

rx_inc_ucast_ctrl

Output

Asserted for one cycle when a valid unicast RX frame is
received.

rx_inc_pause

Output

Asserted for one cycle when valid RX pause frames are received.

rx_inc_fcs_err

Output

Asserted for one cycle when a RX packet with FCS errors is
received.

Assertion of this signal might be early or delayed
compared to assertion of the
dout_fcs_error signal on the RX custom streaming
interface, because the IP core asserts
rx_inc_fcs_err when the MAC sees the FCS error,
and asserts
dout_fcs_error when it presents the relevant
frame on the custom streaming client interface. Depending on the filtering
settings in the
RX_FILTER_CTRL register, the frame might not
appear at all on the client interface.

rx_inc_fragment

Output

Asserted for one cycle when a RX frame less than 64 bytes and
reporting a CRC error is received.

rx_inc_jabber

Output

Asserted for one cycle when an oversized RX frame reporting a
CRC error is received.

rx_inc_sizeok_fcserr

Output

Asserted for one cycle when a valid RX frame with FCS errors is
received.

MAC – PHY XLGMII or CGMII Interface

The PHY side of the MAC implements the XLGMII or CGMII protocol as
defined by the IEEE 802.3ba standard. The standard XLGMII or CGMII
implementation consists of 32 bit wide data bus. However, the Altera
implementation uses a wider bus interface in connecting a MAC to the internal
PHY. The width of this interface is 320 bits for the 100GbE IP core and 128
bits for the 40GbE IP core.

Table 26. XL/CGMII Permissible Encodings . Lists XL/CGMII permissible encodings. Memorizing a few of the
XL/CGMII encodings greatly facilitates understanding of Ethernet waveforms. The
XL/CGMII encodings are backwards compatible with older Ethernet and have
convenient mnemonics. The
DATAPATH_OPTION RTL parameter instantiates TX and RX
backwards compatibility and is set by default.

Control

Data

Description

0

xx

Packet data, including preamble and FCS bytes.

1

07

Idle.

1

fb

Start of Frame (fb = frame begin).

1

fd

End of Frame (fd = frame done).

1

fe

XL/CGMII Error. Typically a bit error which switched a 66‑bit
block between data and control, or corrupt control information (fe = frame
error).

MAC to PHY Connection Interface

Table 27. MAC to PHY and PHY to MAC TX and RX Signals.

The MAC–PHY connection interface is exposed in the 40‑100GbE
MAC-only and PHY-only IP core variations. In addition, the
tx_lanes_stable output signal from the PHY component
is available to provide status information to user logic in PHY-only IP core
variations and in MAC and PHY IP core variations.

In the table,
<w> = 2 for the 40GbE IP core and
<w> = 5 for the 100GbE IP core.

Lane to Lane Deskew Interface

The lane to lane deskew signal is included in the 40‑100GbE IP core
with and without adapters. When both MAC and PHY options are selected, the lane
to lane deskew input signal acts as an internal signal. The lane to lane deskew
output signal from the PHY component is available to provide status information
to user logic in both PHY-only and MAC and PHY IP core variations.

Table 28. Lane to Lane Deskew Interface Signals

Signal Name

Direction

Description

lanes_deskewed

Input

Indicates lane to lane skew is corrected. Available as an
input to the 40-100GbE MAC IP core only.

lanes_deskewed

Output

Indicates lane to lane skew is corrected. Available as an
output from the 40-100GbE PHY IP core and the 40-100GbE MAC and PHY IP core.

PCS Test Pattern Generation and Test Pattern Check

The PCS can generate a test pattern and detect a scrambled idle test
pattern. PCS test-pattern mode is suitable for receiver tests and for certain
transmitter tests.

When bit 0 of the
TEST_MODE register at offset 0x019 has the value of 1, a
scrambled idle pattern is enabled. In this mode, the scrambler generates a test
pattern. The scrambler does not require seeding during test-pattern operation.
The input to the scrambler is a control block (blocktype=0x1E). The TX PCS adds
synchronous headers and alignment markers to the data stream, enabling the RX
PCS to align and deskew the PCS lanes.

The scrambled idle test-pattern checker utilizes the block lock state
diagram, the alignment marker state diagram, the PCS deskew state diagram, and
the descrambler; these blocks operate as if in normal data reception. The bit
error rate (BER) monitor state diagram is disabled during RX test-pattern mode.
When
align_status is true and the scrambled idle RX
test-pattern mode is active, the scrambled idle test-pattern checker observes
the synchronous header and the output from the descrambler. When the
synchronous header and the output of the descrambler is an idle pattern, a
match is detected. When operating in scrambled idle test pattern, the
test-pattern error counter counts blocks with a mismatch. Any mismatch
indicates an error and shall increment the test‑pattern error counter.

Transceiver PHY Serial Data Interface

The core uses a 40‑bit ×<n> lane digital
interface to send data to the TX high‑speed serial I/O pins operating at
10.3125 Gbps in the standard 40GbE and 100GbE variations, at
6.25 Gbps in the 24.24 variations, and at 25.78125 Gbps in the CAUI–4
variations. The
rx_serial and
tx_serial ports connect to the 10.3125 Gbps, 6.25 Gbps, or 25.78125 Gbps pins. The
protocol includes automatic reordering of serial lanes so that any ordering is
acceptable.
Virtual lanes 0 and 1 transmit data on
tx_serial[0].

PCS BER Monitor

The PCS implements bit error rate (BER) monitoring as specified by the
IEEE 802.3ba-2010 100G Ethernet Standard. When the PCS
deskews the data and aligns the lanes, the BER monitor checks the signal
quality and asserts
hi_ber if it detects excessive errors. When
align_status is asserted and
hi_ber is deasserted, the RX PCS continuously accepts
blocks and generates RXD <63:0> and RXC <7:0> on the XLGMII or
CGMII interface.

High BER occurs when 97 invalid 66-bit synchronous headers are detected
for 100GbE within 500 µs or detected for 40GbE within 1.25 ms. When fewer than
97 invalid 66-bit synchronous headers occur in the same window, the IP core
exists the high BER state.

For more information, refer to
Figure 82–13—BER monitor state diagram
illustrated in the
IEEE 802.3ba-2010 100G Ethernet Standard.

40GBASE-KR4 IP Core Variations

The 40GBASE-KR4 IP core supports low-level control of analog
transceiver properties for link training and auto-negotiation in the absence of
a predetermined environment for the IP core. For example, an Ethernet IP core
in a backplane may have to communicate with different link partners at
different times. When it powers up, the environment parameters may be different
than when it ran previously. The environment can also change dynamically,
necessitating reset and renegotiation of the Ethernet link.

The 40-100GbE IP core 40GBASE-KR4 variations implement the IEEE
Backplane Ethernet Standard 802.3ap-2007. The 40-100GbE IP core provides
this reconfiguration functionality in Stratix V devices by configuring each
physical Ethernet lane with an Altera Backplane Ethernet 10GBASE-KR PHY IP core
if you turn on
Enable KR4 in the 40-100GbE parameter editor.
The parameter is available in variations parameterized with these values:

Device family: Stratix V

MAC configuration: 40GbE

Core option: "PHY" or "MAC
& PHY"

PHY configuration: 40Gbps
(4 x 10)

Duplex mode: Full Duplex

The PHY IP core includes the option to implement the following
features:

KR auto-negotiation
provides a process to explore coordination with a link partner on a variety of
different common features. The 40GBASE-KR4 variations of the 40-100GbE IP core
can auto-negotiate only to a 40GBASE-KR4 configuration. Turn on the
Enable KR4 Reconfiguration and
Enable Auto-Negotiation parameters to
configure support for auto-negotiation.

Link training provides a process for the IP core to train the link
to the data frequency of incoming data, while compensating for variations in
process, voltage, and temperature. Turn on the
Enable KR4 Reconfiguration and
Enable Link Training parameters to configure
support for TX link training.

To enable RX link training, you must also turn on the
Enable RX equalization parameter. Two
options are available for TX link training:

A built-in TX
adaptation algorithm.

A microprocessor
interface to support manual control of the link training process. Turn on the
Enable microprocessor interface parameter
to configure this support.

After the link is up and
running, forward error correction (FEC) provides an error detection and
correction mechanism to enable noisy channels to achieve the Ethernet-mandated
bit error rate (BER) of 10-12. Turn on the
Include FEC sublayer to configure support for
FEC.

The 40GBASE-KR4 IP core variations include separate link training and
FEC modules for each of the four Ethernet lanes, and a single auto-negotiation
module. You specify the master lane for performing auto-negotiation in the
parameter editor, and the IP core also provides register support to modify the
selection dynamically.

The 40GBASE-KR4 IP core is designed to connect to a reconfiguration
bundle, which includes the Altera Transceiver Reconfiguration Controller and
logic to assist in reconfiguring the transceivers into the different modes of
operation (AN, LT, and FEC data mode or non-FEC data mode). Altera provides the
testbench and example design to assist you in integrating your 40GBASE-KR4 IP
core into your complete design. The testbench and design example include the
reconfiguration bundle. You can examine the reconfiguration bundle for an
example of how to drive and connect the 40GBASE-KR4 IP core.

The 40GBASE-KR4 IP core variations provide two interfaces to control
these processes.

40GBASE-KR4 Reconfiguration Interface

The 40GBASE-KR4 reconfiguration interface supports low-level control
of analog transceiver properties for link training and auto-negotiation in the
absence of a predetermined environment for the IP core.

Table 29. 40GBASE-KR4 Reconfiguration Interface Signals.

Signals with a width of 4 x n are divided into fields of width n.
Bits [n-1:0] refer to Lane 0, bits [2n-1:n] refer to Lane 1, bits [3n-1:2n]
refer to Lane 2, and bits [4n-1:3n] refer to Lane 3. You can use these signals
to dynamically change between auto-negotiation, link training, and normal data
modes.

Note that the regular Stratix V dynamic reconfiguration interface,
the
reconfig_from_xcvr,
reconfig_to_xcvr, and
reconfig_busy signals, are also available in
40GBASE-KR4 IP core variations. The reconfiguration bundle in the example
design includes the Altera Transceiver Reconfiguration Controller. For an
example of how to coordinate dynamic transceiver reconfiguration using these
two interfaces, the regular Stratix V transceiver reconfiguration interface and
the 40GBASE-KR4 specific interface, refer to the example design reconfiguration
bundle.

Signal Name

Direction

Description

rc_busy[3:0]

Input

When asserted, indicates that reconfiguration is in progress.

lt_start_rc[3:0]

Output

When asserted, starts the TX PMA equalization reconfiguration
on the corresponding lane. This signal is present only if link training is
enabled.

main_rc[23:0]

Output

The main TX equalization tap value, which is the same as
VOD. This signal is present only if link training is enabled.

post_rc[19:0]

Output

The post-cursor TX equalization tap value for the
corresponding lane. This signal is present only if link training is enabled.

pre_rc[15:0]

Output

The pre-cursor TX equalization tap value for the corresponding
lane. This signal is present only if link training is enabled.

tap_to_upd[11:0]

Output

Specifies the TX equalization tap to update to optimize signal
quality. Each lane's field has the following valid values:

3'b100: main tap

3'b010: post-tap

3'b001: pre-tap

This signal is present only if link training is enabled.

seq_start_rc[3:0]

Output

When a bit is asserted, starts PCS reconfiguration for the
corresponding lane.

dfe_start_rc[3:0]

Output

When a bit is asserted, starts RX DFE equalization for the
corresponding lane. This signal is present only if RX equalization is enabled.

dfe_mode[7:0]

Output

Specifies the DFE operation mode. Valid at the rising edge of
the
def_start_rc signal and held until the falling
edge of the
rc_busy signal. The following encodings are
defined for each lane:

2'b00: Disable DFE

2'b01: DFE
triggered mode (single shot)

2b10 and 2'b11 are
reserved.

This signal is present only if RX equalization is enabled.

ctle_start_rc[3:0]

Output

When a bit is asserted, starts continuous time-linear
equalization (CTLE) reconfiguration on the corresponding lane. This signal is
present only if RX equalization is enabled.

ctle_rc[15:0]

Output

RX CTLE value. This signal is valid at the rising edge of the
ctle_start_rc signal and held until the
falling edge of the
rc_busy signal. The valid range of values is
4'b0000–4'b1111. 4'b0000 indicates the RX CTLE is disabled and 4'b1111
indicates RX CTLE is at its maximum value. This signal is present only if RX
equalization is enabled.

ctle_mode[7:0]

Output

Specifies the CTLE mode. This signal is valid at the rising
edge of the
ctle_start_rc signal and held until the
falling edge of the
rc_busy signal. The only valid value of this
signal in the 40-100GbE IP core is 2'b0. This signal is present only if RX
equalization is enabled.

pcs_mode_rc[5:0]

Output

Specifies the PCS mode for reconfiguration. Has the following
valid values:

b'000001:
auto-negotiation mode

b'000010: link
training mode

b'100000: data
mode (normal operation)

Other values are not valid for the 40GBASE-KR4 IP core.

en_lcl_rxeq[3:0]

Output

When a bit is asserted, it signals that an additional custom
RX equalization is enabled for the corresponding lane. The bits are identical
to the Link Trained status bits 0xD2 [0], [8], [16], and [24].

rxeq_done[3:0]

Input

When asserted, indicates that custom RX equalization is
complete. The PHY IP core ANDs each bit of this signal with
rx_trained from the Training State Diagram for
the corresponding lane.

40GBASE-KR4 Microprocessor Interface

The optional embedded processor interface signals allow you to use the
embedded processor mode of Link Training. This mode overrides the TX adaptation
algorithm and allows an embedded processor to initialize the link.

Table 30. 40GBASE-KR4 Microprocessor Interface Signals. Signals with a width of 4 x n are divided into fields of width n.
Bits [n-1:0] refer to Lane 0, bits [2n-1:n] refer to Lane 1, bits [3n-1:2n]
refer to Lane 2, and bits [4n-1:3n] refer to Lane 3. These signals are only
available if you turn on
Enable microprocessor interface.

Signal Name

Direction

Description

upi_mode_en[3:0]

Input

When a bit is asserted, enables embedded processor mode on the
corresponding lane.

upi_adj[7:0]

Input

Selects the active tap for the corresponding lane. Each lane's
field has the following valid values:

2'b01: main tap

2'b10: post-tap

2'b11: pre-tap

upi_inc[3:0]

Input

When a bit is asserted, sends the increment command for the
corresponding lane.

upi_dec[3:0]

Input

When a bit is asserted, sends the decrement command for the
corresponding lane.

upi_pre[3:0]

Input

When a bit is asserted, sends the preset command for the
corresponding lane.

upi_init[3:0]

Input

When a bit is asserted, sends the initialize command for the
corresponding lane.

upi_st_bert[3:0]

Input

When a bit is asserted, starts the BER timer for the
corresponding lane.

upi_train_err[3:0]

Input

When a bit is asserted, indicates a training error on the
corresponding lane.

upi_lock_err[3:0]

Input

When a bit is asserted, indicates a training frame lock error
on the corresponding lane.

upi_rx_trained[3:0]

Input

When a bit is asserted, the RX interface for the corresponding
lane is trained.

upo_enable[3:0]

Output

When a bit is asserted, indicates that the IP core is ready to
receive commands from the embedded processor for the corresponding lane.

upo_frame_lock[3:0]

Output

When a bit is asserted, indicates the receiver has achieved
training frame lock on the corresponding lane.

upo_cm_done[3:0]

Output

When a bit is asserted, indicates the master state machine
handshake for the corresponding lane is complete.

upo_bert_done[3:0]

Output

When a bit is asserted, indicates the BER timing for the
corresponding lane is at its maximum count.

Each four-bit field holds the current BER count for the
corresponding lane.

upo_ber_max[3:0]

Output

When a bit is asserted, the BER counter for the corresponding
lane has rolled over.

upo_coef_max[3:0]

Output

When a bit is asserted, indicates that the remote coefficients
for the corresponding lane are at their maximum or minimum values.

Control and Status Interface

The control and status interface provides an Avalon-MM interface to
the control and status registers. The Avalon-MM interface implements a standard
memory‑mapped protocol. You can connect an embedded processor or JTAG Avalon
master to this bus to access the control and status registers.

Table 31. Avalon-MM Control and Status Interface Signals. The
clk_status clocks the signals on the 40-100GbE IP core
control and status interface.

Signal Name

Direction

Description

status_addr[15:0]

Input

Address for reads and writes

status_read

Input

Read command

status_write

Input

Write command

status_writedata[31:0]

Input

Data to be written

status_readdata[31:0]

Output

Read data

status_readdata_valid

Output

Read data is ready for use

The status interface is designed to operate at a low frequencies, typically 50 MHz for Stratix IV devices and 100 MHz for Stratix V devices, so that control and
status logic does not compete for resources with the surrounding high speed
datapath.

Clocks

You must set the transceiver reference clock
(clk_ref) frequency to a value that
your IP core variation
supports.
For most variations,
The
40-100GbE IP core supports
clk_ref frequencies of 644.53125 MHz ±100 ppm and
322.265625 MHz ± 100 ppm. The ±100ppm value is required for any clock source
providing the transceiver reference clock. For CAUI–4
variations, you must set the frequency of
clk_ref to 644.53125 MHz ±100 ppm. For 24.24 Gbps
variations, you must set the frequency of
clk_ref either to 390.625 MHz ±100 ppm or to
195.3125 MHz ±100 ppm.

Sync–E IP core variations are duplex IP core variations
for which you turn on
Enable SyncE support in the parameter editor.
These variations provide separate IP core input reference clock signals for the
TX and RX transceiver PLLs, and provide the RX recovered clock as a top-level
output signal.

The Synchronous Ethernet standard, described in the ITU-T
G.8261, G.8262, and G.8264 recommendations, requires that the TX clock be
filtered to maintain synchronization with the RX reference clock through a
sequence of nodes. The expected usage is that user logic drives the
tx_ref_clk signal with a filtered version of the RX
recovered clock signal, to ensure the receive and transmit functions remain
synchronized.

In a Sync–E IP core, the restrictions apply to each of
the
rx_clk_ref and
tx_clk_ref input clocks.

The minimum clock frequency for the IP core is 315 MHz.
The only exception is the 40GbE lower rate 24.24 Gbps MAC and PHY IP core,
which has a minimum clock frequency of 190.90 MHz.

Table 32. Clock Inputs. Describes the input clocks that you must provide.

Signal Name

Description

clk_status

A clock for
reconfiguration, offset cancellation, and housekeeping
functions. This clock is also used for clocking the control
and status interface.
The clock
quality and pin chosen are not critical. clk_status is expected to be a 37.5–50 MHz clock on Stratix IV devices and
a 100–125 MHz clock on
Stratix V devices.

clk_ref

clk_ref is the
reference clock for the transceiver TX PLL and the RX CDR
PLL. This input signal is
not available in Sync–E variations.

The frequency of this input clock must match the
value you specify for PHY reference
frequency in the IP core parameter editor.

For the regular 40GbE and 100GbE IP core
variations, this clock must have a frequency of 322.265625 or 644.53125 MHz with a
±100 ppm accuracy per the IEEE
802.3ba-2010 100G Ethernet Standard.

Despite its apparent availability in
the 40-100GbE parameter editor, CAUI–4 variations do not support
the 322 MHz clock frequency. For these variations, this clock
must have a frequency of 644.53125 MHz with a ±100 ppm accuracy.

For 24.24 Gbps IP core variations,
this clock must have a frequency of 390.625 or 195.3125 MHz with
a ±100 ppm accuracy.

In addition, clk_ref must meet the jitter specification of
the IEEE 802.3ba-2010 100G Ethernet
Standard.

The PLL and clock generation logic use this
reference clock to derive the transceiver and PCS clocks. The
input clock should be a high quality signal on the appropriate
dedicated clock pin.

In Sync–E variations (IP core duplex variations
with the Sync–E option enabled), this clock replaces clk_ref as the reference clock for
the transceiver TX PLL.

The frequency of this input clock must match the
value you specify for PHY reference
frequency in the IP core parameter editor.

rx_clk_ref

In Sync–E variations (IP core duplex variations
with the Sync–E option enabled), this clock replaces clk_ref as the reference clock for
the transceiver CDR PLL.

The frequency of this input clock must match the
value you specify for PHY reference
frequency in the IP core parameter editor.

clk_txmac

The input TX clock for the IP
core with or without
adapters is clk_txmac.
The recommended TX MAC clock
frequency is 190.90 MHz for 24.24 Gbps variations, and
315 MHz for all other IP core variations.

clk_rxmac

The input
RX clock for the IP core with
or without adapters is clk_rxmac. The
recommended TX MAC clock frequency is 190.90 MHz for
24.24 Gbps variations, and 315 MHz for all other IP core
variations.

Figure 39. Clock Generation Circuitry. Provides a high-level view of the clock generation circuitry and clock
distribution to the transceiver. In Sync–E
variations, distinct clocks drive the TX PLL (tx_clk_ref) and the CDR block (rx_clk_ref), and the output clock from the CDR is brought out
to the top level.

Resets

The 40-100GbE IP core provides the following two independent reset
mechanisms:

Asynchronous reset
signals—A set of asynchronous reset signals you can assert to reset different
parts of the IP core. Use this method to initialize your IP core.

Reset registers—A set of
register bits you can write to reset different parts of the IP core. This
method is available for dynamic reset during operation.

Table 33. Asynchronous Reset Signals . The IP core provides five reset signals to allow independent reset
control for all configurations. The MAC and PHY asynchronous reset signals are
included in the 40‑100GbE IP Core with adapters and without adapters.

Signal Name

Direction

Description

mac_rx_arst_ST

Input

MAC RX asynchronous reset signal

mac_tx_arst_ST

Input

MAC TX asynchronous reset signal

pcs_rx_arst_ST

Input

PHY PCS RX asynchronous reset signal

pcs_tx_arst_ST

Input

PHY PCS TX asynchronous reset signal

pma_arst_ST

Input

PHY PMA asynchronous reset signal

Note: In any MAC and PHY variation, when you reset the TX MAC you must also reset the TX PCS to avoid transmitting corrupted packets.
Therefore, Altera recommends that you reset the IP core with the following conditions:

Reset the TX MAC and the TX PCS together (assert pcs_tx_arst_ST and mac_tx_arst_ST simultaneously).

Release pcs_tx_arst_ST and mac_tx_arst_ST simultaneously or release pcs_tx_arst_ST after you release mac_tx_arst_ST.

Altera recommends that you
release all parts of the 40-100GbE IP core from reset simultaneously.

Note: Each reset signal must be asserted for at least one
clk_status cycle.

You should not release any reset signal until after you observe that
the reference clock is stable. If the reference clock is generated from an
fPLL, wait until after the fPLL locks.

A table lists the signals
visible in the MAC-only variations that are not visible in the MAC and PHY
variations.

40-100GbE MAC-only IP core variations are the variations that do not
include a PHY. The signals in a MAC-only variation with and without adapters
are shown in the following figures in black, blue, or green. Signals shown in
purple and in dark gray are not available in the MAC-only variations.

Note: When simulating the full design the
lanes_deskewed input comes from the
output of the RX PCS, indicating a fully locked status. To avoid confusion,
when simulating the
alt_e40_mac or
alt_e100_mac wrapper as the top level, drive the
lanes_deskewed input and the
tx_mii_ready input to '1'.