H04L45/00—Routing or path finding of packets in data switching networks

H04L45/60—Router architecture

H—ELECTRICITY

H04—ELECTRIC COMMUNICATION TECHNIQUE

H04J—MULTIPLEX COMMUNICATION

H04J3/00—Time-division multiplex systems

H04J3/02—Details

H04J3/06—Synchronising arrangements

H04J3/0635—Clock or time synchronisation in a network

H04J3/0685—Clock or time synchronisation in a node; Intranode synchronisation

H04J3/0697—Synchronisation in a packet node

Abstract

A receive-side client interface for a media access controller embedded in an integrated circuit having programmable circuitry is described. A media access controller core includes a receive engine. A receive-side datapath is coupled to the media access controller core. The receive-side datapath is configured to operate at two frequencies to accommodate the programmable circuitry in the integrated circuit.

Description

LIMITED COPYRIGHT WAIVER

A portion of the disclosure of this patent document contains material to which the claim of copyright protection is made. The copyright owner has no objection to the facsimile reproduction by any person of the patent document or the patent disclosure, as it appears in the U.S. Patent and Trademark Office file or records, but reserves all other rights whatsoever.

FIELD OF THE INVENTION

One or more aspects of the invention relate generally to a network interface and more particularly, to an Ethernet Media Access Controller (“EMAC”) embedded in an integrated circuit (IC).

Each programmable tile typically includes both programmable interconnect and programmable logic. The programmable interconnect typically includes a large number of interconnect lines of varying lengths interconnected by programmable interconnect points (PIPs). The programmable logic implements the logic of a user design using programmable elements that can include, for example, function generators, registers, arithmetic logic, and so forth.

The programmable interconnect and programmable logic are typically programmed by loading a stream of configuration data into internal configuration memory cells that define how the programmable elements are configured. The configuration data can be read from memory (e.g., from an external PROM) or written into the FPGA by an external device. The collective states of the individual memory cells then determine the function of the FPGA.

Another type of PLD is the Complex Programmable Logic Device, or CPLD. A CPLD includes two or more “function blocks” connected together and to input/output (I/O) resources by an interconnect switch matrix. Each function block of the CPLD includes a two-level AND/OR structure similar to those used in Programmable Logic Arrays (PLAs) and Programmable Array Logic (PAL) devices. In some CPLDs, configuration data is stored on-chip in non-volatile memory. In other CPLDs, configuration data is stored on-chip in non-volatile memory, then downloaded to volatile memory as part of an initial configuration sequence.

For all of these programmable logic devices (PLDs), the functionality of the device is controlled by data bits provided to the device for that purpose. The data bits can be stored in volatile memory (e.g., static memory cells, as in FPGAs and some CPLDs), in non-volatile memory (e.g., FLASH memory, as in some CPLDs), or in any other type of memory cell.

The terms “PLD” and “programmable logic device” include but are not limited to these exemplary devices, as well as encompassing devices that are only partially programmable.

To enhance functionality of PLDs, embedded cores have been added. For example, FPGAs may include one or more hardwired microprocessors. However, an Ethernet Media Access Controller (“EMAC”) core for PLDs has only been available as a program core. For example, a program or “soft” implementation in FPGA programmable circuitry (“fabric”) of an EMAC is available from Xilinx, Inc. of San Jose, Calif., which is described in additional detail in “1-Gigabit Ethernet MAC Core with PCS/PMA Sublayers (1000BASE-X) or GMII v4.0” by Xilinx, Inc. [online] (Aug. 25, 2004) <URL:http://www.xilinx.com/ipcenter/catalog/logicore/docs/gig_eth_mac.pdf>, which is incorporated by reference herein in its entirety (hereinafter “soft EMAC core”).

Advantageously, having a soft EMAC core allows users to connect an FPGA to a network, such as an Ethernet. Unfortunately, the cost of the soft EMAC core implementation is significant with respect to use of configurable logic cells.

Accordingly, it would be desirable and useful to provide an EMAC core that uses fewer configurable logic cells than a soft EMAC core and provides the same or greater functionality of a soft EMAC core. Moreover, such an EMAC core may be substantially compatible with the Institute of Electronic and Electrical Engineers (“IEEE”) specification 802.3-2002. Furthermore, as PLDs may have any user instantiated design, such an EMAC core may be independent of user design.

SUMMARY OF THE INVENTION

The invention relates generally to a receive-side client interface to a media access controller embedded in a programmable logic device.

An aspect of the invention is a programmable logic device including: configurable logic having a first frequency of operation; and a media access controller integrated circuit embedded in the programmable logic device, where the media access controller integrated circuit has a second frequency of operation of at least approximately twice the first frequency of operation. The media access controller integrated circuit has a receive-side client interface having a selectable data input width and configurable for operation at any of a plurality of data rates, where the receive-side client interface is for communication with the configurable logic at the first frequency of operation and for communication outside of the programmable logic device at the second frequency of operation.

Another aspect of the invention is a receive-side client interface for a media access controller. A media access controller core includes a receive engine. A receive-side datapath is coupled to the media access controller core, where the receive-side datapath includes a first set of registers, a second set of registers and a third set of registers. The first set of registers is clocked responsive to a first clock signal. The second set of registers is clocked responsive to a second clock signal, and the third set of registers is clocked responsive to a third clock signal. The second clock signal is an undivided version of the first clock signal, and the third clock signal is a divided version of the first clock signal. The media access controller is embedded in an integrated circuit having programmable logic, where the programmable logic is capable of operating at the third clock signal frequency but not capable of operating at the first clock signal frequency.

BRIEF DESCRIPTION OF THE DRAWINGS

Accompanying drawing(s) show exemplary embodiment(s) in accordance with one or more aspects of the invention; however, the accompanying drawing(s) should not be taken to limit the invention to the embodiment(s) shown, but are for explanation and understanding only.

FIG. 1 is a high-level block diagram depicting an exemplary embodiment of a Field Programmable Gate Array (“FPGA”).

FIG. 2A is a high-level block diagram depicting an exemplary embodiment of an instantiation of an EMAC in configurable logic.

FIG. 2B is a high-level block diagram depicting an exemplary embodiment of an FPGA having an embedded EMAC system.

FIG. 2C is a high-level block/schematic diagram depicting an exemplary embodiment of a clock tree for an EMAC core.

FIG. 2D, there is shown an exemplary embodiment of signal timing for signals of FIG. 2C.

FIG. 2E is a block/schematic diagram depicting an exemplary embodiment of a transmit clock generator.

FIG. 2F-1 is a schematic diagram depicting an exemplary embodiment of an on-chip global buffer multiplexer.

FIG. 2F-2 is a schematic diagram depicting an exemplary embodiment of a divider circuit.

FIG. 2G is a block/schematic diagram depicting an exemplary embodiment of a receive clock generator.

FIG. 3 is a block/schematic diagram depicting an exemplary embodiment of an FPGA configured for an overclocking mode.

FIG. 3A is a simplified block diagram depicting an exemplary embodiment of clock management for a Media Independent Interface.

FIG. 4-1 is a high-level block/schematic diagram depicting an exemplary embodiment of a host interface.

FIG. 4-2 is a block/schematic diagram depicting an exemplary embodiment of a host interface.

FIG. 4-3 is state diagram depicting an exemplary embodiment of a state machine for EMAC register read select logic block.

FIG. 4-4 is a state diagram depicting an exemplary embodiment of a state machine for address filter read logic block.

FIG. 4-5A is a block/schematic diagram depicting an exemplary embodiment of device control register (“DCR”) bridge.

FIG. 4-5B is a table diagram depicting an exemplary embodiment of DCR address and bit assignments for a DCR bridge.

FIG. 4-5C is a table diagram listing an exemplary embodiment of definitions for memory-mapped registers.

FIG. 4-6 is a state diagram depicting an exemplary embodiment of a state machine of a DCR acknowledgement generator.

FIG. 4-7 is a state diagram depicting an exemplary embodiment of a state machine of a DCR read bypass multiplexer enable generator 552.

FIG. 4-8 is a block diagram depicting exemplary embodiments of logic blocks of a control generator block for generating control signals for reading from or writing to a DCR bridge to a host bus.

FIG. 4-9 is a block diagram depicting exemplary embodiments of logic blocks of a control generator block for generating control signals for reading or writing data from or to a host bus into a DCR bridge.

FIG. 4-10 is a state diagram depicting an exemplary embodiment of a state machine of a configuration read/write bus controller.

FIG. 4-11 is a state diagram depicting an exemplary embodiment of a state machine of a MIIM read/write bus controller.

FIG. 4-12 is a state diagram depicting an exemplary embodiment of a state machine of a statistics read bus controller.

FIG. 4-13 is a state diagram depicting an exemplary embodiment of a state machine of an address filter read/write bus controller.

FIG. 4-14 is a state diagram depicting an exemplary embodiment of a state machine of an address filter content addressable memory read/write bus controller.

FIG. 4-15 is a state diagram depicting an exemplary embodiment of a state machine of a read data received controller.

FIG. 4-16 is a state diagram depicting an exemplary embodiment of a state machine of a configuration read/write controller.

FIG. 4-17 is a state diagram depicting an exemplary embodiment of a state machine of a statistics read controller.

FIG. 4-18 is a state diagram depicting an exemplary embodiment of a state machine of a MIIM read/write controller.

FIG. 4-19 is a state diagram depicting an exemplary embodiment of a state machine of an address filter read/write controller.

FIG. 4-20 is a state diagram depicting an exemplary embodiment of a state machine of a multicast address register read/write controller.

FIGS. 4-21A through 4-21C are timing diagrams for respective exemplary instances of generation of a sample cycle pulse.

FIGS. 9 and 10 are simplified block diagrams depicting respective exemplary embodiments of Field Programmable Gate Array architectures in which one or more aspects of the invention may be implemented.

To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures.

DETAILED DESCRIPTION

In the following description, numerous specific details are set forth to provide a more thorough description of the specific embodiments of the invention. It should be apparent, however, to one skilled in the art, that the invention may be practiced without all the specific details given below. In other instances, well known features have not been described in detail so as not to obscure the invention. For ease of illustration, the same number labels are used in different diagrams to refer to the same items, however, in alternative embodiments the items may be different. Moreover, for purposes of clarity, a single signal or multiple signals may be referred to or illustratively shown as a signal to avoid encumbering the description with multiple signal lines. Moreover, along those same lines, a multiplexer or a register, among other circuit elements, may be referred to or illustratively shown as a single multiplexer or a single register though such reference or illustration may be representing multiples thereof. Furthermore, though particular signal bit widths, data rates and frequencies are describe herein for purposes of clarity by way of example, it should be understood that the scope of the description is not limited to these particular numerical examples as other values may be used.

EMAC System

FIG. 1 is a high-level block diagram depicting an exemplary embodiment of an FPGA 100. FPGA 100 includes FPGA programmable configurable circuitry (“FPGA fabric”) 101 in which an area is reserved for an embedded processor, as well as other embedded circuitry (“hardwired”), namely, processor block 102. Notably, processor block 102 need not be for an embedded processor, but generally refers to any area on an FPGA die reserved for embedded circuitry, more generally Application Specific Integrated Circuitry (“ASIC”) block 102. FPGA fabric 101 may include configurable logic configured for interfacing to one or more interfaces, such as Physical Layer (“PHY”) interfaces 119, clock interface 115, host bus 118, statistics interfaces 116, and client interfaces 117, in this exemplary embodiment. Notably, the words “include” and “including”, and variations thereof, as used herein shall mean including without limitation.

Processor block 102 includes the following embedded, i.e., hardwired, circuitry: processor 103, Ethernet Media Access Controller 110 (“EMAC0”), EMAC 111 (“EMAC1”), and host interface 112. Embedded processor 103 may be a PowerPC 405 core from IBM, though other known processor cores may be used. In an alternative embodiment embedded processor 103 is a the hardwired form of the MicroBlaze or PicoBlaze softcore processor from Xilinx, Inc. EMAC 110, EMAC 111 and host interface 112 are collectively referred to as the top-level EMAC (“EMAC_top”) 104. EMACs 110 and 111 may be used for access to and from Ethernet 39 via PHY interface 119. Alternatively, rather than a PHY interface 119, a transceiver, such as a Multi-Gigabit Transceiver (“MGT”) or an external PHY integrated circuit may be used. Notably, though an EMAC is described herein for purposes of clarity by way of example, it should be understood that the scope is not limited to an Ethernet type of network. Accordingly, a media access controller for interfacing to any known network may be embedded in an integrated circuit having configurable logic for communication therewith.

Processor block 102 may include traces for busing. A Device Control Register (“DCR”) bus 114 is described herein. DCR bus 114 is a known DCR interface for a PowerPC 405 core (“PPC core”). Though a PPC core is described herein for purposes of clarity by way of example, it should be understood that other known processor cores may be used. Furthermore, it should be appreciated that though an embedded processor is described herein, an external host processor 10 may optionally be used instead of embedded processor 103. External host processor 10 may be any of a variety of known processors. Furthermore, it should be understood that host bus 118 may optionally be coupled to an internal embedded processor, such as embedded processor 103, or a processor 10A instantiated in configurable logic of FPGA fabric 101. Notably, configurable logic may be used to instantiate a bridge between host interface 118 and processor 10A. Moreover, it should be understood that EMACs 110 and 111 may be used without any host processor, as configuration vectors may be provided via tie-off pin inputs.

EMACs 110 and 111 share a single host interface 112. Either or both of EMACs may be selected via host interface 112. Though two EMACs are shown one or more than two EMACs may be used. Host interface 112 may be used to interface to a microprocessor or other known integrated circuit external to FPGA 100. Such access to an external integrated circuit may be via host bus 118. Notably, host bus 118 is a processor platform independent host bus. In an implementation, host interface 112 may use either an EMAC host bus, such as host bus 118, or a DCR bus 114 through a DCR bridge 113, which may or may not be part of host interface 112. In other words, either host bus 118 or DCR bus 114 is used at a time.

EMAC core 123 includes clock generator/management 124. Clock generator 124 may be used to provide a transmit clock signal and a receive clock signal, among other below-described clock signals for EMAC 110. EMAC Core 133 includes clock generator 134. Clock generator 134 may be used to provide a transmit clock signal and a receive clock signal, among other below-described clock signals for EMAC 111.

As EMAC 110 and EMAC 111 are the same, only EMAC 110 will be described herein below for purposes of clarity.

FIG. 1A is a simplified block diagram depicting an exemplary embodiment of an EMAC core 123. With simultaneous reference to FIGS. 1 and 1A, EMAC 110 is further described. Again, EMAC cores 123 and 133 are the same, so only one is described for purposes of clarity.

In an exemplary implementation of client interface 117, an 8-bit or 16-bit wide mode may be selected. Client interface 117 is coupled to transmit engine 820 and to receive engine 850. Receive engine 850 may include an address filter 129. Transmit engine 820, which may be considered part of or coupled to transmit client interface 127, is coupled to flow control 105. Receive engine 850, which may be considered part of or coupled to receive client interface 128, is coupled to flow control 105. Transmit engine 820 and receive engine 850 are coupled to MII/GMII/RGMII interface 106, which in turn may be coupled to a physical layer interface 119. MII/GMII/RGMII interface 106 may be coupled to PCS/PMA Sublayer 107, which in turn may be coupled to an MGT and may provide an MDIO interface to a physical layer interface 119 along with MII management interface 108. Receive engine 850, MII management interface 108 and configuration registers may be coupled to host interface 112 via select circuitry 890, which circuitry alternatively may be part of host interface 112.

EMAC 110 may be a multiple mode EMAC. In an exemplary implementation, EMAC 110 may support data rates of approximately 10, 100, and 1000 megabits per second and be compliant with IEEE 802.3-2002 specifications. Though EMAC 110 may operate at a single data rate, such as either approximately 10, 100, or 1000 megabits per second, it may operate as a tri-mode EMAC switching as between data rates. Notably, other data rates may be used, such as other data rates greater than 100 megabits per second.

In an exemplary implementation, EMAC 110 may support Reduced Gigabit Media Independent Interface (“RGMII”) protocol for use with double data rate (“DDR”) operation thereby reducing width of a data bus to an external physical layer interface, such as physical layer interface 119. A Physical Medium Attachment (“PMA”) sub-layer may be used with a Multi-Gigabit Transceiver (“MGT”) of FPGA 100 to provide an on-chip 1000BASE-X implementation. MGTs that may be used are shown in FIG. 10, for example.

An embedded EMAC may operate with a Media Independent Interface (“MII”), a Gigabit MII (“GMII”), or a PCS/PMA to an MGT. The input/output (“I/O”) pins for these physical layer (PHY) interfaces 119 cross the ASIC-FPGA boundary to the I/O cells or MGT driver cells located in FPGA 100. Notably, EMAC 110 may use one and only one set of PHY interface pins at a time, and thus only one physical layer interface 119 interfacing, such as for MII, GMII, or MGT, is done at a time.

Meanwhile, processor block 102 has a limited number of I/O pins available for EMAC 110 due to routing channel constraints in FPGA 100. Hence, PHY interface I/O pins are re-used for the different interfaces, such as for a RGMII, an MII, a GMII, a 1000BASE-X, and a Serial Gigabit Media Independent Interface (“SGMII”). In an exemplary implementation, a total reduction of approximately 78 I/O pins on a PHY interface 119 may be achieved, along with output pin reductions in a statistics interface 116, as described below in additional detail. In this exemplary implementation, this reduction in pin count facilitated adding another embedded EMAC, namely, EMAC 111, in a processor block 102 of a pre-existing dimension. Thus, for example, EMAC 110 may have approximately 57 to 61 pins for a physical layer interface 119.

EMAC 110 may be configured to generate statistics on data traffic. For example, at the end of each transmitted or received frame, EMAC 110 may output a statistics vector for a frame to logic, which may be instantiated in FPGA fabric 101, for accumulation. However, statistics accumulation may be independent of the transmitted or received frame provided that each accumulation completes before the next statistics output so that no statistics vector is missed.

Because processor block 102 has a limited number of I/O pins as mentioned above, statistics interface 116 may output a statistics vector in a number of bits at a time, which is smaller than the length of a statistics vector. This output may be done over several cycles instead of in one cycle to reduce the number of output pins used to provide statistics interface 116. For two EMACs 110 and 111, output pins for statistics interfaces 116 may be reduced by approximately 102 pins in an exemplary implementation. Along with PHY interface I/O pins reduction, statistics interface pin reduction may facilitate integration of more than one EMAC in an ASIC block 102.

On the physical layer interface side of EMAC 110, GMII and MII interfaces use standard input/outputs (“I/Os”) to access data and control signals to a network connection via physical layer interface 119, in which an additional PHY integrated circuit may be disposed between PHY interface 119 and the physical medium (i.e., Ethernet lines). In addition, EMAC 110 physical layer interface 119 can be configured for a Physical Coding Sublayer (PCS) and a PMA sub-layer (“PCS/PMA”) interface which may use a serializer-deserializer (“SERDES”) to access a data signal serially. A SERDES may be instantiated in programmable IOBs, such as IOBs 2904 of FIG. 10. An example of a SERDES that may be instantiated is described in additional detail in commonly-owned U.S. Pat. No. 7,091,890, issued on Aug. 15, 2006, entitled, “MULTI-PURPOSE SOURCE SYNCHRONOUS INTERFACE CIRCUITRY,” which is incorporated by reference herein in its entirety.

Flow control module 105 may be used to avoid or reduce congestion in EMAC 110 from communication traffic. MIIM interface may allow a processor access to control and status registers in the PCS layer when configured in a 1000BASE-X or Serial Gigabit Media Independent Interface (“SGMII”) mode.

Clock generator 124 facilitates EMAC 110 to operate in different modes, for example such as GMII, MII, RGMII, SGMII, and 1000BASE-X modes. Furthermore, clock generator 124 facilitates EMAC 110 to operate at one of three different speeds, for example such as 10, 100, or 1000 megabits per second, or other high data rate for “overclocking.”

It should be understood that in contrast to an EMAC instantiated in configurable logic, with the embedding of EMAC 110 as dedicated circuitry there is an FPGA fabric 101/EMAC 110 boundary with which to contend. Notably, this boundary is different than interfacing to an embedded processor, which conventionally has a general purpose interface, as an embedded EMAC 110 has special purpose interfacing. However, as described below, EMAC 110 interfacing is configured in part to provide a general purpose communication client-side interface to FPGA fabric 101. For dynamically reconfigurable logic, such general purpose communication client-side interface facilitates coupling different user-design instantiations without redesign thereof to accommodate EMAC 110. In other words, the interface between EMAC 110 and a user-defined instantiated design in FPGA fabric 101 may be independent of one another.

Implementation of an embedded EMAC core 123 in processor block 102 facilitates use of processor 103 as a host processor. To accomplish this, a host interface 112, including a DCR bridge 113 and supporting logic, is provided. Notably, DCR bridge 113 may be external to host interface 112. In addition, host interface 112 allows for a host processor, embedded in or external to FPGA 100, to manage EMAC configuration registers using a host bus 118 supported by EMAC 110. This usage is in contrast to use of processor 103 via a DCR bridge 113 and DCR bus 114.

When DCR bus 114 is used as a host bus, DCR bridge 113 translates commands carried over DCR bus 114 into EMAC host bus signals. These signals are then input into at least one of EMAC 110 and 111. In an exemplary implementation, DCR bridge 113 includes four device control registers, two of which are used as data registers, such as respective 32 bit wide data registers. Another is used as a control register. The fourth device control register is used as a ready status register. A host processor, such as processor 103, polls this fourth device control register to determine access completion status. Bits in this fourth device control register are asserted when there is no access in progress. When an access is in progress, a bit corresponding to the type of access is automatically de-asserted. This bit is automatically re-asserted when the access is complete.

Alternatively, host interface 112 may provide an interrupt request to inform a host processor, such as processor 103 or an external host processor 10, of an access completion. A user may select to use either polling or interrupting to inform a host processor of access status.

Notably, transmit client interface 127 and receive client interface 128 each may operate in respective clock domains. Processor, such as processor 103, as associated with host interface 112 may operate in a separate clock domain too. Notably, by clock domain it is not meant to imply that the frequencies are different, though they may be the same or different frequencies, but rather that clocking may be asynchronous with respect to separate clock domains.

Soft EMAC

FIG. 2A is a high-level block diagram depicting an exemplary embodiment of a “soft” EMAC (“EMAC_top”) 204S. EMAC_top 204S and interface logic 202S is a program core that may be instantiated in configurable logic of FPGA 100. Notably, EMAC_top 104 of FIG. 1 may be designed using a hardware description language, such as VHDL or Verilog, among others. Accordingly, EMAC_top 104 of FIG. 1 may be an ASIC conversion of hardware description language code, whereas EMAC_top 204S may be an FPGA program code conversion of part of such hardware description language code. Thus, EMAC_top 104 and interfaces thereto of FIG. 1 may be provided in part as a design listing for subsequent instantiation in configurable logic of a programmable logic device. However, EMAC_top 204S is not just a repeat of EMAC 104 though in instantiated as opposed to embedded form, as a clock generator 204 in contrast to clock generator 124, is external to an EMAC.

Logic interface 201 couples EMAC_top 204S to interface logic 202S. Interface logic 202S is instantiated in configurable logic of FPGA fabric 101. Interface logic 202S is a program core for instantiating interfaces, such as configurable logic versions of client interface 117 and physical layer interface 119 of FIG. 1, in configurable logic. In an exemplary implementation, logic 202S may be a relatively fast FIFO for hold transmit and receive data packets from a client interface of EMAC_top 204S.

With simultaneous reference to FIGS. 1, 1A and 2A, it should be understood that because clock generator 124 is inside EMAC core 123, there is a clock buffering unknown propagation delay when the clock goes through a design instantiated in FPGA fabric 101. Thus, an unknown propagation delay of a clock signal going from an ASIC EMAC to a user instantiated design in FPGA fabric 101 may be handled by sending a clock out of such EMAC and then buffered and sent back into such EMAC, and for EMAC core 123, a clock interface is provided as described below in additional detail. In contrast, when a clock generator 204 is instantiated in FPGA fabric 101, such as along with a user design, a known clocking relationship exists by using FPGA fabric 101 clock network 205 resources. So such clock signal need not go into and out of an EMAC instantiated in configurable logic.

Clock Interface

A clock network may introduce clock skew. With respect to a clock network in FPGA fabric 101, such skew may be unknown. Handling skew from clock signal distribution in FPGA fabric 101 is described below in additional detail.

For an EMAC system instantiated in configurable logic, all of the logic for EMAC_top 204S and interface logic 202S may be in FPGA fabric 101. Clock signals are routed to EMAC_top 204S and interface logic 202S with FPGA clock networks, such as clock network 205. As a result, due to known clock buffering and routings in FPGA 100, the clock skews between EMAC_top 204S and interface logic 202S may be controlled within a tolerance range.

However, in EMAC 110 of FIG. 1, a clock network includes a balanced clock tree with known and fixed delays throughout. In contrast, a clock network in FPGA fabric includes a clock driver and clock network routings. The delay of a clock signal through an FPGA fabric clock network is dependent on FPGA design implementation. As a result, there is an uncontrolled amount of clock skew between EMAC ASIC logic and FPGA configurable logic at the “ASIC-FPGA” interface. Furthermore, in an exemplary implementation, EMAC 110, including clock generator 124, is implemented in processor block 102 with standard cells, and the same standard cells are not used to implement configurable logic in FPGA fabric 101. In other words, clock tree routing in FPGA fabric 101 and processor block 102 are different.

By feeding back an FPGA fabric clock into EMAC 110 to account for design specific clock delay in a user-instantiated design in FPGA configurable logic and to use a delay cell in the input datapath in EMAC 110 as described herein, clock skew introduced by an EMAC clock tree may be compensated.

For purposes of clarity, only ASIC versions of EMAC_top 104 embedded in an FPGA 100 are described hereinafter, as a configurable logic instantiated version of EMAC_top 104 will be apparent from description of an ASIC version thereof.

FIG. 2B is a high-level block diagram depicting an exemplary embodiment of an FPGA 100 having an embedded EMAC system. More particularly shown is EMAC core 123 having a clock generator 124 and a clock-output tree 210. A clock signal provided from clock generator 124 is sent to clock-output tree 210 and separately to a clock tree 213, which is generally shown as a global buffer (“BUFG”) driver. Thus, the clock signal output from clock generator 124 may be provided external to processor block 102 but internal to FPGA 100 where it is driven by BUFG driver 213.

A clock signal output from BUFG driver 213 is routed through a clock network provided for in conventional FPGA clock routing, as previously described. However, in the ASIC implementation of EMAC core 123, clock-output tree 210 is used to drive a clock signal output from clock generator 124.

It should be appreciated that the output from clock-output tree 210 and the output from BUFG driver 213 may be skewed with respect to one another because a design implemented in FPGA fabric external to processor block 102 is user-dependent, and thus the amount of clock loading may not be known in advance when implementing clock-output tree 210. Though FIG. 2B only shows EMAC 110, the above description applies equally to EMAC 111.

Accordingly, having a clock signal go through a clock network in FPGA fabric responsive to a user-instantiated design, generally indicated as BUFG driver 213, there will be some clock loading. Thus, output signal 231 may be out of phase with respect to client clock input signal 220.

Referring to FIG. 2D, there is shown an exemplary embodiment of signal timing for signals of FIG. 2C. With simultaneous reference to FIGS. 2C and 2D, signals of FIG. 2C are further described. A phase difference 233 between signals 220 and 231 will be less than the total delay of a delay cell of EMAC core 123 used to compensate for this known phase difference 233. The total delay of the delay cell is slightly larger than the total clock tree delay 233 due at least in part to setup time which should taken into account.

Client clock input signal 220 is used to clock flip-flops 226 and 225. Clock output signal 231 from clock-input tree 211 is used to clock flip-flops 229 and 224 of EMAC core 123. Data output of flip-flop 229, namely output signal 228, thus will be active for a period equivalent to a period of clock signal 231. Output of flip-flop 225, namely EMAC core input signal 227, will have an active (“high”) time equivalent to a period of clock signal 220.

Input clock signal 227 is provided to a buffer 223 of EMAC core 123 to provide a delay. Output signal 230 is a compensated delay output signal which may be used as a data input to flip-flop 224 driven by clock signal 231 to provide a data output 232 for clocking EMAC 110.

By feeding back FPGA fabric clock network loaded clock signal 220 to drive clock-input tree 211 of ASIC EMAC core 123, clock skew between ASIC EMAC core 123 and FPGA fabric 101 due to clock network loading of client clock input signal 220 may be taken into account. Output signal 231 of ASIC input-clock tree 211 is skewed by a known clock tree delay in an ASIC implementation. This clock skew is compensated by one or more delay cells, such as one or more buffers 223, for instances of data inputs from FPGA fabric 101 to EMAC core 123, such as via register (“flip-flop”) 225 clocked by FPGA fabric 101 loaded clock signal 220. For instances of data outputs from EMAC core 123 to FPGA fabric 101, such as via register 226 clocked responsive to loaded clock signal 220, output data 228 may be captured by register 226. Notably, registers 224 and 229 are merely representative of circuits in EMAC 110, and are not the actual circuits, which are described below in additional detail. Accordingly, registers 224 and 229 are generally representative of how the EMAC clock domain may work. Moreover, registers 226 and 225 represent a user-instantiated design, and thus may not be the actual circuits implemented by a user.

Embedded (“hardwired”) EMACs implemented in standard cells generally have more than twice the performance of supporting logic implemented FPGA fabric 101. To take advantage of this increase in performance, in an exemplary implementation, client interface 117 at the ASIC-FPGA fabric boundary is configured to run EMAC 110 at approximately twice the clock frequency of the supporting logic programmed in configurable logic. To maintain throughput, client interface 117 datapath width is doubled to compensate for the supporting logic running at half the clock frequency. Client interface 117 allows for EMAC 110 to run at the same clock frequency as the supporting logic in FPGA by using half the allocated datapath width.

Responsive to EMAC 110 operating in an “overclocking” mode, such as a 16-bit overclocking mode, MII transmit clock input (“MII_TX_CLK”) signal 267 is not used; hence, the input clock pin for MII_TX_CLK signal 267 may be used to bring in a divided by two clock signal from a DCM, as described below in additional detail with reference to FIG. 3.

With continuing reference to FIG. 2E, clock signal 222 is provided to a counter 240, multiplexer 247, multiplexer 248, and multiplexer 251. Counter 240, as well as counter 241, may be Johnson counters for tracking a logic one. For purposes of clarity by way of example, it will be assumed that clock signal 222 has a frequency of approximately 125 MHz. All frequencies provided herein below are approximate, and actual frequency used depends upon implementation.

Counter 240 provides a divide by 5 clock signal (“CLK—25 MHz”) 256 at 25 MHz and a divided by 10 clock signal (“CLK—12—5 MHz”) 257 at 12.5 MHz. Counter 240 provides signal CLK—25 MHz 256 to a multiplexer 242 as input. CLK—25 MHz 256 is provided to a logic 1 input (“input I1”) of multiplexer 242. By a logic 1 (“logic high”) input of a multiplexer, it is meant that to select that input for output from the multiplexer, a control signal will be a logic 1. Counter 240 provides signal CLK—12—5 MHz 257 to a counter 241 and to a multiplexer 243 as input. CLK—12—5 MHz 257 is provided to a logic 1 (“logic high”) input I1 (“input I1”) of multiplexer 243.

Clock signal 222 is provided to a logic 1 input of multiplexers 247, 248, and 251. Tx_GMII_MII_Clk_IN signal 265 is provided to a logic 0 input of multiplexer 251.

Counter 241 provides a divided by 5 clock (“CLK—2—5 MHz”) signal 258 at 2.5 MHz and a divided by 10 clock (“CLK—1—25 MHz”) signal 259 at 1.25 MHz. Counter 241 provides CLK—2—5 MHz signal 258 to a logic low input of multiplexer 242. Counter 241 provides CLK—1—25 MHz signal 259 to a logic 0 (“logic low”) input of multiplexer 243 as input. By a logic 0 (“logic low”) input of a multiplexer, it is meant that to select that input for output from the multiplexer, a control signal will be a logic 0.

A speed select (“SPEED_IS—1000”) signal 255 is provided to multiplexer 247 and to multiplexer 248 as control input. Responsive to speed select signal 255 being a logic 1, multiplexers 247 and 248 both select clock signal 222 for output. Responsive to speed select signal 255 being a logic 0, multiplexer 247 selects INT_MII_TX_CLK signal 278 for output and multiplexer 248 selects TX_CORE_CLK_SRGMII signal 266 for output. Alternatively, speed select signal 255 may be referred to as a SPEED_IS—10—100 with logic high and low inputs of multiplexers 247 and 248 reversed in FIG. 2E.

An overclocking mode select signal 270 is provided to multiplexer 250 as a control input. Responsive to overclocking mode select signal 270 being a logic 1, multiplexer 250 selects an MII transmit clock (“MII_TX_Clk”) signal 267 for output therefrom. Responsive to overclocking mode select signal 270 being a logic 0, multiplexer 250 selects a logic 0, tied to a logic low input of multiplexer 250, for output. Output of multiplexer 250 is a divided by two transmit client clock (“TX_CLIENT_DIV2_CLK”) signal 272, which may be disabled by selecting an input of multiplexer 250 tied to ground.

Notably, in an implementation of EMAC 110, EMAC 110 is a tri-mode MAC, namely, frequency of operation may be switched on the medium from approximately 1000, to 100, to 10 Mb/s. This translates into switching the system clock. Host interface 112 handles this switching. To control switching of clocks to avoid creating an unwanted pulse, clocks are only switched during a low period of the clocks. For this switching in an exemplary implementation, multiplexers 242, 243, 245, 246, 247, 248, 250, and 251 may be what is known as on-chip global buffer multiplexers, an example of which is described with reference to FIG. 2F-1.

FIG. 2F-1 is a schematic diagram depicting an exemplary embodiment of an on-chip global buffer multiplexer 99. A select signal 21 is provided to inverter 11 and to an input of an AND gate 17. Inverter 11 provides an inverted version of signal 21 to an input of an AND gate 14. Another input to AND gate 14 is provided by an inverter 12. Another input to AND gate 17 is provided by an inverter 13.

AND gate 14 provides an input data A signal (“dataA_in”) 22 as output. AND gate 17 provides an input data B signal (“dataB_in”) 25 as output. Input data A signal 22 is provided to a data input of a register 15. Input data B signal 25 is provided to a data input of a register 18.

A clock A signal (“clockA”) 23 is provided to a clock input of register 15 and to an input of an AND gate 16. A clock B signal (“clockB”) 26 is provided to a clock input of register 18 and to an input of an AND gate 19. Register 15 provides a register A signal (“Areg”) 24 as output. Register 18 provides a register B signal (“Breg”) 27 as output. Areg signal 24 is provided to another input of AND gate 16. Breg signal 27 is provided to another input of AND gate 19.

FIG. 2F-2 is a schematic diagram depicting an exemplary embodiment of a divider circuit 98, which may be used in an implementation for divider 244 of FIG. 2E. A clock signal 41 is provided to a clock input of a register 32. A data input to register 32 is provided by an inverter 31. Register 32 provides a register output signal (“Reg1_Out”) 42 as output. Register output signal 42 is provided to buffers 31 and 33 as input. Buffer 33 provides a divided by 2 clock signal (“CLK_DIV2”) 43 as output of divider 98.

FIG. 2G is a block/schematic diagram depicting an exemplary embodiment of a receive clock generator 124R. Receive clock generator 124R is a part of clock generator 124. Receive clock generator 124R includes multiplexers 285, 286, 287 and 288, divider 284, buffers 283 and 291, and OR gate 281. In an exemplary implementation, multiplexers 285, 286, 287, and 288 are on-chip global buffer multiplexers, an example of which is illustratively shown in FIG. 2F-1, and divider 284 may be a divider as illustratively shown in FIG. 2F-2.

Overclocking mode signal 270 is provided as an input to an OR gate 281. PCS/PMA mode signal 271 is provided as another input to OR gate 281. OR gate 281 outputs ORed overclocking mode signal 270 and PCS/PMA mode signal 271 as a mode select (“OVERCLOCKIN_OR_PCS_PMA”) signal 296. Select signal 296 is provided to a multiplexer 288 and to a multiplexer 286 as control input.

Accordingly, it should be appreciated that EMAC core 123 includes a clock generator 124 from which a clock signal is generated and a version of which generated clock signal is fed back to EMAC core 123 to account for clock signal distribution through FPGA fabric 101. Secondly, it should be appreciated that any of several modes, such as MII, GMII, SGMII, RGMII, and 1000BASE-X PCS/PMA, may be used where transmit clock generator 124T and a receive clock generator 124R portions of clock generator 124 are used for providing clock signals for transmission and reception for communicating via a network. Furthermore, clock signals for a PCS/PMA sublayer mode or an overclocking mode may be selected. Along these lines, clock generator 124 provides both EMAC core and client interface clock signals.

In an implementation, when EMAC 110 is configured for tri-mode operation or non-tri-mode operation, transmit clock speed is approximately 2.5, 25, and 125 MHz for 10, 100 and 1000 Mb/s approximate data rates, respectively. In an implementation, when EMAC 110 is configured for tri-mode operation or non-tri-mode operation, receive clock speed is approximately 2.5, 25, and 125 MHz for 10, 100 and 1000 Mb/s approximate data rates, respectively. It should be understood that embedded EMAC 110 may be capable of operating at a faster frequency than FPGA fabric 101.

FIG. 3 is a block/schematic diagram depicting an exemplary embodiment of FPGA 100 configured for an overclocking mode. In this exemplary embodiment, a digital clock manager (“DCM”) 308 is coupled to clock-input trees 211 and 301 of EMAC core 123 and is configured to provide a divide by two clock signal 305. Clock signal 221 output from clock generator 124 is input to DCM 308. Output from DCM 308 is a divided by two clock signal 305 and an undivided or 1× clock signal 304 with respect to divided by two clock signal 305.

Responsive to EMAC 110 being in an overclocking mode, such as a 16-bit overclocking mode, DCM 308 in FPGA 100 is used to provide divided by two clock signal 305. Because a DCM is used, the phase between 1× clock signal 304 and divided by two clock signal 305 are phase aligned at the output of DCM 308. Clock signal 305 of FIG. 3 may be handled as was clock signal 221, described with reference to FIG. 2C, as the same principle in the above solution applies to the divided by two clock skew.

Clock signal 304 may be input to buffer 302, and clock signal 305 may be provided to buffer 303. Output of buffer 302 may be fed back as an input to DCM 308 and may be provided as clock signal 220 to clock-input tree 211. Output of buffer 303 may be provided as clock signal 306 to FPGA fabric 101 and to clock-input tree 301. Notably, separate clock trees may be used for handling clocks of different frequencies, for example where clock signal 304 is greater than or equal to 250 MHz and clock signal 305 is greater than or equal to 125 MHz. Recall, for this example that clock signal 304 is twice the frequency of clock signal 305.

FIG. 3A is a simplified block diagram depicting an exemplary embodiment of clock management for an RGMII. Though an RGMII example is used, it should be understood that an MII or MGII may be used, depending on which mode of these three MII modes is selected. However, for compliance with an interface protocol, frequency of the output signal may be specified, as described below. EMAC 110 provides a client output transmit clock 221T for RGMII logic 106R and a buffered client input transmit clock 220T may be received. Clock output signal 2002 and clock input signal 2003, such as respective RGMII transmit and receive clock signals, may be any of a variety of frequencies as describe in additional detail below herein. Client EMAC transmit and receive input and output clocks 2004 through 2007 may be provided to a user design 2001 instantiated in programmable logic. For MII, clock frequencies for clock signals 2002 and 2003 are likewise selectable. However, for a GMII, while clock frequency of clock signal 2003 is selectable, frequency of clock signal 2002 is set to that called out in the GMII specification, such as 125 MHz, for a physical layer interface.

Host Interface

With renewed reference to FIG. 1, in embedded EMAC top 104, a host bus 118 is configured for backward compatibility with a soft EMAC core host interface. This backward compatibility allows users who have been using the soft EMAC core to use an embedded EMAC 110 without having to redesign the host interface of the soft EMAC core, thereby facilitating user migration.

FIG. 4-1 is a high-level block/schematic diagram depicting an exemplary embodiment of a host interface 112. Host bus 118 allows for a host processor to be located in FPGA 100 or external to FPGA 100. In addition, in a PowerPC 405 (“PPC405”) processor core implementation, a DCR bridge 113 is implemented internal to host interface 112, so that PPC405 processor 103 residing in the Processor block 102 can act as a host in managing EMAC 110 configuration registers via DCR bus 114. Implementing DCR bridge 113 in processor block 102 with area-efficient standard cells facilitates making available configurable logic resources in FPGA 100 for customer applications. In addition, DCR bridge 113 in processor block 102 provides an efficient way for processor 103 in processor block 102 to act as a host processor to access host registers in EMAC core 123 through DCR bus 114. Notably, DCR bridge may be internal or external to host interface 112.

In addition, a PPC405 implementation of processor 103, using DCR bridge 113, can read statistics registers implemented in FPGA fabric 101. When DCR bus 114 is not used, host interface 112 allows a user to manage EMAC host registers via a host bus 118. Additionally, host interface 112 includes logic for processor 103 to read, via DCR bus 114 or host bus 118, statistics registers, such as may be implemented in configurable logic for accumulation of statistics, located in FPGA fabric 101.

An input signal 406 to processor block 102 called “dcremacenable” is used to select the host bus type to use. Dcremacenable signal 406 is asserted to select DCR bus 114 for use as a host bus, and deasserted to select host bus 118 for use as a host bus. Dcremacenable signal 406 may be provided via a tie-off pin that can be tied to a logic value (high or low) when FPGA 100 is configured. Notably, it should be understood that if an embedded processor other than a PPC405 core were implemented, then DCR bridge 113 and DCR bus 114 may be replaced with a bridge or hub and associated busing thereof for the type of processor embedded. For example, a Northbridge may be used for interfacing to an embedded Pentium processor from Intel of Santa Clara, Calif. Furthermore, no embedded processor may be present in FPGA 100, as processor 103 is not necessary for operation of EMAC 110. Tie-off pins are provided with FPGA 100, such that a user may set values to avoid the need for a processor. Tie-off pins may be used to configure FPGA 100 as a network device, such as a router, bridge, hub, and the like for example.

Furthermore, processor 103 may be used as a host processor and host bus 118 may be used in addition thereto. For example, there may be peripheral functions to be associated with EMAC 110 which peripheral functions could be instantiated in configurable logic of FPGA 100. If such peripheral functions employ register access of EMAC 110, such register access may be had via host bus 118. An example of such a peripheral function would be processing of statistics on network transmission. Another example of such a peripheral function would be address filtering in addition to that already provided with EMAC 110.

Host bus 118 is used to provide signals 414 and 409 to host interface 112 and to receive signal 413 from host interface 112. Dcermacenable signal 406 may be provided as a select signal to multiplexers 401 and 402. Input to logic high inputs of multiplexers 401 and 402 may include DCR selection information as between selecting one or both of EMACs 110 and 111. Outputs from multiplexers 401 and 402 may be buffered via AND gates (“buffers”) 404 and 405 for providing to EMAC 110 and 111, respectively. However, only one EMAC 110 or EMAC 111 may communicate with a host device at a time, and thus outputs from EMAC 110 and 111 may be provided to multiplexer 403 for communicating via host bus 118. A select signal provided to multiplexer 403 may originate from the output of multiplexer 402.

FIG. 4-2 is a block/schematic diagram depicting an exemplary embodiment of host interface 112. Table 1 lists signal sets for FIG. 4-2. For purposes of clarity by way of example, bit lengths for an implementation are provided; however, the particular bit lengths need not be implemented, as other bit lengths may be used. Moreover, logic equations are described in Verilog Register Transfer Level (“RTL”).

TABLE 1

Signal Set

Signals

(1)

dcrClk, dcrABus[8:9], dcrWrite, dcrRead,

dcrWrDBus[0:31], dcrAck, dcrRdDBus[0:31]

(2)

dcr_hostAddr[9:0], dcr_hostOpCode[1:0],

dcr_hostMIIMsel, dcr_hostReq, dcr_hostWrData[31:0],

dcr_AddrFilRd, dcr_AddrFilWr, dcr_AFcamWr,

dcr_AFcamRd

(3)

HOST_ADDR[9:0], HOST_MIIM_SEL,

HOST_OPCODE[1:0]

(4)

hostAddr[9:0], hostOpcode[1:0], hostMIIMsel,

hostReq, hostWrData[31:0], hostAddrFilRd,

hostAddrFilWr, hostAFcamRd

(5)

HOST_ADDRe0[9:0], HOST_OPCODEe0[1:0],

HOST_MIIM_SELe0, HOST_REQe0,

HOST_WR_DATAe0[31:0], HOST_AddrFilRdE0,

HOST_AddrFilWrE0, host_AFcamRdE0

(6)

HOST_ADDRe1[9:0], HOST_OPCODEe1[1:0],

HOST_MIIM_SELe1, HOST_REQe1,

HOST_WR_DATAe1[31:0], HOST_AddrFilRdE1,

HOST_AddrFilWrE1, host_AFcamRdE1

(7)

hostAddr[9:0], hostReq, hostMIIMsel, hostOpcode[1:0]

(8)

AFcfgRdEn, AFcfgWrEn, AFcfgCAMrdEn

(9)

{16′h0000, dcr_hostReq, dcr_hostOpcode[1:0],

2′b00, dcr_emac1Sel, dcr_hostAddr[9:0]}

(10)

(dcr_StatsRdEn & dcremacenable)

Host interface 112 uses two clocks signals, namely, a DCR clock (“dcrClk”) signal 516 (shown in FIG. 4-5A) and a host clock (“HOST_CLK”) signal 440. The dcrClk signal 516 runs at the same clock frequency as the system clock for processor 103. DCR bridge 113 uses both dcrClk signal 516 and HOST_CLK signal 440. HOST_CLK signal 440 comes from a host device coupled to host bus 118 and is part of signal set (3). Signals 414 include signal set (3), HOST_CLK signal 440, a host request signal, and a host write data signal 438. HOST_CLK 440 is used to interface to host registers in EMAC core 123.

Signal set (1) is provided via DCR bus 114 to and from DCR bridge 113. From signal set (1), it should be understood that DCR bus 114 contains only two least significant address bits. This is because a central DCR address decoding unit is implemented in processor block 102 and DCR bridge 113 uses only four DCR registers in this exemplary implementation. The central DCR address decoding unit decodes the DCR address bus (“dcrABus[0:7]”) signal from processor 103 and in conjunction with DCR read and DCR write signals generates DCR write or DCR read signals if the address is targeted to DCR bridge 113.

DCR bridge 113 converts the DCR commands in a dcrClk domain into host bus signals in a HOST_CLK domain for output, namely, DCR bridge output signals are dcr_emac1Sel 411 and signal set (2), generally referred to as signals 412. Dcremacenable signal 406 is provided as a control select input to multiplexers 401 and 402. Dcremacenable signal 406 is used to select which host bus to use, namely, either host bus 118 or DCR bus 114. The selected host bus signals are emac1Sel 411 and signal set (4), generally indicated as signals 412, namely, the outputs of multiplexers 402 and 401, respectively. Input to multiplexer 401 is signal set (2), which is also provided to bus 443. Other signals input multiplexer 401 are signals 414. Input to multiplexer 402 is dcr_emac1sel signal 411 and Host_emac1Sel signal 409. Notably, there is a one-to-one correspondence of same signal inputs between inputs to multiplexers 401 and 402 from DCR bridge 113 and host interface 118.

Responsive to emac1Sel signal 400 being a logic 1, host bus signals 410 are directed to EMAC 111 and directed to EMAC 110 responsive to emac1Sel signal 400 being a logic 0. In an exemplary implementation, signal emac1Sel 400 may be address bit [10] of host bus 118. Output 410 from multiplexer 401 is provided as input to buffers 404 and 405. Output 400 from multiplexer 402 is provided as input to buffer 405 and logic block 429, and inverted then provided as input to buffer 404. Signal set (5) is host bus signals output from buffer 404 to EMAC 110, and signal set (6) is host bus signals to output from buffer 405 to EMAC 111. Host bus 118 may be coupled to host interface logic, which logic is describe in additional detail in the above-referenced soft EMAC core.

Logic block 421 contains address decoding for address filter host registers read and write enable and address filter content addressable memory (“CAM”) read enable. Notably, though the term CAM is used herein, an actual CAM may or may not be implemented. Storage for multicast addresses may be in the form of registers for example, namely, multicast address registers (“MARs”). Accordingly, the terms CAM and MAR should be considered interchangeable.

Signal set (3) includes inputs and signal set (8) includes outputs of logic block 421. Thus, only a portion signals 414 are provided to logic block 421. The address filter CAM write signal is the same signal as the host registers write signal, but the address filter CAM read signal uses a separate signal, AFcfgCAMrdEn of signals 439, because the CAM read is an added function to Address Filter read logic 422. The address decode and read enable or write enable signals for host address bus 118 are provided via host interface 112 because DCR bridge 113 generates those read enable or write enable signals, and symmetry is used for signals between DCR bridge 113 and host bus 118 signals.

Below is a code listing for an exemplary embodiment of address decode logic equations for address filter host registers read or write and CAM read enable for logic block 421, where the logic equations are in Verilog RTL:

Logic block 429 uses decoded read command signals to generate emac1SelReg signal 469 to keep the read data return path open for the selected EMAC until another read command. This is used because each type of read returns data with different timing.

Below is a code listing for an exemplary embodiment of logic equations for logic block 429, where the logic equations are in Verilog RTL:

With continuing reference to FIG. 4-2, emacRegRdSel logic block 430 generates signals to steer read data to the proper datapath. When a HOST_RdDen signal of signals 464 is asserted, the read data is from a host register in an embedded EMAC, either EMAC 110 or 111 in this example. When the HOST_statsRdDen signal is asserted, the read data is from a statistics register implemented in FPGA fabric 101.

FIG. 4-3 is state diagram depicting an exemplary embodiment of a state machine 457 for emacRegRdSel logic block 430. Responsive to reset signal 474 being asserted, state machine 457 goes to idle state 472.

Responsive to statistics read enable signal at a logic high state and configuration read enable signal and MIIM read enable signal being at a logical low state of signals 446, state machine 457 transitions from idle state 472 to state S1475. All the states of state machine 457 for a host read data enable signal and a host statistics read data enable signal outputs 464 from emacRegRdSel logic block 430 are set forth below in Table 2.

State machine 457 transitions from idle state 472 to state C1473 when statistics read enable signal and MII read enable signal are both at a logic low state and configuration read enable signal is at a logic high state. State machine 457 transitions from idle state 472 to state M1470 responsive to statistics read enable signal and configuration read enable signal being at a logic low state and MIIM read enable signal may be at either a logic low or a logic high state for this transition to occur.

State machine 457 stays in state M1470 responsive to host MIIM ready signal 408 not being asserted, and transitions from state M1470 to state M2471 responsive to host MIIM ready signal 408 being asserted. All other transitions occur responsive to host clock signal 440, namely, transitioning from state M2471 to idle state 472, transitioning from state C1473 to idle state 472, and transitions from state S1475 to state S2476 to state S-3477 to state S4478 to state S5479 to state S6480 to state S7481 and back to idle state 472.

In Table 2 are state machine 457 outputs for signals 464 for each of the states in FIG. 4-3.

Responsive to HOST_AFlswRdEn being asserted, a CAM read data [31:0] signal is output to a read bus to host bus 118, namely, HOST_RD_DATA[31:0] 445. In the next host clock 440 cycle, HOST_AFmswRdEn is asserted, and read data [47:32] is output to HOST_RD_DATA[15:0] of host read data bus 445. This outputting the least-significant-word first followed in the next host clock cycle by the most-significant-word is for consistency with reading statistic registers. For this, the read data from hostAddrFilRdD[47:32] 462 is registered for one host clock cycle delay in outputting. This may be done by providing an address filter configuration CAM read enable register signal 460 as a control select input to multiplexer 427 having host address filter read data signal [47:32] 462 as one set of logic high data inputs and feeding back address filter read data CAM most-significant-word register (“AFrdDcamMSWreg[15:0]”) signal 459 as a set of logic low data inputs to multiplexer 427. Output from multiplexer 427 is provided to register 426. Register 426 is clocked responsive to host clock signal 440. Output of register 426 is AFrdDcamMSWreg[15:0] signal 459.

FIG. 4-4 is a state diagram depicting an exemplary embodiment of a state machine 447 for address filter read logic block 422. State machine 447 transitions to idle state 483 responsive to reset signal 474. From idle state 483, state machine 447 transitions to address filter read state 482 responsive to address filter configuration read enable signal of signals 439 being asserted. From address filter read state 482, state machine 447 transitions back to idle state 483 responsive to the next host clock cycle. State machine 447 transitions from idle state 483 to address filter control state 1484 responsive to address filter configuration CAM read enable signal of signals 439 being asserted. From address filter CAM state 1484, state machine 447 transitions to address filter CAM state 2485 responsive to a next host clock cycle. From address filter CAM state 2485, state machine 447 transitions back to idle state 483 responsive to a subsequent host clock cycle. State machine 447 stays in idle state 483 if neither of signals 439 are asserted.

In Table 3 are state machine 447 outputs for signals 441 for each of the states in FIG. 4-4.

TABLE 3

State

HOST_AFlswRdEn

HOST_AFmswRdEn

IDLE

0

0

AFR

1

0

AFC1

1

0

AFC2

0

1

Read data (“RdDe#[31:0]”) and address filter read data (“AddrFilRdDe#[47:0]”), where # is a 0 or 1 respectively for EMAC 110 and EMAC 111, are provided to multiplexer 428, along with MIIM read done signal (“HOST_MIIM_RDY#”), where # is a 0 or 1 respectively for EMAC 110 and EMAC 111. Multiplexer 428 output is selected responsive to emac1SelReg signal 469. Thus, RdDe0[31:0] contains the read data from the EMAC0 host registers and AddrFilRdDe0[47:0] contains the read data from the EMAC0 address filter 129, and RdDe1[31:0] contains the read data from the EMAC1 host registers and AddrFilRdDe1[47:0] contains the read data from the EMAC1 address filter.

Responsive to emac1SelReg signal 469 being at a logic high state, the read data set from EMAC1 is selected, and responsive to emac1SelReg signal 469 being at a logic low state, the read data set from EMAC0 is selected.

AFcfgCAMrdEnReg signal 460 is the registered version of the AFcfgCAMrdEn signal of signals 439. In an implementation, because read data bus 445 of host bus 118 is only 32 bits wide, hostAddrFilRdD[47:32] 462 is registered and output in the next host clock cycle. Again, for a data set, the least significant word is output first and immediately followed by the most significant word of the data set on the following host clock cycle so that the read timing for an address filter, such as address filter 129, is consistent with the read timing of statistics registers.

When embedded processor 103 is used as a host processor, host bus 118 is not used for communicating with a host processor. Hence, host bus 118 I/O pins may be re-used in a different way to read statistics registers implemented in FPGA fabric 101. This re-use of I/O pins facilitates interfacing FPGA fabric 101 to ASIC and other embedded logic in processor block 102 using the limited number of I/O pins available in processor block 102.

Data signals 437 and 455 are input to multiplexer 454 along with signals from bus 457, namely, 16 bits of padding coupled to ground 458 or other fixed logic low value and 16 bits from signal 459. Output from multiplexer 454 is selected responsive to a three bit wide control select input from host read data enable, host address filter least significant word read enable and host address filter most significant word read enable signals 456. Host read data [31:0] signal 452 output from multiplexer 454 is input to a logic low port of multiplexer 423. Signal set (10) generally indicated as signal 448 is provided as a control select input to multiplexers 423 and 424. Signal 412 in addition to logic zero padding 444 provided to bus 443 is provided to a logic high input port of multiplexer 423. Input to a logic high port of multiplexer 424 is select signal 451, and input to a logic low port of multiplexer 424 is ready signal 408. Output from multiplexer 423 is host read data [31:0] signal 445, and output from multiplexer 424 is host MIIM ready signal 446. Outputs from multiplexers 423 and 424 may be bussed outputs 413 of host bus 118.

HOST_MIIM_SEL=used as HOST_MIIM_RDY.
Thus, it should be appreciated that pins for signals 445 and 446 are used for read busing of host configuration registers and for write busing of statistic registers. Notably, these pins for signals 445 and 446 do not need to be used just for statistics registers, but may be used to access any registers instantiated in FPGA fabric 101.

For example, by re-using processor block 102 I/O pins, PPC405 processor can act as a host processor to perform all the management functions as a host processor embedded in FPGA 100 or external to FPGA 100. When DCR bus 114 is not used as a host bus, host bus 118 may be used to access host registers in EMAC core 123. Again, host bus 118 allows a host processor to reside in FPGA 100 or be external to FPGA 100.

Host address filter read data signal 434, which in an implementation may be a 48-bit wide signal, and host read data signal 437, which in an implementation may be a 32-bit wide signal, are part of a host interface 118. A portion of bits of read data signal 434, such as bits [47:32], may be provided to bus 696, and other bits, such as 16 other bit lines coupled to ground 458, may be provided to bus 696 to provide padding for a bus width, such as a 32-bit width.

Bus 696 may be coupled to a logic high input port of multiplexer 490. Another portion of bits of read data signal 434, such as bits [31:0], may be input to a logic high input port of multiplexer 491. Read data signal 437 may be input to respective logic low input ports of multiplexers 490 and 491.

A DCR address filter CAM read select signal 511 may be provided as an input signal to multiplexer 490 to select as between inputs to provide read data MSW signal 513, which may be a 32-bit wide signal, as an output. Select signal 511 and a DCR address filter read select signal 512 may be logically ORed to provide a control select input to multiplexer 491 to provide a read data LSW signal 695, which may be a 32-bit wide data signal.

MSW output from multiplexer 490 may be input to a logic low port of multiplexer 493. LSW output from multiplexer 491 may be provided to a logic low input port of multiplexer 492. Input to a logic high port of multiplexer 492 may be a read data host interface (“IF”) register signal 539, which may be a 32-bit wide signal and which may be obtained from output of multiplexer 509. A host register read enable signal 517 may be provided as a control select signal to multiplexer 492 to provide an output therefrom to a logic low input port of multiplexer 494.

DCR write data bus 514, which may be a 32-bit wide data bus, may be provided to respective logic high input ports of multiplexers 493 and 494, and as an input to control register 500 and to a logic low input port of multiplexer 495. A logic high input port of multiplexer 495 may be coupled to ground 458, and a host register access start signal 519 may be provided as a control select input to multiplexer 495 to provide an output therefrom to a logic low input port of multiplexer 496. A logic high input port of multiplexer 496 may be coupled to a logic high voltage level 697, and a host register access done signal 520 may be provided as a control select input to multiplexer 496.

A MSW input write enable signal 515 may be input as a control select signal to multiplexer 493, and a LSW input write enable signal 518 may be input as a control select signal to multiplexer 494. Output from multiplexer 493 is input to MSW data register 497. Output from multiplexer 494 is input to LSW data register 498. Output from multiplexer 496 is input to ready status register 499.

Registers 497 through 500 may each be 32-bit wide registers clocked responsive to DCR clock signal 516. Outputs of registers 497 through 500 are provided to multiplexer 698, which is coupled to receive select signals 525, where select signals 525 include a data register MSW read enable signal, a data register LSW read enable signal, a ready status read enable signal, and a control register read enable signal for respectively selecting input from registers 497 through 500 for output from multiplexer 698. Output from multiplexer 698 is DCR read data signal 526, which may be a 32-bit wide signal.

To provide a bypass mode, read data signal 526 may be input to a logic high input port of multiplexer 507 and input to a logic low input port of multiplexer 507 may be DCR write data bus 514. A DCR read signal 528 and a DCR read output enable signal 529 may be ANDed by AND gate 306, the output from which may be provided as a control select signal, namely DCR read data bus enable signal 530, to multiplexer 507. Output of multiplexer 507 is DCR read data bus 531, which may be a 32-bit wide data bus.

Output from LSW data register 498 is LSW data register signal 532, which may be 32-bits wide and which may be input to a logic high input port of multiplexer 502, MIIM write data register 541, and interrupt request enable register 537. Host register access done signal 533 may be input to a logic low input port of multiplexer 502, and an interrupt request status write enable signal 534 may be provided as a control select input to multiplexer 502. Output of multiplexer 502 is provided to interrupt request status register 536. Registers 536, 537, and 541 may each be 32-bits wide and clocked responsive to DCR clock signal 516.

Output from LSW data register 498 and MIIM write data register 541 are respectively provided to a logic high input port and a logic low input port of multiplexer 508. An MIIM write data select signal 540 is provided as a control select signal input to multiplexer 508 to provide DCR/host write data signal 542, which may be a 32-bit wide signal.

FIG. 4-5B is a table diagram depicting an exemplary embodiment of DCR address and bit assignments for DCR bridge 113. In this exemplary implementation, DCR bridge 113 uses four DCR registers 497 through 500 of FIGS. 4-5A occupying four consecutive DCR addresses 523. Default values 521 and read or write capability 524 of DCR registers 497 through 500 are also listed. In an implementation, each of registers 497 through 500 is clocked responsive to DCR clock signal 516, and each of registers 497 through 500 has a 32 bit wide [0:31] output.

With simultaneous reference to FIGS. 4-5A and 4-5B, DCR bridge 113 is further described. With respect to bits [0:15] of a ready status DCR register (“RDYstatus”) 499, this register is a read-only register, though it is possible to write to this register for functional verification. With respect to bit [21] of DCR control register (“cntlReg”) 500, in an exemplary implementation emac1Sel may be bit [10] of host bus 118 address bits, where a logic 0 is for EMAC0 and a logic 1 is for EMAC1.

DCR most-significant word data register (“dataRegMSW”) 497 is used in address filter register reads where return data contains a threshold number of bits, such as 48 bits for example. An example usage is a read of a unicast address register or one of the four multicast addresses in CAM. Again, CAM is not limited to memory, but may be registers such as MARs. In this exemplary implementation, dataRegMSW 497 receives the most significant read data bits [47:32] of host address filter read data [47:0] 434.

In this exemplary implementation, dataRegMSW 497 is used in reading of statistics registers because the statistics registers are 64 bits wide. The most significant word of the statistics register (e.g., bits [63:32]) may be stored in dataRegMSW 497. DataRegMSW 497 facilitates consistent software programming, namely, when PPC405 processor 103 issues a host register read command, host interface 112 deposits the read data to DCR data registers, and then PPC405 processor 103 may issue a DCR read command to dataRegMSW 497 to bring the read data into a general-purpose register (GPR) of processor 103.

A DCR least significant word data register (“dataRegLSW”) 498 contains the least significant word, such as for example 32 bits of read or write data. Write data goes through dataRegLSW 498, and in an exemplary implementation, dataRegLSW 498 is programmed with write data before cntlReg 500 is programmed with a write command.

Processor 103 commands for host register accesses may be written to cntlReg 500. Responsive to cntlReg 500 being programmed, host interface 112 may start to take action for a host register transaction. Hence, for a host register write, the sequence of programming in an implementation may be to put write data into dataRegLSW 498 first before programming cntlReg 500.

RDYstatus register 499 contains EMAC host register read or write transaction status. Processor 103 may poll RDYstatus register 499 to determine whether an EMAC host register read or write is complete before it issues another EMAC host register access command, as DCR bridge 113 in this exemplary implementation is configured not to accept another DCR command from PPC405 processor 103 until an EMAC host register read or write that is in progress completes. In the instance of MIIM host register read or write, it may take multiple HOST_CLK signal 440 cycles for the EMAC MII data input/output (“MDIO”) interface to serially shift in or out the read or write data. Furthermore, the MDIO clock (“MDC”) frequency may be a fraction of HOST_CLK signal 440 frequency. MDC frequency may be less than approximately 2.5 MHz.

PPC405 processor 103 is configured to time-out and simply execute another instruction if a DCR device does not assert a DCR acknowledge within 64 dcrClk signal 516 clock cycles. Hence, PPC405 processor 103 assumes that a DCR instruction is executed even though the instruction is still in progress or waiting. This leads to incorrect outcome when the presumed executed instruction's result is used.

In addition to DCR registers 497, 498, 499 and 500, host interface 112 may use memory-mapped registers to assist in EMAC host register read or write transfers and thereby avoiding having to use more DCR registers.

Table 5 lists an exemplary embodiment of a memory map for host interface memory-mapped registers and EMAC embedded host registers. Groups of registers, addresses for each group, and a description for each address are listed. The memory map of host registers is for when DCR bus 114 is used as a host bus for host register access.

FIG. 4-5C is a table diagram listing an exemplary embodiment of definitions for memory-mapped registers. MIIMcntl register is not listed in FIG. 4-5C because it is not physically implemented; only its address is decoded to determine initiations of an MDIO register access.

Each of registers 536, 537 and 541 has a read and write function. Bit assignments 505 and default values are listed in FIG. 4-5C. Host interface registers, such as IRstatus register 536 and IRenable register 537, are implemented so that a user may alternately choose to use an interrupt as a means to inform processor 103 that a read or write 504 to an EMAC host register is completed.

When any bit 505 of IRstatus register 536 is set, DCR host completed interrupt request and DCR host done interrupt (“dcrhostdoneir”) signal 407 (shown in FIG. 4-2) is asserted to raise an interrupt to processor 103, such as when an EMAC register access has completed. This facilitates processor 103 to process instructions, other than EMAC host read or write instructions, following the interrupt without having to spend time polling RDYstatus register 499 to find out when an EMAC host register read or write completes. This may be useful in a read or write to MIIM registers because MDC frequency is conventionally low compared to system clock frequency of processor 103 and conventionally approximately a hundred processor instructions may be executed in the time that it takes a MIIM register read or write to complete.

MIIM write data (“MIIMwrData”) register 541 is used to hold MIIM write data temporarily before it is output from EMAC core 123 for a MIIM register write. MIIMwrData register 541 allows DCR dataRegLSW 522 to be reused to reduce the number of DCR registers used and to facilitate software programming consistency.

In an implementation, an address filter block contains a four-entry CAM/MAR for multicast address matching. As described below in additional detail, host interface 112 does not directly read or write to the CAM or MARs. Instead, the CAM/MAR data, CAM/MAR address and read/write bit is written to address filter registers, namely, read configuration address table and write configuration address table, to read or write CAM/MAR entries.

FIGS. 4-6 is a state diagram depicting an exemplary embodiment of a state machine 551S of dcrAckGen 551. State machine 551S is reset responsive to reset signal 474, which places state machine 551S in idle state 546. State machine 551S transitions from idle state 546 to write acknowledge state 545 responsive to DCR write enable acknowledge signal (“dcrWrEn_ack”) of signals 1506 being asserted. After which, state machine 551S from write acknowledge state 545 transitions back to idle state 546 at completion of an acknowledgment of a write to DCR registers.

Responsive to DCR read enable acknowledgment (“dcrRdEn_ack”) signal of signals 1506 being asserted, state machine 551S transitions from idle state 546 to read acknowledge state zero 547. From read acknowledge state zero 547, state machine 551S transitions to read acknowledge state one 548 responsive to a next clock cycle of DCR clock signal 516. From read acknowledge state one 548, state machine 551S transitions to idle state 546 responsive to a next clock cycle of DCR clock signal 516.

State machine 551S stays in idle state 546 if neither DCR write enable acknowledgment signal nor DCR read enable acknowledgment signal are asserted. Output of state machine 551S, namely, dcrAck signal 510, is a logic 0 while in idle state 546 or read acknowledge zero state 547. In read acknowledgement one state 548 or write acknowledge state 545, dcrAck signal 510 is a logic 1.

FIG. 4-7 is a state diagram depicting an exemplary embodiment of a state machine 552S of DCR read bypass multiplexer enable (“dcrRdBypassMuxEn”) generator 552. Input signals 1505 to dcrRdBypassMuxEn generator 552 are listed as signal set (15) in Table 4.

State machine 552S is reset responsive to reset signal 474, which puts state machine 552S in idle state 549. From idle state 549, state machine 552S outputs a logic zero for DCR read output enable signal 485. State machine 552S stays in idle state 549 for DCR read enable acknowledgment signal not being asserted. However, for state machine 552S receiving an asserted DCR read enable acknowledgment signal, state machine 552S transitions to enable state 550. Output of state machine 552S is a logic one for DCR read output enable signal 485 for enable state 550. State machine 552S stays in enable state 550 if DCR read enable inverted (“dcrRdEn_neg”) signal is not asserted. If, however, DCR read enable inverted signal is asserted, state machine 552S transitions from enable state 550 to idle state 549.

Referring again to FIG. 4-5A, a control signal, namely, DCR read output enable (“dcrRdOutEn”) signal 485, is generated by dcrRdBypassMuxEn generator 552 responsive to signals 1505, namely, signal set (15) of Table 4, for reading from DCR registers. When dcrRdOutEn signal 485 is not asserted, input to DCR bus 114 is bypassed to output of DCR bus 114 in accordance with a DCR specification for a PPC405. Control generator (“cntlGen”) block 588 generates control signals 1507, namely, signal set (14) of Table 4, for reading and writing to DCR registers and host interface memory mapped registers responsive to input signals 1504, namely, signal set (13) of Table 4.

FIG. 4-8 is a block diagram depicting exemplary embodiments of logic blocks of control generator block 588 for generating control signals for reading from or writing to DCR bridge 113 to host bus 160 or 161. FIGS. 4-9 is a block diagram depicting exemplary embodiments of logic blocks of control generator block 588 for generating control signals for reading or writing data from or to host bus 160 or 161 into DCR bridge 113. Simultaneous reference is made to FIGS. 4-5A, 4-8 and 4-9. Notably, due to differences in read or write timing, such as from or to a configuration register, MIIM register, statistics register, address filter configuration register or address filter CAM, separate controllers implemented with state machines may be used as indicated in this exemplary embodiment.

Control generator host interface logic block 421 includes a DCR address decoder and logic for qualifying read and write control signals. FIGS. 4-33A and 4-33B is a code listing depicting an exemplary embodiment of logic block 421, with logic equations in Verilog RTL. Logic block 421 provides output signals 1513, namely, signal set (4) of Table 1, in response to input signals 1512, namely, signal set (3) of Table 1. FIG. 4-34 is a code listing depicting an exemplary embodiment of main bus control (“busCntlMain”) block 553, with logic equations in Verilog RTL. Main bus control block 553 provides output signals 1511, namely, signals from signal set (2) of Table 1, in response to input signals 1510, namely, signals from signal set (1) of Table 1.

FIG. 4-10 is a state diagram depicting an exemplary embodiment of a state machine 554S of configuration read/write bus controller 554. State machine 554S is reset responsive to reset signal 474, which places state machine 554S in idle state 630. State machine 554S transitions from idle state 630 to configuration read (“ConfigRead”) state 631 responsive to configuration read signal 559 being asserted. After which, state machine 554S from configuration read state 631 transitions back to idle state 630 at a completion of a read to EMAC configuration registers.

Responsive to configuration write signal 560 being asserted, state machine 554S transitions from idle state 630 to configuration write (“ConfigWrite”) state 632. From configuration write state 632 state machine 554S transitions back to idle state 630 at a completion of a write to DCR registers.

FIG. 4-11 is a state diagram depicting an exemplary embodiment of a state machine 555S of MIIM read/write bus controller 555. State machine 555S is reset responsive to reset signal 474, which places state machine 555S in idle state 633. State machine 555S transitions from idle state 633 to MIIM read 1 state 634 responsive to MIIM read signal 576 being asserted. On a next clock cycle of host clock signal 440, state machine 555S from MIIM read 1 state 634 transitions to MIIM read 2 state 635 for a completion of a read to MIIM registers. State machine 555S transitions from MIIM read 2 state 635 back to idle state 633 responsive to MIIM ready signal 614 being asserted, namely, indicating completion of this read. State machine 55S stays in MIIM read 2 state 635 if MIIM ready signal 614 is not asserted.

Responsive to MIIM write signal 577 being asserted, state machine 555S transitions from idle state 633 to MIIM write 1 state 636. From MIIM write 1 state 636 state machine 555S transitions to MIIM write 2 state 637 on a next clock cycle of host clock signal 440. State machine 555S transitions from MIIM write 2 state 637 back to idle state 633 responsive to MIIM ready signal 614 being asserted, namely, indication a completion of this write to MIIM registers. State machine 555S stays in MIIM write 2 state 637 if MIIM ready signal 614 is not asserted.

FIG. 4-12 is a state diagram depicting an exemplary embodiment of a state machine 556S of statistics read bus controller 556. State machine 556S is reset responsive to reset signal 474, which places state machine 556S in idle state 638. State machine 556S transitions from idle state 638 to statistics read (“StatsRead”) state 639 responsive to statistics read signal 561 being asserted. After which, state machine 556S from statistics read state 639 transitions back to idle state 638 at a completion of a read from external FPGA-based statistics registers.

FIG. 4-13 is a state diagram depicting an exemplary embodiment of a state machine 557S of address filter read/write bus controller 557. State machine 557S is reset responsive to reset signal 474, which places state machine 557S in idle state 640. State machine 557S transitions from idle state 640 to address filter read 1 (“AFread1”) state 641 responsive to address filter read signal 578 being asserted. On a next clock cycle of host clock signal 440, state machine 557S transitions from address filter read 1 state 641 to address filter read 2 (“AFread2”) state 642 for completion of a read to address filter registers. State machine 557S transitions from address filter read 2 state 642 back to idle state 640 at a completion of this read to address filter registers.

Responsive to address filter write signal 579 being asserted, state machine 557S transitions from idle state 640 to address filter write (“AFwrite”) state 643. After which, state machine 557S from address filter write state 643 transitions back to idle state 640 at a completion of a write to address filter registers.

FIG. 4-14 is a state diagram depicting an exemplary embodiment of a state machine 558S of address filter content addressable memory read/write bus controller 558. State machine 558S is reset responsive to reset signal 474, which places state machine 558S in idle state 644. State machine 558S transitions from idle state 644 to address filter content addressable memory read 1 (“AFcamRd1”) state 645 responsive to CAM read signal 562 being asserted. On a next clock cycle of host clock signal 440, state machine 558S from AFcam read 1 state 645 transitions to AFcam read 2 (“AFcamRd2”) state 646 for completion of a read to address filter CAM registers. State machine 558S transitions from AFcamRd2 state 646 back to idle state 644 at a completion of this read to address filter CAM registers.

Responsive to CAM write signal 563 being asserted, state machine 558S transitions from idle state 644 to AFcam write (“AFcamWr”) state 647. After which, state machine 558S from AFcamWr state 647 transitions back to idle state 644 at a completion of a write to address filter CAM registers.

FIG. 4-15 is a state diagram depicting an exemplary embodiment of a state machine 591S of read data received controller 591. State machine 591S is reset responsive to reset signal 474, which places state machine 591S in idle state 648.

State machine 591S transitions from idle state 648 to start read data received (“startRdDrecv”) state 649 responsive to a host register read signal of signals 1514 being asserted. After which, state machine 591S from start read data received state 649 transitions back to idle state 648 at a completion of initialization to receive read data from registers.

State machine 591S transitions from idle state 648 to start write done (“startWrDone”) state 650 responsive to either a host register write signal or a content addressable memory write signal of signals 1514 being asserted. After which, state machine 591S from start write done state 650 transitions back to idle state 648 at a completion of a write to host registers.

State machine 591S transitions from idle state 648 to start content addressable memory read data received (“startCAMrdDrecv”) state 651 responsive to a content addressable memory read signal of signals 1514 being asserted. After which, state machine 591S from start content addressable memory read data received state 651 transitions back to idle state 648 at a completion of initialization to receive read data from CAM registers.

Table 13 lists state machine 591S status of output signals for each of the states in FIG. 4-15.

TABLE 13

Idle

startRdDrecv

startWrDone

startCAMRdDrecv

State/Output

648

649

650

651

MIIMrdR

0

MIIMrdReg

0

0

StatsRdR

0

StatsRdReg

0

0

configRdR

0

configRdReg

0

0

AddrFilRdR

0

AddrFilRdReg

0

0

MIIMwrR

0

0

MIIMwrReg

0

configWrR

0

0

configWrReg

0

AddrFilWrR

0

0

AddrFilWrReg

0

camRdR

0

0

0

camRdReg

camWrR

0

0

camWrReg

0

In idle state 648, all outputs of state machine 591S are logic 0. In start read of data received state 649, status of MIIMwrR, configWrR, AddrFilWrR, camRdR, and camWrR output signals of state machine 591S are all logic zero, and status of MIIMrdR, StatsRdR, configRdR, and AddrFilRdR output signals of state machine 591S are respectively the status or content of their associated register, namely, MIIMrdReg, StatsRdReg, configRdReg, and AddrFilRdReg, respectively.

In start write done state 650, status of MIIMwrR, configWrR, AddrFilWrR, and camWrR output signals of state machine 591S are respectively the status or content of their associated register, namely, MIIMwrReg, configWrReg, AddrFilWrReg, and camWrReg, respectively, and status of MIIMrdR, StatsRdR, configRdR, AddrFilRdR, and camRdR output signals of state machine 591S are all logic 0. In start CAM read of data received state 651, all outputs of state machine 591S are logic 0, except for camRdR output which is the status or content of its respective register, namely, camRdReg.

FIG. 4-16 is a state diagram depicting an exemplary embodiment of a state machine 592S of configuration read/write controller 592. State machine 592S is reset responsive to reset signal 474, which places state machine 592S in idle state 652. State machine 592S transitions from idle state 652 to configuration read 1 (“ConfigRd1”) state 653 responsive to configuration read receive signal 597 being asserted. After which, state machine 592S from configuration read 1 state 653 transitions back to idle state 652 at a completion of this read to host configuration registers.

Responsive to configuration write reset signal 598 being asserted, state machine 592S transitions from idle state 652 to configuration write 1 (“ConfigWr1”) state 654. From configuration write 1 state 654 state machine 592S transitions back to idle state 652 at a completion of this write to host configuration registers.

Output signals of state machine 592S, namely, outputs 602 through 605 are 1-bit wide signals. States, namely, idle state 652, configuration read 1 state 653 and configuration write 1 state 654, are set forth below in Table 14. Table 14 lists state machine 592S status of output signals for each of the states in FIG. 4-16.

TABLE 14

config-

config-

dRegMSWwe_cf

dRegLSWwe_cfg

RdDone

WrDone

IDLE

0

0

0

0

ConfigRd1

0

1

1

0

CinfigWr1

0

0

0

1

FIG. 4-17 is a state diagram depicting an exemplary embodiment of a state machine 594S of statistics read controller 594. State machine 594S is reset responsive to reset signal 474, which places state machine 594S in idle state 655. State machine 594S stays in idle state 655 if statistics read ready signal 599 is not asserted.

State machine 594S transitions from idle state 655 to statistics read 1 (“Sr1”) state 656 responsive to statistics read ready signal 599 being asserted. For each clock cycle of host clock signal 440 after state machine 594S is in statistics read 1 state 656, state machine 594S transitions to a next state. For example, from statistics read 1 state 656, state machine 594S transitions to statistics read 2 (“Sr2”) state 657. From statistics read 2 state 657 state machine 594S transitions to statistics read 3 (“Sr3”) state 658. From statistics read 3 state 658 state machine 594S transitions to statistics read 4 (“Sr4”) state 659. From statistics read 4 state 659 state machine 594S transitions to statistics read 5 (“Sr5”) state 660. From statistics read 5 state 660 state machine 594S transitions to statistics read 6 (“Sr6”) state 661. From statistics read 6 state 661 state machine 594S transitions to statistics read 7 (“Sr7”) state 662 for completion of a read to statistics registers. State machine 594S from statistics read 7 state 662 transitions back to idle state 655 at a completion of this read to statistics registers.

Outputs 606 through 608 of state machine 594S are all 1-bit wide signals. States, namely, states 656 through 662, of state machine 594S for signal outputs 606 through 608 of statistics read controller 594 are set forth below in Table 15. Table 15 lists state machine 594S status of output signals for each of the states in FIG. 4-17.

TABLE 15

dRegMSWwe_Stats

dRegLSWwe_Stats

StatsRdDone

IDLE

0

0

0

Sr1

0

0

0

Sr2

0

0

0

Sr3

0

0

0

Sr4

0

0

0

Sr5

0

0

0

Sr6

0

1

0

Sr7

1

0

1

FIG. 4-18 is a state diagram depicting an exemplary embodiment of a state machine 593S of MIIM read/write controller 593. State machine 593S is reset responsive to reset signal 474, which places state machine 593S in idle state 663. State machine 593S stays in idle state 663 if neither of MIIM read receive signal 613 nor MIIM write ready signal 590 is asserted.

State machine 593S transitions from idle state 663 to MIIM read 1 (“MIIMr1”) state 664 responsive to MIIM read receive signal 613 being asserted. State machine 593S stays in MIIM read 1 state 664 if MIIM ready signal 614 is not asserted. State machine 593S transitions from MIIM read 1 state 664 to MIIM read 2 (“MIIMr2”) state 665 responsive to MIIM ready signal 614 being asserted. State machine 593S transitions from MIIM read 2 state 665 back to idle state 663 for a completion of this read receive from MIIM registers.

State machine 593S transitions from idle state 663 to MIIM write 1 (“MIIMw1”) state 667 responsive to MIIM write ready signal 590 being asserted. State machine 593S stays in MIIM write 1 state 667 if MIIM ready signal 614 is not asserted. State machine 593S transitions from MIIM write 1 state 667 to MIIM write 2 (“MIIMw2”) state 668 responsive to MIIM ready signal 614 being asserted. State machine 593S transitions from MIIM write 2 state 668 back to idle state 663 for a completion of this write to MIIM registers.

Outputs 617 through 620 of state machine 593S are 1-bit wide signals. States of state machine 593S for signal outputs 617 through 620 of MIIM read/write controller 593 are set forth below in Table 15. Table 15 lists state machine 593S status of output signals for each of the states in FIG. 4-18.

TABLE 15

dRegMSWwe_

dRegLSWwe_

miim

miim

MIIMrdDone

MIIMwrDone

IDLE

0

0

0

0

MIIMr1

0

0

0

0

MIIMr2

0

1

1

0

MIIMw1

0

0

0

0

MIIMw2

0

0

0

1

FIG. 4-19 is a state diagram depicting an exemplary embodiment of a state machine 595S of address filter read/write controller 595. State machine 595S is reset responsive to reset signal 474, which places state machine 595S in idle state 669. State machine 595S transitions from idle state 669 to address filter read 1 (“AddrFilRd1”) state 670 responsive to address filter read receive signal 615 being asserted. After which, state machine 595S from address filter read 1 state 670 transitions back to idle state 669 at a completion of a read to address filter registers.

Responsive to address filter write ready signal 616 being asserted, state machine 595S transitions from idle state 669 to address filter write 1 (“AddrFilWr1”) state 671. From address filter write 1 state 671 state machine 595S transitions back to idle state 669 at a completion of a write to address filter registers.

Outputs of state machine 595S, namely, outputs 621 through 624 are 1-bit wide signals. States, namely, idle state 669, address filter read 1 state 670 and address filter write 1 state 671, are set forth below in Table 16. Table 16 lists state machine 595S status of output signals for each of the states in FIG. 4-19.

TABLE 16

dRegMSWwe_

dRegLSWwe_

AddrFil-

AddrFil-

AF

AF

RdDone

WrDone

IDLE

0

0

0

0

AddrFilRd1

0

1

1

0

AddrFilWr1

0

0

0

1

FIG. 4-20 is a state diagram depicting an exemplary embodiment of a state machine 596S of address filter CAM read/write controller 596. State machine 596S is reset responsive to reset signal 474, which places state machine 596S in idle state 672. State machine 596S transitions from idle state 672 to address filter content addressable memory read 1 (“AFcamRd1”) state 673 responsive to CAM read receive signal 600 being asserted. State machine 596S from AFcamRd1 state 673 transitions back to idle state 672 at a completion of this read to CAM registers.

State machine 596S transitions from idle state 672 to address filter content addressable memory write 1 (“AFcamWr1”) state 674 responsive to CAM write ready signal 601 being asserted. On the next clock cycle of host clock signal 440, state machine 596S transitions from AFcamWr1 state 674 to address filter content addressable memory write 2 (“AFcamWr2”) state 675 for a write to CAM registers. State machine 596S transitions from AFcamWr2 state 675 back to idle state 672 at a completion of this write to CAM registers.

Outputs of state machine 596S, namely, outputs 609 through 612 are 1-bit wide signals. States, namely, states 673 through 675, of state machine 596S for signal outputs 609 through 612 of address filter CAM read/write controller 596 are set forth below in Table 17. Table 17 lists state machine 596S status of output signals for each of the states in FIG. 4-20.

TABLE 17

dRegMSWwe_

dRegLSWwe_

AFcam-

AFcam-

AFcam

AFcam

RdDone

WrDone

IDLE

0

0

0

0

AFcamRd1

1

1

1

0

AFcamWr1

0

0

0

0

AFcamWr2

0

0

0

1

FIG. 4-21A through 4-21C are timing diagrams for respective exemplary instances of generation of a sample cycle pulse 489. In FIG. 4-21A, host clock signal 440 has a period which is three times longer than the period of DCR clock signal 516. A sample cycle pulse 544A is generated responsive to each falling edge of host clock signal 440, or generally at one-half the period of host clock signal 440.

In FIG. 4-21B, host clock signal 440 has a period which is four times longer than the period of DCR clock signal 516. A sample cycle pulse 544B is generated responsive to the first falling edge of DCR clock signal 516 immediately following a falling edge of host clock signal 440.

In FIG. 4-21C, host clock signal 440 has a period which is five times longer than the period of DCR clock signal 516. A sample cycle pulse 544C is generated responsive to the first falling edge of DCR clock signal 516 immediately after a first rising edge of DCR clock signal 516 that immediately follows a falling edge of host clock signal 440.

Accordingly, it should be appreciated that by bridging DCR registers with a finite state machine, which may be broken up into several finite state machines, several DCR registers may be mapped to a significantly larger address space, such as control registers. Moreover, DCR bridge emulates a set of signals, namely, platform independent host interface signals, for access to such control registers. In other words, DCR bridge 113 maps a small register address space, such as four DCR registers, to a significantly larger register address space, such as control and status registers of EMAC 110, and mapping the significantly larger address space to the small register address space.

Register Access

EMAC core host registers, address filter registers, statistic registers, and MIIM registers may be accessed. The description that follows is for EMAC0 host registers; however, the same description applies to EMAC1 host registers.

As mentioned above, an access to EMAC host register may take several dcrClk signal 516 cycles, because a system clock of processor 103 may run at a higher frequency than host clock signal 440. As a result, a polling or interrupt may be used to ensure that an EMAC register access is completed before processor 103 issues another host interface access.

Some of the EMAC0 core host registers are Receive Configuration Word 0, Receive Configuration Word 1, Transmit Configuration, Flow Control Configuration and Management Configuration. From an exemplary flow for one of these registers, flows for all read/write access to EMAC0 core host registers will be understood. Additionally, continuing the example of PPC405 for processor 103, it will be assumed that access is to Device Control Registers (“DCRs”).

FIG. 4-22 is a flow diagram depicting an exemplary embodiment of a receive configuration word register read access flow 700. At 701, a read command and read address for EMAC0 core host register is set up. This may be for a DCR control register, such as control register 500 of FIG. 4-5A. A read bit may be registered, as well as the read address, for this set up. At 702, a DCR write to the DCR control register is done to instruct host interface 112 to execute the read command. In other words, the write to the DCR control register is done to read the EMAC0 core host register.

At 703, polling or waiting for an interrupt is done by processor 103 for confirmation of completion of the read. At 704, host interface 112 deposits the data read in a data register. In an implementation, the least significant word is read first, and thus read data is deposited into DCR dataRegLSW 498. A DCR read may be done from dataRegLSW 498 to retrieve the read data deposited.

FIG. 4-23 is a flow diagram depicting an exemplary embodiment of a receive configuration word register write access flow 710. At 711, write data to the EMAC core host register is set up. The write data may be set up for dataRegLSW 498.

At 712, the write command address for the EMAC core host register is set up. The write bit and write command address may be registered in Register File register for this set up. At 713, write data is put in a data register, such as DCR dataRegLSW 498. At 714, a write command is issued from processor 103 to host interface 112 to do a DCR write to control register 500 to instruct host interface 112 to write the data in dataRegLSW 498 into an EMAC core host register. Note that the write data may be written into dataRegLSW 498 before writing a command to control register 500. At 715, processor 103 polls or waits for an interrupt for another host interface instruction.

Statistics registers may be implemented in the FPGA fabric 101. However, host interface 112 has logic as described above used to read the statistics registers. The statistics registers may be read only.

FIGS. 4-24 is a flow diagram depicting an exemplary embodiment of a multicast frames received okay register read flow (“statistics register read flow”) 720. To read statistics registers, at 721 a set up for a read command and a statistics register address is done. This set up may be done for control register 500 with registering a read bit and a statistics register address.

At 722, a DCR write to cntlReg 500 is done to instruct host interface 112 to start a statistics register read. This may be done by issuing a read command from processor 103 to host interface 112.

At 723, polling or waiting for an interrupt may be done by processor 103 to determine if a read has been completed in response to the read command issued. Host interface 112 gets data read from the statistics register addressed and puts such data in data registers, such as in dataRegMSW 497 and dataRegLSW 498 in executing a read command. At 724, DCR bridge 113 reads from dataRegMSW 497 to obtain the most significant word of the read data, and at 725, DCR bridge 113 reads from dataRegLSW 498 to obtain the least significant word of the read data.

FIG. 4-25 is a flow diagram depicting an exemplary embodiment of a MIIM register read flow 730. At 731, a physical layer address and a register address are set up and written into a data register, such as into DCR dataRegLSW 498. At 732, a host register read enable bit is set, such as to a logic 0, and a MIIM control address is set up.

At 733, a DCR write to control register 500 is done thereby initiating a read of the MIIM register addressed. At 734, processor 103 polls the DCR RDYstatus register 499 or waits for an interrupt to determine whether the read has completed. When the data read completes, host interface 112 deposits the read data in dataRegLSW 498. Processor 103 may then do a DCR read on dataRegLSW 498 to get the MIIM register read data.

FIGS. 4-26 is a flow diagram depicting an exemplary embodiment of a MIIM register write flow 740. At 741, MIIM write data is set up.

At 743, a Register File register is set up for writing thereto. A host write enable bit is set, such as to logic 1, and an address for MIIM write data register 541 is set.

At 744, MIIM data is written to a DCR data register, such as dataRegLSW 498. At 745, the data from dataRegLSW 498 is transferred to host memory-mapped MIIMwrData register 541 by doing a DCR write to cntlReg 500 with the host write enable bit and MIIMwrData register address set.

At 746, a physical layer device address and a register address are set up. At 747, the physical layer device address and the register address are written into DCR dataRegLSW 498. At 748, a write to DCR cntlReg 500 is done with host write enable bit set and MIIM control register address set to start a host interface write to MIIMwrData register 541. At 749, processor 103 polls the DCR RDYstatus register 499 or waits for an interrupt for another host interface instruction.

Reads and writes to address filter registers for a unicast address register and general configuration registers are the same steps as reads and writes to EMAC core host registers, and thus such reads and writes are not repeated. However, reads and writes to address filter CAM is slightly different.

FIG. 4-27 is a flow diagram depicting an exemplary embodiment of a host interface CAM entry read flow 760. At 761, CAM read/write bit is set, such as to a logic 1; a CAM address is set. At 761, a CAM data field is cleared, such as to “0”. At 762, the CAM read/write and address bits are registered by a DCR write to dataRegLSW 498. At 763, a host register write enable bit is set, and a read configuration address table address is set.

At 764, a DCR write to cntlReg 500 is done. This write initiates a read of a host interface CAM entry, namely, a read of a register associated with the read configuration address table address. Responsive to the read initiated, host interface 112 deposits CAM entry read data, such as for CAM entry 1, upper bits in DCR dataRegMSW 497 and deposits CAM entry read data lower bits in DCR dataRegLSW 498. At 765, processor 103 polls the DCR RDYstatus register 499 or waits for an interrupt for completion of the read of the host interface CAM entry.

To obtain deposited read data, at 766 processor 103 issues a DCR read of dataRegMSW 497 and dataRegLSW 498 to get the CAM entry data. This may be done in two steps, where for example upper bits from dataRegMSW 497 are obtained first, and then lower bits from dataRegLSW 498 are obtained. Again, though the term CAM is used, it should be appreciated that it may be replaced with MAR throughout herein.

FIG. 4-28 is a flow diagram depicting an exemplary embodiment of a host interface CAM entry write flow 770. At 771, CAM data is set up and obtained. At 772, a DCR write to dataRegLSW 498 with the CAM data is done. At 773, a write to a write configuration address table is set up.

At 774, a host register write enable bit is set and an address to the write configuration address table is set up. At 775, a DCR write to cntlReg 500 is done with the EMAC host register write enable bit set, such as to a logic 1, and an address field set to the write configuration address table address. This commands host interface 112 to write CAM data from DCR dataRegLSW 498 into an address filter register, namely, the write configuration address table register associated with the write configuration address table address.

At 776, a CAM read/write bit is cleared, such as set to logic 0, and a CAM address field is set, such as to logic 1 for CAM entry 1. At 777, CAM data remaining is set up. At 778, a DCR write is done to place CAM write enable, CAM address and CAM data remaining into dataRegLSW 498.

At 779, a write to the read configuration address table is set up. At 780, the host register enable bit is set and a read configuration address table address is set up. At 781, a DCR write to cntlReg 500 is done with EMAC host register write enable bit set, such as to logic 1, and an address field set to the read configuration address table. This DCR write to cntlReg 500 commands host interface 112 to put the write data in dataRegLSW 498 into the CAM entry, such as CAM entry 1, which in this example may be a register associated with the read configuration address table address.

FIG. 4-29 is a block diagram depicting an exemplary embodiment of host interface CAM entry read flow 760. Host interface 112 includes DCR registers 751. Notably, in this exemplary implementation four DCR registers 497 through 500, as previously described, are used. A user's software program causes information to be loaded into dataRegLSW 498 as generally indicated by arrow 726. A user's program causes information to be loaded into cntlReg 500 as generally indicated by arrow 727. Once information is written into DCR registers 498 and 500, as previously described, hardware writes content of dataRegLSW 498 into register 718 of host interface registers 705. As previously mentioned, register 718 is a register associated with an address table configuration entry. Responsive to a CAM read/write bit and CAM address bits written into register 718, CAM 706 deposits CAM data 716 into dataRegMSW 497 and deposits CAM data 717 into dataRegLSW 498, as respectively generally indicated with arrows 728 and 729.

FIG. 4-30 is a block diagram depicting an exemplary embodiment of host interface CAM entry write flow 770. Information is written into dataRegLSW 498 by a user's software program, as generally indicated with arrow 737. Information is written into cntlReg 500 by a user's software program, as generally indicated with arrow 738. Contents of dataRegLSW 498 is written into register 750 of host interface registers 705, as generally indicated with arrow 739. Register 750 is associated with a configuration address table, as previously described.

Information is again written into dataRegLSW 498 by a user's software program, as generally indicated with arrow 737, and information is again written into cntlReg 500 by a user's software program, as generally indicated with arrow 738. Hardware writes dataRegLSW 498 content into register 718 of host interface registers 705, as generally indicated with arrow 767. Register 718 is associated with a configuration address table, as previously described. Responsive to a CAM read/write bit and CAM address bits written into register 718, hardware writes content from register 718 into location 716 of CAM 706, as generally indicated with arrow 768, and writes content from register 750 into location 717 of CAM 706, as generally indicated with arrow 769.

In both FIGS. 4-29 and 4-30, some numerical examples have been provided for purposes of clarity by way of example. For example, CAM 706 is illustratively shown as a 48 bit wide memory that is four entries deep; address locations for five host interface registers 705 are illustratively shown; and four DCR registers 751 having two-bit addresses are illustratively shown. However, it should be understood that other bit values, addresses, and numbers of registers/memory size may be used. Furthermore, though a CAM 706 is described, it should be understood that CAM functionality may be provided with circuits other than memory, such as registers and comparators.

FIG. 4-31 is a high-level block diagram depicting an exemplary embodiment of host interface 112 coupled to a physical layer device 119D. Physical layer device 119D includes MIIM registers 754. Data and control information are provided to data least significant word register 498 and control register 500 of DCR registers 751, as generally respectively indicated by arrows 755 and 756. This data and control information may be provided by a user software program. This may be done one time for initialization of host interface 112.

Data and control information is again provided to data least significant word register 498 and control register 500 of DCR registers 751, as generally respectively indicated by arrows 755 and 756. Again, this may be done by a user software program, though not for initialization this time.

After control information is passed to control register 500 in this second instance, host interface 112 captures MIIM read data in data register least significant word 498 responsive to physical layer device 119D asserting host MIIM ready signal 408. This capturing of data is generally indicated by arrow 758 where data from a register of MIIM registers 754 is transferred to data least significant word register 498. Notably, a physical layer device may be located internal or external to a programmable logic device in which an EMAC, such as EMAC 110, is located.

FIG. 4-32 is a high-level block diagram depicting an exemplary embodiment of interfacing between host interface 112 and physical layer device 119D for a write to an EMAC 110 or 111. Data and control information are provided to data least significant word register 498 and control register 500 as respectively indicated by arrows 792 and 793. This may be done once for initialization of host interface 112, and may be done by a user software program.

After control information is written to control register 500, as generally indicated by arrow 793, host interface 112 writes data from data least significant word register 498 to management configuration register 759 as generally indicated by arrow 791.

On a second iteration of writing data to data register least significant word 498 and control information to control register 500 by a user software, host interface 112 initiates a write of data in data least significant word register 498 to MIIM write data register 752, as generally indicated with arrow 795.

On a third iteration of writing data to data register least significant word 498 and control information to control register 500 by user software, host interface 112 initiates a write of data in MIIM write data register 752 to a register of MIIM registers 754, as generally indicated with arrow 794. This write may be provided to physical layer device 119D via an MDIO interface.

Client Interface

Returning to FIG. 1, by providing an embedded EMAC, clock frequency may be increased to higher than approximately 125 MHz with an implementation employing standard cells in processor block 102. In an implementation, EMAC 110 may be clocked at approximately twice the clock frequency (“overclocking”) than that of supporting logic in FPGA 100, so that EMAC 110 is capable of approximately a doubling of data rate output to enhance data throughput.

In an implementation, supporting logic in FPGA 100 is run at approximately 125 MHz because it is the clock frequency that FPGA fabric 101 supports, and thus existing supporting logic does not have to be extensively redesigned for migration to an FPGA 100 having one or more embedded EMACs. For purposes of clarity by way of example and not limitation, it will be assumed that supporting logic in FPGA 100 is operated at half the clock frequency of embedded EMAC 110, other than at the boundaries between EMAC 110 and FPGA fabric 101. Accordingly, width of client interfaces 127 and 128 is doubled, for example from an 8 bit width to a 16 bit width, to compensate for the slower clock frequency of FPGA fabric 101 to maintain enhanced data throughput of EMAC 110.

To maintain backward compatibility, datapath widths of client interfaces 127 and 128 may be selectable, such as for example to be 8 bits, when both EMAC 110 and a user design instantiated in FPGA fabric 101 (“the client”) are running at the same clock frequency, such as for example approximately 125 MHz. Selection of datapath width of client interfaces 127 and 128 may be independently controlled by input pins to processor block 102 for each receive and transmit direction, because transmit and receive may run independently from one another and thus may operate at different frequencies.

Mode select signals may be provided to FPGA fabric 101 via input tie-off pins of processor block 102. Input tie-off pins could be tied to a particular value when an FPGA is configured or could be varied if they are controlled by FPGA logic.

It should be understood that configurable logic of FPGA 100 may operate even maximally at a frequency which is substantially less than that achievable by an EMAC 110 embedded in FPGA 100. For example, configurable logic of FPGA 100 may have a maximum frequency of operation of approximately 125 MHz, and embedded EMAC 110 may have a maximum frequency of operation of approximately twice or greater than that of FPGA 100. Thus, by having a wider data width for configurable logic, frequency of operation of embedded EMAC 110 may be greater than that of configurable logic. However, embedded EMAC 110 may also be used for communication to networks, backplanes or other media outside of FPGA 100. Embedded EMAC 110 may be capable of data rates greater than 1.25 Gigabits per second, which is greater than the current Ethernet standard, namely, approximately 1.0 Gigabits per second. Accordingly, for example, FPGA 100 may be coupled to another medium, such as a backplane, to operate at non-standard data rates, such as in excess of 1.25 Gigabits per second. Thus, transmit and receive client interfaces described herein should not be considered as only being coupled for communication with configurable logic of FPGA 100, but may be used for communication external to FPGA 100, including communication at non-standard data rates.

From Tx engine 820, transmit collision signal 829 and transmit retransmit signal 830 are provided to Tx client interface 127 (e.g., Tx DP 127D). Transmit client interface 127 is configured to provide transmit collision signal 837 and transmit retransmit signal 838 responsive to transmit collision signal 829 and transmit retransmit signal 830 respectively. A transmit collision signal is to indicate a collision on a medium, and a retransmit signal is to indicate a frame to retransmit owing to aborting transmission of the frame due to the collision.

Transmit client interface 127 is configured to provide transmit acknowledge output signal 832 responsive to transmit acknowledge signals 823 and 824. A transmit acknowledge signal is a handshake signal, which for example may be asserted after an EMAC accepts a first byte of data of a transmitted frame. Data, such as from FPGA fabric 101, to be transmitted, may be provided to transmit client interface 127 via transmit data input signal 833, transmit data valid signal most significant word input signal 834, and transmit data valid input signal 835. Additionally, transmit underrun signal 836 may be asserted by a client to force an EMAC to insert an error code to corrupt the then current frame and then fall back to an idle transmission state. For example, an aborted transfer can occur if a first-in, first-out buffer stack (“FIFO”) coupled to a client interface empties before a frame is completely transmitted.

In an implementation, transmit data input signal 833 may actually be 16 signals, namely a 16 bit wide input, where transmit client interface 127 is configured to relay such data to transmit engine 820 via transmit data signal 826 and a fraction of such input data width, such as for example transmit data signal 826 may be an 8 bit wide signal. Accordingly, it should be appreciated that FPGA fabric 101 may operate at a slower frequency that EMAC 110, thereby allowing EMAC 110 to have a higher data throughput though processing data in a width that is less than the input data width.

A transmit inter-frame gap (“IFG”) delay signal 816 may be provided from FPGA fabric 101 to transmit engine 820 for adjustment of delay between frames. Such a signal may be a plurality of signals for a particular bit width, such as for example an 8 bit width.

Transmit engine 820 may be configured for a Media Independent Interface (“MII”) and in particular a Gigabit MII (“GMII”). For purposes of clarity by way of example and not limitation, it will be assumed that Tx engine 820 is configured for a GMII. A Gigabit transmit clock signal 811 is provided to transmit engine 820 from a user. Responsive to clock signal 811, transmit engine 820 is configured to provide GMII transmit clock signal 812. Transmit engine 820 is further configured to provide GMII transmit enable signal 813, GMII transmit data signal 814, and GMII transmit error signal 815. GMII transmit data signal 814 is responsive to transmit data signal 826, and in an implementation may have the same bit width, such as 8 signals for an 8 bit wide output.

Data input to register 881 is transmit divide by two clock signal 822. Data input to register 882 is transmit data valid most significant word input signal 834. Data input to register 883 is transmit data valid input signal 833. Data input to registers 893 and 895 is transmit data input signal 883, which for example may be a 16 bit wide input. Output of register 881 is transmit divide by two clock registered signal 864, which is provided as input to AND gate 886 and AND gate 891, and is provided as a control select input to multiplexer 889. Output of register 882 is data valid most significant word registered signal 862, which is provided as an input to datapath multiplexer controller 896 and to a logic high input of multiplexer 889. Output of register 883 is data valid registered signal 863, which is provided as an input to datapath multiplexer controller 896 and to AND gate 891.

Output of AND gate 891 is a control select signal input to multiplexer 892. Output of register 893 is registered data signal 865 which may be a 16 bit wide output. Data registered signal 865 is provided to a logic high input of multiplexer 892 and to multiplexer 897. Data registered signal 865 may be divided for inputting to multiplexer 897 in an implementation directed to specific designated binary input ports. Output of data register 895 is provided to a logic low level input of multiplexer 898 and is transmit data 8 bit mode registered signal 869. Output of multiplexer 892 is provided to register 894.

Output of register 894 is fed back to a logic low state input of multiplexer 892 and is data registered two signal 868, which may be a 16 bit wide data signal in an implementation. Data registered two signal 868 is provided as data input to multiplexer 897. In an implementation, data registered two signal 868 may be divided in half, with one half going to one binary logic designation of multiplexer 897 and the other half going to a different binary designation input of multiplexer 897. Output from datapath multiplexer controller 896 is select signals, such as select signals S0, 51, S2, S3, respectively referenced as signal 872-875, which are provided as input to multiplexer 897. Thus, for example, signal S3 may be used for selecting data bits [15:8] input to port 0001 of multiplexer 897. Continuing this exemplary implementation, each select signal 872 through 875 would be for selecting a different portion of either data registered signal 865 or 868 for output from multiplexer 897.

Output from multiplexer 897 is transmit data 16 bit mode to 8 bit mode signal 876. Accordingly, in an implementation, data width of transmit data 16 bit mode to 8 bit mode signal 867 would be an 8 bit wide signal which may be provided to a logic high input of multiplexer 898. Select 16 bit mode signal 825 may be provided as a control select signal to multiplexer 898. The other input to multiplexer 898, namely transmit data 8 bit mode registered signal 869, is provided to a logic low input of multiplexer 898.

Either of inputs 876 or 869 may be selected responsive to select mode signal 825 to provide transmit data signal 826, which in an implementation would be an 8 bit wide data output. Another input to AND gate 886 is transmit acknowledge signal 824, and output from AND gate 886 is acknowledged at divide by two clock signal being logic high 866, which is provided as a control select input to multiplexer 887. Another input to multiplexer 887 may be tied to a logic high state. Output of multiplexer 887 is provided as an input, such as a logic zero input to multiplexer 888. Another input to multiplexer 888 may be tied to a logic low state. Output of multiplexer 888 may be provided as data input to register 884. Output of multiplexer 889 is provided to data register 885, the output of which is data valid most significant word registered two signal 867, which is provided as an input to datapath multiplexer controller 896 and fed back as a data input to multiplexer 889.

Data valid inputs TX_DV_MSW_IN 834 and TX_DV_IN 835 and data input TX_DATA_IN[15:0] 833 from FPGA fabric 101 may be registered immediately responsive to TX_DIV2_CLK 822 in Tx client interface 810 to facilitate timing in a design instantiated in FPGA fabric 101. Sel16bMode signal 825 is for selecting a data width mode, for example such as whether a Tx client interface is operating in a 16-bit mode or 8-bit mode.

Tx client interface 810 may have to handle a number, for example four, instances of input data. In an instance, TX_ACK 824 is asserted while TX_DIV2_CLK 822 is at a logic high level, and transmit data is an even number of bytes. In another instance, TX_ACK 824 is asserted while TX_DIV2_CLK 822 is at a logic high level, and transmit data is an odd number of bytes. In yet another instance, TX_ACK 824 is asserted while TX_DIV2_CLK 822 is at a logic low level, and transmit data is an even number of bytes. And in still yet another instance, TX_ACK 824 is asserted while TX_DIV2_CLK 822 is at a logic low level, and transmit data is an odd number of bytes.

An acknowledge (“ACKatDiv2CkHiReg”) signal 871 is asserted when TX_DIV2_CLK signal 822 is registered at a logic high level. Acknowledge signal 871 may be used to determine when TX_ACK 824 is asserted with respect to phase of TX_DIV2_CLK 822.

FIG. 5D is a state diagram depicting an exemplary embodiment of a state machine 900 for dpMuxCntl block 896. State machine 900 is reset responsive to reset signal 874, which places state machine 900 in an idle state 907. State machine 900 stays in idle state 907 until a data valid register (“DVIdReg”) signal 863 is asserted.

State machine 900 transitions from idle state 907 to an odd octet transmission state A1901 responsive to signal DVIdReg 863 being asserted. State machine stays in A1 state 901 until Tx_Ack signal 824 is asserted. If only one data octet is being transmitted, if a data valid most significant word register (“DVIdMSWreg”) signal 862 is not asserted and Tx_Ack signal 824 is asserted, state machine 900 transitions from state A1901 back to idle state 907. If two or more data octets are transmitted, DVIdMSWreg signal 862 may be maintained in an asserted state when Tx_Ack signal 824 is asserted causing state machine 900 to transition from state A1901 to an even octet transmission state A2902. State machine 900 transitions from state A2902 back to idle state 907, if DVIdReg signal 863 is deasserted.

If ACKatDiv2CkHiReg signal 871 is asserted while DVIdReg signal 863 is being asserted, state machine 900 transitions from state A2902 to an odd octet transmission state A3903. Notably, Tx_Ack signal 824 is asserted while Tx_Div2_Clk signal 822 is in a logic high state for this transition. If DVIdMSWreg signal 862 is deasserted at this juncture meaning that the current transmission is done, state machine 900 transitions from state A3903 back to idle state 907.

If DVIdMSWreg signal 862 is still asserted, providing an even number of data octets is being transmitted, state machine 900 transitions from state A3903 to an even octet transmission state A4904. If DVIdMSWreg signal 862 is then deasserted, state machine 900 transitions from state A4904 back to odd octet transmission state A3903.

If both DVIdReg signal 863 and ACKatDiv2CkHiReg signal 871 are not asserted while state machine 900 is in state A2902, and Tx_Ack signal 824 is asserted while Tx_Div2_Clk signal 822 is in a logic low state, state machine 900 transitions from state A2902 to an odd octet transmission state A5905. If DVIdMSWreg signal 862 is deasserted, providing this transmission of an odd number of data octets is done, state machine 900 transitions from state A5905 back to idle state 907. If DVIdMSWreg signal 862 is asserted for transmission of an even number of data octets, state machine 900 transitions from state A5905 to an even octet transmission state A6906.

If DVIdReg signal 863 is asserted for transmission of an odd number of data octets, state machine 900 transitions from state A6906 back to state A5905. If DVIdReg signal 863 is deasserted, providing this transmission of an even number of data octets is done, state machine 900 transitions from state A6906 back to idle state 907.

All four outputs of state machine 900, such as outputs S0 through S3, are logic 0 in idle state 907. Output S0 is a logic 1 and outputs 51 through S3 are all logic 0 in both states A1901 and A5905. Output S1 is a logic 1 and outputs S0, S2, and S3 are all logic 0 in both states A2902 and A6906. Output S2 is a logic 1 and outputs S0, S1, and S3 are all logic 0 in state A3903. Output S3 is a logic 1 and outputs S0 through S2 are all logic 0 in state A4904.

Outputs of state machine 900, namely, outputs S0 through S4, are 1-bit wide signals. States, namely, states A1901 through A6906, of state machine 900 for signal outputs S0 through S3 of dpMuxCntl block 896 are set forth below in Table 18. Table 18 lists state machine 900 status for output signals for each of the states in FIG. 5D.

TABLE 18

S0

S1

S2

S3

IDLE

0

0

0

0

A1

1

0

0

0

A2

0

1

0

0

A3

0

0

1

0

A4

0

0

0

1

A5

1

0

0

0

A6

0

1

0

0

FIG. 5J-1 is a schematic diagram depicting an exemplary embodiment of a transmit data valid generator 1020. Transmit data valid generator 1020 receives data valid registered signal 863 to a logic high input of multiplexer 1021 and to data valid generator 1027. Output of multiplexer 1021 is provided to register 1022 which is clocked responsive to transmit clock signal 821. Output of register 1022 is data valid registered two signal 1034 which is provided to an input of inverter 1024 and to a logic low input of multiplexer 1021.

Multiplexer 1021 is provided a registered clock signal, namely transmit divide by two clock registered signal 864, as a control select signal input. Data valid registered signal 863 is provided to an input of AND gate 1025 along with the output of inverter 1024 to provide data valid start pulse signal 1033. Data valid start pulse signal 1033 is provided as an input to data valid generator 1027. Other inputs to data valid generator 1027 are transmit acknowledge signal 824, data valid most significant word registered signal 862, and acknowledge at a logic high state of divide by two clock signal 866.

FIG. 5J-2 is a state diagram depicting an exemplary embodiment of a state machine 1040 for data valid generator 1027. State machine 1040 is reset responsive to reset signal 874, which places state machine 1040 in an idle state 1046. State machine 1040 stays in idle state 1046 until a data valid start pulse (“DVIdStart_p”) signal 1033 is asserted.

If DVIdStart_p signal 1033 is asserted, state machine 1040 transitions from idle state 1046 to a first data octet state B11041. State machine 1040 stays in state B11041 until TX_ACK signal 824 is asserted. If TX_ACK signal 824 is asserted while DVIdMSWreg signal 862 is not asserted for only one data octet being transmitted, state machine 1040 transitions from state B11041 back to idle state 1046.

If ACKatDiv2CkHi signal 866 is asserted, state machine 1040 transitions from state B11041 to state B21042. State machine 1040 stays in state B21042 if DVIdReg signal 863 and DVIdMSWreg signal 862 are both asserted for continued data input for the then current transmission. If DVIdReg signal 863 is deasserted for an even number of data octets being transmitted, state machine 1040 transitions from state B21042 back to idle state 1046. If DVIdMSWreg signal 862 is deasserted for an odd number of data octets being transmitted, state machine 1040 transitions from state B21042 to state B31043. From state B31043 state machine 1040 transitions back to idle state 1046 at a completion of the then current transmission.

If, while in state B11044, ACKatDiv2CkHi signal 866 is deasserted, state machine 1040 transitions from state B11041 to state B41044. State machine 1040 stays in state B41044 if DVIdReg signal 863 and DVIdMSWreg signal 862 are both asserted for continued data input for the then current transmission. If DVIdReg signal 863 and DVIdMSWreg signal 862 are both deasserted for an even number of data octets being transmitted, state machine 1040 transitions from state B41044 back to idle state 1046. If DVIdMSWreg signal 862 is asserted while DVIdMSWreg signal 862 is deasserted for an odd number of data octets being transmitted, state machine 1040 transitions from state B41044 to state B51045. From state B51045 state machine 1040 transitions back to idle state 1046 at a completion of the then current transmission.

The output of state machine 1040, which is output TX_DATA_VALID signal 827, is a logic 0 in idle state 1040 and in states B31043, B41044, and B51045. Output TX_DATA_VALID signal 827 is a logic 1 or a logic 0, namely, the content of a data valid register (“DVIdReg”), in state B11041 and in state B21042.

The output of state machine 1040, namely, output TX_DATA_VALID signal 827, is a 1-bit wide signal. States, namely, states B11041 through B51045, of state machine 1040 for output TX_DATA_VALID signal 827 are set forth below in Table 19. Table 19 lists state machine 1040 status for TX_DATA_VALID signal 827 for each of the states in FIG. 5J-2.

TABLE 19

TX_DATA_VALID

IDLE

0

B1

DVIdReg

B2

DVIdReg

B3

0

B4

0

B5

0

FIGS. 5E, 5F, 5G and 5H are respective output timing diagrams of exemplary embodiments of either even or odd transmit data byte lengths for when transmit client interface 127 is in an 16-bit mode. In FIG. 5E, TX_ACK signal 824 is asserted when TX_DIV2_CLK signal 822 is generally at a logic high level and TX_DATA[7:0] signal 826 has an even number of bytes. Notably, TX_DV_IN signal 835 and TX_DV_MSW_IN signal 834 are generally asserted, raised to a logic high level, at 930 and de-asserted, lowered to a logic low level, at 950.

In FIG. 5F, TX_ACK signal 824 is asserted when TX_DIV2_CLK signal 822 is generally at a logic high level and TX_DATA[7:0] signal 826 has an odd number of bytes. Notably, TX_DV_IN signal 835 and TX_DV_MSW_IN signal 834 are generally asserted at 930, but TX_DV_MSW_IN signal 834 is de-asserted at 970 prior to TX_DV_IN signal 835 which is de-asserted at 950.

In FIG. 5G, TX_ACK signal 824 is asserted when TX_DIV2_CLK signal 822 is generally at a logic low level and TX_DATA[7:0] signal 826 has an even number of bytes. Notably, TX_DV_IN signal 835 and TX_DV_MSW_IN signal 834 are generally asserted at 930 and de-asserted at 970.

In FIG. 5H, TX_ACK signal 824 is asserted when TX_DIV2_CLK signal 822 is generally at a logic low level and TX_DATA[7:0] signal 826 has an odd number of bytes. Notably, TX_DV_IN signal 835 and TX_DV_MSW_IN signal 834 are generally asserted at 930, but TX_DV_MSW_IN signal 834 is de-asserted at 990 prior to TX_DV_IN signal 835 which is de-asserted at 970.

FIG. 51 is an output timing diagram depicting an exemplary embodiment of a bypass mode for when transmit client interface 127 is in an 8-bit mode. In a bypass mode, TX_DV_MSW_IN signal 834 is maintained de-asserted, and TX_DV_IN signal 835 is asserted generally at 930 and de-asserted generally at 970. In this example, TX_DATA[7:0} signal 826 has an even number of bytes, though an odd number of bytes may be used in bypass mode.

Rx client interface 128 is configured to provide data valid most significant word output signal 853 and data valid output signal 854 responsive to Rx data signal 846 and Rx data valid signal 847. Rx client interface 128 is configured to provide Rx good frame output signal 855 and Rx bad frame output signal 856 responsive to Rx good frame signal 848 and Rx bad frame signal 849, respectively. Rx client interface 128 converts RX_GOOD_FRAME and RX_BAD_FRAME signals 848 and 849 from an RX_CLK signal 841 domain to an RxDiv2Clk signal 845 domain. A good frame signal may be asserted after the last byte of data is received to indicate reception of a compliant frame. A bad frame signal may be asserted after the last byte of data is received to indicate reception of a non-compliant frame.

Rx client interface 128 obtains RX_DATA_VALID 847 and RX_DATA[7:0] 846 from a physical layer interface, such as for an Ethernet. In an implementation, this data signaling may be at a frequency up to approximately 250 MHz when an overclocking or a 16-bit mode is used. By registering upon receipt RX_DATA_VALID 847 and RX_DATA[7:0] 846 in Rx client interface 128, design timing is simplified. In an implementation, Rx client interface 128 may assemble two data octets for output to FPGA fabric 101 in 16-bit increments so that FPGA fabric 101 can be run at half of the clock frequency of incoming data while maintaining data throughput.

FIG. 5K is a schematic diagram depicting an exemplary embodiment of an Rx client interface 128 (e.g., Rx DP 128D). Rx client interface 128 outputs two data valid signals, namely dataVldOut 854 and dataVldMSWout 853 to indicate validity of assembled data, such as the two data octets in the above example. Sel16bMode signal 851 indicates whether the Rx client interface 128 is used a particular mode, such as in a 16-bit or an 8-bit mode. Rx client interface 128 processes instances of input data where RX_DATA_VALID 847 is asserted, such as for a received frame having even or odd number of data octets. Example embodiments of receive output timing are described below.

Output of multiplexer select register 1054 is provided as data input to multiplexer select register 1059. Output of receive data valid register 1055 is provided to an input port of AND gate 1057. Output of receive data register 1056 is provided to an input of AND gate 1058. Initially, select sixteen-bit mode signal 851 is provided as inputs to AND gates 1057 and 1058. Output of AND gate 1057 is data valid sixteen-bit mode signal 1087. Output of AND gate 1058 is data sixteen-bit mode signal 1092. Output of AND gate 1057 is provided as a data input to data valid register 1060 and to an input of AND gate 1066. Output of AND gate 1058 is provided to data register 1061 and to bus 1088.

Output of multiplexer select register 1059 is provided as data input to multiplexer select register two 1062. Out of data valid register 1060 is provided to data input of data valid register two 1063, to a logic low input port of multiplexer 1067, and to an input port of AND gate 1065. In addition to data valid sixteen-bit mode signal 1087 provided to an input port of AND gate 1066, multiplexer select register signal 1091 is provided as another input to AND gate 1066.

Output of data register 1061 is provided to data register two 1064 and to buses 1088 and 1086. Output of data register two 1064 is provided to bus 1086. Accordingly, data register output from register 1061 in combination with data sixteen-bit mode signal 1092, provides a sixteen-bit wide input bus to a logic low port of multiplexer 1069. Moreover, data register output from register 1064, in combination with the output from register 1061, provided to bus 1068 provides a sixteen-bit wide input bus to a logic high port of multiplexer 1069.

Multiplexer select register two 1062 output is provided as a control select signal input to multiplexers 1061, 1068, and 1069 to select between logic low and logic high input ports for output respectively from such multiplexers. Output of multiplexer select register two 1062 is multiplexer select register two signal 1090. Output from data valid register two 1063 is provided to a logic high input port of multiplexer 1067 and to an input of AND gate 1065. Output from AND gate 1065 is provided to a logic high input port of multiplexer 1068. Output of AND gate 1066 is provided to a logic low input port of multiplexer 1068.

Output of multiplexer 1067 is data sixteen-bit valid signal 1095, and is provided as data input to register 1070. Output from multiplexer 1068 is provided as data input to register 1071 and is data sixteen-bit valid most significant word signal 1096. Output from multiplexer 1069 is data sixteen-bit signal 1097, which is provided as a data input to register 1072.

FIG. 5L is a schematic diagram depicting an exemplary embodiment of a circuit implementation of multiplexer select register A 1051. Receive divide by two clock signal 854 is provided as a data input to register 1101. Notably, registers as described herein may be implemented with flip-flops. Moreover, such flip-flops may have resets, which reset signals are not shown for purposes of clarity. Register 1101 is clocked responsive to receive inverted clock signal 1083.

Output of register 1101 is receive divide by two clock register signal 1111, which is input to AND gate 1102. Receive data valid signal 847 and inverted receive data valid register A signal 1115 are provided as inputs to AND gate 1103. Notably, inverted receive data valid register A signal 1115 may be data output of receive data valid register A 1052. Output of AND gate 1103 is start pulse signal 1112, which is provided as an input to AND gate 1102. Another input to AND gate 1102 is select sixteen-bit mode signal 851. Output of AND gate 1102 is provided as a control select input to multiplexer 1105.

A logic high input port of multiplexer 1105 is coupled to a logic high bias voltage 1117. A logic low input port of multiplexer 1105 is coupled to receive a feedback output, namely multiplexer select register A output signal 1118. Output of multiplexer 1105 is provided to a logic low input port of multiplexer 1106. A logic high input port of multiplexer 1106 is coupled to a logic low bias, such as ground 1089.

An inverted receive data valid signal 1113 and receive data valid register A signal 1114, which may be data output of receive data valid register A 1052 of FIG. 5K, are input to AND gate 1104. Output of AND gate 1104 is end pulse signal 1116 which is provided as a control select input to multiplexer 1106. Output of multiplexer 1106 is provided as a data input to register 1107. Register 1107 is clocked responsive to receive clock signal 278. Output of register 1107 is multiplexer select register A signal 1118.

FIGS. 5M, 5N, 5O and 5P are respective output timing diagrams of exemplary embodiments of either even or odd receive data byte lengths for when receive client interface 128 is in an 16-bit mode. In FIG. 5M, RX_DATA_VALID signal 847 is generally asserted when RxDiv2Clk signal 845 is generally at a logic high level and RX_DATA[7:0] signal 846 has an even number of bytes. Notably, RX_DATA VALID signal 847 is generally asserted at 1010 and maintained at the logic high level for reception of all data bytes of RX_DATA[7:0] signal 846, after which RX_DATA VALID signal 847 is de-asserted after/during reception of the last data byte generally at 1120.

In FIG. 5N, RX_DATA_VALID signal 847 is generally asserted when RxDiv2Clk signal 845 is generally at a logic high level and RX_DATA[7:0] signal 846 has an odd number of bytes. Notably, RX_DATA VALID signal 847 is generally asserted at 1010 and maintained at the logic high level for reception of all data bytes of RX_DATA[7:0] signal 846, after which RX_DATA VALID signal 847 is de-asserted after/during reception of the last data byte generally at 1140.

FIG. 5O, RX_DATA_VALID signal 847 is asserted when RxDiv2Clk signal 845 is generally at a logic low level and RX_DATA[7:0] signal 846 has an even number of bytes. Notably, RX_DATA VALID signal 847 is generally asserted at 1150 and maintained at the logic high level for reception of all data bytes of RX_DATA[7:0] signal 846, after which RX_DATA VALID signal 847 is de-asserted after/during reception of the last data byte generally at 1170.

In FIG. 5P, RX_DATA_VALID signal 847 is asserted when RxDiv2Clk signal 845 is generally at a logic low level and RX_DATA[7:0] signal 846 has an odd number of bytes. Notably, RX_DATA VALID signal 847 is generally asserted at 1150 and maintained at the logic high level for reception of all data bytes of RX_DATA[7:0] signal 846, after which RX_DATA VALID signal 847 is de-asserted after/during reception of the last data byte generally at 1120.

FIG. 5Q is an output timing diagram depicting an exemplary embodiment of a bypass mode for when receive client interface 128 is in an 8-bit mode. In a bypass mode, RX_DATA VALID signal 847 is generally asserted at 1150 and maintained at the logic high level for reception of all data bytes of RX_DATA[7:0] signal 846, after which RX_DATA VALID signal 847 is de-asserted after/during reception of the last data byte generally at 1120. Both assertion and de-assertion of RX_DATA VALID signal 847 generally occur while RxDiv2Clk signal 845 is either at a logic low level or at a logic high level. In this example, RX_DATA[7:0} signal 846 has an odd number of bytes, though an even number of bytes may be used in bypass mode. Furthermore, notably, signals 853, 1090, 1095, 1096, and 1197 are not used in this bypass mode, i.e., maintained de-asserted.

It should be understood that each transmit and receive data pathway may be configured for a bit width, such as for example 8 or 16 bits wide, with each such pathway being synchronous to a clock, respectively such as a TX_CLK or a RX_CLK, for independent full-duplex operation.

Physical Layer Interface

Returning to FIG. 1, EMAC 110 can be configured to interface to MII/GMII/MGT physical layer (“PHY”) interfaces. Because EMAC 110 uses one and only one PHY interface 119 in operation at a time, all I/O pins for each of PHY interfaces 119 are not used simultaneously. At the same time, processor block 102 has a finite number of I/O pins available due to routing channel requirements in FPGA 100, and thus I/O pins are shared with between processor 103 and other functional blocks in Processor block 102.

Because processor block 102 has limited number of I/O pins available at the ASIC Processor block-FPGA fabric boundary, FPGA fabric 101 may use FPGA cells to interface to the ASIC/processor block 102 for routing connectivity to ASIC/Processor block 102. These FPGA cells are for interfacing processor block 102 to FPGA fabric 101, namely connecting processor block I/O ports to FPGA fabric routing. FPGA termination cell width determines the number of I/O pins possible. As a result, for EMAC 110 to support PHY interfaces 119, PHY I/O pins are re-used (“pin muxing”) for each PHY interface 119.

Pin muxing reduces the number of I/O pins employed by each EMAC 110 and EMAC 111 by 39 pins each. With a total reduction of 78 I/O pins in PHY interfaces 119, along with output pin reductions in statistics interfaces 116, two EMACs 110 and 111 may be implemented in processor block 102 where before there would only be room for one EMAC.

An implementation of this example may result in a reduction of 21 input pins.

Statistics Interface

FIG. 7A is a high-level block diagram depicting an exemplary embodiment of a transmit-side (“Tx”) of a statistics interface 1240, which forms a portion of statistics interface 116 of FIG. 1. From transmit engine 820, transmit statistics vector signal 1241 is provided to transmit statistics multiplexer 125. For purposes of clarity by way of example, it will be assumed that transmit statistics vector signal 1241 is a thirty-two-bit wide signal, though other bit widths may be used. For example, transmit statistics vector 1241 may be a thirty-two-bit wide vector which is provided to transmit statistics multiplexer 125. A portion of transmit statistics vector 1241 may be siphoned off to provide transmit statistics byte valid signal 1243. For example, the thirtieth bit of transmit statistics vector 1241 may be used to provide transmit statistics byte valid signal 1243.

EMAC 110 generates statistics for data traffic responsive to transmitting and receiving each Ethernet frame. At the end of each such frame, EMAC 110 outputs associated transmit and receive statistics vectors to logic configured and FPGA fabric 101 or subsequent collection and processing of such data traffic. In FIG. 6 there is an example implementation of logic instantiated in FPGA fabric for collection and processing of data traffic.

FIG. 6 is a high-level block diagram depicting an exemplary embodiment of EMAC 110 statistics registers, which may be read via a DCR bus. As previously mentioned, processor block 102 is a region isolated for embedded circuitry within FPGA fabric 101. However, processor block 102 may be external to FPGA fabric 101, though access to FPGA fabric 101 is part of implementing embedded circuitry within processor block 102. EMAC client transmit statistics valid signal 1212 is provided via statistics interface 116 from processor block 102. EMAC client Tx statistics valid signal 1212 may be for EMAC 110 or EMAC 111, where a number sign (“#”) as indicated in the drawing is for either a zero or a one to designate one of the two EMACS.

Returning to FIG. 7A, because processor block 102 has a limited number of I/O pins available, EMACs 110 and 111 output statistics vectors in a small number of bits for each transmit or receive clock cycle. For example, statistics interface 116 may output seven bits per receive clock cycle. The example of seven bits is selected as an inter-frame gap delay may be as small as four receive clock cycles, and thus a subsequently received frame may be substantially short, i.e., a packet that contains no data. Transmission of a receive statistics vector to logic instantiated in FPGA fabric 101 thus can be completed in four receive clock cycles to provide sufficient time for statistics processing units instantiated in FPGA fabric 101 to accumulate receive statistics provided via statistics interface 116.

For example for a transmit statistics vector, statistics output may be one bit per transmit clock cycle. A one bit per transmit clock cycle was selected as an example because a transmit side does not have the same restriction as the receive side. As mentioned with reference to FIG. 6, de-multiplexers are instantiated in FPGA fabric 101 to de-multiplex statistics bit output from statistics interface 116. Notably, multiplexing and de-multiplexing of statistics output introduces time delays before statistics vectors may be processed by a statistics collection unit instantiated in FPGA fabric 101. However, because such statistics collection unit need not be synchronized to the received or transmitted frame that generated the statistics, statistics accumulation proceeds independently from a transmit or receive frame. Statistics processing is configured to complete before a next statistics output such that the next statistics output may be processed.

Continuing the above example, multiplexing of a statistics vector reduces transmit statistics interface from thirty-two output pins to one output pin and reduces receive statistics interface from twenty-seven output pins to seven output pins. Accordingly, a total reduction of fifty-one output pins in implementation may be obtained for each EMAC statistics interface 116. The I/O pin reduction for implementation of PHY interfaces 119 as well as the reduction of output pins for statistics interfaces 116 facilitates integrating more than one EMAC, such as EMACs 110 and 111, within processor block 102.

FIG. 7B is a high-level block diagram depicting an exemplary embodiment of a receive-side statistics interface 1260. Receive engine 850 provides receive statistics vector signal 1261 to receive statistics multiplexer 126. For purposes of clarity by way of example and not limitation, it will be assumed that receive statistics vector 1261 is a twenty-seven-bit wide signal which is provided to receive statistics multiplexer 126, though other bit widths may be used. A portion of receive statistics vector signal 1261 may be used to provide receive statistics byte valid signal 1263. For example, the twenty-second bit of receive statistics vector signal 1261 may be used to provide receive statistics byte valid signal 1263.

FIG. 7C is a block/schematic diagram depicting an exemplary embodiment of transmit statistics multiplexer 125. Transmit statistics vector signal 1241 is input to a logic high port of multiplexer 1282. Transmit statistics valid signal 1242 is provided as a control signal input to multiplexer 1282 to select as between high and low logic level ports, and is provided to transmit statistics multiplexer controller 1281. Other inputs to transmit statistics multiplexer controller 1281 are transmit reset signal 1285 and transmit clock signal 821.

Output from transmit statistics multiplexer controller 1281 is select signal 1289, which for example may be a five-bit wide select signal, which is provided as a control select input to multiplexer 1284. Output from transmit statistics multiplexer controller 1281 is transmit statistics valid output signal 1243.

Output of multiplexer 1282 is transmit statistics vector multiplex signal 1291, which in continuing the above example may be a thirty-two-bit wide output. Output of multiplexer 1282 is provided to register 1283 which is clocked responsive to transmit clock signal 821. Output of register 1283 is provided to bus 1287. Continuing the above example, bus 1287 may be a thirty-two-bit wide bus for providing respective inputs, such as thirty-two respective inputs to multiplexer 1284. Any of such thirty-two inputs to multiplexer 1284 may be selected responsive to a five-bit wide select signal 1289 for output as transmit statistics vector output signal 1244. All output of register 1283 is fed back to multiplexer 1282 for input on a logic low port thereof. For example, this feedback may be thought of transmit statistics vector registered signal 1292, which may be a thirty-two bit wide signal.

FIG. 7D is a state diagram depicting an exemplary embodiment of a state machine 1300 for transmit statistics multiplexer controller 1281 of FIG. 7C. State machine 1300 is put into idle state 1305 responsive to transmit reset signal 1285 being asserted. State machine 1300 stays in idle state 1305 responsive to transmit statistics valid signal 1242 not being asserted. Notably, while in idle state 1305, state machine 1300 outputs, namely transmit statistics valid output signal 1243 and select signal 1289 are respectively logic 0 and logic 00000 in the above-described exemplary implementation. If, however, transmit statistics valid signal 1242 is asserted, state machine 1300 transitions from idle state 1305 to state S11301.

State machine 1300 in state S11301 has transmit statistics valid output signal 1243 equal to a logic 1 and select signal 1289 equal to 00000 in the above-described exemplary implementation. State machine 1300 in state S2, which transitions from state S1 to S2 responsive to the next clock cycle of transmit clock 821, has outputs of transmit statistics valid output signal 1243 equal to a logic 1 and select signal 1289 equal to 00001 in the above-described exemplary implementation.

Accordingly, for each subsequent transmit clock signal 821 cycle, state machine 1300 proceeds to subsequent states incrementing select signal 1289. Skipping ahead to the last two states for the exemplary implementation, at state S311303, transmit statistics valid output signal will be a logic 1 and select signal 1289 will be a 10001. On the next clock cycle of transmit clock 821, state machine 1300 will transition from state S311303 to state S321304 where transmit statistics valid output signal 1243 will be logic 1 and select signal 1289 will be a 10000 for the above-described exemplary implementation. After all bits on bus 1287 of FIG. 7C have been incrementally selected for transmit statistics vector output signal 1244, on the next transmit clock signal 821 cycle, state machine 1300 will transition from state S321304 back to idle state 1305.

FIG. 7F is a block/schematic diagram depicting an exemplary embodiment of receive statistics multiplexer 126. Receive statistics valid signal 1262 is provided as a control select input to multiplexer 1351 and to receive statistics multiplexer controller 1354. Provided to a logic high input port of multiplexer 1351 is receive statistics vector signal 1261, which in the above exemplary implementation is a twenty-seven bit wide signal. Other inputs to receive statistics multiplexer controller 1354 are receive reset signal 1355 and receive clock signal 278. Outputs from receive statistics multiplexer controller 1354 include select signal 1358 and receive statistics valid output signal 1265. In an exemplary implementation, select signal 1358 is a two-bit wide signal for selecting one of four input port groupings of multiplexer 1353 for output.

Output of multiplexer 1351 is receive statistics vector multiplex signal 1356, which in an exemplary implementation is a twenty-seven-bit wide signal. Receive statistics vector multiplex signal 1356 is provided to register 1352, which registers are clocked via receive clock signal 278. Output of register 1352 is provided to bus 1322 and fed back to a logic low input port of multiplexer 1351. Output of register 1352 is receive statistics vector registered signal 1357, which in the exemplary implementation is a twenty-seven-bit wide signal. Continuing the above example, the twenty-seven bits output from register 1352 may be grouped as bits zero through six, seven through thirteen, fourteen through twenty, and twenty-one through twenty-six and then recycling of bit zero. Select signal 1358 selects one of these groupings for output from multiplexer 1353, and then select signal 1358 is incremented to select another group, and so on and so forth. In this manner, output of multiplexer 1353 is Rx statistics vector output signal 1264, which may be a seven-bit-wide output signal in an exemplary implementation.

FIG. 7G is a state diagram depicting an exemplary embodiment of a state machine 1370 for receive statistics multiplexer controller 1354 of FIG. 7F. Responsive to assertion of receive reset signal 1335, state machine 1370 is put in idle state 1375. State machine 1370 stays in idle state 1375 until receive statistics valid signal 1262 is asserted. In other words, state machine 1370 stays in idle state 1375 responsive to the non-assertion of receive statistics valid signal 1262.

Responsive to assertion of receive statistics valid signal 1262, state machine 1370 transitions from idle state 1375 to state S11371. Outputs of state machine 1370, namely receive statistics valid output signal 1265 and select signal 1358, in state S11371 respectively are a logic 1 and a logic 00. On a subsequent receive clock signal 1278 cycle, state machine 1370 transitions from state S11371 to state S21372. Accordingly, receive statistics valid output signal 1265 is maintained at a logic 1 level, and select signal 1358 is incremented to a 01 to select the next grouping of seven bits for output from multiplexer 1353 for receive statistics vector output signal 1264. From state S21372, state machine 1370 responsive to the next receive clock signal 278 cycle transitions to state S31373.

In state S31373, state machine 1370's outputs are a logic 1 and a logic 11 for receive statistics valid output signal 1265 and select signal 1358, respectively. On the next receive clock signal 278 cycle, state machine 1370 transitions from state S31373 to state S41374. In state S41374, outputs of state machine 1370 are a logic 1 and a logic 10 for receive statistics valid output signal 1265 and select signal 1358, respectively. On the next receive clock signal 278 cycle, state machine 1370 transitions from S41374 back to idle state 1375.

FIG. 7H is a timing diagram depicting an exemplary embodiment of timing for receive statistics multiplexer 126 of FIG. 7B. Receive statistics valid signal 1262 is pulsed generally at 1381, and in response, receive statistics vector output signal 1264 will be passed data and receive statistics valid output signal 1265 will be held at a logic high state generally through 1383 for data 1384. Receive statistics vector signal 1261 generally at 1382 provides data for receive statistics vector output signal 1264. Notably, each portion of data 1384, which in this example there are four respective portions of data 1384, is provided on each clock cycle of receive clock signal 278 after pulsing receive statistics valid signal 1262.

Address Filter

FIG. 8 is a high-level block diagram depicting an exemplary embodiment of address filter 129 of FIG. 1. Receive data early signal 1421, which in an exemplary implementation may be an eight-bit wide signal, along with receive data valid early signal 1429, are provided to receive client interface 128. Output of receive client interface 128 is provided to CAM 1401, broadcast address module 1402, pause address module 1403, unicast address module 1404, and pause address module 1405. Pause address module 1403 is a factory setting, which may be hard-wired or programmed, whereas pause address module 1405 is configured for inputting an address by a user. Notably, a pause address may be asserted, for example by a client circuit instantiated in configurable logic of FPGA fabric 101, to transmit a pause frame.

Provided to unicast address module 1404 is TIE unicast address signal 1422, which in an exemplary implementation may be a forty-eight-bit wide address signal. Provided to pause address module 1405 is receive pause address signal 1423, which in an exemplary implementation may be a forty-eight-bit wide address signal. As mentioned above, CAM 1401 may be implemented as a plurality of registers with comparison logic, though a CAM may be used.

A mode that accepts any destination address, what is commonly known as “promiscuous mode”, may be invoked responsive to promiscuous mode signal 1412, which is provided from decode/control circuitry 1406 to OR tree 1408. Output of OR tree 1408 is address valid early signal 1428. Output from OR tree 1407 is provided as frame drop inverted signal 1426, which signal is provided as a data input to register 1409. Register 1409 is clocked responsive to receive clock signal 278. Data output of register 1409 is provided as an input to inverter 1411, the output of which is frame drop signal 1427.

Frame data is passed to a client through the Rx data/control client interface. If address filter 129 is activated, only frames having an address matching an address in address filter 129 are passed to the client. Frames with non-matching addresses are dropped, which dropping is indicated to the client via frame drop signal 1427 being asserted. Notably, when promiscuous mode is invoked, address filter 129 is disabled for filtering addresses though frame drop signal 1427 may still be asserted.

Receive pause address signal 1423 may be obtained from embedded EMAC host registers. For example, EMAC host register may include a receive configuration word one register, which in an exemplary implementation may support storing a 32-bit wide address, and a receive configuration word zero register, which in an exemplary implementation may support storing a 16-bit wide address to determine if an incoming destination address matches a stored address for purposes of rejecting or accepting the incoming received frame.

Host bus 160 may include a version of host bus 118 signals and address filter access signals, as listed for example as signal set (5) in Table 1. For example, host bus 160 may include a host clock signal 440, a host address signal, a host write enable signal, a host read enable signal of signals 464, a host write data signal, a host address filter CAM read signal, an internal MGT host reset signal, and address filter read data signal 434 or 468, 487. In an exemplary implementation, a host address signal may be a ten-bit wide signal; a host read/write data signal may be a thirty-two-bit wide signal; and an address filter read data signal may be a forty-eight-bid wide signal.

Each hard core EMAC contains a receive address filter. Address filtering is for rejecting any incoming receive frame that does not have an acceptable destination address. When a packet is rejected, the packet's data is not passed to client.

Address filter 129 can be programmed to promiscuous mode to accept all incoming frames. In an exemplary implementation, CAM 1401 contains four entries for multicast address matching though fewer or more entries may be used and CAM size may be adjusted accordingly. Notably, a broadcast address and a pause control address are fixed to respective predefined values for broadcast address module 1402 and pause address module 1403.

Having an embedded ASIC block 102 in an FPGA, facilitates implementation of tie-off pins. A tie-off pin may be used to provide TIE address filter enable signal 1425 to activate or deactivate address filtering. TIE_addrFilEn signal 1425 can be programmed when the FPGA is configured. For example, when TIE_addrFilEn signal 1425 is tied to logic high, address filtering is active. A host processor may overwrite this tie-off value by programming a new value through host bus 118 (shown in FIG. 1) as coupled to host bus 160. Notably, processor 103 may change tie-off pin values via a write to address filter 129 via DCR bus 114 and host interface 112.

Additionally, tie-off pins may be used to provide a particular unicast address for address filter matching, such as via TIE unicast address signal 1422. This allows address filter 129 to start functioning without having to program a unicast address through host bus 160. To change a unicast address, a host processor can program in a new unicast address through host bus 160.

Again, tie-off pins may be set to a value when the FPGA is configured, and thus use of these tie-off pins allows address filter 129 to start functioning with a unicast address and with address filtering activated or deactivated without any management action from a host processor. Accordingly, address filter 129 may start functioning with a unicast address or in a promiscuous mode without the need for host processor intervention. In an exemplary implementation, address filter 129 may be implemented with standard layout cells in ASIC block 102 to provide an efficient implementation compared to implementation in FPGA programmable logic, thereby resulting in increasing FPGA resource availability for instantiation of a user design.

Address filter 129 makes use of the early versions of pipelined received data valid and data signals, namely RX_DATA_VALID_early 1429 and RX_DATA_early[7:0] 1421, so address filter 129 has time to compare a received destination address with an addresses stored in address filter 129 and so EMAC core 123 host registers, namely Receive Configuration Word 0[31:0] and Receive Configuration Word 1[15:0], have time to determine whether to accept or reject an incoming receive frame.

Below are some examples of FPGAs in which EMACs 110 and 111 may be implemented. FIG. 9 is a simplified illustration of an exemplary FPGA. The FPGA of FIG. 9 includes an array of configurable logic blocks (LBs 2801a-28011) and programmable input/output blocks (I/Os 2802a-2802d). The LBs and I/O blocks are interconnected by a programmable interconnect structure that includes a large number of interconnect lines 2803 interconnected by programmable interconnect points (PIPs 2804, shown as small circles in FIG. 9). PIPs are often coupled into groups (e.g., group 2805) that implement multiplexer circuits selecting one of several interconnect lines to provide a signal to a destination interconnect line or logic block. Some FPGAs also include additional logic blocks with special purposes, e.g., DLLs, RAM, and so forth.

One such FPGA, the Xilinx Virtex® FPGA, is described in detail in pages 3-75 through 3-96 of the Xilinx 2000 Data Book entitled “The Programmable Logic Data Book 2000” (hereinafter referred to as “the Xilinx Data Book”), published April, 2000, available from Xilinx, Inc., 2100 Logic Drive, San Jose, Calif. 95124, which pages are incorporated herein by reference. (Xilinx, Inc., owner of the copyright, has no objection to copying these and other pages referenced herein but otherwise reserves all copyright rights whatsoever.) Young et al. further describe the interconnect structure of the Virtex FPGA in U.S. Pat. No. 5,914,616, issued Jun. 22, 1999 and entitled “FPGA Repeatable Interconnect Structure with Hierarchical Interconnect Lines”, which is incorporated herein by reference in its entirety.

One such FPGA, the Xilinx Virtex®-II FPGA, is described in detail in pages 33-75 of the “Virtex-II Platform FPGA Handbook”, published December, 2000, available from Xilinx, Inc., 2100 Logic Drive, San Jose, Calif. 95124, which pages are incorporated herein by reference.

One such FPGA, the Xilinx Virtex-II Pro™ FPGA, is described in detail in pages 19-71 of the “Virtex-II Pro Platform FPGA Handbook”, published Oct. 14, 2002 and available from Xilinx, Inc., 2100 Logic Drive, San Jose, Calif. 95124, which pages are incorporated herein by reference.

As FPGA designs increase in complexity, they reach a point at which the designer cannot deal with the entire design at the gate level. Where once a typical FPGA design comprised perhaps 5,000 gates, FPGA designs with over 100,000 gates are now common. To deal with this complexity, circuits are typically partitioned into smaller circuits that are more easily handled. Often, these smaller circuits are divided into yet smaller circuits, imposing on the design a multi-level hierarchy of logical blocks.

Libraries of predeveloped blocks of logic have been developed that can be included in an FPGA design. Such library modules include, for example, adders, multipliers, filters, and other arithmetic and DSP functions from which complex designs can be readily constructed. The use of predeveloped logic blocks permits faster design cycles, by eliminating the redesign of duplicated circuits. Further, such blocks are typically well tested, thereby making it easier to develop a reliable complex design.

Some FPGAs, such as the Virtex FGPA, can be programmed to incorporate blocks with pre-designed functionalities, i.e., “cores”. A core can include a predetermined set of configuration bits that program the FPGA to perform one or more functions. Alternatively, a core can include source code or schematics that describe the logic and connectivity of a design. Typical cores can provide, but are not limited to, digital signal processing functions, memories, storage elements, and math functions. Some cores include an optimally floorplanned layout targeted to a specific family of FPGAs. Cores can also be parameterizable, i.e., allowing the user to enter parameters to activate or change certain core functionality.

In some FPGAs, each programmable tile includes a programmable interconnect element (INT 2911) having standardized connections to and from a corresponding interconnect element in each adjacent tile. Therefore, the programmable interconnect elements taken together implement the programmable interconnect structure for the illustrated FPGA. The programmable interconnect element (INT 2911) also includes the connections to and from the programmable logic element within the same tile, as shown by the examples included at the top of FIG. 10.

For example, a CLB 2902 can include a configurable logic element (CLE 2912) that can be programmed to implement user logic plus a single programmable interconnect element (INT 2911). A BRAM 2903 can include a BRAM logic element (BRL 2913) in addition to one or more programmable interconnect elements. Typically, the number of interconnect elements included in a tile depends on the height of the tile. In the pictured embodiment, a BRAM tile has the same height as four CLBs, but other numbers (e.g., five) can also be used. A DSP tile 2906 can include a DSP logic element (DSPL 2914) in addition to an appropriate number of programmable interconnect elements. An IOB 2904 can include, for example, two instances of an input/output logic element (IOL 2915) in addition to one instance of the programmable interconnect element (INT 2911). As will be clear to those of skill in the art, the actual I/O pads connected, for example, to the I/O logic element 2915 are manufactured using metal layered above the various illustrated logic blocks, and typically are not confined to the area of the input/output logic element 2915.

In the pictured embodiment, a columnar area near the center of the die (shown shaded in FIG. 10) is used for configuration, clock, and other control logic. Horizontal areas 2909 extending from this column are used to distribute the clocks and configuration signals across the breadth of the FPGA.

Some FPGAs utilizing the architecture illustrated in FIG. 10 include additional logic blocks that disrupt the regular columnar structure making up a large part of the FPGA. The additional logic blocks can be programmable blocks and/or dedicated logic. For example, the processor block PROC 2910 shown in FIG. 10 spans several columns of CLBs and BRAMs.

Note that FIG. 10 is intended to illustrate only an exemplary FPGA architecture. The numbers of logic blocks in a column, the relative widths of the columns, the number and order of columns, the types of logic blocks included in the columns, the relative sizes of the logic blocks, and the interconnect/logic implementations included at the top of FIG. 10 are purely exemplary. For example, in an actual FPGA more than one adjacent column of CLBs is typically included wherever the CLBs appear, to facilitate the efficient implementation of user logic.

While FPGA examples have been used to illustrate some embodiments of the present invention, the scope of the present invention is not limited to FPGAs. Other embodiments include other types of PLDs besides FPGAs. Further embodiments include an IC having programmable logic or programmable interconnections or both coupled to an embedded EMAC. Hence the IC, for some embodiments of the present invention, may not be what is called an FPGA, but may have circuits with some or all functions the same as or similar to an FPGA that are coupled to the embedded EMAC.

Notably, program(s) of the program product defines functions of embodiments in accordance with one or more aspects of the invention and can be contained on a variety of signal-bearing media, such as computer-readable media having code, which include, but are not limited to: (i) information permanently stored on non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM or DVD-RAM disks readable by a CD-ROM drive or a DVD drive); (ii) alterable information stored on writable storage media (e.g., floppy disks within a diskette drive or hard-disk drive or read/writable CD or read/writable DVD); or (iii) information conveyed to a computer by a communications medium, such as through a computer or telephone network, including wireless communications. The latter embodiment specifically includes information downloaded from the Internet and other networks. Such signal-bearing media, when carrying computer-readable instructions that direct functions of one or more aspects of the invention represent embodiments of the invention.

While the foregoing describes exemplary embodiment(s) in accordance with one or more aspects of the invention, other and further embodiment(s) in accordance with the one or more aspects of the invention may be devised without departing from the scope thereof, which is determined by the claim(s) that follow and equivalents thereof. Claim(s) listing steps do not imply any order of the steps. Trademarks are the property of their respective owners. Headings are provided merely for organizational clarity and are not intended in anyway to limit the scope of the disclosure under them.

Claims (15)

1. A programmable device comprising:

configurable circuitry having a first frequency of operation, wherein the configurable circuitry is configurable to implement a user-specified circuit design;

a media access controller integrated circuit embedded in the programmable device, the media access controller integrated circuit having a second frequency of operation of at least approximately twice the first frequency of operation; and

the media access controller integrated circuit having a receive-side client interface having a selectable data input width and configurable for operation at any of a plurality of data rates, the receive-side client interface for communication with the configurable circuitry at the first frequency of operation and for communication outside of the programmable device at the second frequency of operation.

2. The programmable device, according to claim 1, wherein the second frequency of operation is for a data rate of the plurality of data rates in excess of a maximum standard Ethernet data rate.

3. The programmable device, according to claim 2, wherein the maximum standard Ethernet data rate is approximately 1.0 Gigabits per second.

4. The programmable device, according to claim 1, wherein the media access controller is a hardwired circuit embedded in the programmable device.

6. The programmable device, according to claim 1, wherein the data input width of the receive-side client interface coupled to the configurable circuitry is twice a width of a portion of the receive-side client interface coupled to the media access controller.

7. The programmable device, according to claim 1, wherein the media access controller integrated circuit further comprises a transmit-side client interface having a selectable data output width that is selectable independently of the data input width of the receive-side client interface.

8. The programmable device, according to claim 7, wherein the receive-side client interface and the transmit-side client interface each operate in separate clock domains that are asynchronous with respect to one another.

9. A method of communication of data comprising:

providing a programmable device including:

configurable circuitry having a first frequency of operation, wherein the configurable circuitry is configurable to implement a user-specified circuit design;

a media access controller integrated circuit embedded in the programmable device, the media access controller integrated circuit having a second frequency of operation of at least approximately twice the first frequency of operation;

the media access controller integrated circuit having a receive-side client interface having a selectable data input width and configurable for operation at any of a plurality of data rates, the receive-side client interface for communication with the configurable circuitry at the first frequency of operation and for communication outside of the programmable device at the second frequency of operation; and

establishing a receive-side communication link at a data rate in excess of an Ethernet data rate.

10. The method, according to claim 9, wherein the Ethernet data rate is approximately 1.25 Gigabits per second.

11. The method, according to claim 9, wherein the media access controller is a hardwired circuit embedded in the programmable device.

13. The method, according to claim 9, wherein the data input width of the receive-side client interface coupled to the configurable circuitry is twice a width of a portion of the receive-side client interface coupled to the media access controller.

14. The method, according to claim 9, wherein the media access controller integrated circuit further comprises a transmit-side client interface having a selectable data output width that is selectable independently of the data input width of the receive-side client interface.

15. The method, according to claim 14, wherein the receive-side client interface and the transmit-side client interface each operate in separate clock domains that are asynchronous with respect to one another.