When examining high-speed Ethernet interfaces, the apparent clock frequency requirements seem physically unrealistic at first glance. For 10GbE (10 Gigabit Ethernet), a naive implementation would suggest needing a 10GHz clock to handle one bit per cycle. Modern transceivers employ several architectural optimizations to make this feasible with practical clock speeds.
The fundamental technique is parallelization through multi-lane designs. A typical 10GBASE-R implementation uses:
// Conceptual PHY data path structure
struct XGEMAC {
uint64_t xgmii_tx_data[4]; // 4 lanes of 64-bit data
uint8_t xgmii_tx_ctrl[4]; // Control bits per lane
uint32_t clk312mhz; // Actual clock frequency
};
This XGMII (10 Gigabit Media Independent Interface) specification divides the data path into four lanes running at 312.5MHz (312.5MHz × 4 lanes × 8 bits = 10Gbps). The physical implementation uses:
High-speed transceivers serialize/deserialize data streams using:
- 64b/66b encoding (reduces framing overhead)
- Clock data recovery (CDR) circuits
- Differential signaling (PAM4 for 100GbE)
Example of a modern 100GbE implementation:
// QSFP28 module configuration (100GbE)
void configure_100g_phy() {
set_modulation(PAM4); // 2 bits per symbol
set_lanes(4); // 4 physical lanes
set_symbol_rate(26.5625GBd); // Baud rate per lane
// Effective rate: 26.5625GBd × 4 lanes × 2 bits = 212.5Gbps
// (Includes FEC overhead for 100GbE payload)
}
Modern NIC designs implement multiple clock domains:
// Typical clock domains in a 100GbE controller
#define PCIE_CLK 250MHz // Host interface
#define MAC_CLK 322MHz // Data processing
#define PHY_CLK 156.25MHz // Physical layer
#define SERDES_CLK 26.5625GHz // Internal serializer
The highest frequency signals only exist within the analog SerDes blocks, while digital logic operates at more manageable frequencies.
Current generation NICs use these techniques:
Standard | Data Rate | Lanes | Line Rate | Encoding |
---|---|---|---|---|
10GBASE-R | 10Gbps | 4 | 3.125Gbps | 64b/66b |
100GBASE-CR4 | 100Gbps | 4 | 25Gbps | 64b/66b |
100GBASE-SR4 | 100Gbps | 4 | 25Gbps | 64b/66b |
Modern FPGA implementations demonstrate how this works in practice:
// Xilinx Ultrascale+ 100G Ethernet example
ethernet_100g #(
.LANES(4),
.LINE_RATE(25.78125),
.ENCODING("64B66B")
) phy_inst (
.clk_322mhz(mac_clk),
.clk_156m(phy_clk),
.serdes_clk(25G_refclk)
);
Modern high-speed Ethernet adapters use several clever techniques to avoid requiring impractical clock frequencies while maintaining line rate throughput. Let's examine how 10Gb and 100Gb interfaces achieve this.
Rather than processing a single bit stream at 10GHz, Ethernet cards use wide parallel buses internally:
// Typical 10G Ethernet MAC to PHY interface (XAUI)
4 lanes × 3.125 Gbps = 10 Gbps aggregate
// 100G implementations often use:
20 lanes × 5 Gbps = 100 Gbps (CAUI-4)
or
4 lanes × 25 Gbps = 100 Gbps (CAUI-8)
Line encoding reduces the actual clock requirements:
- 64B/66B encoding (10GbE): Adds 2-bit sync header, making effective clock ~10.3 GHz / 66 = 156.25 MHz per lane
- 256B/257B encoding (100GbE): Further reduces overhead
Here are typical clock frequencies used internally:
Standard | Line Rate | Internal Clock | Lanes |
---|---|---|---|
10GBASE-R | 10.3125 Gbps | 156.25 MHz | 4 (XAUI) |
100GBASE-CR4 | 103.125 Gbps | 1.5625 GHz | 20 (CAUI-4) |
Modern FPGAs implement this using SERDES blocks. Here's a simplified Verilog example:
module xaui_mac (
input wire clk_156mhz,
input wire [63:0] tx_data,
output wire [3:0] xaui_tx
);
// 64b/66b encoder
reg [65:0] encoded_data;
always @(posedge clk_156mhz) begin
encoded_data <= {2'b01, tx_data}; // Sync header + payload
end
// 4-channel SERDES
genvar i;
generate
for (i=0; i<4; i=i+1) begin : serdes
serializer ser (
.clk(clk_156mhz),
.parallel_in(encoded_data[i*16 +: 16]),
.serial_out(xaui_tx[i])
);
end
endgenerate
endmodule