BIOS IT Blog
Intel® Omni-Path
Introduction
Omni-Path is the next generation high performance fabric from Intel®; the successor to the already very successful True Scale Fabric, Omni-Path is making a huge step in closing the gap to the market dominant Mellanox Technologies.
Intel® Omni-Path (OPA) implements a plethora of new technologies to the Technical Computing space with emphasis on High Performance Computing (HPC). Built on the foundations of Intel® True Scale Fabric and additional IP acquired from Cray Intel® is looking to dominate the HPC arena with a low latency, high bandwidth cost efficient fabric.
OPA is not Infiniband; despite being a primary competitor to the Mellanox EDR technology Intel decided to move away from the Infiniband lock-in to a more functional fabric dedicated to HPC. Infiniband was never originally designed for HPC with adaptation in the early 2000’s being slow after numerous setbacks; after which Infiniband was slowly adopted as a clustering interconnect. Intel® took a different approach using a technology called Performance Scaled Messaging (PSM) which optimised the Infiniband stack to work more efficiently at smaller message sizes, which is typically what you associate with HPC workloads; usually MPI traffic. For OPA Intel® have gone a step further, building on the original PSM architecture; Intel® acquired proprietary technology from the Cray Aries interconnect to enhance the capabilities and performance of OPA, these are at both the fabric and host level.
Key Features of the New Intel® Omni-Path Fabric
Some of the new technologies packaged in to the Omni-Path Fabric include: Enhanced Performance Scaled Messaging (PSM).
The application view of the fabric is derived heavily from, and has application-level software compatible with, the demonstrated scalability of Intel® True Scale Fabric architecture by leveraging an enhanced next generation version of the Performance Scaled Messaging (PSM) library. Major deployments by the US Department of Energy and others have proven this scalability advantage. PSM is specifically designed for the Message Passing Interface (MPI) and is very lightweight—one-tenth of the user space code—compared to using verbs. This leads to extremely high MPI and Partitioned Global Address Space (PGAS) message rates (short message efficiency) compared to using Infiniband* verbs.
Upgrade Path to Intel® Omni-Path
Despite not being truly Infiniband Intel® have managed to maintain compatibility with their previous generation True Scale Fabric meaning that applications that work well on True Scale can be easily migrated to OPA. OPA integrates support for both True Scale and Infiniband API’s ensuring backwards compatibility with previous generation technologies to support any standard HPC application.
Source: Intel® Corporation,2015
Other features include:
- Adaptive Routing
- Dispersive Routing
- Traffic Flow Optimization
- Packet Integrity Protection
- Dynamic Lane Scaling
We will cover these features in more detail in our forthcoming whitepaper on Intel Omni Path Fabric technology.
Intel® Omni Path Hardware
Host Fabric Interface Adapters (HFI’s)
Intel® currently has two offerings on the host fabric interface (HFI) adapter side these include a PCIe x8 58Gbps adapter and a PCIe x16 100Gbps adapter, both of these are single port adapters. Both HFI’s use the same silicon so offer the same latency capabilities and features of the high end 100Gbps card.
Along with the physical adapter cards, Supermicro will also be releasing a range of Super Servers with the Omni-Path fabric laid down on the motherboard, this will offer a tighter layer of integration and enable a more compact server design. To take this design even further Intel® have announced that they will be intergrading OPA on to future Intel® Xeon® processors, this will reduce latency further and overall increase performance of all applications.
Some Key features:
- Multi-core scaling – support for up to 160 contexts
- 16 Send DMA engines (M2IO usage)
- Efficiency – large MTU support (4 KB, 8 KB, and 10KB) for reduced per-packet processing overheads. Improved packet-level interfaces to improve utilization of on-chip resources.
- Receive DMA engine arrival notification
- Each HFI can map ~128 GB window at 64 byte granularity
- Up to 8 virtual lanes for differentiated QoS
- ASIC designed to scale up to 160M messages/second and 300M bidirectional messages/second
Intel® Omni-Path Host Fabric Adapter 100 Series 1 Port PCIe x16 | Intel® Omni-Path Host Fabric Adapter 100 Series 1 Port PCIe x8 | |
---|---|---|
ADAPTER TYPE | Low Profile PCIe Card(PCIe x16) | Low Profile PCIe Card(PCIe x8) |
PORTS | Single | Single |
CONNECTOR | QSFP28 | QSFP28 |
LINK SPEED | 100Gb/s | ~58Gb/s on 100Gb/s Link |
POWER (TYP./MAX) – - COPPER - OPTICAL |
7.4/11.7W (Copper) 10.6/14.9W (Optical) |
6.3/8.3W (Copper) Passive (55° C at 200 LFM) |
THERMAL/TEMP. | Passive (55° C at 200 LFM) | Passive (55° C at 200 LFM) |
Source: Intel® Corporation, 2015
Intel® Omni-Path Edge and Director Class Switch 100 Series
The all new Edge and Director switches for Omni-Path from Intel® offer a totally different design from traditional Infiniband switches. Incorporating a new ASIC and custom front panel layout, Intel® have been able to offer up to 48 Ports at 100Gbps from a single 1U switch, this is 12 ports higher than its nearest competitor. The higher switching density allows for some significant improvements within the data centre, some include:
- Reduced switching cost due to needing less physical switching (over 30% reduction in switches for most configurations)
- Lower amount of fabric hops for reduced latency
- 100-110ns switch latency
- Support for fabric partitioning
- Support for both active and passive cabling
- Higher node count fabric: support for up to 27,648 nodes in a single fabric that is up by nearly 2.3x of traditional Infiniband.
Intel® Omni-Path Edge Switch 100 Series: 48 Port | Intel® Omni-Path Edge Switch 100 Series: 24 Port | |
---|---|---|
PORTS | 48 up to 100Gbps | 24 up to 100Gbps |
RACK SPACE | 1U (1.75”) | 1U (1.75”) |
CAPACITY | 9.6Tb/s | 4.8Tb/s |
PORT SPEED | 100Gb/s | ~58Gb/s |
POWER (TYP./MAX) – - INPUT 100-240 VAC 50-60HZ - OPTICAL |
189/238 W (Copper) 356/408 W (Optical) |
146/179 W (Copper) 231/264 W (Optical) |
INTERFACE | QSFP28 | QSFP28 |
FANS & AIRFLOW | N+1 (Speed Control) Forward/Reverse | N+1 (Speed Control) Forward/Reverse |
Intel’s Director switch range offers a very similar feature set to the Edge switches with various chassis options as you may expect. Currently there is a 20U and a 7U variant available supporting various Spine and Leaf modules.
Intel® Omni-Path Director Class Switch 100 Series: 24 Slot | Intel® Omni-Path Director Class Switch 100 Series:6 Slot | |
---|---|---|
PORTS | 48 up to 100Gbps | 48 up to 100Gbps |
RACK SPACE | 20U (1.75”) | 7U (1.75”) |
CAPACITY | 19.2Tb/s | 4.8Tb/s |
MANAGEMENT MODULES | 1/2 | 1/2 |
LEAF MODULES (32 PORTS) | Up to 24 | Up to 6 |
SPINE MODULES | Up to 8 | Up to 3 |
POWER (TYP./MAX) – - INPUT 100-240 VAC 50-60HZ - OPTICAL |
6.8/8.9 KW (Copper) 9.4/11.6 KW (Optical) |
1.8/2.3 KW (Copper) 2.4/3.0 KW (Optical)) |
INTERFACE | QSFP28 | QSFP28 |
FANS & AIRFLOW | N+1 (Speed Control) Forward/Reverse | N+1 (Speed Control) Forward/Reverse |
Intel® Omni-Path Software Components
Intel® Omni-Path Architecture software comprises the Intel® OPA Host Software Stack and the Intel® Fabric Suite.
Intel® OPA Host Software
Intel’s host software strategy is to utilize the existing OpenFabrics Alliance interfaces, thus ensuring that today’s application software written to those interfaces run with Intel® OPA with no code changes required. This immediately enables an ecosystem of applications to “just work.”
All of the Intel® Omni-Path host software is being open sourced.
As with previous generations PSM provides a fast data path with an HPC-optimized lightweight software (SW) driver layer. In addition, standard I/O-focused protocols are supported via the standard verbs layer.
Intel® Fabric Suite
Provides comprehensive control of administrative functions using a mature Subnet Manager. With advanced routing algorithms, powerful diagnostic tools and full subnet manager failover, the Fabric Manager simplifies subnet, fabric, and individual component management, easing the deployment and optimization of large fabrics.
Intel® Fabric Manager GUI
Provides an intuitive, scalable dashboard and analysis tools for viewing and monitoring fabric status and configuration. The GUI may be run on a Linux or Windows desktop/laptop system with TCP/IP connectivity to the Fabric Manager.
Source: Intel® Corporation, 2015
Not what you're looking for? Check out our archives for more content
Blog Archive