Open Fabrics Enterprise Distribution (OFED) Version 4.17-1 Release Notes June 2019 =============================================================================== Table of Contents =============================================================================== 1. Overview, which includes: - OFED Distribution Rev 4.17-1 Contents - Supported Platforms and Operating Systems - Supported HCA and RNIC Adapter Cards and Firmware Versions - Tested Switch Platforms - Third party Test Packages - OFED sources, build process 2. Change log 3. Known Issues 4. Generic Notes =============================================================================== 1. Overview =============================================================================== These are the release notes of OpenFabrics Enterprise Distribution (OFED) release 4.17. The OFED software package is composed of several software modules, and is intended for use on a computer cluster constructed as an InfiniBand Fabric, an iWARP Network or a RoCE Fabric. Note: If you plan to upgrade the OFED package on your cluster, please upgrade all of its nodes to this new version. Note: If you are installing the OFED package on a system that contains the providers drivers in the initrd / initramfs you should update the initramfs to indicate the new driver locations either manually or using the command: dracut -f -v --force for a full rewrite (May differ per distro) Failing to do so may lead to failing to probe vendor specific infiniband drivers with a mismatch message, or running with inbox drivers instead of the drivers that are part of OFED. 1.1 OFED 4.17-1 Contents ----------------------- The OFED package contains the following components: - OpenFabrics core and ULPs: - IB HCA drivers (mthca, mlx4, mlx5, qib) - iWARP RNIC driver (cxgb3, cxgb4, i40iw, qedr) - RoCE drivers (mlx4, mlx5, qedr, bnxt_re) - IB core - Upper Layer Protocols: - IPoIB, SRP Initiator, iSER, uDAPL - NVMe-oF - OpenFabrics utilities: - OpenSM (OSM): InfiniBand Subnet Manager - Diagnostic tools - Performance tests - Extra packages: - libfabric - library that exports interfaces for fabric services to applications - Sources of all software modules (under conditions mentioned in the modules LICENSE files) - Documentation 1.2 Supported Platforms and Operating Systems --------------------------------------------- o CPU architectures: - x86_64 - x86 - ppc64 o Linux Operating Systems: - RedHat EL7.4 3.10.0-693.el7 - RedHat EL7.5 3.10.0-862.el7 - RedHat EL7.6 3.10.0-957.el7 - SLES12.3 4.4.73-5-default - SLES12.4 4.12.14-94.41-default - SLES15 4.12.14-23.1-default - kernel.org 4.17 1.3 HCAs and RNICs Supported ---------------------------- This release supports IB HCAs by Intel and Mellanox Technologies, iWARP RNICs by Chelsio Communications and Intel and RoCE adapters by IBM, Mellanox, Cavium, and Broadcom. InfiniBand Adapters o Intel (formerly QLogic) HCAs: - Intel(R) True Scale DDR PCIe x8 and x16 HCAs - Intel(R) True Scale QDR PCIe x8 Gen2 HCAs o Mellanox Technologies HCAs (FDR and FDR10 Modes are Supported): - ConnectX-3 (Rev 2.40.7000 and above) - ConnectX-3 Pro (Rev 2.34.5000 and above) - Connect-IB (Rev 10.16.1020 and above) o Mellanox Technologies HCAs (EDR, FDR, FDR10 Modes are Supported): - ConnectX-4 (Rev 12.22.1002 and above) - ConnectX-4 Lx (Rev 14.22.1002 and above) - ConnectX-5 (Rev 16.22.1002 and above) o Mellanox Technologies HCAs (HDR, EDR, FDR, FDR10 Modes are Supported): - ConnectX-6 (Rev xx.xx.x and above) For official firmware versions please see: http://www.mellanox.com/content/pages.php?pg=firmware_download iWARP Adapters o Chelsio RNICs: - S310/S320 10GbE Storage Accelerators - R310/R320 10GbE iWARP Adapters - T4: T420-CR, T440-CR, T422-CR, T404-BT, T440-LP-CR, T420-LL-CR, T420-CX - T5: T502-BT, T580-CR, T580-LP-CR, T520-LL-CR, T520-CR, T522-CR, T540-CR - T6: T62100-CR, T62100-LP-CR, T6225-CR o Intel RNICs: - i40iw 10Gb iWARP Adapter (kernel 4.17 only) o Cavium - QL4xxxx series of converged network adapters (iWARP) RoCE Adapters o Mellanox - ConnectX-2 EN (Rev 2.9.1200 and above) - ConnectX-3 EN (Rev 2.34.5000 and above) - ConnectX-4 EN (Rev 14.22.1002 and above) - ConnectX-5 EN (Rev 16.22.1002 and above) - ConnectX-6 EN (Rev x.xx.xxxx and above) o Cavium - QL4xxxx series of converged network adapters (RoCEv2) o Broadcom - NetXtreme Ethernet Network Adapters (RoCEv2) Other Drivers - VMware Paravirtual RDMA Driver (vmw_pvrdma) - Software RDMA over Ethernet (rdma_rxe) - NVMe-oF Host and Target (nvme, nvmet, nvme-rdma) 1.4 Switches Supported ---------------------- This release was tested with switches and gateways as follow: InfiniBand Switches o Flextronics - F-X430044 o Intel (formerly QLogic) - QL12200 o Mellanox - MLNX-OS MSX6036/SX6025 w/w MLNX-OS version 3.3.4304 - Grid Director 4036 w/w Grid Director version 3.9.2-992 - FabricIT EFM IS5035 w/w FabricIT EFM version 1.1.3000 - FabricIT BXM MBX5020 w/w FabricIT BXM version 2.1.2000 iWARP Switches o Arista o Fujitsu - XG2000C 10Gb Ethernet Switch RoCE Switches o Arista o BLADE Network Technologies (BNT) o Mellanox - SX1036 - SX1024 - SX1016 1.5 Third Party Packages ------------------------ The following third party packages have been tested with OFED 4.17: - Open MPI - 3.x - Intel MPI Library 2018/2019 1.6 OFED Sources ---------------- All sources are located under: git://git.openfabrics.org https://github.com/linux-rdma https://github.com/Mellanox/mstflint https://github.com/ofiwg/libfabric Linux: ------ URL: git://git.openfabrics.org/compat-rdma/linux-4.17.git Branch: master - Linux kernel sub-tree that includes files relevant for the OFED project only. Based on v4.17. Used to shorten git clone time. Note: the regular Linux git tree can be used as well. compat: ------- URL: git://git.openfabrics.org/compat-rdma/compat.git Branch: ofed - Based on compat project (https://github.com/mcgrof/compat). The compat module provides functionality introduced in newer kernels to older kernels through a set of header files and exported symbols. See https://github.com/mcgrof/compat/wiki for details. - Used to replace kernel_addons in the previous OFED kernel tree. compat-rdma: ------------ URL: git://git.openfabrics.org/compat-rdma/compat-rdma.git Branch: master User level Sources are downloaded from http://www.openfabrics.org/downloads/ as written in the BUILD_ID build process: -------------- The kernel sources are based on Linux 4.17 mainline kernel. Its patches are included in the OFED sources directory. OFED's rdma_core is based on 4.17. The distro's rdma_core will not be newer since we use the latest upstream kernel/rdma_core as a base and add support to the existing distros. OFED brings newer functionality to the existing distros by removing the in-box" user-space packages and installing newer rdma core (user and kernel) and other OFED packages. compat-rdma brings the new kernel drivers installed under /lib/modules/`uname -r`/kernel/updates directory which precede in the search path the in-box drivers so that OFED drivers can be loaded by the openibd service. Latest build process is available under: https://downloads.openfabrics.org/WorkGroups/ewg/Build%20Process/ The list of maintainers is available under: http://www.openfabrics.org/downloads/MAINTAINERS =============================================================================== 2. Change log =============================================================================== ------------------------------------------------------------------------------- OFED-4.17-1-rc2 Main Changes from OFED-4.17-1-rc1 ------------------------------------------------------------------------------- 1. compat-rdma - Module.supported: Added new Mellanox kernel modules 2. Updated packages - rdma-core v17.5 - perftest-4.4-0.6.gba4bf6d - opensm-3.3.22 ------------------------------------------------------------------------------- OFED-4.17-1-rc1 Main Changes from OFED-4.17 ------------------------------------------------------------------------------- 1. Add support for RHEL7.6 and SLES12 SP4 2. compat-rdma - bnxt_en: Fix bnxt_en compilation for RH 7.6 - Add vmw_pvrdma patch from linux-next - compat: Move bpf_prog_sub to be exported by compat 3. Updated packages - rdma-core v17.4 - perftest-4.4-0.3.g7a8e409 (Downgraded due to bug: 2698) - mstflint-4.11.0-4 - libfabric-1.7.1 - fabtests-1.7.1 4. ofed-scripts - install.pl: Updated rdma-core requirements - install.pl: Disable compat-rdma-firmware on SLES12SP4 due to conflict - install.pl: Added SLES12 SP4 support - install.pl: Added RHEL7.6 support - install.pl: Added openssl-devel requirement for mstflint ------------------------------------------------------------------------------- OFED-4.17 Main Changes from OFED-4.8-2 ------------------------------------------------------------------------------- 1. Add support for RHEL7.5 and SLES15 2. Remove support for RHEL7.0,7.1,7.2,7.3, and SLES12.0,12.1,12.2 3. compat-rdma - Based on kernel.org kernel 4.17 - Fixed cma_configfs backport - Fixed SRP backport - openibd: unload bnxt_en on restart/stop - i40iw: remove support for ib_get_vector_affinity - qed: Fix shmem structure inconsistency between driver and the mfw. 4. Updated packages - rdma-core v17.2 - perftest-4.4-0.5.g1ceab48 - opensm 3.3.21 - infiniband-diags-2.1.0 - mstflint-4.10.0 - ofed_scripts - libfabric-1.6.2 - fabtests-1.6.2 ------------------------------------------------------------------------------- OFED-4.8-2 Main Changes from OFED-4.8-1 ------------------------------------------------------------------------------- 1. compat-rdma - bnxt_re: Add speed defines for 50G and 100G adapters - IB/core: Adding IB_SPEED_HDR definition - bnxt_en: Adding device ids for BCM5880x devices - bnxt_re: Avoid Hard lockup during error CQE processing - BACKPORT qed: Enable RoCE parser searching on fp init - BACKPORT qed: fix dump of context data - qed: Backport Free RoCE ILT Memory on rmmod qedr - bnxt_re: Fix incorrect DB offset calculation - bnxt_re: Unconditionly fence non wire memory operations - bnxt_re: Fix race conditions during load/unload testing - bnxt_re: Fix memory leak if QP create fails - bnxt_re: Disable atomics support - bnxt_en: Fix Max MTU setting on SLES 12 SP3 - qede: SLE12SP3 Backport fix use core min max MTU check - cxgb: SLE12SP3 Backport fix: use net core MTU range checking - Remove vmw_pvrdma tech preview patches - Fixed mlx4 backport - qedr cherry pick: lower message verbosity - Update qib backport for SLES 12.3 - bnxt_en: backport for RH 7.0 and 7.1 - compat-rdma.spec: Added compat-rdma-firmware subpackage - bnxt_en: Backporting for RHEL 7.4 and SLES12SP3 - qedr: Add backports for RHEL 7.4 and SLES 12 SP3 - configure: Add parameters for RXE and RDMAVT - ib_iser: Added support for SLES12 SP3 - mlx5: Added support for SLES12 SP3 - mlx4: Added support for SLES12 SP3 - ib_core: Added support for SLES12 SP3 - mlx5: Added RHEL7.4 support - mlx4: Added RHEL7.4 support - configure/makefile: Rename CONFIG_INFINIBAND_RXE -> CONFIG_RDMA_RXE to match upstream 2. Updated packages - ofed-scripts: install.pl: Ignore firmware in compat-rdma-content. bug: 2677 add --with-rdmavt-mod for qib Add VMware Paravirtual RDMA Driver as an installable driver Remove vmw_pvrdma tech preview install option install.pl: Added rdma-core-devel dependency for the perftest build Revert "install.pl: Enable bnxt_re on supported OSes only" install.pl: Added compat-rdma-firmware package install.pl: Enable bnxt_re on supported OSes only install.pl: Updated rdma-core dependencies install.pl: Remove requirement on the specific cmake version install.pl: Added support for SLES12 SP3 install.pl: Added RHEL7.4 support - rdma-core-v16.3 - fabtests-1.5.3 - libfabric-1.5.3 - perftest-4.1-0.2.g770623f OFED-4.8-1 Main Changes from OFED-4.8 ------------------------------------------------------------------------------- 1. compat-rdma - openibd: rpcrdma added to the list of modules to unload - bnxt_en/bnxt_re: Support ulp_shutdown hook - bnxt_re: BZ 2655 Fix nfs client hang with a call trace on stress traffic - bnxt_re: BZ 2654 Fix traffic failure after changing ip address - bnxt_re: BZ 2656 fix a crash in qp error event processing - bnxt_en: BZ 2649 Enabling DCB parameter for bnxt_en - Fixes for i40iw which have been included in kernels 4.12 and 4.13 - qedr: fix BZ 2642 - parse VLAN ID and priority correctly - bnxt_en: Bug 2645 - Set default completion ring for async events - bnxt_re: bz 2646 - 0012-bnxt_re-Make-room-for-mapping-beyond-32-entries.patch - ib_srp: Fixed backport - issue 2635 - bnxt_en: Misc bug fixes - bnxt_re: Fix bnxt_en installation in the latest daily builds - bnxt_re: Adding bug fix patches - bnxt_re: Backports for SLES12SP0/SP1/SP2 and RH7.0/7.1 - NFS/RDMA backport patch for SLES12SP2 - NFS/RDMA backport patch to revert source files to 4.6 kernel in order to facilitate dependency on distro SUNRPC. Include fix to use correct ib_map_mr_sg signature from OFED4.8. - makefile: Added CONFIG_BNXT parameter to override kernel's default - compat-rdma.spec: Fixed QED firmware installation - configure: Fix typos introduced by b8520c5751cc6c40bb2f5ba8533a34dc260f86f2 - configure: Fix typo - Merge branch 'master' of https://github.com/selvintxavier/compat-rdma - Fixed SRP support on RHEL7.2 by using blk_iopoll API in ib_core - bnxt_re: Add bnxt_re backports - bnxt_re: Enable bnxt_re building - Removed previously applied patch 0057-qed-Fix-error-in-the-dcbx-app-meta-data.patch - Add qedr 2. Updated packages - rdma-core stable-v15 - libfabric-1.5.2 - fabtests-1.5.2 - mstflint-4.8.0 3. ofed_scripts - Added support for rdma-core new packaging format - install.pl: Added support for using source RPM per Distro - install.pl: Enable nfsrdma on RHEL7.3 and SLES12 SP2 - Fix syntax error after libbnxt_re-rdmav2 - Add bnxt_re support - Add qedr OFED-4.8 Main Changes from OFED 3.18-3 GA ------------------------------------------------------------------------------- 1. Added support for RHEL7.3 and SLES12 SP2 2. compat-rdma: - BACKPORT-ib_srp: synced with MLNX_OFED ib_srp backport - IB/srp: force reconnect_delay module param to exceed fast_io_fail_tmo - Final updates to ccl ABI between host and card to match OFED 4.8 and kernel 4.9. - Fix rstream timeouts on 4k_lat test - Support for Kernel 4.8 in QIB on OFED-4.8 - compat-rdma: Add i40iw to openibd - Interop related fixes for iWarp drivers i40iw and nes - configure: Fix typo - compat: Cleanup auto-generated defines - iw_cxgb4: Guard against null cm_id in dump_ep/qp - Xeon Phi updates - Fixes for i40iw which have been included in kernels > 4.8 - Updated 0003-add-the-ibp-client-and-server-drivers.patch 3. Updated packages: - compat-rdma-4.8 - based on linux-4.8 - infiniband-diags-2.0.0 - Removed libibmad (deprecated) - libfabric-1.4.2 - fabtests-1.4.2 - rdma-core v13 - libibscif-1.1.3 - rdma-core-12-1 - dapl-2.1.10 - fabtests-1.4.1rc1 - infiniband-diags-1.6.7 - libfabric-1.4.1rc1 - libibmad-1.3.13 - libibscif-1.1.2 - mstflint-4.5.0-1.17.g8a0c39d - perftest-3.4-0.9.g98a9a17 4. install.pl - Disable inband support for mstflint =============================================================================== 3. Known Issues =============================================================================== The following is a list of general limitations and known issues of the various components of the OFED 4.17-1 release. 01. When upgrading from an earlier OFED version, the installation script does not stop the earlier OFED version prior to uninstalling it. Workaround: Stop the old OFED stack (/etc/init.d/openibd stop) before upgrading or reboot the server after OFED installation. 02. Memory registration by the user is limited according to administrator setting. See "Pinning (Locking) User Memory Pages" in OFED_tips.txt for system configuration. 03. Fork support from kernel 2.6.12 and above is available provided that applications do not use threads. fork() is supported as long as the parent process does not run before the child exits or calls exec(). The former can be achieved by calling wait(childpid), and the latter can be achieved by application specific means. The Posix system() call is supported. 04. IPoIB: brctl utilities do not work on IPoIB interfaces. The reason for that is that these utilities support devices of type Ethernet only. 05. In case uninstall is failing, check the error log and remove the remaining RPMs manually using 'rpm -e '. 07. RDS is not supported. 08. Bug 2644 – qedr: Changing Ethernet MTU prevents future RoCE traffic. Changing the MTU size of the Ethernet interface will stop and prevent RoCE traffic on that interface. To overcome this reload the qedr driver: $ rmmod qedr $ modprobe qedr 09. Bug 2639 - ib_isert/ib_srpt report "Unknown symbol" after OFED installation: ib_srpt and ib_isert kernel modules are not included in OFED. So, after OFED installation the in-box ib_core being replaced by one coming with OFED and, therefore, in-box ib_srpt and ib_isert kernel modules fail to be loaded over OFED's ib_core. If one need to use iSER or SRP target then need to recompile iSERT/SRPT over the OFED. 10. Bug 2640 - openibd fail to start when system is coming up: The inbox kernel modules being loaded from initrd So, need to rebuild the initrd by: # dracut -f -v 11. Bug 2641 - IPoIB: Missing first ping after clearing ARP cache: This is a known issue by IPoIB design 12. Bug 2661 - rstream intermittent error: Connection refused (stale connection) 1. Two Q-Logic/Cavium NICs back to back 2. Command: rstream -S all -T a on the second connect in the test: Connection refused (stale connection ) 13. Bug 2694 - bnxt_re, SLES12.3 errors in dmesg in iSER target while running read traffic 14. Bug 2696 - bnxt_re, call trace observed while running perftest with tx queue size 50000 15. Bug 2697 - i40iw, NVMeoF fails with i40iw and Chelsio (T5) interop 16. Bug 2701 - perftest doesn't support inline data on new Broadcom netextreme adapter 17. Bug 2702 - cxgb4 driver: cannot set MTU more than 1500 on RH7.5 Note: See the release notes of each component for additional issues. =============================================================================== 4. Generic Notes =============================================================================== 1. install.pl - '--without-depcheck' parameter should be used in case where the required dependencies sutisfied using 'make install' command and not in RPM form, so, they cannot be verified by the install.pl script. Using '--without-depcheck' when the dependency is actually not present may lead to undesired behaviour both in compilation and run time. 2. ULP and Driver restrictions: - NVMe-oF (kernel 4.17 and SLES15 only) - Intel i40iw 10GbE iWARP Adapter (kernel 4.17 only) - Intel qib 40Gb IB Adapter (kernel 4.17 only) 3. Post installation note for RH7.4 and RH7.5 Redhat may start the rdma.service and attempt to load in-box ULP modules not built with OFED rdma_core components. As a result there may be some conflicts with in-box modules and the new rdma_core modules loaded with our openibd service. For example, dmesg may show unknown symbols similiar to the following: rpcrdma: Unknown symbol rdma_reject_msg To prevent in-box ULP modules from attempting to loading you can disable the rdma.service or modify the configuration files after you install OFED. The following is a sample post install script: dracut -f -v --force sed -i -e "s/yes/no/" /etc/rdma/rdma.conf sed -i -e "s/^svc/#&/;s/^xprt/#&/" /etc/rdma/modules/rdma.conf reboot