Tuesday, December 4, 2018
Fabrics technologies are continuously evolving and System Administrators are under pressure to deploy new fabric technologies in a continuous quest to improve application efficiency and programmer productivity. Increasingly, that means integrating multiple, heterogeneous fabrics. This applies to both the applications, and overall system design (Compute Cluster, Storage Networks, etc.). While choosing between various fabric technologies depending on their features the system administration has to consider the challenges these underlying shared fabric stack code brings. Due to this shared common code base, challenges can arise when trying to use them on the same system at the same time, and/or being able to interchange the technology on a system (namely InfiniBand and Omni-Path). Ability to integrate heterogeneous fabrics into their environments requires knowledge of the network layer deployment as well as the associated system/kernel dependencies and effects. In this talk, system administrators can get an understanding of some of the biggest challenges they face regularly.
- Educate and understand the various fabrics functionalities
- Identify documentation needed to deploy and maintain these heterogeneous fabrics
- Understanding of other sysadmin communities and how to get connected to share the learnings and knowledge
- Standardization and support for these technologies to interoperate.
Educate system administrators about co-existence of heterogeneous fabrics and educate them on challenges and potential solutions.
System administrators of High Speed Fabrics
Speaker(s) / Moderator:
Jesse Martinez is the Technical Lead for the HPC Networking Team within the HPC Systems Group at Los Alamos National Laboratory. Jesse began his career at LANL as a student intern where he attended the 2011 Computer System, Cluster, and Networking Summer institute Jesse received his B.S. in Computer Science from the New Mexico Institute of Mining and Technology in 2012. As an HPC Networking administrator, he manages the network infrastructure for LANL's production supercomputing capability, including Ethernet, InfiniBand, and Omni-Path high speed networks. Jesse has presented his work on high speed network monitoring at various conferences including the Salishan HPC conference and the OpenFabrics Alliance (OFA) Workshop. Jesse has recently organized the HPC Networking Saturday Workshop at the Tapia 2017 and 2018 conferences.