Towards Modular Supercomputing with Slurm

Ähnliche Dokumente
Orientierungsveranstaltungen 2009 Informatikstudien der Universität Wien

GPGPUs am Jülich Supercomputing Centre

CLAIX Vorstellung und Technik Christian Terboven

Possible Contributions to Subtask B Quality Procedure

Überlebe ich in meinem Job die tobende IT Revolution im DataCenter. Georgina Zeug-Bremen MIT Sales Dell EMC

Wissenschaftliches Rechnen an der Christian- Albrechts-Universität zu Kiel

Electron cloud studies at TU Darmstadt and GSI

GridMate The Grid Matlab Extension

jasimasolutions Simulation based process optimization for the agri food sector jasimabeverageplanner jasima Solutions Spin Off from

Titelmasterformat durch Klicken bearbeiten

SARA 1. Project Meeting

HLRN III - HPC Ressource für Norddeutschland

Technische Universität Kaiserslautern Lehrstuhl für Virtuelle Produktentwicklung

Accelerator physics at TU Darmstadt/TEMF

Munich Center for Technology in Society

Institut für Künstliche Intelligenz

Dr. Bernd Korves October 05, 2015

Webbasierte Exploration von großen 3D-Stadtmodellen mit dem 3DCityDB Webclient

WE SHAPE INDUSTRY 4.0 BOSCH CONNECTED INDUSTRY DR.-ING. STEFAN AßMANN

MOGON. Markus Tacke HPC ZDV. HPC - AHRP Markus Tacke, ZDV, Universität Mainz

Die deutsche Windows HPC Benutzergruppe

Sustainability Balanced Scorecard as a Framework for Eco-Efficiency Analysis

GAUSS towards a common certification process for GNSS applications using the European Satellite System Galileo

Power-Efficient Server Utilization in Compute Clouds

High Performance Computing am Fraunhofer ITWM

O N E SOLUTION. VIS//ON Overview Module Datacenter and Cablemanagement. VIS//ON Übersicht Module Datacenter und Kabelmanagement

IEI Graphics Grade Backplane Solution

Introducing PAThWay. Structured and methodical performance engineering. Isaías A. Comprés Ureña Ventsislav Petkov Michael Firbach Michael Gerndt

HPC an der Uni Mainz

Seminarvortrag: Direktivenbasierte Programmierung von Beschleunigern mit OpenMP 4.5 und OpenACC 2.5 im Vergleich

Supercomputer - Eine einfache Einführung

Long-term archiving of medical data new certified cloud-based solution offers high security and legally approved data management

Green IT Fraunhofer ITWM Rechenzentrum

En:Tool EnEff BIM Introduction to the Project and Research Association

Spielst du noch oder rechnest du schon?

CHiC Chemnitzer Hochleistungs-Linux Cluster. Stand HPC Cluster CHiC. Frank Mietke, Torsten Mehlan, Torsten Höfler und Wolfgang Rehm

MULTI PHYSICS SIMULATION IN MANUFACTURING

INTERREG IIIa Project R&D - Ready for Research and Development Project results and ongoing activities

ELBA2 ILIAS TOOLS AS SINGLE APPLICATIONS

A parameterised 3D-Structure-Model for the state of Bremen (Germany)

Ausblick auf den HLRN III - die neue HPC Ressource für Norddeutschland

Introduction Workshop 11th 12th November 2013

Fluid-Particle Multiphase Flow Simulations for the Study of Sand Infiltration into Immobile Gravel-Beds

The Solar Revolution New Ways for Climate Protection with Solar Electricity

Gemeinsames TUM-LMU Seminar

Digitalfunk BOS Austria

Presentation of a diagnostic tool for hybrid and module testing

Forschungsaktivitäten und Kooperationen der Internationalen Energieagentur mit österreichischer Beteiligung Martina Ammer, 6.

Kraftwerk Berlin. Today, this formerly silenced Berlin power station is a vibrant place for exhibitions and events. A space resonating with energy.

JONATHAN JONA WISLER WHD.global

Querschnittstechnologien inkl. Geothermie F&E Schwerpunkte und deren Implementierungsstrategie

The Hessian Infrastructure for HPC

Virtualisierung: Neues aus 2010 und Trends 2011

Symposium on Scalable Analytics. Skalierbare Analysen mit EXASolution

Energy Logic 2.0 Senkung der Energiekosten mit System

GIS-based Mapping Tool for Urban Energy Demand

Energieeffizienz und Erneuerbare Energien Programme der EZ -- ein Zwischenstand


Wireless Sensor Netzwerke für den industriellen Einsatz

Urs Iten Siemens, Global Portfolio Management - Data Centers. siemens.com/datacenters Unrestricted Siemens AG All rights reserved

Architecture of Parallel Computer Systems - Efficient HPC Systems -

MODELLING AND CONTROLLING THE STEERING FORCE FEEDBACK USING SIMULINK AND xpc TARGET

New X-ray optics for biomedical diagnostics

bw-grid Cluster in Mannheim

Social Sciences and Humanities role in Social Innovation Some key observations from the experience of TU Dortmund University sfs

AnyWeb AG

The valence of flexible green electricity generation units DI Dr. Bernhard Stürmer University College of Agricultural and Environmental Pedagogy

SOFTWARE UPDATE TOUCHPANEL BY NILAN. So wird die Software für Compact P - GEO/AIR aktualisiert

Höhen und Tiefen Hugendubel setzt auf Exadata

ThinkStation P-Series Anwendungsbreiche

Programmierung und Leistungsanalyse des bwgrid-clusters. Dr. Heinz Kredel und Dr. Hans-Günther Kruse

Unternehmensweite IT Architekturen

Karlsruhe Institute of Technology KIT

Analysieren, Verstehen und Handeln mit DCIM Gestalten Sie Ihr Data Center kontrolliert und effizient.

Produktionscontrolling auf dem Weg zur Industrie 4.0

Erste Erfahrungen mit Installation und Betrieb

PROFI UND NUTANIX. Portfolioerweiterung im Software Defined Data Center

HIR Method & Tools for Fit Gap analysis

Environmental management in German institutions of higher education: Lessons learnt and steps toward sustainable management

Exkursion zu Capgemini Application Services Custom Solution Development. Ankündigung für Februar 2013 Niederlassung Stuttgart

Austria Regional Kick-off

Herzlich Willkommen zum HP Storage Summit 2015

Model-based Development of Hybrid-specific ECU Software for a Hybrid Vehicle with Compressed- Natural-Gas Engine

TMF projects on IT infrastructure for clinical research

LiLi. physik multimedial. Links to e-learning content for physics, a database of distributed sources

The Master of Science Entrepreneurship and SME Management

Arrow University München Thin Client Lösungen performant (auch im 3D-Umfelfd), zukunftssicher und zentral verwaltet!

Transkript:

Towards Modular Supercomputing with Slurm 2017-09-25 Dorian Krause et al., Jülich Supercomputing Centre, Forschungszentrum Jülich

Dual-Architecture Supercomputing Facility IBM Power 4+ JUMP, 9 TFlop/s IBM Power 6 JUMP, 9 TFlop/s IBM Blue Gene/L JUBL, 45 TFlop/s JURECA Cluster (2015) 2.2 PFlop/s JUROPA 200 TFlop/s HPC-FF 100 TFlop/s File Server IBM Blue Gene/P JUGENE, 1 PFlop/s IBM Blue Gene/Q JUQUEEN 5.9 PFlop/s JUST 2

Homogeneous Cluster Network 3

Heterogeneous Cluster GPU GPU GPU Network GPU GPU GPU 4

Cluster-Booster Architecture Network Cluster Booster 5

The DEEP projects: DEEP, DEEP-ER, DEEP-EST www.deep-projects.eu EU-Exascale projects 27 partners Total budget: 44 M EU-funding: 30 M Nov 2011 Jun 2020 All combine: Hardware Software Applications in a strong co-design 6

The DEEP projects: DEEP, DEEP-ER, DEEP-EST www.deep-projects.eu EU-Exascale projects 27 partners Total budget: 44 M EU-funding: 30 M Nov 2011 Jun 2020 All combine: Hardware Software Applications in a strong co-design 7

DEEP Architecture Intel Xeon Intel Xeon Phi TM (KNC) 8

DEEP Prototype Installed at JSC 1,5 racks 500 TFlop/s peak 3.5 GFlop/s/W Water cooled Cluster (128 Xeon) Booster (384 Xeon Phi KNC) 9

Application-Driven Approach DEEP projects applications (15): Brain simulation (EPFL + NMBU) Space weather simulation (KU Leuven) Climate simulation (Cyprus Institute) Computational fluid engineering (CERFACS) High temperature superconductivity (CINECA) Seismic imaging (CGG + BSC) Human exposure to electromagnetic fields (INRIA) Geoscience (LRZ) Radio astronomy (Astron) Lattice QCD (University of Regensburg) Molecular dynamics (NCSA) Data analytics in Earth Science (UoI) High Energy Physics (CERN) 10

Programming Environment Cluster Booster Extoll Infiniband OmpSs on top of MPI provides pragmas to ease the offload process 11

Programming Environment Cluster Booster ParaStation MPI Extoll Infiniband OmpSs on top of MPI provides pragmas to ease the offload process 12

Programming Environment Cluster Booster ParaStation MPI Infiniband MPI_Comm_spawn Cluster Booster Protocol Extoll OmpSs on top of MPI provides pragmas to ease the offload process 13

Programming Environment Cluster Booster ParaStation MPI Infiniband MPI_Comm_spawn Cluster Booster Protocol Extoll OmpSs on top of MPI provides pragmas to ease the offload process 14

Dual-Architecture Supercomputing Facility IBM Power 4+ JUMP, 9 TFlop/s IBM Power 6 JUMP, 9 TFlop/s IBM Blue Gene/L JUBL, 45 TFlop/s JURECA Cluster (2015) 2.2 PFlop/s JUROPA 200 TFlop/s HPC-FF 100 TFlop/s File Server IBM Blue Gene/P JUGENE, 1 PFlop/s IBM Blue Gene/Q JUQUEEN 5.9 PFlop/s JUST 15

Dual-Architecture Supercomputing Facility IBM Power 4+ JUMP, 9 TFlop/s IBM Power 6 JUMP, 9 TFlop/s IBM Blue Gene/L JUBL, 45 TFlop/s JURECA Cluster (2015) 2.2 PFlop/s JUROPA 200 TFlop/s HPC-FF 100 TFlop/s JUST File Server IBM Blue Gene/P JUGENE, 1 PFlop/s IBM Blue Gene/Q JUQUEEN 5.9 PFlop/s JURECA Booster (2017) 5 PFlop/s 16

JURECA Cluster-Booster System JURECA Cluster 1884 nodes Dual-Socket Xeon Mellanox EDR IB 2.2 PF/s peak Bridging: 20 Tb/s Booster 1640 nodes Xeon Phi 7250-F Intel OPA full fat-tree 5 PF/s peak 198 bridge nodes 17

JURECA Cluster-Booster System JURECA Cluster 1884 nodes Dual-Socket Xeon Mellanox EDR IB 2.2 PF/s peak Bridging: 20 Tb/s Booster 1640 nodes Xeon Phi 7250-F Intel OPA full fat-tree 5 PF/s peak 198 bridge nodes 18

JURECA Cluster-Booster System JURECA Cluster 1884 nodes Dual-Socket Xeon Mellanox EDR IB 2.2 PF/s peak Bridging: 20 Tb/s Booster 1640 nodes Xeon Phi 7250-F Intel OPA full fat-tree 5 PF/s peak 198 bridge nodes 19

JURECA Cluster-Booster System Building blocks Cluster Hardware Cluster Software Booster Hardware Booster Software Cluster-Booster Bridging for MPI Workload Management Slurm requirements Support for heterogeneity, QoS per partition Support for flexible heterogeneous jobs across partitions 20

Next step: Modular Supercomputing General Purpose CPU MEM Workload 1 General Purpose CPU FPGA DN MEM MEM NVRAM NVRAM Many Core CPU Workload 2 21

Next step: Modular Supercomputing General Purpose CPU MEM General Purpose CPU FPGA DN The DEEP-EST project builds on Slurm: MEM MEM Planning to adapt Slurm for MSA needs NVRAM NVRAM (Scheduling, handling of non-standard resources) Workload 1 Many Core CPU Workload 2 22

Summary Cluster-Booster Architecture combines multi- and manycore (or other accelerator) technologies in one system and enables a flexible resource selection. JURECA is currently extended to realize the Cluster- Booster Architecture in a 7.2 PF/s production system. Slurm is a key building block of our software stack. The community and SchedMD help us achieve our goal. Working towards Modular Supercomputing Architecture in DEEP-EST using Slurm. We hope to be able to contribute back. 23