Energy Efficient High Performance Computing

Ähnliche Dokumente
Towards Modular Supercomputing with Slurm

LOEWE-CSC and it s data center

Dynamic Hybrid Simulation

GridMate The Grid Matlab Extension

WE SHAPE INDUSTRY 4.0 BOSCH CONNECTED INDUSTRY DR.-ING. STEFAN AßMANN

Supercomputer für die. Forschungsanlage FAIR. Volker Lindenstruth FIAS, IfI, LOEWE Professur Chair of HPC Architecture

Steca Solar Electronics Products and solutions for an ecological future. Intersolar 2011 M. Müller, Steca GmbH

Solar Air Conditioning -

En:Tool EnEff BIM Introduction to the Project and Research Association

Introducing PAThWay. Structured and methodical performance engineering. Isaías A. Comprés Ureña Ventsislav Petkov Michael Firbach Michael Gerndt

SolNET Meeting in Darlana University January 2007

Mehr erreichen mit der bestehenden Infrastuktur. Mathias Widler Regional Sales Manager DACH

TMF projects on IT infrastructure for clinical research

sciebo The Academic NRW Cloud Storage Service

Power-Efficient Server Utilization in Compute Clouds

Bosch Thermotechnik. Thermotechnology

Possible Contributions to Subtask B Quality Procedure

GIS-based Mapping Tool for Urban Energy Demand

ISEA RWTH Aachen Electric Bus Simulation

Analog GSM-Gateway TRF

JONATHAN JONA WISLER WHD.global

Energieeffizienz und Erneuerbare Energien Programme der EZ -- ein Zwischenstand

Introduction Workshop 11th 12th November 2013

Walter Buchmayr Ges.m.b.H.

Connecting the dots on Germany s Energiewende and its impact on European energy policy

Industrie 4.0 The Impact of Digitization on Sustainable Value Creation

Potentials for Economic Improvement of Die Casting Cells

Simulation of a Battery Electric Vehicle

Integration of D-Grid Sites in NGI-DE Monitoring

Algorithms for graph visualization

Newest Generation of the BS2 Corrosion/Warning and Measurement System

Atline Inspection of Casting Production Process at Volkswagen using VG Inline

Bosch Power Tec Clean Energy Week Energy Storage

Oracle Integration Cloud Service

SARA 1. Project Meeting

SCHLAUE LÖSUNGEN FÜR IHRE ANWENDUNGEN SMART SOLUTIONS FOR YOUR APPLICATIONS

Mit Legacy-Systemen in die Zukunft. adviion. in die Zukunft. Dr. Roland Schätzle

Tuning des Weblogic /Oracle Fusion Middleware 11g. Jan-Peter Timmermann Principal Consultant PITSS

Energy balance of the integrated route quo vadis?

Lehrstuhl für Allgemeine BWL Strategisches und Internationales Management Prof. Dr. Mike Geppert Carl-Zeiß-Str Jena

Presentation of a diagnostic tool for hybrid and module testing

Low Carbon energy intensive industries

Pilot Project Biogas-powered Micro-gas-turbine

Geistes-, Natur-, Sozial- und Technikwissenschaften gemeinsam unter einem Dach. Programmes for refugees at Bielefeld University

WP2. Communication and Dissemination. Wirtschafts- und Wissenschaftsförderung im Freistaat Thüringen

INTERREG IIIa Project R&D - Ready for Research and Development Project results and ongoing activities

The valence of flexible green electricity generation units DI Dr. Bernhard Stürmer University College of Agricultural and Environmental Pedagogy

Copyright by Max Weishaupt GmbH, D Schwendi

PV Standalone system design and battery charging technologies

Support Technologies based on Bi-Modal Network Analysis. H. Ulrich Hoppe. Virtuelles Arbeiten und Lernen in projektartigen Netzwerken

Simulating the Idle: A New Load Case for Vehicle Thermal Management

Gemeinsames TUM-LMU Seminar

Cleanroom Fog Generators Volcano VP 12 + VP 18

rsoc plant Efficient design and operation behavior

ELBA2 ILIAS TOOLS AS SINGLE APPLICATIONS

Data Center Management

Non users after Cochlear Implantation in Single Sided Deafness

Customer-specific software for autonomous driving and driver assistance (ADAS)

Creating OpenSocial Gadgets. Bastian Hofmann

Internationale Energiewirtschaftstagung TU Wien 2015

LINDENBERG ANLAGEN GMBH Stromerzeugungs und Pumpenanlagen Schaltanlagenbau

Welcome to Düsseldorf

USB -> Seriell Adapterkabel Benutzerhandbuch

technical documents for extraction and filter devices type series 1000

prorm Budget Planning promx GmbH Nordring Nuremberg

Counterflow Heat Recovery Fan

JPlus Platform Independent Learning with Environmental Information in School

Developing Interactive Integrated. Receiver Decoders: DAB/GSM Integration

The Cumulus Project: Build a Scientific Cloud for a Data Center

Engineering the Factory of the Future Now.Next.Beyond. Heiko Schwindt VP Automation & Electrification Solutions, Bosch Rexroth

Business-centric Storage How appliances make complete backup solutions simple to build and to sell

Integrating storages in energy regions and local communities a range of storage capacities

Efficient Design Space Exploration for Embedded Systems

STAHL Setting the course for tomorrow. Energy efficiency in steelmaking The blast furnace Fit for the future?

Routing in WSN Exercise

New X-ray optics for biomedical diagnostics

Intelligent heating in smart buildings

Development of complex Mechatronic Systems - Computer aided Conceptual Design. ICCME 09 Salzburg, May

AS Path-Prepending in the Internet And Its Impact on Routing Decisions

Exascale Computing. = Exascale braucht Manycore-Hardware...und was für Software??? 46/58

QUANTUM. Energieeffiziente Kältemaschinen made by ENGIE Refrigeration Energy efficient chillers made by ENGIE Refrigeration

Transkript:

Energy Efficient High Performance Computing Michael Gerndt Technische Universität München gerndt@in.tum.de

Faculty of Informatics Founded in 1975 2500 students 30% foreign students 40 Professors (22 Chairs)

HPC Applications

TOP 500 November 2012 Name Site Architecture Peak PFlop/s Titan ONL, USA Cray, 560640 cores, Opteron+Nvidia K20 Sequoia LLNL, USA K Computer RIKEN, Japan Mira JUQUEEN SuperMUC ANL, USA FZJ, Germany LRZ, Germany IBM BlueGene/Q, 1.6 M cores Fujitsu, 705 K cores SPARC 64 IBM BlueGene/Q, 786 K cores IBM BlueGene/Q, 393 K cores IBM, idataplex, 147 k Intel Sandy Bridge cores 27.1 8.2 20.1 7.9 Power MW 11.3 12.6 10.1 3.9 5.0 1.9 3.2 3.4

Energy Efficiency (Linpack Peak/W) Name Type Architecture Gflops/W Titan Heterogeneous 300K Opteron Cores + 18688 Nvidia K20 Sequoia Homogeneous IBM BlueGene/Q, 1.6 M cores K Computer Homogeneous Fujitsu, 705 K cores SPARC 64 SuperMUC Homogeneous IBM, idataplex, 147 k Intel Sandy Bridge cores 2.143 2.069 0.83 0.85

Pace to Exascale Exascale: 10 18 Flops/s Expected to be reached after 2018 Massive parallelism and energy consumption Name Cores Power MW Titan 20 M 303 Sequoia 80 M 395 K Computer 64 M 1145 SuperMUC 49 M 1000

European Exascale Hardware Projects DEEP (2011-2014) Cluster of accelerators Intel Xeon Phi Extoll network Dynamic distribution of cluster-nodes and accelerators Mont Blanc Cortex A-15 SoCs Blue Gene approach but GPU on each SoC Critical: network bandwidth

Energy Efficient HPC Reduce the power losses in the power supply chain Exploit your possibilities for using compressorless cooling und use energy-efficient cooling technologies (e.g. direct liquid cooling) Re-use waste heat of IT systems Use newest semiconductor technology Use of energy saving processor and memory technologies Consider using special hardware or accelerators tailored for solving specific scientific problems or numerical algorithms Monitor the energy consumption of the compute systems and the cooling infrastructure Use energy aware system software to exploit the energy saving features of your target platform Monitor and optimize the performance of your scientific applications Use most efficient algorithms Use best libraries Use most efficient programming paradigm Energy efficient infrastructure Energy efficient hardware Energy aware software environment Energy efficient applications

Leibniz Supercomputer Center

SuperMUC, 3 Petaflop Supercomputer

SuperMUC at Leibniz Computing Centre

What s special about SuperMUC SuperMUC is the most powerfull pure x86 ISA system of the world General purpose, standard programming interface, easy to port to, future-safe for many applications SuperMUC is the most energy efficient x86 based supercomputer of the world Dark Center infrastructure at LRZ Warm Water cooling Energy aware scheduling with Load Leveler Dagstuhl Perspectives Workshop: May 21-25, 2012 12

Power (kw) Energy Consumption (kwh) Technische Universität München SuperMUC Power Consumption SuperMUC HPL Power Consumption (Infrastructure, Machine Room & PDU Measurements) 4000 3500 3000 2500 2000 1500 1000 500 0 0 16:3018:0019:3021:0022:300:001:303:004:306:007:309:0010:3012:0013:30 Time (Clock) 50000 45000 40000 35000 30000 25000 20000 15000 10000 5000 Power (Machine Room, kw) Power (PDU, kw) Linpack HPL run May 17,2012 2.582 PF Power (infrastructure, kw) Energy (Machine Room, kwh) Run Start: 17.05.2012 20:56, 965,40 kw Run End: 18.05.2012 08:37, 711,02 kw Duration: 42045s or 11.68 hours Avg. power: 2758.87 kw Energy: 32308.68 kwh Subsystems included in PDU measurements: o Computational Nodes o Interconnect Network Provided by Torsten Wilde, LRZ

CPU frequency and RAM setting Provided by Andrey Semin, Sr. Engineer HPC, Intel

SuperMUC - Costs High End System 2010-2014 Phase 1 2014-2016 Phase 2 Investment Costs (Hardware and Software) 53 Mio ~ 19 Mio Operating Costs (Electricity costs and maintenance for hardware und software, some 32 Mio ~ 29 Mio additional personnel) SUM 85 Mio ~48 Mio Extension Buildings (construction and Infrastructure) 49 Mio

Energy Capping in Contract with IBM New funding scheme: energy included Contract includes energy cost for 5 years Power consumption of the system varies between 1 and 3 MW depending on the usage by the applications The contract is based on the energy consumed in a benchmark suite agreed between IBM and LRZ

IBM idataplex dx360 M4: Water cooling Heat flux > 90% to water; very low chilled water requirement Power advantage over air-cooled node: Warm water cooled ~10% (cold water cooled ~15%) due to lower T components and no fans Typical operating conditions: T air = 25 35 C, T water = 18 45 C

IBM System x idataplex Direct Water Cooled Rack

Warm Water Cooling: CoolMUC

Warm Water Cooling

Warm Water Cooling

FP7 AutoTune Project Goals Tackle complexity of HPC architectures Multicore, multisocket, accelerators, DVFS Enable higher productivity via auto-tuning Focus on static tuning in pre-production phase Produce tuning recommendations Leverage state of the art performance analysis Implement an extensible environment Support open and proprietary plugins

PTF User Interface Periscope Tuning Framework Extension of Periscope Online tuning process Application phase-based Extensible via tuning plugins Single tuning aspect Combining multiple tuning aspects Rich framework for plugin implementation Automatic and parallel experiment execution Search Algorithms PTF Frontend Tuning Plugins Static Program Info Experiment Execution Agent Network Monitor Monitor Monitor Performance Analysis Application

Energy Efficiency Tuning in AutoTune Application tuning for energy efficiency Performance optimization in general Data transfer optimization Concurrency throttling Power gating Dynamic voltage and frequency scaling Energy measurement RAPL (Running Average Power Limit) Node power supply

ENOPT library implemented by LRZ

Energy Properties Automatic search for application properties Energy Inefficient Regions (Flops/J) Energy Tunable Regions Memory/compute bound regions Regions with Load imbalance Tuning plugin Inspect the energy properties Identifies potential good combinations of governor and frequency Performs an online evaluation of the scenarios

Early results 50 SIP 60 Matrix Multiply 40 30 20 10 50 40 30 20 10 0 Ondemand 2.7 Userspace 2.7 Ondemand 1.99 Userspace 1.2 0 Ondemand 2.7 Userspace 2.7 Ondemand 1.99 Userspace 1.2 25 20 15 SeiSol Energy (kj) Time (10s) 10 5 0 Ondemand 2.7 Userspace 2.7 Ondemand 1.99 Userspace 1.2

Summary Energy wall Energy reduction on all levels required Technology System Application Auto-tuning can help in reducing energy consumption Indirect through performance tuning Direct by tuning DVFS