HP @ RZ.RWTH-AAHEN.DE Sun Fire T2000 SunFire E25K PU and IO boards SunFire E25K luster F-AL Switch SunFire E6900 luster 1 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication SunFire V40z luster
Inhalt ompute luster des Rechen- und Kommunikationszentrums (RZ) Der Linpack-Benchmark Die Top500-Liste der schnellsten Rechner 2 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication
ompute luster RWTH Aachen #nodes model processor type ompute luster der RWTH Aachen Feb 07 #procs #cores #threads clock [MHz] memory [GB] 4 SF E25K UltraSPAR IV 72 144 144 1050 288 3 HP @ RZ.RWTH-AAHEN.DE enter accumulated permance network [TFLOPS] Gigabit Ethernet Fire Link 1.21 1.15 omputing and ommunication accumulated memory [TB] Gigabit Ethernet Fire Link 1.84 1.54 16 SF E6900 UltraSPAR IV 24 48 48 1200 96 8 SF E2900 UltraSPAR IV 12 24 24 1200 48 Gigabit Ethernet 0.46 0.38 1 SF T2000 UltraSPAR T1 1 8 32 1000 8 Gigabit Ethernet 0.0001 0.01 64 SF V40z Opteron 848 4 4 4 2200 8 Gigabit Ethernet 1.13 0.51 Gigabit Ethernet 4 SF V40z Opteron 875 4 8 8 2200 16 Infiniband 0.14 0.06 2 SF X4600 Opteron 885 8 16 16 2600 32 Gigabit Ethernet 0.17 0.06 9 Dell 1950 Xeon 5160 (Woodcrest) 2 4 4 3000 8 Gigabit Ethernet Infiniband 0.22 0.07 108 sum 1042 1833 1857 5.16 3.79
System Management Frontend nodes interactive work, program development and testing, GUIs cluster.rz.rwth-aachen.de (USIV-Solaris 9) cluster-sol10.rz.rwth-aachen.de (USIV-Solaris 10) cluster-sun-x86.rz.rwth-aachen.de (OPT-Solaris 10) cluster-linux.rz.rwth-aachen.de (Opteron-Linux 64b) cluster-win.rz.rwth-aachen.de (Intel-Windows 64b) Batch system Sun Grid Engine: jobs (> 20 min) 4 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication
Overview over HP Tools urrent program development environment HP on the Sun SPAR and Opteron systems at the RWTH 4 platms : 1. SPAR/Solaris 9/10, 64bit 2. Opteron/Solaris, 64 bit 3. Opteron/Linux and Intel/Windows, 64 bit 4. Opteron/Windows and Intel/Windows, 64 bit serial programming, shared memory parallelization, message passing compilers / MPI libraries, debugging tools, permance analysis tools 5 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication
Programming Environment ompilers + Debugging Tools ompany ompiler Version Language OpenMP support Autopar Debugger Runtime Analysis US IV Opteron Xeon Sun Studio 11 F95//++ F95/++ F95/++ Sun Studio 12 α F95//++ F95/++ F95/++ dbx sunstudio dbx sunstudio 6 HP @ RZ.RWTH-AAHEN.DE enter analyzer, collect, er_print, gprof analyzer, collect, er_print, gprof Solaris Lin Win Lin Win X X X X X X Intel V9.1 F95/++ F95/++ Threading Tools F95/++ idb vtune X X vtune vtune Intel V10.0 beta F95/++ F95/++ Threading Tools F95/++ idb vtune X X vtune vtune Intel Guide V4.0 F90//++ F77/F90//++ Assure F77/F90//++ GuideView X GNU V4.0 F95/++ gdb gprof X X X X GNU V4.2 beta F95/++ F95/++ gdb gprof X X PGI V6.1 F77/F90//++ F77/F90//++ F77/F90//++ pgdbg pgprof X X Microsoft Visual Studio 2003 ++ Visual Studio X X Microsoft Visual Studio 2005 ++ ++ Visual Studio X X Etnus TotalView 8.0 X X X X omputing and ommunication
Programming Environment MPI Libraries and Tools ompany Version MPI2 support Debugger Runtime Analysis Plattm Network Sun HP lustertools 5 yes TotalView, Prism analyzer, mpprof Solaris 9 tcp, shm, FireLink Sun HP lustertools 6 yes TotalView analyzer, mpprof Solaris 10 tcp, shm Sun HP lustertools 7 based on OpenMPI yes TotalView analyzer Solaris 10 tcp, shm, infiniband Intel based on mpich2.x yes TotalView Intel Traceollector & Analyzer Linux tcp, shm, Infiniband ANL mpich 1.2.6 no TotalView jumpshot Sol, Lin, Win tcp, shm ANL mpich2 1.0.x yes TotalView Sol, Lin, Win tcp, shm Voltaire based on mpich 1.2.5 no? Linux tcp, shm, Infiniband Microsoft based on mpich2 yes Visual Studio w/ MS ompute luster Pack Windows tcp, shm, (Infiniband) Univ Dresden installation planned? Vampir-NG Sol, Lin any 7 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication
Application Software FEM Abaqus Ansys Hyperworks Ls-Dyna Marc/Mentat Nastran/Patran hemie Gaussian Turbomole FD FX/TASFlow Fluent IEM-FD StarD Mathematik Maple Mathematica Misc Matlab/Simulink Tecplot Site license Parallel version available No site license 8 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication
License onditions and Fees ommercial Application Software the responsilitiy of each user no commercial usage fees covered by the institutions ommercial development tools are provided by the ompilers MPI Debuggers Runtime Analysis tools 9 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication
Job submission Sun Grid Engine qsub [options] scriptfile [- scriptarguments] Job deletion qdel job_id Job status inquiry qstat [options] -help -f #all queues and jobs -j job_id -u user Graphical Interface qmon 10 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication
www.rz.rwth-aachen.de/computing/hpc/statistics/load/ 11 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication
Some Web-links Web page of the SunHP workshop with more links and inmation http://www.rz.rwth-aachen.de/sunhpc Inmation related to HP at the RWTH http://www.rz.rwth-aachen.de/hpc/ Inmation related to the RWTH Sun Fire SMP luster http://www.rz.rwth-aachen.de/hpc/sun/ Sun Fire SMP luster Primer http://www.rz.rwth-aachen.de/hpc/primer Further reading: Batch system http://www.rz.rwth-aachen.de/hpc/batch Further reading: Software http://www.rz.rwth-aachen.de/computing/sw 12 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication
Inhalt ompute luster des Rechen- und Kommunikationszentrums (RZ) Der Linpack-Benchmark und die Top500-Liste der schnellsten Rechner 13 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication
Maßstab für Rechenleistung Der Linpack Benchmark Die theoretisch mögliche Leistung (peak permance), die durch die Taktrate und die Recheneinheiten definiert wird, kann nie erreicht werden. Deshalb als Maßstab Leistung bei der Lösung eines dichtbesetzten linearen Gleichungssystems LINPAK Benchmark (www.top500.org) M[ega]flops = Millionen Gleitpunktoperationen/Sekunde G[iga]flops, T[era]flops, P[eta]flops (jeweils Faktoren 10^3) Heutiger Spitzenreiter (28. Liste im Nov. 2006): IBM BlueGene/L mit 131072 Prozessoren, 32 TB Total Memory Lawrence Livermore National Laboratory (LLNL) Theoretische Spitzenleistung: 367 Tflops = 367000 Gflops Linpack: 280 Tflops 76% von Peak (Matrixgröße: N~1.769.471 ) Zum Vergleich Theoretische Spitzenleistung eines Dualcore Intel Xeon 5160 (Woodcrest), 3 GHz: 2 cores * 4 Ergebnissen/Takt (SSE) * 3 GHz = 24 Gflops 14 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication
The TOP500 List 1.000.000 Gflops 100.000 10.000 RWTH: Peak Permance (R peak) RWTH: Linpack Permance TOP500: Rank 1 TOP500: Rank 50 1.000 100 10 1 93 TOP500: Rank 200 TOP500: Rank 500 P-Technology Moore's Law 94 95 96 Fujitsu VPP300 at Aachen University 97 98 99 00 01 Sun Fire luster at Aachen University 02 15 HP @ RZ.RWTH-AAHEN.DE enter 03 04 05 omputing and ommunication 06
Die aktuelle Top 20 der Top500-Liste Rank Site omputer ountry Procs RMax Nmax Processor 1 LLNL IBM Blue Gene USA 131072 280600 1769471 PowerP 440 700 2 Sandia Natl Lab ray Red Storm USA 26544 101400 1700000 AMD x86_64 Opteron D 2400 3 IBM IBM Blue Gene USA 40960 91290 983039 PowerP 440 700 4 LLNL IBM pseries p5 575 USA 12208 75760 1383600 POWER5 1900 5 Barcelona IBM Bladeenter PP Spain 10240 62630 0 PowerP 970 2300 6 Sandia Natl Lab Dell PowerEdge 1850 USA 9024 53000 0 Intel EM64T Xeon EM64 3600 7 EA Bull NovaScale 5160 France 9968 52840 0 Intel IA-64 Itanium 2 1600 8 NASA Ames SGI Altix USA 10160 51870 1290240 Intel IA-64 Itanium 2 1500 9 Titech Sun Fire x4600 luster Japan 11088 47380 1148160 AMD x86_64 Opteron D 2400 10 Oak Ridge Nal Lab ray XT3 USA 10424 43480 0 AMD x86_64 Opteron D 2600 11 Maui (MHP) Dell PowerEdge 1955 USA 5200 42390 0 Intel EM64T Xeon 51xx ( 3000 12 Tacc Dell PowerEdge 1955 USA 5200 41460 0 Intel EM64T Xeon 51xx ( 2667 13 FZJ IBM Blue Gene Germany 16384 37330 663551 PowerP 440 700 14 Earth Simulator NE Earth-Simulator Japan 5120 35860 1075200 NE 1000 15 Atomic Weapons Est ray XT3 UK 7812 32500 0 AMD x86_64 Opteron D 2600 16 ray Inc. ray XT4 USA 6696 27980 0 AMD x86_64 Opteron D 2600 17 ASTRON IBM Blue Gene Netherland 12288 27450 516095 PowerP 440 700 18 LRZ SGI Altix 4700 Germany 4096 24360 1139712 Intel IA-64 Itanium 2 1600 19 LLNL Intel Itanium2 Tiger4 - QUSA 4096 19940 975000 Intel IA-64 Itanium 2 1400 20 ambridge Univ Dell PowerEdge 1950 UK 2340 18270 713000 Intel EM64T Xeon 51xx ( 3000 16 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication Proc. Freq.
Aachen on Rank 180 in e 2005 http://www.rz.rwth-aachen.de/hpc/sun/ Over 2 TeraFlop/s Linpack Permance April 2005. The upgrade from UltraSPAR III to UltraSPAR IV including an increase of the main memory capacity more than doubled our Linpack permance! A linear system with 499,200 unknowns was solved in 11:12:48.8 hours at an average speed of 2054.4 billion floating point operations per second (GFlop/s). The program had a total memory footprint of 2 Terabyte. 1276 processor cores were kept busy with 82,930,000,000 million floating point operations. 17 HP @ RZ.RWTH-AAHEN.DE enter omputing and ommunication