QCDOC

from Wikipedia, the free encyclopedia

The QCDOC , Quantum Chromodynamics On a Chip , is a supercomputer concept implemented in various locations , which aims to use inexpensive and simple, but effective hardware to obtain a massively parallel supercomputer that is small in size and preferably performs calculations for Performs Quantum Chromodynamics (QCD). So it is at the same time a "cheap special computer" and, because of the massive parallelism, a highly efficient supercomputer, preferably for the QCD. At the same time, great attention was paid to the energy efficiency of the computer.

overview

The concept was initially designed as a joint project by various institutions: the University of Edinburgh (UKQCD), Columbia University (New York), the RHIC accelerator center at Brookhaven National Laboratory (NY), and the IBM group . The aim was to enable highly effective computer simulations in the grid QCD . At least 10 tflops should be  achieved with a utilization of 50%.

There are three implemented QCDOCs, each with the desired maximum performance (10 Tflops).

  • University of Edinburgh (Parallel Computing Center ( EPCC ); in operation since 2005)
  • Brookhaven 1 ( RHIC )
  • Brookhaven 2 ( US Department of Energy ) (DOE high energy program at the accelerator in Brookhaven.)

23 employed scientists ( UK ), their postdocs and students, from seven universities, belong to the UKQCD. The costs were met by a Joint Infrastructure Fund Award , which was endowed with 6.6 million pounds. Personnel costs including system support, physicists and postdocs for programming were approximately £ 1 million per year, other computing and operating costs were approximately £ 0.2 million,

QCDOC replaces an earlier project, QCDSP , which created computational efficiency by connecting large numbers of signal processors together in a similar fashion.

The QCDSP computer connected 12,288 nodes to a four-dimensional network, which in 1998 reached 1 Tflops for the first time.

QCDOC can be seen as the forerunner of the very successful IBM supercomputer Blue Gene / L (BG / L). Both computers have a lot in common that go beyond chance matches. The 'Blue Gene' is also a massively parallel supercomputer made up of a large number of cheap and relatively simple 'PowerPC 440' processors ( system on a chip, SoC ). These “node computers ” are interconnected to form a high-dimensional network with a high “bandwidth” . The computers differ, however, in that the processors in the BG / L are more powerful and that they are interconnected to form a faster and more effective network that comprises several hundred thousand nodes.

architecture

Node computer

Logic scheme of the ASIC system of the QCDOC computer

The computer nodes are specially made ASICs with approx. 50 million transistors each. They are manufactured by IBM themselves and work at around 500 MHz, with PowerPC 440 processor cores. Each node has a DIMM socket that operates between 128 and 2048 MB at 333 MHz.

The nodes work with a total of up to 1 double precision GFLOPS .

Overall system

The node computers are accommodated in pairs on a computer double card, with a DIMM socket and a 4: 1 Ethernet node for communication processes with other nodes. The double cards have two connections, one for the connections between the cards and one for the power supply, the Ethernet, the clock and other necessary things.
32 such dual computer cards are accommodated in two rows on a motherboard that supports the 800 Mbit / s fast Ethernet connection. Eight motherboards are housed in a kind of “compartment” (a so-called “box”); each “compartment” contains 512 processor nodes and an overall connection network corresponding to a six-dimensional cube with 2 6 corner points. A node computer consumes around 5 watts of power, and each “compartment” needs appropriate air and water cooling.

A complete system can consist of any number of “compartments”. In total, the systems have up to several tens of thousands of nodes.

Communication between the nodes

Each node can send and receive information to and from the twelve nearest neighbors (of the associated six-dimensional grid) at a clock rate of 500 Mbit / s. This results in a total bandwidth of 12 Gbit / s. The operating system communicates with the nodes via an Ethernet . This is also used for error diagnostics, configuration and communication processes (e.g. communication with hard drives ).

Operating system

A special operating system, QOS , runs on the QCDOC . a. enables the start-up (“ booting ”) of the computer, runtime and monitoring processes and simplifies the management of the numerous computer nodes. It uses a specific kernel and provides a. for compatibility with POSIX processes ("unix-like") using the Cygnus library ( newlib ).

successor

The QCDOC is now (2010) out of date (see QPACE ). At the University of Regensburg, z. B., it now serves as a "showcase" in the entrance area of ​​the physics faculty.

See also

Individual evidence

  1. PPA roadmap.

Web links