Parallel Linear Algebra for Scalable Multi-core Architectures

from Wikipedia, the free encyclopedia

Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) is a program library that provides interfaces for the programming languages C and Fortran . PLASMA can solve linear equations , calculate least squares problems and associated problems such as factoring a matrix.

The software library is designed for shared memory systems, especially for homogeneous multi-core and multi- socket designed systems. PLASMA offers full compatibility with the Basic Linear Algebra Subprograms (BLAS) Level 3 software library as well as matrix routines for initialization or for forming the inverse. The aim of PLASMA is to replace LAPACK , but in contrast to LAPACK it does not support the solving of singular or eigenvalue problems or routines for tape matrices at the moment (July 2012) .

Key data

PLASMA was written by the Innovative Computing Laboratory (ICL). ICL provides software for solving standard problems in scientific computing . ICL is part of the Electrical Engineering and Computer Science Department in the College of Engineering at the University of Tennessee .

Licensing is a modification of BSD and is very permissive. This means that the distribution of the source code or the binary files is permitted provided that the copyright is given.

The source code of version 1.0.0 was made available for download on January 1, 2009. The PLASMA project was presented in a publication for the first time in 2006.

Motivation towards LAPACK

LAPACK and ScaLAPACK are the standard for high-throughput calculations in linear algebra. They were developed for shared memory and distributed memory architectures, but the task of parallelization is delegated to Basic Linear Algebra Subprograms (BLAS). The strengths of BLAS lie in the reuse of data in order to make the best possible use of higher storage levels. Level 3 BLAS calls achieve a surface-to-volume effect, which means that care is taken to ensure that communication is kept very efficient. Block algorithms are very prominent in level 3 BLAS calls and provide high performance on storage hierarchy systems. However, LAPACK routines very often rely on level 2 BLAS calls, which scale very poorly on shared memory systems.

Web links

Individual evidence

  1. PLASMA license terms
  2. A. Buttari, J. Langou, J. Kurzak, J. Dongarra: Parallel tiled QR factorization for multicore architectures . In: Proceedings of the 7th international conference on Parallel processing and applied mathematics . Heidelberg, Springer-Verlag, PPAM'07, 639–648, Berlin, 2008