Parallel coordinates

from Wikipedia, the free encyclopedia
Parallel coordinate plot of flea beetle data with GGobi .

Parallel coordinates (also || -coordinates ; English parallel coordinate plot , PCP ) are a method for the visualization of high-dimensional structures and multivariate data. In the graphic on the right, the vertical lines show the axes of the coordinate system. In contrast to the scatter diagram , in which two coordinate axes are arranged at right angles to one another, here they run parallel and at the same distance. Each line from left to right corresponds to a data point and is represented by a polygon with corners on the parallel axes. The position of the corner on the i-th axis corresponds to the i-th coordinate of the point.

history

The invention of the parallel coordinates is often attributed to Maurice d'Ocagne in 1885, but apart from the fact that the words appear in the title, this publication has nothing to do with the visualization technique of the same name, but only describes a transformation function for coordinate systems. In addition, there are undoubtedly representations of parallel coordinates already before 1885, for example by H. Gannett and FW Hewes in 1883 (see link in the reference). Almost 80 years later, in 1959, Alfred Inselberg's original idea was used again. From 1977 they were systematically developed and popularized by him. They are most frequently used in algorithms to avoid collisions in air traffic (1987), in data mining , in image analysis processes, in optimization, process control, and computer intrusion detection. Wegman's article Hyperdimensional Data Analysis Using Parallel Coordinates from 1990 was decisive for the successful use of parallel coordinates .

Generalized parallel coordinates were proposed by Moustafa and Wegman in 2002 and 2006. Here, the Cartesian coordinate system is mapped into a parameter space using basic functions, and this is then mapped onto parallel coordinates. This enables a connection between generalized parallel coordinates, the Grand Tour and the Andrews curves to be established.

Advantages and disadvantages

The parallel coordinates have advantages and disadvantages:

  • Increasing the dimension simply means adding more (vertical) axes.
  • Since parallel coordinates map a higher-dimensional space onto a two-dimensional space, a loss of information occurs. This can be measured with the help of the Parseval identity .
  • With practice, certain two-dimensional and also higher-dimensional structures can be easily recognized in parallel coordinates. The graphic below shows various two-dimensional structures (perfectly positively and negatively correlated data points, clusters, circles and normally distributed data) once in the scatter plot (above) and in parallel coordinates. Patterns in parallel coordinates are known for (hyper) planes, curves, several smooth (hyper) surfaces, similarities, convexity and non-orientable surfaces. The point-line duality is an indication that the mathematical fundamentals come from projective geometry .
Various two-dimensional structures in the scatter diagram (above) and in parallel coordinates (below).

To visualize high-dimensional data in statistics, three important aspects must be considered:

the arrangement of the axes
The arrangement of the axes is crucial for the search for structures in the data. In a typical data analysis, many arrangements are usually tried. Arrangement heuristics were developed that allow insights into interesting structures.
the rotation of the axes (data)
Since the i-th coordinate is determined by the corner on the i-th axis, a rotation of the axes (= rotation of the data) can produce a different picture. The two graphics on the left can be viewed as a rotation of the axes (or data) by 90 degrees. Despite the same structure, there are different structures in the parallel coordinates.
the scaling of the axes
The parallel coordinates are essentially a series of lines between pairs of coordinate axes. Therefore, the variables should be scaled to a similar scale. Different scalings can also give interesting insights into the data.

literature

  • Alfred Inselberg: Parallel Coordinates: Visual Multidimensional Geometry and Its Applications . 1st edition. Springer, New York 2009, ISBN 978-0-387-21507-5 .
  • Martin Graham, Jessie Kennedy: Using Curves to Enhance Parallel Coordinate Visualizations . Napier University, Edinburgh, UK ( PDF, online [accessed 29 September 2011]).
  • Rida E. Moustafa, Edward J. Wegman: On Some Generalization of Parallel Coordinate Plots . George Mason University 2002 (Technical report).

Web links

Individual evidence

  1. ^ Maurice d'Ocagne: Coordonnées Parallèles et Axiales: Méthode de transformation géométrique et procédé nouveau de calcul graphique déduits de la considération des coordonnées parallèlles . Gauthier-Villars, Paris 1885.
  2. ^ Henry Gannett: General Summary Showing the Rank of States by Ratios 1880. Retrieved February 5, 2015 .
  3. ^ Alfred Inselberg: The Plane with Parallel Coordinates . In: Visual Computer . 1, No. 4, 1985, pp. 69-91. doi : 10.1007 / BF01898350 .
  4. ^ Edward J. Wegman: Hyperdimensional Data Analysis Using Parallel Coordinates . In: Journal of the American Statistical Association . tape 85 , no. 411 , September 1990, pp. 664-675 .
  5. ^ R. Moustafa, E. Wegman: On Some Generalization to Parallel Coordinate Plot . In: Seeing a million, A Data Visualization Workshop, Rain am Lech (no.), Germany . 2002.
  6. a b R. Moustafa, E. Wegman: Multivariate continuous data — Parallel Coordinates . In: A. Unwin, M. Theus, H. Hofmann (Eds.): Graphics of Large Datasets: Visualizing a Million . Springer, 2006, p. 143-156 .
  7. ^ A. Inselberg: Parallel Coordinates: Visual Multidimensional Geometry and its Applications . Springer, 2009.
  8. Interactive Hierarchical Dimension Ordering Spacing and Filtering for Exploration of High Dimensional Datasets. (P. 3–4; PDF; 6.0 MB)