Geneva (software)

from Wikipedia, the free encyclopedia
Geneva
Basic data

Maintainer Rüdiger Berlich, Ariel Garcia
developer Gemfony scientific UG (limited liability) and other contributors
Current  version Geneva 1.10.0
(March 20, 2020)
operating system Linux
programming language C ++
category Cloud computing , optimization
License Apache License v2.0
Repository

Geneva (acronym for G RID en abled ev olutionary a lgorithms) is in C ++ implemented program library , to approximate the combinable algorithms, computer-aided provides solving optimization problems. Beyond the eponymous evolutionary algorithms , further optimization algorithms are supported. An optimization problem prepared for Geneva includes the definition of the input parameters including their types as well as a mapping rule with which one or more numerical quality standards are clearly assigned to these parameters. In addition to floating point numbers , parameter sets can also include Boolean and integer values, whereby parameter types may also be mixed in the description of the problem. Optimizing means then the search for maxima or minima of a predetermined program code as a program or an external evaluation function ( Einkriterienoptimierung , ) as a function of the input parameters, or the search for a group of satisfactory solutions in the case of multi-criteria optimization ( ). The Geneva library is available as open source software under the Apache license v2.0 .

Fields of application

Geneva is specifically designed for optimization problems that benefit from running in distributed and parallel computing environments . Challenges for the computational effort are on the one hand a particularly "noisy" quality surface (frequency of local maxima and local minima ) and on the other hand the computational effort to evaluate individual candidate solutions. A candidate solution is a set of parameters passed by the optimization algorithm to the evaluation function. There are no content restrictions with regard to the problem modeled by the evaluation function, but it must be limited to the data types supported by Geneva and implement the unambiguous mapping required at the beginning. A usage scenario envisaged in the design of Geneva is the evaluation of candidate solutions through (also computationally intensive) simulations . However, simple mathematical functions implemented in C ++ are just as possible and are used for benchmarking Geneva. However, restrictions with regard to the implementation of the evaluation function can result from the parallel execution in the Geneva context.

Functionality

Implemented Algorithms

Figure 1: Parameter scan and optimization of the "Noisy Parabola" with Geneva in two dimensions

In addition to evolutionary algorithms, Geneva supports various other metaheuristic optimization methods (without an exclusive definition of floating point or Boolean parameters) . These are current

  • Swarm algorithms
  • Simulated annealing (expanded to include the option of multiple simultaneous evaluations)
  • Simple gradient descents (not Conjugate Gradient ). In each step, the gradient is determined on the basis of the difference quotient and the direction of the steepest drop in the quality surface is approximately determined

In addition, parameter scans are supported, either on a regular grid or with the help of random candidate solutions evenly distributed over the entire parameter space. A scan of the “Noisy Parabola” (a function used in Geneva for test purposes with a very high number of local optima ) on a grid is shown on the right-hand side as an example. Superimposed on the scan in red is an optimization with the help of an evolution strategy using Geneva. The evolutionary algorithm subjects only part of the parameter space to the evaluation function. Since the number of spatial points to be scanned increases exponentially with the number of parameters, parametric optimization is the only feasible way to solve many high-dimensional problems. The Geneva library can also deal with problems with very high numbers of parameters (cf. also film 1).

Coupling of algorithms

Optimization algorithms can be coupled with each other, whereby the best solutions of the previous algorithm become the starting value of the following algorithm. An example of use is the search for suitable starting points for optimization with the help of a rough parameter scan, followed by an evolution strategy. The exchange of the best candidate solutions between different algorithms basically also allows the optimization run to be divided into coarse and fine optimization with different algorithms.

Treatment of boundary conditions

Boundary conditions can greatly restrict the valid part of the parameter space. Optimization algorithms must avoid such areas as much as possible, on the one hand to achieve a valid optimum more quickly and on the other hand to avoid potentially incorrect return values ​​of the evaluation function for invalid parameter values. On the one hand, the Geneva library supports the restriction of the value range of individual parameters. It is also possible to treat the dependencies of the boundary conditions of individual parameters on the values ​​of other variables by tracing them back to a common quality surface for invalid and valid solutions. An example of such a problem would be two parameters x and y, the sum of which must not exceed a given, constant range. In principle, every variable can be varied in this example. Valid parameter values ​​depend on the current value of the other variable.

The procedure used by Geneva for this purpose first evaluates the validity of a parameter set. Valid solutions are transformed with the help of a sigmoid function so that the evaluation of valid parameter sets cannot exceed or fall below an upper or lower limit. If the user can now numerically specify how serious the violation is for each violated boundary condition, Geneva calculates a replacement rating (invalidity) that takes the place of calling the rating function. This happens e.g. B., in which the product of all disabilities is multiplied by the upper (when minimized) or lower (when maximized) limit of the sigmoid function. The resulting replacement quality surface drops in the direction of valid value ranges. The solutions found by the optimization algorithm are "pulled" in the direction of valid values.

Monitoring

Current parameters of the optimization algorithms as well as the candidate solutions can be extracted during the ongoing optimization and output in the form of an input script for the data analysis package ROOT . ROOT then takes care of the conversion into the desired image formats, such as PNG or PDF . This tool can also be used for post-processing (e.g. by making notes) outside of Geneva. Data can either be collected with algorithm-specific, predefined optimization monitors, or with the help of lighter-weight, mostly user-defined pluggable optimization monitors . Figure 1 was generated in this way, with post-processing taking place.

Parallelization and size of the supported problems

Film 1: 150 superimposed, semi-transparent triangles are modified by an evolution strategy with Geneva so that they together resemble the Wikipedia "W". For this purpose, the algorithm must find suitable values ​​for 1500 parameters in the range [0.1].

Metaheuristic optimization algorithms run in cycles, with successive iterations building on the results of the previous iteration. Usually several evaluations are carried out per iteration, and many optimization algorithms require fewer iterations to achieve a satisfactory solution when the number of candidate solutions increases for the same problem size. Conversely, the optimization of problems with complex quality surfaces benefits from additional optimization cycles and - insofar as supported by the selected optimization algorithm - per iteration of the evaluation of the largest possible number of candidate solutions.

The useful amount of iterations and candidate solutions per iteration is limited in particular by the available computing time. A parallelization at the level of the evaluation of candidate solutions can accelerate the optimization calculation. It is helpful here that the individual evaluations of different candidate solutions are usually independent of one another. For large populations of candidate solutions, parametric optimization approaches "trivial parallel" problems with regard to the ability to be parallelized .

The evaluation step required for optimization can take place in Geneva in parallel in several threads or distributed in clusters , grid or cloud . Apart from the local calculability for given parameter sets, no content-related prerequisite is made for the evaluation functions used. However, the selected type of parallelization creates requirements for the implementation of the evaluation function. When running in the network, it should also be noted that, due to the restrictions described by Amdahl's law , very short evaluation functions lead to poor scalability.

Meta optimization

There is the possibility of optimizing configuration parameters of optimization algorithms (currently implemented in Geneva only for evolutionary algorithms), either to minimize the number of calls of the evaluation function until a given quality is reached, or to search more efficiently for the best possible optima, depending on the user's choice. The procedure is very computationally intensive due to the need for many optimization runs and is not suitable for computationally intensive evaluation functions. However, it can be used to determine suitable configuration parameters, even for very noisy quality surfaces, using mathematical functions that can be calculated quickly. Furthermore, the possibility of encapsulating optimization algorithms in individuals supports multipopulations - they allow optimization algorithms to be made the subject of optimization.

Architecture and design

Geneva is a toolkit implemented in the C ++ 14 standard . Only the gcc and Clang compilers are currently supported in the production version. Geneva uses components from the Boost library collection in many places . Other external libraries are only used for specific purposes, such as suitable OpenCL implementations for an optional GPGPU backend. In addition to the actual optimization functionality implemented in the eponymous library, there are two other central components in Geneva: libgeneva

Brokering and network communication

All functionality required for network communication and, more generally, the distribution of computing jobs is libcourtierimplemented in the.

The network communication currently based on the Boost ASIO library and the Boost Beast library implements a simple client-server architecture . On the client side, it only requires the server to be accessible. In the case of ASIO, network connections are reopened for each transfer of parameter sets or evaluation results and closed again after the successful transfer of the data. Due to this design, there is no need for the library to have any information about the duration of the evaluation function - this can also be active for many hours. However, depending on the evaluation function, execution in the network places similar demands on the server as a web server due to the high frequency of client-server connections, so it may require a configuration adapted to Geneva. In comparison , the Boost Beast-based implementation uses websockets so that connections are maintained for longer periods of time. The serialization of objects necessary for network communication is implemented via the Boost .Serialization library.

Calculation orders can be transferred by several producers to a broker, who uses plugins to distribute them to interested parties and to return finished calculation orders to the producers. In addition to network communication, other consumers are possible. For example, the individual steps behind film 1 were generated with a GPGPU backend based on OpenCL .

Generation of random numbers

Many stochastic optimization methods, such as evolutionary algorithms, require large amounts of random numbers for successful optimization - in the case of evolutionary algorithms usually with a normal distribution . However, since the optimization takes place in cycles, the computing power required over time is subject to periodic fluctuations. Accordingly, there are times of low utilization that can be used to generate random numbers. The Geneva library uses this as part of a "random number factory", which fills buffers in the background in several threads with random numbers evenly distributed in the interval (up to a specified limit). “Proxy objects” in the consumers can take packets of random numbers from a queue of the random number factory. These are made available to consumers individually and transparently if required - the proxy objects look like random number generators for them. If necessary, this can also be converted into random distributions required by the consumer. By continuously filling buffers with random numbers, times of low activity in particular can be better used. Furthermore , there are no problems with setting the seeds of the generators, which would occur if each of the (possibly thousands of) consumers were to use their own generator. This functionality is implemented in the. libhap

Examples

Figure 2: A Dalitz plot of Monte Carlo generated decays, using a ComPWA model. Optimized parameters for different resonances were determined with Geneva.

Film 1 shows an optimization carried out with the Geneva library, in which 150 random, semi-transparent triangles had to be arranged in terms of their coordinates, color values ​​and transparency so that they looked as similar as possible to the Wikipedia “W”. Only the best solutions of each iteration are shown. Overall, this means an adjustment of 1500 parameters in the value range [0: 1]. The evaluation criterion included the quadratic sum of the numerical deviations of the color channels of all pixels between a candidate image composed of the 150 triangles for a given parameter set and the target image. Since the evaluation of different target images is independent of one another, the optimization of such problems greatly benefits from the simultaneous use of many processors, for example in a multi-core processor. The example is also interesting insofar as with metaheuristic optimization methods, as a rule, with many parameters one cannot make a statement about how close one is to the global optimum. In the present case, however, the visual assessment is sufficient for this despite the very high number of parameters. The example is based on an idea by Roger Alsing (there with the Mona Lisa as the target image). Geneva is currently u. a. in particle physics (see also Figure 2) and in the automotive industry to optimize the acoustics of exhaust systems.

history

Forerunners from Geneva were developed in 1994 as part of a diploma thesis at the Institute for Experimental Physics I at the Ruhr University Bochum and used there for the training of feedforward neural networks for the detection of hadronic splitoffs in the crystal barrel experiment at CERN / Geneva . The code was further developed from 2001 to 2004 as part of a doctoral thesis at the same institute with a view to the distributed optimization of analyzes of elementary particle physics . As part of a spin -off funding at the Steinbuch Center for Computing of the Karlsruhe Institute of Technology , Geneva was completely revised and handed over to the KIT spin-off Gemfony scientific UG (limited liability). Geneva was released as open source in this context. Version 1.10 of the Geneva library contains around 130,000 lines of code and is under active development. Geneva version 1.10 released in March 2020 was placed under the Apache license in version 2.0. Up to this point in time it was under the GNU Affero General Public License in version 3.0.

Individual evidence

  1. ^ Repository of the Geneva Library
  2. a b c d e Geneva Library Manual, Version 1.6; R. Berlich, S. Gabriel, A. Garcia: Parametric Optimization with the Geneva Library Collection ; engl.
  3. HadoOptimizer - Solving complex optimization problems by Christian Kumpe et al. in SCC News (publication by the Steinbuch Center for Computing at the Karlsruhe Institute of Technology), edition 2011/3, p. 24 ff.
  4. a b c ComPWA: A common amplitude analysis framework for PANDA M Michel et al. 2014 J. Phys .: Conf. Ser. 513 022025
  5. ^ Distributed Parametric Optimization with the Geneva Library , Rüdiger Berlich et al., In Data Driven e-Science ; Conference proceedings of ISGC 2010; Jumper; Simon C. Lin and Eric Yen (editors); ISBN 978-1-4419-8013-7 ; P. 303 ff.
  6. Modeling of the Mona Lisa with triangles by Roger Alsing
  7. Model independent determination of the CKM phase using input from -mixing , Sam Harnew and Jonas Rademacker, Journal of High Energy Physics, March 2015, 2015: 169
  8. Parametric optimization of the acoustics of exhaust systems ( Memento of the original from March 4, 2016 in the Internet Archive ) Info: The archive link was automatically inserted and not yet checked. Please check the original and archive link according to the instructions and then remove this notice. , Dominik Rödel and Rüdiger Berlich, 2nd International Engine Congress 2015, February 2015, Baden-Baden @1@ 2Template: Webachiv / IABot / www.atzlive.de
  9. R. Berlich: Visualization of hadronic split-offs and their detection with neural networks , diploma thesis, Institute for Experimental Physics I of the Ruhr University Bochum, 1995
  10. R.Berlich: Application of Evolutionary Strategies to Automated Parametric Optimization Studies in Physics Research , doctoral thesis, Institute for Experimental Physics I of the Ruhr University Bochum, 2004
  11. R.Berlich, M.Kunze: Parametric optimization with evolutionary strategies in particle physics , Nuclear Instruments and Methods in Physics Research A 534 (2004) 147–151
  12. The best for last - optimization algorithms on distributed systems , Rüdiger Berlich, iX, Heise Zeitschriftenverlag, edition 12/2010, pp. 110–115.
  13. ^ " Geneva - from research to economic application ", Rüdiger Berlich in SCC News (publication of the Steinbuch Center for Computing at the Karlsruhe Institute of Technology), edition 2009/2, p. 8 ff.
  14. ^ Analysis of the Geneva code by BLACKDUCK / Open Hub