# Calculus of variations

The calculus of variations is a branch of mathematics that was developed around the middle of the 18th century by Leonhard Euler and Joseph-Louis Lagrange in particular .

The central element of the calculus of variations is the Euler-Lagrange equation

${\ displaystyle \ delta I (x, \ delta x) = 0}$,

which just becomes the Lagrange equation from classical mechanics . ${\ displaystyle I = \ int {\ mathcal {L}} \, \ mathrm {d} t}$

## Basics

The calculus of variations deals with real functions of functions, which are also called functionals . Such functionals can be integrals over an unknown function and its derivatives. One is interested in stationary functions, i.e. those for which the functional assumes a maximum , a minimum (extremal) or a saddle point . Some classic problems can be elegantly formulated with the help of functional.

The key theorem of the calculus of variations is the Euler-Lagrange equation, more precisely "Euler-Lagrange's differential equation". This describes the stationarity condition of a functional. As with the task of determining the maxima and minima of a function, it is derived from the analysis of small changes around the assumed solution. The Euler-Lagrangian differential equation is just a necessary condition . Adrien-Marie Legendre and Alfred Clebsch as well as Carl Gustav Jacob Jacobi provided further necessary conditions for the existence of an extremal . A sufficient but not necessary condition comes from Karl Weierstrass .

The methods of calculus of variations appear in Hilbert space techniques, Morse code and symplectic geometry . The term variation is used for all extremal problems of functions. Geodesy and differential geometry are areas of mathematics in which variation plays a role. A lot of work has been done, in particular, on the problem of the minimal surfaces that occur with soap bubbles.

## application areas

The calculus of variations is the mathematical basis of all physical extremal principles and is therefore particularly important in theoretical physics , for example in the Lagrange formalism of classical mechanics or orbit determination , in quantum mechanics using the principle of the smallest effect and in statistical physics in the context of Density functional theory . In mathematics, for example, the calculus of variations was used in the Riemannian treatment of the Dirichlet principle for harmonic functions . The calculation of variations is also used in control and regulation theory when it comes to determining optimal controllers .

A typical application example is the brachistochron problem : On which curve in a gravitational field from a point A to a point B that is below, but not directly below A, does an object need the least time to traverse the curve? Of all the curves between A and B, one minimizes the term describing the time it takes to run the curve. This expression is an integral that contains the unknown, searched function that describes the curve from A to B, and its derivatives.

## A tool from the analysis of real functions in a real variable

In the following, an important technique of calculus of variations is demonstrated, in which a necessary statement for a local minimum place of a real function with only one real variable is transferred into a necessary statement for a local minimum place of a functional. This statement can then often be used to set up descriptive equations for stationary functions of a functional.

Let a functional be given on a function space ( must be at least a topological space ). The functional has a local minimum at this point . ${\ displaystyle I \ colon X \ to \ mathbb {R}}$${\ displaystyle X}$${\ displaystyle X}$${\ displaystyle x \ in X}$

The following simple trick replaces the “difficult to handle” functional with a real function that only depends on one real parameter “and is correspondingly easier to handle”. ${\ displaystyle I}$${\ displaystyle F (\ alpha)}$${\ displaystyle \ alpha}$

With one had any steadily through the real parameter parameterized family of functions . Let the function (i.e. for ) be equal to the stationary function . Also, let that be by the equation ${\ displaystyle \ epsilon> 0}$${\ displaystyle (x _ {\ alpha}) _ {\ alpha \ in (- \ epsilon, \ epsilon)}}$${\ displaystyle \ alpha}$${\ displaystyle x _ {\ alpha} \ in X}$${\ displaystyle x_ {0}}$${\ displaystyle x _ {\ alpha}}$${\ displaystyle \ alpha = 0}$${\ displaystyle x}$

${\ displaystyle F (\ alpha): = I (x _ {\ alpha})}$

defined function differentiable at the point . ${\ displaystyle F \ colon (- \ epsilon, \ epsilon) \ to \ mathbb {R}}$${\ displaystyle \ alpha = 0}$

The continuous function then assumes a local minimum at this point, since there is a local minimum of . ${\ displaystyle F}$${\ displaystyle \ alpha = 0}$${\ displaystyle x_ {0} = x}$${\ displaystyle I}$

From analysis for real functions in a real variable it is known that then holds. Applied to the functional, this means ${\ displaystyle \ left. {\ frac {\ mathrm {d}} {\ mathrm {d} \ alpha}} F (\ alpha) \ right | _ {\ alpha = 0} = 0}$

${\ displaystyle \ left. {\ frac {\ mathrm {d}} {\ mathrm {d} \ alpha}} I (x _ {\ alpha}) \ right | _ {\ alpha = 0} = 0.}$

When setting up the desired equations for stationary functions is then utilized that the above equation for any ( "benign") Family with must apply. ${\ displaystyle (x _ {\ alpha}) _ {\ alpha \ in (- \ epsilon, \ epsilon)}}$${\ displaystyle x_ {0} = x}$

This will be demonstrated in the next section using the Euler equation.

## Euler-Lagrange equation; Derivative of variation; other necessary or sufficient conditions

Given are two points in time with and a function that is twofold continuously differentiable in all arguments, the Lagrangian${\ displaystyle t_ {a}, t_ {e} \ in \ mathbb {R}}$${\ displaystyle t_ {e}> t_ {a}}$

${\ displaystyle {\ mathcal {L}} \ colon (t_ {a}, t_ {e}) \ times G \ to \ mathbb {R} \, \ quad G \ subset \ mathbb {R} ^ {n} \ times \ mathbb {R} ^ {n} \, \ quad G {\ text {open}}}$.

For example, in the Lagrangian of the free relativistic particle with mass and${\ displaystyle m}$${\ displaystyle c = 1}$

${\ displaystyle {\ mathcal {L}} (t, x, v) = - m {\ sqrt {1-v ^ {2}}}}$

the area is the Cartesian product of and the interior of the unit sphere . ${\ displaystyle G}$${\ displaystyle \ mathbb {R} ^ {3}}$

The set of all twofold continuously differentiable functions becomes the function space ${\ displaystyle X}$

${\ displaystyle x \ colon [t_ {a}, t_ {e}] \ to \ mathbb {R} ^ {n}}$

selected that occupy the predefined locations or at the end time : ${\ displaystyle t_ {a}}$${\ displaystyle t_ {e}}$${\ displaystyle x_ {a}}$${\ displaystyle x_ {e}}$

${\ displaystyle x (t_ {a}) = x_ {a} \, \ quad x (t_ {e}) = x_ {e}}$

and whose values ​​are together with the values ​​of their derivation in , ${\ displaystyle G}$

${\ displaystyle \ forall t \ in [t_ {a}, t_ {e}] \ colon \ left (x (t), {\ frac {\ mathrm {d} x} {\ mathrm {d} t}} ( t) \ right) \ in G}$.

With the Lagrangian , the functional , the effect, becomes through ${\ displaystyle {\ mathcal {L}}}$${\ displaystyle I \ colon X \ to \ mathbb {R}}$

${\ displaystyle I (x): = \ int _ {t_ {a}} ^ {t_ {e}} {\ mathcal {L}} (t, x (t), {\ dot {x}} (t) ) \, \ mathrm {d} t}$

Are defined. We are looking for the function that minimizes the effect . ${\ displaystyle x \ in X}$${\ displaystyle I}$

According to the technique presented in the previous section, we investigate all differentiable single-parameter families that go through the stationary function of the functional (it is therefore true ). The equation derived in the last section is used ${\ displaystyle (x _ {\ alpha}) _ {\ alpha \ in (- \ epsilon, \ epsilon)} \ subset X}$${\ displaystyle \ alpha = 0}$${\ displaystyle x}$${\ displaystyle x_ {0} = x}$

${\ displaystyle 0 = \ left. {\ frac {\ mathrm {d}} {\ mathrm {d} \ alpha}} I (x _ {\ alpha}) \ right | _ {\ alpha = 0} = \ left [ {\ frac {\ mathrm {d}} {\ mathrm {d} \ alpha}} \ int _ {t_ {a}} ^ {t_ {e}} {\ mathcal {L}} (t, x _ {\ alpha } (t), {\ dot {x}} _ {\ alpha} (t)) \, \ mathrm {d} t \ right] _ {\ alpha = 0}}$.

Including the differentiation according to the parameter in the integral yields with the chain rule ${\ displaystyle \ alpha}$

{\ displaystyle {\ begin {aligned} 0 & = \ left [\ int _ {t_ {a}} ^ {t_ {e}} \ left (\ partial _ {2} {\ mathcal {L}} (t, x_ {\ alpha} (t), {\ dot {x}} _ {\ alpha} (t)) \ partial _ {\ alpha} x _ {\ alpha} (t) + \ partial _ {3} {\ mathcal { L}} (t, x _ {\ alpha} (t), {\ dot {x}} _ {\ alpha} (t)) \ partial _ {\ alpha} {\ dot {x}} _ {\ alpha} (t) \ right) \, \ mathrm {d} t \ right] _ {\ alpha = 0} \\ & = \ left [\ int _ {t_ {a}} ^ {t_ {e}} \ partial _ {2} {\ mathcal {L}} (t, x _ {\ alpha} (t), {\ dot {x}} _ {\ alpha} (t)) \ partial _ {\ alpha} x _ {\ alpha} (t) \, \ mathrm {d} t + \ int _ {t_ {a}} ^ {t_ {e}} \ partial _ {3} {\ mathcal {L}} (t, x _ {\ alpha} (t ), {\ dot {x}} _ {\ alpha} (t)) \ partial _ {\ alpha} {\ dot {x}} _ {\ alpha} (t) \, \ mathrm {d} t \ right ] _ {\ alpha = 0}. \ end {aligned}}}

Here stand for the derivatives according to the second or third argument and for the partial derivative according to the parameter . ${\ displaystyle \ partial _ {2}, \ partial _ {3}}$${\ displaystyle \ partial _ {\ alpha}}$${\ displaystyle \ alpha}$

Later it will prove to be favorable if the second integral contains instead of the first integral . This can be achieved through partial integration: ${\ displaystyle \ partial _ {\ alpha} {\ dot {x}} _ {\ alpha} (t)}$${\ displaystyle \ partial _ {\ alpha} x _ {\ alpha} (t)}$

${\ displaystyle 0 = \ left [\ int _ {t_ {a}} ^ {t_ {e}} \ partial _ {2} {\ mathcal {L}} (t, x _ {\ alpha} (t), { \ dot {x}} _ {\ alpha} (t)) \, \ partial _ {\ alpha} x _ {\ alpha} (t) \, \ mathrm {d} t + \ left [\ partial _ {3} { \ mathcal {L}} (t, x _ {\ alpha} (t), {\ dot {x}} _ {\ alpha} (t)) \, \ partial _ {\ alpha} x _ {\ alpha} (t ) \ right] _ {t = t_ {a}} ^ {t_ {e}} \ right.}$
${\ displaystyle - \ left. \ int _ {t_ {a}} ^ {t_ {e}} {\ frac {\ mathrm {d}} {\ mathrm {d} t}} \ left (\ partial _ {3 } {\ mathcal {L}} (t, x _ {\ alpha} (t), {\ dot {x}} _ {\ alpha} (t)) \ right) \, \ partial _ {\ alpha} x_ { \ alpha} (t) \, \ mathrm {d} t \ right] _ {\ alpha = 0}}$

At the points and apply regardless of the conditions and . Derive these two constants according to returns . Therefore the term disappears and one obtains the equation after combining the integrals and factoring out${\ displaystyle t = t_ {a}}$${\ displaystyle t = t_ {e}}$${\ displaystyle \ alpha}$${\ displaystyle x _ {\ alpha} (t_ {a}) = x_ {a}}$${\ displaystyle x _ {\ alpha} (t_ {e}) = x_ {e}}$${\ displaystyle \ alpha}$${\ displaystyle \ partial _ {\ alpha} x _ {\ alpha} (t_ {a}) = \ partial _ {\ alpha} x _ {\ alpha} (t_ {e}) = 0}$${\ displaystyle \ left [\ partial _ {3} {\ mathcal {L}} (t, x _ {\ alpha} (t), {\ dot {x}} _ {\ alpha} (t)) \ partial _ {\ alpha} x _ {\ alpha} (t) \ right] _ {t = t_ {a}} ^ {t_ {e}}}$${\ displaystyle \ partial _ {\ alpha} x _ {\ alpha}}$

${\ displaystyle 0 = \ left [\ int _ {t_ {a}} ^ {t_ {e}} \ left (\ partial _ {2} {\ mathcal {L}} (t, x _ {\ alpha} (t ), {\ dot {x}} _ {\ alpha} (t)) - {\ frac {\ mathrm {d}} {\ mathrm {d} t}} \ partial _ {3} {\ mathcal {L} } (t, x _ {\ alpha} (t), {\ dot {x}} _ {\ alpha} (t)) \ right) \, \ partial _ {\ alpha} x _ {\ alpha} (t) \ , \ mathrm {d} t \ right] _ {\ alpha = 0}}$

and with ${\ displaystyle x _ {\ alpha} (t) | _ {\ alpha = 0} = x (t)}$

${\ displaystyle 0 = \ int _ {t_ {a}} ^ {t_ {e}} \ left (\ partial _ {2} {\ mathcal {L}} (t, x (t), {\ dot {x }} (t)) - {\ frac {\ mathrm {d}} {\ mathrm {d} t}} \ partial _ {3} {\ mathcal {L}} (t, x (t), {\ dot {x}} (t)) \ right) \ left [\ partial _ {\ alpha} x _ {\ alpha} (t) \ right] _ {\ alpha = 0} \, \ mathrm {d} t.}$

Except for the start time and the end time, there are no restrictions. With this, the time functions are arbitrary, twice continuously differentiable time functions except for the conditions . According to the fundamental lemma of the calculus of variations , the last equation can only be fulfilled for all admissible ones if the factor is equal to zero in the entire integration interval (this is explained in more detail in the remarks). This gives the Euler-Lagrange equation for the stationary function${\ displaystyle x _ {\ alpha} (t)}$${\ displaystyle t \ mapsto \ left [\ partial _ {\ alpha} x _ {\ alpha} (t) \ right] _ {\ alpha = 0}}$${\ displaystyle \ partial _ {\ alpha} x _ {\ alpha} (t_ {a}) = \ partial _ {\ alpha} x _ {\ alpha} (t_ {e}) = 0}$${\ displaystyle \ left [\ partial _ {\ alpha} x _ {\ alpha} \ right] _ {\ alpha = 0}}$${\ displaystyle \ partial _ {2} {\ mathcal {L}} (t, x (t), {\ dot {x}} (t)) - {\ frac {\ mathrm {d}} {\ mathrm { d} t}} \ partial _ {3} {\ mathcal {L}} (t, x (t), {\ dot {x}} (t))}$${\ displaystyle x}$

${\ displaystyle \ partial _ {2} {\ mathcal {L}} (t, x (t), {\ dot {x}} (t)) - {\ frac {\ mathrm {d}} {\ mathrm { d} t}} \ partial _ {3} {\ mathcal {L}} (t, x (t), {\ dot {x}} (t)) = 0}$,

which must be fulfilled for everyone . ${\ displaystyle t \ in (t_ {a}, t_ {e})}$

The specified quantity to be made to disappear is also called the Euler derivative of the Lagrangian , ${\ displaystyle {\ mathcal {L}}}$

${\ displaystyle {\ frac {{\ hat {\ partial}} {\ mathcal {L}}} {{\ hat {\ partial}} x}} (t): = \ left. {\ frac {\ partial { \ mathcal {L}} (t, x, {\ dot {x}})} {\ partial x}} \ right | _ {(t, x (t), {\ dot {x}} (t)) } - {\ frac {\ mathrm {d}} {\ mathrm {d} t}} \, \ left (\ left. {\ frac {\ partial {\ mathcal {L}} (t, x, {\ dot {x}})} {\ partial {\ dot {x}}}} \ right | _ {(t, x (t), {\ dot {x}} (t))} \ right) \ ,.}$

In physics books in particular, the derivation is referred to as variation. Then the variation of . The variation of the effect ${\ displaystyle \ left. \ partial _ {\ alpha} \ right | _ {\ alpha = 0}}$${\ displaystyle \ delta x = \ left. \ partial _ {\ alpha} x _ {\ alpha} \ right | _ {\ alpha = 0}}$${\ displaystyle x}$

${\ displaystyle \ delta I (x, \ delta x) = \ int \ mathrm {d} t \, {\ frac {\ delta I} {\ delta x (t)}} \ delta x (t)}$

is like a linear form in the variations of the arguments, its coefficients are called the variation derivative of the functional . In the case under consideration it is the Euler derivative of the Lagrangian ${\ displaystyle \ mathrm {d} f = \ sum _ {i} (\ partial _ {i} f) \ mathrm {d} x ^ {i}}$${\ displaystyle {\ frac {\ delta I} {\ delta x (t)}}}$${\ displaystyle I}$

${\ displaystyle {\ frac {\ delta I} {\ delta x (t)}} = {\ frac {{\ hat {\ partial}} {\ mathcal {L}}} {{\ hat {\ partial}} x}} (t)}$.

## Remarks

The function for and${\ displaystyle t \ mapsto b (t)}$${\ displaystyle t_ {0} = 1}$${\ displaystyle \ epsilon = 0 {,} 1}$

When deriving the Euler-Lagrange equation, it was taken into account that a continuous function , which for all functions at least twice continuously differentiable with when integrating over ${\ displaystyle a}$${\ displaystyle b}$${\ displaystyle b (t_ {a}) = b (t_ {e}) = 0}$

${\ displaystyle \ int _ {t_ {a}} ^ {t_ {e}} a (t) b (t) \, \ mathrm {d} t}$

returns zero, must be identical to zero.

This is easy to see when you take into account that it has for example

${\ displaystyle b (t): = {\ begin {cases} 0 & {\ text {for}} t \ leq t_ {0} - \ epsilon {\ text {or}} t \ geq t_ {0} + \ epsilon \\ (t-t_ {0} + \ epsilon) ^ {3} (t_ {0} -t + \ epsilon) ^ {3} & {\ text {for}} t \ in (t_ {0} - \ epsilon , t_ {0} + \ epsilon) \ end {cases}}}$

gives a twice continuously differentiable function, which is positive in an environment of an arbitrarily selected point in time and otherwise zero. If there were a point at which the function would be greater or less than zero, then due to the continuity it would also be greater or less than zero in an entire area around this point. With the function just defined , however, the integral, in contradiction to the requirement an, is also greater or less than zero. The assumption that something would be non-zero at one point is therefore wrong. The function is really identical to zero. ${\ displaystyle \ epsilon}$${\ displaystyle t_ {0} \ in (t_ {a}, t_ {e})}$${\ displaystyle t_ {0}}$${\ displaystyle a}$${\ displaystyle (t_ {0} - \ epsilon, t_ {0} + \ epsilon)}$${\ displaystyle b}$${\ displaystyle \ int _ {t_ {a}} ^ {t_ {b}} a (t) b (t) \, \ mathrm {d} t}$${\ displaystyle a}$${\ displaystyle a}$${\ displaystyle t_ {0}}$${\ displaystyle a}$

If the function space is an affine space , the family is often defined in the literature as a sum with a freely selectable time function that must meet the condition . The derivative is then precisely the Gateaux derivative of the functional at the point in direction . The version presented here appears to the author a little more favorable if the set of functions is no longer an affine space (for example, if it is restricted by a nonlinear constraint; see, for example, the Gaussian principle of least constraint ). It is shown in more detail in and is based on the definition of tangent vectors on manifolds. ${\ displaystyle X}$${\ displaystyle (x _ {\ alpha}) _ {\ alpha \ in (- \ epsilon, \ epsilon)}}$${\ displaystyle x _ {\ alpha} (t): = x (t) + \ alpha h (t)}$${\ displaystyle h}$${\ displaystyle h (t_ {a}) = h (t_ {e}) = 0}$${\ displaystyle \ left. \ partial _ {\ alpha} I (x _ {\ alpha}) \ right | _ {\ alpha = 0}}$ ${\ displaystyle \ left. \ partial _ {\ alpha} I (x + \ alpha h) \ right | _ {\ alpha = 0}}$${\ displaystyle I}$${\ displaystyle x}$${\ displaystyle h}$${\ displaystyle X}$

In the case of a further, restricting functional , which restricts the function space by the fact that it should apply, the Lagrange multiplier method can be applied analogously to the real case : ${\ displaystyle J (x) = \ int j (t, x, {\ dot {x}}, {\ ddot {x}}, \ dots, x ^ {(n)}) d ^ {d} t}$${\ displaystyle X}$${\ displaystyle J (x) = 0}$

${\ displaystyle {\ frac {\ delta I} {\ delta x_ {i}}} = \ lambda {\ frac {\ delta J} {\ delta x_ {i}}}}$

for any and a solid . ${\ displaystyle i = 1, \ dotsc, n}$${\ displaystyle \ lambda \ in \ mathbb {R}}$

## Generalization for higher derivative and dimensions

The above derivation by means of partial integration can be applied to variation problems of the type

${\ displaystyle I (\ varphi) = \ int {\ mathcal {L}} (\ varphi (x), \ partial _ {1} \ varphi (x), \ dots, \ partial _ {d} \ varphi (x ), \ dots) \ mathrm {d} ^ {d} x}$

transferred, whereby in the dependencies derivatives (see multi-index notation ) also higher order appear, for example up to the order . In this case the Euler-Lagrange equation is ${\ displaystyle D ^ {\ alpha} \ varphi (x)}$${\ displaystyle \ vert \ alpha \ vert \ leq N}$

${\ displaystyle \ sum _ {\ vert \ alpha \ vert \ leq N} (- 1) ^ {\ vert \ alpha \ vert} {\ frac {\ delta {\ mathcal {L}}} {\ delta (D ^ {\ alpha} \ varphi (x))}} = 0}$,

where the Euler derivative as

${\ displaystyle {\ frac {\ delta {\ mathcal {L}}} {\ delta (D ^ {\ alpha} \ varphi (x))}}: = \ left. {\ frac {\ partial {\ mathcal { L}}} {\ partial (D ^ {\ alpha} \ varphi)}} \ right \ vert _ {\ varphi = \ varphi (x), \ partial _ {1} \ varphi = \ partial _ {1} \ varphi (x), \ dots}}$

is to be understood (and where symbolically represents the corresponding dependence on in a self-explanatory manner , stands for the concrete value of the derivation of ). In particular, the total is also over . ${\ displaystyle D ^ {\ alpha} \ varphi}$${\ displaystyle {\ mathcal {L}}}$${\ displaystyle D ^ {\ alpha} \ varphi (x)}$${\ displaystyle \ varphi (x)}$${\ displaystyle \ alpha = 0}$

## literature

Older books:

• Friedrich Stegmann ; Textbook of the calculus of variations and their application in studies of the maximum and minimum . Kassel, Luckhardt, 1854.
• Oskar Bolza : Lectures on calculus of variations. BG Teubner, Leipzig et al. 1909, ( digitized version ).
• Paul Funk : Calculus of Variations and their application in physics and technology (= The basic teachings of the mathematical sciences in individual representations. 94, ). 2nd Edition. Springer, Berlin et al. 1970.
• Adolf Kneser : calculus of variations. In: Encyclopedia of Mathematical Sciences, including its applications . Volume 2: Analysis. Part 1. BG Teubner, Leipzig 1898, pp. 571-625 .
• Paul Stäckel (Hrsg.): Treatises on the calculation of variations. 2 parts. Wilhelm Engelmann, Leipzig 1894;
• Part 1: Treatises by Joh. Bernoulli (1696), Jac. Bernoulli (1697) and Leonhard Euler (1744) (= Ostwald's classic of the exact sciences. 46, ). 1894, ( digitized version );
• Part 2: Treatises by Lagrange (1762, 1770), Legendre (1786), and Jacobi (1837) (= Ostwald's classic of the exact sciences. 47). 1894, ( digitized version ).

## Individual evidence

1. Brachistochronous problem .
2. Vladimir I. Smirnow : Course of higher mathematics (= university books for mathematics. Vol. 5a). Part 4, 1st (14th edition, German-language edition of the 6th Russian edition). VEB Deutscher Verlag der Wissenschaften, Berlin 1988, ISBN 3-326-00366-8 .
3. See also Helmut Fischer, Helmut Kaul: Mathematics for physicists. Volume 3: The calculus of variations, differential geometry, mathematical foundations of general relativity. 2nd, revised edition. Teubner, Stuttgart et al. 2006, ISBN 3-8351-0031-9 .