Dirac delta function

Schematic representation of the Dirac delta function by a line surmounted by an arrow. The height of the arrow is usually meant to specify the value of any multiplicative constant, which will give the area under the function. The other convention is to write the area next to the arrowhead.

In science and mathematics, the Dirac delta function, or δ function, is a generalized function, or distribution that was historically introduced by the physicist Paul Dirac for modelling the density of an idealized point mass or point charge, as a function that is equal to zero everywhere except for zero and whose integral over the entire real line is equal to one.[1][2][3] As there is no function that has these properties, the computations that were done by the theoretical physicists appeared to mathematicians as nonsense, until the introduction of distributions by Laurent Schwartz, for formalizing and validating mathematically these computations. This amounts, in particular, to define Dirac delta function as a linear functional, which maps every function to its value at zero.[4][5] The Kronecker delta function, which is usually defined on a discrete domain and takes values 0 and 1, is a discrete analog of Dirac delta function.

In engineering and signal processing, the delta function, also known as the unit impulse symbol,[6] may be regarded through its Laplace transform, as coming from the boundary values of a complex analytic function of a complex variable. The formal rules obeyed by this function are part of the operational calculus, a standard tool kit of physics and engineering. In many applications, the Dirac delta is regarded as a kind of limit (a weak limit) of a sequence of functions having a tall spike at the origin. The approximating functions of the sequence are thus "approximate" or "nascent" delta functions.

The graph of the delta function is usually thought of as following the whole x-axis and the positive y-axis. The Dirac delta is used to model a tall narrow spike function (an impulse), and other similar abstractions such as a point charge, point mass or electron point. For example, to calculate the dynamics of a billiard ball being struck, one can approximate the force of the impact by a delta function. In doing so, one not only simplifies the equations, but one also is able to calculate the motion of the ball by only considering the total impulse of the collision without a detailed model of all of the elastic energy transfer at subatomic levels (for instance).

To be specific, suppose that a billiard ball is at rest. At time t=0{\displaystyle t=0} it is struck by another ball, imparting it with a momentumP, in kgm/s{\displaystyle kg\,m/s}. The exchange of momentum is not actually instantaneous, being mediated by elastic processes at the molecular and subatomic level, but for practical purposes it is convenient to consider that energy transfer as effectively instantaneous. The force therefore is Pδ(t){\displaystyle P\delta (t)}. (The units of δ(t){\displaystyle \delta (t)} are s−1{\displaystyle s^{-1}}.)

To model this situation more rigorously, suppose that the force instead is uniformly distributed over a small time interval Δt{\displaystyle \Delta t}. That is,

Here the functions FΔt{\displaystyle F_{\Delta t}} are thought of as useful approximations to the idea of instantaneous transfer of momentum.

The delta function allows us to construct an idealized limit of these approximations. Unfortunately, the actual limit of the functions (in the sense of ordinary calculus) limΔt→0FΔt{\displaystyle \lim _{\Delta t\to 0}F_{\Delta t}}. is zero everywhere but a single point, where it is infinite. To make proper sense of the delta function, we should instead insist that the property

which holds for all Δt>0{\displaystyle \Delta t>0}, should continue to hold in the limit. So, in the equation F(t)=Pδ(t)=limΔt→0FΔt(t){\displaystyle F(t)=P\delta (t)=\lim _{\Delta t\to 0}F_{\Delta t}(t)}, it is understood that the limit is always taken outside the integral.

In applied mathematics, as we have done here, the delta function is often manipulated as a kind of limit (a weak limit) of a sequence of functions, each member of which has a tall spike at the origin: for example, a sequence of Gaussian distributions centered at the origin with variance tending to zero.

Despite its name, the delta function is not truly a function, at least not a usual one with range in real numbers. For example, the objects f(x) = δ(x) and g(x) = 0 are equal everywhere except at x = 0 yet have integrals that are different. According to Lebesgue integration theory, if f and g are functions such that f = galmost everywhere, then f is integrable if and only ifg is integrable and the integrals of f and g are identical. A rigorous approach to regarding the Dirac delta function as a mathematical object in its own right requires measure theory or the theory of distributions.

A rigorous interpretation of the exponential form and the various limitations upon the function f necessary for its application extended over several centuries. The problems with a classical interpretation are explained as follows:[13]

The greatest drawback of the classical Fourier transformation is a rather narrow class of functions (originals) for which it can be effectively computed. Namely, it is necessary that these functions decrease sufficiently rapidly to zero (in the neighborhood of infinity) in order to ensure the existence of the Fourier integral. For example, the Fourier transform of such simple functions as polynomials does not exist in the classical sense. The extension of the classical Fourier transformation to distributions considerably enlarged the class of functions that could be transformed and this removed many obstacles.

Further developments included generalization of the Fourier integral, "beginning with Plancherel's pathbreaking L2-theory (1910), continuing with Wiener's and Bochner's works (around 1930) and culminating with the amalgamation into L. Schwartz's theory of distributions (1945) ...",[14] and leading to the formal development of the Dirac delta function.

This is merely a heuristic characterization. The Dirac delta is not a function in the traditional sense as no function defined on the real numbers has these properties.[17] The Dirac delta function can be rigorously defined either as a distribution or as a measure.

One way to rigorously capture the notion of the Dirac delta function is to define a measure, which accepts as an argument a subset A of the real line R, and returns δ(A) = 1 if 0 ∈ A, and δ(A) = 0 otherwise.[19] If the delta function is conceptualized as modeling an idealized point mass at 0, then δ(A) represents the mass contained in the set A. One may then define the integral against δ as the integral of a function against this mass distribution. Formally, the Lebesgue integral provides the necessary analytic device. The Lebesgue integral with respect to the measure δ satisfies

In the theory of distributions, a generalized function is considered not a function in itself but only in relation to how it affects other functions when "integrated" against them. In keeping with this philosophy, to define the delta function properly, it is enough to say what the "integral" of the delta function is against a sufficiently "good" test function [φ]. Test functions are also known as bump functions. If the delta function is already understood as a measure, then the Lebesgue integral of a test function against that measure supplies the necessary integral.

For δ to be properly a distribution, it must be continuous in a suitable topology on the space of test functions. In general, for a linear functional S on the space of test functions to define a distribution, it is necessary and sufficient that, for every positive integer N there is an integer MN and a constant CN such that for every test function φ, one has the inequality[24]

With the δ distribution, one has such an inequality (with CN = 1) with MN = 0 for all N. Thus δ is a distribution of order zero. It is, furthermore, a distribution with compact support (the support being {0}).

In the context of measure theory, the Dirac measure gives rise to a distribution by integration. Conversely, equation (1) defines a Daniell integral on the space of all compactly supported continuous functions φ which, by the Riesz representation theorem, can be represented as the Lebesgue integral of φ with respect to some Radon measure.

Generally, when the term "Dirac delta function" is used, it is in the sense of distributions rather than measures, the Dirac measure being among several terms for the corresponding notion in measure theory. Some sources may also use the term Dirac delta distribution.

for every compactly supported continuous function f. As a measure, the n-dimensional delta function is the product measure of the 1-dimensional delta functions in each variable separately. Thus, formally, with x = (x1, x2, ..., xn), one has[6]

The delta function can also be defined in the sense of distributions exactly as above in the one-dimensional case.[25] However, despite widespread use in engineering contexts, (2) should be manipulated with care, since the product of distributions can only be defined under quite narrow circumstances.[26]

The notion of a Dirac measure makes sense on any set.[19] Thus if X is a set, x0 ∈ X is a marked point, and Σ is any sigma algebra of subsets of X, then the measure defined on sets A ∈ Σ by

Another common generalization of the delta function is to a differentiable manifold where most of its properties as a distribution can also be exploited because of the differentiable structure. The delta function on a manifold M centered at the point x0 ∈ M is defined as the following distribution:

δx0[φ]=φ(x0){\displaystyle \delta _{x_{0}}[\varphi ]=\varphi (x_{0})}

(3)

for all compactly supported smooth real-valued functions φ on M.[27] A common special case of this construction is when M is an open set in the Euclidean space Rn.

On a locally compact Hausdorff spaceX, the Dirac delta measure concentrated at a point x is the Radon measure associated with the Daniell integral (3) on compactly supported continuous functions φ. At this level of generality, calculus as such is no longer possible, however a variety of techniques from abstract analysis are available. For instance, the mapping x0↦δx0{\displaystyle x_{0}\mapsto \delta _{x_{0}}} is a continuous embedding of X into the space of finite Radon measures on X, equipped with its vague topology. Moreover, the convex hull of the image of X under this embedding is dense in the space of probability measures on X.[28]

This holds under the precise condition that f be a tempered distribution (see the discussion of the Fourier transform below). As a special case, for instance, we have the identity (understood in the distribution sense)

provided that g is a continuously differentiable function with g′ nowhere zero.[32] That is, there is a unique way to assign meaning to the distribution δ∘g{\displaystyle \delta \circ g} so that this identity holds for all compactly supported test functions f. Therefore, the domain must be broken up to exclude the g′ = 0 point. This distribution satisfies δ(g(x)) = 0 if g is nowhere zero, and otherwise if g has a real root at x0, then

Using the coarea formula from geometric measure theory, one can also define the composition of the delta function with a submersion from one Euclidean space to another one of different dimension; the result is a type of current. In the special case of a continuously differentiable function g: Rn → R such that the gradient of g is nowhere zero, the following identity holds[34]

More generally, on an open setU in the n-dimensional Euclidean spaceRn, the Dirac delta distribution centered at a point a ∈ U is defined by[43]

δa[φ]=φ(a){\displaystyle \delta _{a}[\varphi ]=\varphi (a)}

for all φ ∈ S(U), the space of all smooth compactly supported functions on U. If α = (α1, ..., αn) is any multi-index and ∂α denotes the associated mixed partial derivative operator, then the αth derivative ∂αδa of δa is given by[43]

That is, the αth derivative of δa is the distribution whose value on any test function φ is the αth derivative of φ at a (with the appropriate positive or negative sign).

The first partial derivatives of the delta function are thought of as double layers along the coordinate planes. More generally, the normal derivative of a simple layer supported on a surface is a double layer supported on that surface, and represents a laminar magnetic monopole. Higher derivatives of the delta function are known in physics as multipoles.

Higher derivatives enter into mathematics naturally as the building blocks for the complete structure of distributions with point support. If S is any distribution on U supported on the set {a} consisting of a single point, then there is an integer m and coefficients cα such that[44]

for all continuous functions f having compact support, or that this limit holds for all smooth functions f with compact support. The difference between these two slightly different modes of weak convergence is often subtle: the former is convergence in the vague topology of measures, and the latter is convergence in the sense of distributions.

Then a simple change of variables shows that ηε also has integral 1. One may show that (5) holds for all continuous compactly supported functions f,[45] and so ηε converges weakly to δ in the sense of measures.

The ηε constructed in this way are known as an approximation to the identity.[46] This terminology is because the space L1(R) of absolutely integrable functions is closed under the operation of convolution of functions: f ∗ g ∈ L1(R) whenever f and g are in L1(R). However, there is no identity in L1(R) for the convolution product: no element h such that f ∗ h = f for all f. Nevertheless, the sequence ηε does approximate such an identity in the sense that

This limit holds in the sense of mean convergence (convergence in L1). Further conditions on the ηε, for instance that it be a mollifier associated to a compactly supported function,[47] are needed to ensure pointwise convergence almost everywhere.

If the initial η = η1 is itself smooth and compactly supported then the sequence is called a mollifier. The standard mollifier is obtained by choosing η to be a suitably normalized bump function, for instance

η(x)={e−11−|x|2 if |x|<10 if |x|≥1.{\displaystyle \eta (x)={\begin{cases}e^{-{\frac {1}{1-|x|^{2}}}}&{\text{ if }}|x|<1\\0&{\text{ if }}|x|\geq 1.\end{cases}}}

In the context of probability theory, it is natural to impose the additional condition that the initial η1 in an approximation to the identity should be positive, as such a function then represents a probability distribution. Convolution with a probability distribution is sometimes favorable because it does not result in overshoot or undershoot, as the output is a convex combination of the input values, and thus falls between the maximum and minimum of the input function. Taking η1 to be any probability distribution at all, and letting ηε(x) = η1(x/ε)/ε as above will give rise to an approximation to the identity. In general this converges more rapidly to a delta function if, in addition, η has mean 0 and has small higher moments. For instance, if η1 is the uniform distribution on [−1/2, 1/2], also known as the rectangular function, then:[48]

for all ε, δ > 0. Convolution semigroups in L1 that form a nascent delta function are always an approximation to the identity in the above sense, however the semigroup condition is quite a strong restriction.

represents the temperature in an infinite wire at time t > 0, if a unit of heat energy is stored at the origin of the wire at time t = 0. This semigroup evolves according to the one-dimensional heat equation:

is the fundamental solution of the Laplace equation in the upper half-plane.[49] It represents the electrostatic potential in a semi-infinite plate whose potential along the edge is held at fixed at the delta function. The Poisson kernel is also closely related to the Cauchy distribution. This semigroup evolves according to the equation

Although using the Fourier transform, it is easy to see that this generates a semigroup in some sense—it is not absolutely integrable and so cannot define a semigroup in the above strong sense. Many nascent delta functions constructed as oscillatory integrals only converge in the sense of distributions (an example is the Dirichlet kernel below), rather than in the sense of measures.

where L is a differential operator on Rn, is to seek first a fundamental solution, which is a solution of the equation

L[u]=δ.{\displaystyle L[u]=\delta .}

When L is particularly simple, this problem can often be resolved using the Fourier transform directly (as in the case of the Poisson kernel and heat kernel already mentioned). For more complicated operators, it is sometimes easier first to consider an equation of the form

for some vector ξ. Such an equation can be resolved (if the coefficients of L are analytic functions) by the Cauchy–Kovalevskaya theorem or (if the coefficients of L are constant) by quadrature. So, if the delta function can be decomposed into plane waves, then one can in principle solve linear partial differential equations.

Such a decomposition of the delta function into plane waves was part of a general technique first introduced essentially by Johann Radon, and then developed in this form by Fritz John (1955).[52] Choose k so that n + k is an even integer, and for a real number s, put

The result follows from the formula for the Newtonian potential (the fundamental solution of Poisson's equation). This is essentially a form of the inversion formula for the Radon transform, because it recovers the value of φ(x) from its integrals over hyperplanes. For instance, if n is odd and k = 1, then the integral on the right hand side is

In the study of Fourier series, a major question consists of determining whether and in what sense the Fourier series associated with a periodic function converges to the function. The nth partial sum of the Fourier series of a function f of period 2π is defined by convolution (on the interval [−π,π]) with the Dirichlet kernel:

In spite of this, the result does not hold for all compactly supported continuous functions: that is DN does not converge weakly in the sense of measures. The lack of convergence of the Fourier series has led to the introduction of a variety of summability methods in order to produce convergence. The method of Cesàro summation leads to the Fejér kernel[53]

Thus δ is a bounded linear functional on the Sobolev space H1. Equivalently δ is an element of the continuous dual spaceH−1 of H1. More generally, in n dimensions, one has δ ∈ H−s(Rn) provided s > n / 2.

Moreover, let H2(∂D) be the Hardy space consisting of the closure in L2(∂D) of all holomorphic functions in D continuous up to the boundary of D. Then functions in H2(∂D) uniquely extend to holomorphic functions in D, and the Cauchy integral formula continues to hold. In particular for z ∈ D, the delta function δz is a continuous linear functional on H2(∂D). This is a special case of the situation in several complex variables in which, for smooth domains D, the Szegő kernel plays the role of the Cauchy integral.

With a suitable rigged Hilbert space(Φ, L2(D), Φ*) where Φ ⊂ L2(D) contains all compactly supported smooth functions, this summation may converge in Φ*, depending on the properties of the basis φn. In most cases of practical interest, the orthonormal basis comes from an integral or differential operator, in which case the series converges in the distribution sense.[58]

Cauchy used an infinitesimal α to write down a unit impulse, infinitely tall and narrow Dirac-type delta function δα satisfying ∫F(x)δα(x)=F(0){\displaystyle \int F(x)\delta _{\alpha }(x)=F(0)} in a number of articles in 1827.[59] Cauchy defined an infinitesimal in Cours d'Analyse (1827) in terms of a sequence tending to zero. Namely, such a null sequence becomes an infinitesimal in Cauchy's and Lazare Carnot's terminology.

Non-standard analysis allows one to rigorously treat infinitesimals. The article by Yamashita (2007) contains a bibliography on modern Dirac delta functions in the context of an infinitesimal-enriched continuum provided by the hyperreals. Here the Dirac delta can be given by an actual function, having the property that for every real function F one has ∫F(x)δα(x)=F(0){\displaystyle \int F(x)\delta _{\alpha }(x)=F(0)} as anticipated by Fourier and Cauchy.

A Dirac comb is an infinite series of Dirac delta functions spaced at intervals of T

A so-called uniform "pulse train" of Dirac delta measures, which is known as a Dirac comb, or as the Shah distribution, creates a sampling function, often used in digital signal processing (DSP) and discrete time signal analysis. The Dirac comb is given as the infinite sum, whose limit is understood in the distribution sense,

Up to an overall normalizing constant, the Dirac comb is equal to its own Fourier transform. This is significant because if f is any Schwartz function, then the periodization of f is given by the convolution

In probability theory and statistics, the Dirac delta function is often used to represent a discrete distribution, or a partially discrete, partially continuous distribution, using a probability density function (which is normally used to represent fully continuous distributions). For example, the probability density function f(x) of a discrete distribution consisting of points x = {x1, ..., xn}, with corresponding probabilities p1, ..., pn, can be written as

As another example, consider a distribution which 6/10 of the time returns a standard normal distribution, and 4/10 of the time returns exactly the value 3.5 (i.e. a partly continuous, partly discrete mixture distribution). The density function of this distribution can be written as

The delta function is expedient in quantum mechanics. The wave function of a particle gives the probability amplitude of finding a particle within a given region of space. Wave functions are assumed to be elements of the Hilbert space L2 of square-integrable functions, and the total probability of finding a particle within a given interval is the integral of the magnitude of the wave function squared over the interval. A set {φn} of wave functions is orthonormal if they are normalized by

where δ here refers to the Kronecker delta. A set of orthonormal wave functions is complete in the space of square-integrable functions if any wave function ψ can be expressed as a combination of the φn:

ψ=∑cnφn,{\displaystyle \psi =\sum c_{n}\varphi _{n},}

with cn=⟨φn|ψ⟩{\displaystyle c_{n}=\langle \varphi _{n}|\psi \rangle }. Complete orthonormal systems of wave functions appear naturally as the eigenfunctions of the Hamiltonian (of a bound system) in quantum mechanics that measures the energy levels, which are called the eigenvalues. The set of eigenvalues, in this case, is known as the spectrum of the Hamiltonian. In bra–ket notation, as above, this equality implies the resolution of the identity:

Here the eigenvalues are assumed to be discrete, but the set of eigenvalues of an observable may be continuous rather than discrete. An example is the position observable, Qψ(x) = xψ(x). The spectrum of the position (in one dimension) is the entire real line, and is called a continuous spectrum. However, unlike the Hamiltonian, the position operator lacks proper eigenfunctions. The conventional way to overcome this shortcoming is to widen the class of available functions by allowing distributions as well: that is, to replace the Hilbert space of quantum mechanics by an appropriate rigged Hilbert space.[63] In this context, the position operator has a complete set of eigen-distributions, labeled by the points y of the real line, given by

φy(x)=δ(x−y).{\displaystyle \varphi _{y}(x)=\delta (x-y).}

The eigenfunctions of position are denoted by φy=|y⟩{\displaystyle \varphi _{y}=|y\rangle } in Dirac notation, and are known as position eigenstates.

Similar considerations apply to the eigenstates of the momentum operator, or indeed any other self-adjoint unbounded operatorP on the Hilbert space, provided the spectrum of P is continuous and there are no degenerate eigenvalues. In that case, there is a set Ω of real numbers (the spectrum), and a collection φy of distributions indexed by the elements of Ω, such that

Pφy=yφy.{\displaystyle P\varphi _{y}=y\varphi _{y}.}

That is, φy are the eigenvectors of P. If the eigenvectors are normalized so that

where the operator-valued integral is again understood in the weak sense. If the spectrum of P has both continuous and discrete parts, then the resolution of the identity involves a summation over the discrete spectrum and an integral over the continuous spectrum.

The delta function also has many more specialized applications in quantum mechanics, such as the delta potential models for a single and double potential well.

The delta function can be used in structural mechanics to describe transient loads or point loads acting on structures. The governing equation of a simple mass–spring system excited by a sudden force impulseI at time t = 0 can be written

where EI is the bending stiffness of the beam, w the deflection, x the spatial coordinate and q(x) the load distribution. If a beam is loaded by a point force F at x = x0, the load distribution is written

q(x)=Fδ(x−x0).{\displaystyle q(x)=F\delta (x-x_{0}).}

As integration of the delta function results in the Heaviside step function, it follows that the static deflection of a slender beam subject to multiple point loads is described by a set of piecewise polynomials.

Also a point moment acting on a beam can be described by delta functions. Consider two opposing point forces F at a distance d apart. They then produce a moment M = Fd acting on the beam. Now, let the distance d approach the limit zero, while M is kept constant. The load distribution, assuming a clockwise moment acting at x = 0, is written

^JB Fourier (1822). The Analytical Theory of Heat (English translation by Alexander Freeman, 1878 ed.). The University Press. p. 408., cf p 449 and pp 546–551. The original French text can be found here.

^Driggers 2003, p. 2321. See also Bracewell 1986, Chapter 5 for a different interpretation. Other conventions for the assigning the value of the Heaviside function at zero exist, and some of these are not consistent with what follows.