Abstract
This is the first of a series of articles dealing with the casual use of distributions made by physicists and engineers. This use is very economical but sometimes leads to embarrassing conclusions that are difficult to justify with rigorous theories of distributions.
Keywords
Dirac delta function; distributions
1. Motivation
One of the authors received an e-mail from a famous physicist about one of his articles [11. M. Amaku, F.A.B. Coutinho, O.J.P. Éboli and E. Massad, Some problems with the Dirac delta function: Divergent series in Physics, accepted for publication in Braz. J. Phys. (2021).]. The following is a transcription of a relevant part that motivated this article: “As to your paper, I see it as an appeal that people stop using the now-standard notation for the Dirac delta function, and switch to the notation of distributions. I don’t think this is practical, and I’m not too supportive of papers that insist on this anyway.” We have to respect this opinion because the physicists’ use of the Dirac delta function as if it were a function is very economical. However, in the mentioned paper, we were not “advocating stopping the now standard notation of the Dirac function”. In fact we responded that “I disagree with you that our paper proposes a new way of using the Dirac delta function. In fact we used methods that are becoming more and more common in physics and only mentioned in remarks that the procedures could be done consistently by using Schwartz or Temple distribution theories. I am sure that you known examples of inconsistencies when using the now standard methods to handle blindly the Dirac function. Let us explain our position more clearly with a bit of history.
In the late 19th century and the beginning of the 20th century a number of special techniques called “improper functions methods” or “symbolic methods” were being used by engineers and mathematical physicists [22. G. Kirchhoff, Sitz. d. K. Preuss Akad. Wiss (Berlin) 22, 641 (1882)., 33. O. Heaviside, Proc. Roy. Soc. A 52 , 504 (1893)., 44. O. Heaviside, Proc. Roy. Soc. A 54 , 105 (1894)., 55. R.H. Weber und R. Ganz, Repertorium Der Physik, 1, Band 2. Teil (Wiley, Berlin, 1916)., 66. P.A.M. Dirac, The Principles of Quantum Mechanics (Oxford UniversityPress, Oxford, 1930).]. The Dirac delta function became very popular because of the extensive use of them in Dirac’s work. However, it was introduced before by Kirchhoff and widely used by Heaviside. These methods were difficult to justify until the work of Schwartz [77. L. Schwartz, Theory of Distributions (Herman, Paris, 1950).]. The work of Dirac and Schwartz is praised in the book by Lighthill [88. M.J. Lighthill, Introduction to Fourier Analysis and generalizedFunction (Cambridge University Press, Cambridge, 1964).] as follows: “To Paul Dirac who saw the it must be true; Laurent Schwartz who proved it and to George Temple who showed how simple it could be made.” The last reference is to the work George Temple [99. G. Temple, J. Lon. Math. Soc. 28, 175 (1953).]. It should be mentioned the above mentioned symbolic methods can also be justified by an almost equivalent theory by J. G. Mikuzinski [1010. J. Mikusinski, Fundamenta Mathematicae 35, 235 (1948)., 1111. On generalized exponential functions see, J. Mikusinski, Studia Math. 13, 48 (1951).]; see also [1212. A. Ederly, Operational calculus and generalized functions (Holt, Rinehart and Winston, New York, 1962)., 1313. J.P. Marchand, Distributions An Outline (Dover Publications, New York, 2007).]. For further information the history of distributions see [1414. D.A.V. Tonidandel and A.E.A. Araújo, Rev. Bras. Ens. Fis. 37, 3306 (2015)., 1515. M.G. Katz and D.A. Tall, Found. Sci. 18, 107 (2013).] and references therein.
In spite of this some of the symbolic methods mentioned above, the efforts to put them in a rigorous mathematical form still do not explain them clearly. In fact, we believe that neither Schwartz [77. L. Schwartz, Theory of Distributions (Herman, Paris, 1950).] nor Temple [1616. G. Temple, Proc. Roy. Soc. A 228, 175 (1955).] distribution theories can easily clarify the formal methods commonly used by Physicists. In this series of articles we try to show examples where this appears to be so, comment on the possible relevance of the results, and show cases where the theory, when properly used, greatly illuminate the physics involved. The first of these problems presented in this article has no practical consequences, but since it brings a lot of confusion to the literature we think it is a good idea to clarify it. More specifically, we focus on the value of the integral
Two possible choices for X are and 1, that are called, respectively, the weak and strong definitions of the Dirac delta. We discuss these two options in detail in Section 2 2. The problem The value of the integral in equation (1) involving the Dirac delta function is the cause of some perplexity in the literature. The following integral (6) ∫ - ∞ ∞ δ ( x ) d x = 1 is well know, but the value of equation (1) is subject of some discussion in the literature. According to G. Barton [18] on page 33 the “strong definition of the delta function requires X = 1, … but some books choose X=12”. We quote below some books and articles that choose this last value: (1) The first reference is in page 29 of [19]. No explanation for this is given, but we can think that the reasoning is as follows. We can write equation (1) as (7) 1 = ∫ - ∞ ∞ δ ( x ) d x = ∫ - ∞ 0 δ ( x ) d x + ∫ 0 ∞ δ ( x ) d x = 2 ∫ 0 ∞ δ ( x ) d x since δ(x) is an even function. Here is a similar argument (8) δ ( x ) = 1 2 π ∫ - ∞ ∞ e i k x d k then (9) ∫ - ∞ ∞ [ 1 2 π ∫ - ∞ ∞ e i k x d k ] d x = 1 . Since equation (8) is an even function of x we have (10) ∫ 0 ∞ [ 1 2 π ∫ - ∞ ∞ e i k x d k ] d x = 1 2 . (2) The second reference is given in page 791 of [20]. The justification for using equation (1) with X=12 is that the n-dimensional delta function is given by (11) δ ( r → ) = 1 ω n δ ( | r → | ) | r → | n - 1 , where ωn is the surface area of the n-dimensional unit sphere. In order to the integral of δ(r→) over the whole space to be one we must use equation (1) with X=12. Remark 1:Formula (14) is a simplified version of the argument given by Courant and Hilbert [20] described above in equation (11). Take for instance n = 3, then ω3 = 4π. (3) The third reference is given by Blinder [21] in his equation (7). His justification for this is that “the factor 12 reflects the fact that the delta function is located in one of the limits, so that, only half of the delta function is within the range of integration”. (4) The fourth reference is given by [22]. The authors of this article propose a new definition of the Dirac delta function. According to them, the usual definition of the delta function is given by (12) ∫ a b δ ( x ) d x = 1 if the interval does not contain the origin, and indeterminate if a or b equal zero, that is, if one of the end points of the interval of integration coincides with zero. They propose, assuming that a < b, that the definition should be (13) ∫ a b δ ( x ) d x = { 1 if a b < 0 1 2 if a b = 0 0 if a b > 0 . This definition of delta function was proposed earlier and independently by John von Neumann and David Hilbert [23]. This fact was pointed out by [24]. One consequence of the weak definition of the delta function is that when you consider the three-dimensional delta function we should have (14) δ 3 ( r → ) = δ ( r ) 2 π r 2 with r=|r→|. However, if you use the strong definition of the delta function we have (15) δ 3 ( r → ) = δ ( r ) 4 π r 2 . In fact, consider δ3(r→) in spherical coordinates. Since the delta function is at the origin it has no angular part and is in fact δ(r) [25]. Therefore, using the weak definition of the delta function and equation (14), we can verify the consistency of the results (16) ∫ δ 3 ( r → ) d 3 r → = ∫ 0 ∞ δ ( r ) 2 π r 2 4 π r 2 d r = 1 . However, using equation (15) and the strong definition also leads to a consistent result (17) ∫ δ 3 ( r → ) d 3 r → = ∫ 0 ∞ δ ( r ) 4 π r 2 4 π r 2 d r = 1 . At this point it is already evident that the use of the weak or the strong definitions of the Dirac delta function implies that derived results, like equations (14) and (15) must be consistently chosen. It is difficult to obtain these results (14) and (15) using distribution theory in the elementary form used by physicists. In fact, consider the delta function in spherical coordinates, which is given by [18] (18) δ 3 ( r → - r → 0 ) = δ ( r - r 0 ) δ ( θ - θ 0 ) δ ( ϕ - ϕ 0 ) r 2 sin θ or (19) δ 3 ( r → - r → 0 ) = δ ( r - r 0 ) δ ( cos θ - cos θ 0 ) δ ( ϕ - ϕ 0 ) r 2 . We now show that equation (18) satisfies one of the basic properties of the delta function and comment upon it. In fact, we have for r→0≠0→ (20) ∫ δ 3 ( r → - r → 0 ) d 3 r → = ∫ 0 ∞ r 2 d r ∫ 0 2 π d ϕ ∫ 0 π sin θ d θ δ 3 ( r → - r → 0 ) = ∫ 0 ∞ d r r 2 δ ( r - r 0 ) r 2 ∫ 0 2 π d ϕ δ ( ϕ - ϕ 0 ) ∫ 0 π d θ sin θ δ ( θ - θ 0 ) sin θ = 1 . On the other hand, the limit r→0→0 is not well defined if we adopt the weak definition for the Dirac delta function. In order to see this fact, let us take the limit along a line r0 → 0 with ϕ0 and θ0 fixed. From equation (20) we obtain that (21) ∫ δ 3 ( r → ) d 3 r → = ∫ 0 ∞ δ ( r ) d r ∫ 0 2 π δ ( ϕ - ϕ 0 ) d ϕ ∫ 0 π δ ( θ - θ 0 ) d θ . Clearly this last expression is problematic for the weak definition since its value depends on ϕ0 and θ0, as well as it is not equal to one! Notwithstanding, this limit is well defined if we adopt the strong definition since the θ and ϕ are equal to zero independently of the values of θ0 and ϕ0. We can circumvent this dilemma as we shall show in Section 4.1.2 of this paper. Due to the above controversy, Gabriel Barton [18] prefers to use the so called strong definition of the delta function, that is, the one that takes X = 1. His book, strongly recommended, carefully studies the two cases. We would like now to study this problem in the light of the Schwartz definition of generalized functions. As we have already mentioned distributions is the name that Schwartz [7] used for these mathematical objects. Generalized functions is the name that G. Temple [9] prefer to call the Schwartz distributions. , while we summarize the main points of the theories of distributions in Section 3 3. Distributions according to Schwartz and Temple In this section we present two approaches to distributions, that is the Schwartz and Temple ones. The latter one is simpler than the Schwartz one, however, this is more general. 3.1. The Schwartz definition In order to define a distribution or a generalized function, as they are also known, we recall the definition of a functional. A functional F is a mathematical object that acts on functions and produces a number. So if φ(x) is a function and F is a functional we can write F[φ] = Number. A simple example of a functional is given by a function f(x) that acts on a “good function” φ(x) as follows: (22) ∫ - ∞ ∞ f ( x ) φ ( x ) d x = Number . The function φ(x) should be a good function so that the integral is well defined. We say that the function f(x) generates the functional. equation (22) can be generalized to three dimensions as follows (23) ∫ - ∞ ∞ ∫ - ∞ ∞ ∫ - ∞ ∞ f ( x , y , z ) φ ( x , y , z ) d 3 V = Number . where d3V = dxdydz and φ(x,y,z) is a good function in three dimensions. A distribution or generalized function is a linear and continuous functional acting upon the space of the “good functions” φ(x). Distributions that are generated by functions like in equation (22) are called regular distributions. We shall denote distributions generated by a function f(x) by G f = ∫ f ( x ) φ ( x ) d x . Distributions that are not generated by functions are called irregular. The Dirac delta function in one dimension is defined by following functional (24) DiracDelta ( x - y ) [ φ ( x ) ] ≡ φ ( y ) , where φ(y) is the value of φ at the point y, that is, a number. The Dirac delta function is an irregular distribution, because there are no function that satisfies equation (26). equation (24) very often written as (25) δ ( x - y ) [ φ ] = φ ( y ) . The last equation is also written as (26) ∫ - ∞ ∞ δ ( x - y ) φ ( x ) d x = φ ( y ) . The function φ(x) in the Schwartz theory is called a “test function” and is a function that have support (the interval where the function is different from zero) in a finite interval of the real line. Moreover, it is continuous and infinitely differentiable. A classical example of a “test function” is given by (27) φ ( x ) = { exp [ - a 2 a 2 - x 2 ] if | x | < a 0 if | x | ≥ 0 whose support is the open set (−a,a) and it is infinitely differentiable. Physical magnitudes, that have no singularities can be simple “promoted” to regular distributions by replacing f(x,y,z) by it in equation (23). So, consider the x component of the electric field that gives rise to a regular distribution (28) G E x [ φ ] = ∫ - ∞ ∞ ∫ - ∞ ∞ ∫ - ∞ ∞ E x ( x , y , z ) φ ( x , y , z ) d 3 V = Number , where φ(x,y,z) is a three-dimensional test function. It is common to hear that the derivative of a function that has a finite jump discontinuity has a delta multiplied by this jump. This should be clarified as follows: Suppose you have a function f(x). Promote this to a distribution, that is, consider that it generates a distribution. The distribution so generated is infinitely differentiable. For example, the first derivative Gf′ of a distribution generated by f acting on a test function φ(x) is by definition (29) G f ′ [ φ ] ≡ G f [ - φ ′ ] . We then recall the definition of the step (Heaviside) function (30) H ( x - x 1 ) = { 1 for x ≥ x 1 0 for x < x 1 . Consider now H(x−x1) as a distribution. The space of test functions φ(x) is the set of infinite differentiable function defined in a finite interval of the real numbers containing the point x1. So, the distribution is the functional (31) G H [ φ ] = ∫ - ∞ ∞ H ( x - x 1 ) φ ( x ) d x = ∫ x 1 ∞ φ ( x ) d x . According to equation (29) the derivative of GH[φ] is (32) G H ′ [ φ ] = - G H [ - φ ′ ] = - ∫ x 1 ∞ d φ d x d x = - φ ( ∞ ) + φ ( x 1 ) = φ ( x 1 ) . However, this functional is the Dirac delta function, viz. (33) δ ( x - x 1 ) [ φ ] = ∫ - ∞ ∞ δ ( x - x 1 ) φ ( x ) d x = φ ( x 1 ) . It is a very important result that if a function is continuous or have only finite jump discontinuities it can be “promoted” to a generalized function by just using it with suitable test functions. However if a function has an infinite discontinuity more care is required as discussed in Section 4 with respect to the function 1/|r→|. It is possible and very useful to use distributions defined with test functions with support in a finite interval (a,b) of the real axis. In fact, we are going to use this type of distributions in another article in this series. Distributions have limitations. For example, they can not be multiplied. Let us illustrate this point. We have seen that, to discuss the δ(x−y), it is formally convenient to write equation (26). Likewise, to discuss the meaning of ∫0∞δ(x)dx, we should consider ∫-∞∞δ(x)H(x)ϕ(x)dx where H(x) is the step function defined in equation (30) and which renders H(x)φ(x) discontinuous. Therefore, ∫ - ∞ ∞ δ ( x ) H ( x ) φ ( x ) d x does not make sense in the Schwartz distribution theory. This explains why it is difficult to solve the problem of what is the value of X in the expression ∫ 0 ∞ δ ( x ) d x = X in Schwartz distribution theory. Also, to solve the more general problem of calculating equation (21) with test functions defined in the interval [0, ∞] for r, [0,2π] for ϕ and [0,π] for θ is of no help, because the test functions vanish at the end points. So, we conclude that although ∫ - ∞ ∞ δ ( x - x ′ ) F ( x ′ ) d x ′ = F ( x ) when the integral is over a finite interval ∫ a b δ ( x - x ′ ) F ( x ′ ) d x ′ and when the variable x′ coincides with one of the ends of the interval of integration, the above integral is undefined or requires great care. As mentioned before we can circumvent this as we shall show in SubSection 4.1.2 of this paper. 3.2. The Temple definition The Temple [16] approach to distributions or generalized functions is simpler, but perhaps, less general than Schwartz approach. Two good books using this approach are [8, 26]. According to Temple generalized functions are special limits of sequences of functions. These special limits, also called weak limits, are defined as follows: A sequence of differentiable functions fn(x) (n = 1,2, …) converges weakly to f(x) if for any test function φ(x) (see above) the limit (34) lim n → ∞ ∫ - ∞ ∞ f n ( x ) φ ( x ) d x = ∫ - ∞ ∞ f ( x ) φ ( x ) d x exists in spite of that classically limn → ∞ fn(x) does not exist. Sometimes this is written as weak lim n → ∞ f n ( x ) = f ( x ) . Here are two examples of such sequences: (35) G n ( x ) = n x 1 1 + n 2 x 2 and (36) W n ( x ) = sin ( n x ) π x . These examples tend to the Dirac delta function, which is not a function classically. But, as we shall see other functions can be “promoted” to distributions. Furthermore, the following sequence tends to the above Heaviside distribution (37) H n ( x ) = 1 2 + 1 π arctan ( n x ) . Differentiation in the Temple approach is like in the Schwartz definition, (38) weak lim n → ∞ d d x f n ( x ) = d d x f ( x ) = f ′ ( x ) so that (39) ∫ - ∞ ∞ f ′ φ ( x ) d x = - ∫ - ∞ ∞ f φ ′ ( x ) d x , that is, the result given by equation (29) holds. Remark 2: There are other definitions and alternative treatments to generalized functions. One that uses discontinuous test functions is by Kurasov [27]. This theory may be used to solve some of the problems raised in [28]. .
Once a choice for the integral in equation (1), we must consistently work with its chosen value when we consider other related problems as it is shown below. As an illustration, we present the relevance for physics of this choice in Section 4 4. What are the values of Δ1r and ∇⋅r→r3? In classical electromagnetism, that is, prior to the invention of distributions these formulae were not known. We shall discuss below several ways of obtaining them, and the advantages they have over the approach used prior to distribution theory. 4.1. Obtaining Δ1r=-4πδ3(r→) We begin discussing the case of Δ1r where r=|r→|=x2+y2+z2 is the modulus of the vector r→=xi→+ yj→+zk→ and Δ is the Laplacian operator, whose expression in cartesian coordinates is (40) Δ = ∂ 2 ∂ x 2 + ∂ 2 ∂ y 2 + ∂ 2 ∂ z 2 . In spherical spherical coordinates the Laplacian is (41) Δ = 1 r ∂ 2 ∂ r 2 r + … where we omitted contributions from the angular part of the operator. The fact that (42) Δ 1 r = - 4 π δ 3 ( r → ) is the object of much discussion in the literature. In fact, there are countless papers about this subject. Here is a selection of the ones we find more useful [21, 29, 30, 31, 32]. In order to calculate Δ1r we have to “promote” the function 1/r to a distribution. As mentioned above and discussed below, this is not a simple task because the classical function 1/r is mathematically not well defined for r=0. The physical significance of the magnitude 1/r we are considering is the inverse of the length of the segment from 0 to the point whose radial coordinate is r. As can be anticipated, the use of the strong or weak definition of the Dirac delta function in equation (42) requires that the definition of the distribution associated to 1/r should be chosen carefully as showed below. 4.1.1. The definition of the distribution G1rB[φ] used by S. M. Blinder Our goal in this section is to prove equation (42) using the strong definition of the Dirac delta function consistently. To this end, we analyze the definition of the distribution corresponding to the classical function 1/r that follows was used by S. M. Blinder in the second part of his article [21]. Moreover, let us follow Temple’s method to define a distribution that corresponds to the classical function 1/r: (43) G 1 r B = weak lim n → ∞ 1 r H ( r - 1 n ) , that is, we are using a sequence of functions H(r-1n) that tends to H(r) when n → ∞. A more careful definition of 1/r, in three dimensions, will be given in SubSection 4.1.3 where this calculation is repeated in more detail. We shall omit the weak limn → ∞ detail in what follows, for simplicity. Then, (44) Δ 1 r = Δ G 1 r B = 1 r ∂ 2 ∂ r 2 H ( r ) or (45) Δ 1 r = 1 r 2 [ r ∂ ∂ r δ ( r ) ] = - 1 r 2 δ ( r ) , where the last equality was obtained using (46) - r ∂ ∂ r δ ( r ) = δ ( r ) . For the sake of continuity of the argument, let us postpone the proof of equation (46). Now, using the expression for δ(r→) given in equation (15) for the strong definition of the delta function we can rewrite equation (45) as Δ 1 r = - 4 π δ 3 ( r → ) . Hence, we have demonstrated equation (42). Therefore, this result can be consistently obtained with the use of the strong definition of the delta function provided equation (46) holds true for this choice. So, let us obtain this equation. In order to demonstrate equation (46) using the strong definition of the delta function we use a test function of the form (47) φ ( r ) = { g ( r ) r for r > 0 0 for r < 0 with g(r) infinitely differentiable and satisfying (48) lim r → 0 r g ( r ) = 0 . Then, to demonstrate equation (46) we integrate the right-hand side of it, that is r∂∂rδ(r), multiplied by the test function g(r)r in a volume 4πr2dr to get (49) 4 π ∫ 0 ∞ δ ( r ) r g ( r ) r r 2 d r = 4 π g ( 0 ) . On the other hand, the left-hand side of equation (46) multiplied by the test function (47), and integrated in the volume 4πr2dr gives (50) 4 π ∫ 0 ∞ - ∂ δ ( r ) ∂ r g ( r ) r r 2 d r = 4 π g ( 0 ) , when we integrate the left-hand side of last equation by parts. Let us do the details of this last calculation, following the page 34 of the reference [18] with modifications. (51) ∫ 0 ∞ - ∂ δ ( r ) ∂ r g ( r ) r r 2 d r = ∫ 0 ∞ - ∂ δ ( r ) ∂ r [ r g ( r ) ] d r = ∫ - ∞ ∞ - ∂ δ ( r ) ∂ r [ r g ( r ) ] d r where we used that g(r) vanishes for r < 0 in the last step. Now, equation (47) allow us to write the last integral as (52) ∫ - ∞ ∞ δ ( r ) d d r [ r g ( r ) ] d r = ∫ - ∞ ∞ δ ( r ) [ r d g ( r ) d r + g ( r ) ] d r = g ( 0 ) . This demonstrates the result. 4.1.2. The definition of the distribution G1rKH[φ] used by Ben Kuang-Yu Hu Let us now see that using a different definition of the distribution that corresponds to the classical function 1/r, we can demonstrate equation (42) using the weak definition of the delta function. We follow here the article by Ben Yu-Kuang Hu [33], but with some modifications. First we define new spherical coordinates with r ranging from −∞ to ∞ and the polar angle θ ranging from 0 to π/2. Then, we can define the distribution (53) G 1 r K H = 1 r sign ( r ) = 1 r { 1 for r > 0 - 1 for r < 0 . Once again, the magnitude r is physically the distance from the origin to a point with coordinate r. Now we calculate (54) Δ ( G 1 r K H ) = 1 r ∂ 2 ∂ r 2 sign ( r ) = 2 r ∂ ∂ r δ ( r ) = - 2 δ ( r ) r 2 = - 4 π δ 3 ( r → ) , where now we have used the results of equations (46) and (14). Hence, we have demonstrated equation (42) but using in the last step the weak definition of the delta function, that is equation (14). Once again we used the result given in equation (46), so let us derive it for the weak definition of the delta function. To do so, we use a test function φ(r) whose support contains the origin and integrate its product with the left-hand side equation (46) (55) - ∫ - ∞ ∞ φ ( r ) r ∂ δ ( r ) ∂ r d r = ∫ - ∞ ∞ ∂ ( r φ ( r ) ) ∂ r δ ( r ) d r (56) = ∫ - ∞ ∞ ( r ∂ φ ( r ) ∂ r + φ ( r ) ) δ ( r ) d r = ∫ - ∞ ∞ δ ( r ) φ ( r ) d r where the last equality follows from the fact the derivative φ is continuous and then r∂ φ/∂ r|r = 0 = 0. Therefore, equation (46) also holds for the weak definition of the Dirac delta function. In brief, we have obtained the well know result Δ 1 r = - 4 π δ 3 ( r → ) using both the weak and the strong definitions of the delta function. This should be no surprise. The two results are the same but they result from two different definitions of the distribution that correspond to the classical function 1/r. Another more careful definition of this distribution is going to be presented, from the point of view of Schwartz, in Section 4.1.3 below. This definition is much more comprehensive than the two we have just presented. The same distribution, from the point of view of Temple is going to be presented in the new subsection. 4.1.3. The definition of the distribution G1/rSW[φ] used by Ray Skinner and John A. Weil Another and more clear way to define the distribution 1/r is given by Ray Skinner and John A. Weil [17]. We now present this approach to show that equation (42) is true, making clear that this framework is more illuminating than the above demonstrations. We follow closely the presentation of [17]. The approach is to consider that both sides of the equation (42) must be interpreted as distributions. That is, the classical operator and the classical function 1/r must be “promoted” to a generalized operator acting on generalized functions and generalized function respectively. Let us begin by “promoting” 1/r to a generalized function. The classical function 1/r and its derivatives −1/r2, 2/r3, etc. “blow up” at r = 0. Following Skinner and Weil we can define the generalized function corresponding to 1/r as (57) G 1 r S W [ φ ] = ∫ S φ ( r → ) r d r sin θ d θ d ϕ = ∫ S φ ( r → ) r d r d Ω where φ is any test function and dΩ = sin θdθdϕ is the solid angle subtended from the origin to an element of volume containing by r→. Similarly, the classical function 1/r2 corresponds to the generalized function (58) G 1 r 2 S W [ φ ] = ∫ S φ ( r → ) d r d Ω . Equations (57) and (58), however, do not define univocally, from the mathematical point of view, a generalized function corresponding to 1/r and 1/r2 respectively, as we shall see at the end of this section. Nevertheless, Physics chooses the definitions (57) and (58) uniquely! We can now calculate formula (42) which requires that we use for the generalized ∇ operator following a definition in agreement with equation (29). We start by calculating the generalized gradient of the generalized function corresponding to 1/r given by equation (57). We have (59) ∇ G 1 r S W [ φ ] ≡ - ∫ S [ ∇ φ ( r → ) ] r d r d Ω = - ∫ S [ ∇ φ ( r → ) ] 1 r d 3 V Next we evaluate the Laplacian of the generalized function G1rSW corresponding to 1/r. We have, once again using equation (29) (60) Δ G 1 r S W [ φ ] = ∇ ⋅ ∇ G 1 r S W [ φ ] = ∫ S ∇ ⋅ [ ∇ φ ( r → ) ] r d r d Ω = p a r t s ∫ S ∇ φ ( r → ) ⋅ r → r d r d Ω = ∫ S ∂ φ ∂ r ( r → ) d r d Ω = 4 π [ lim r → ∞ φ ( r → ) - lim r → 0 φ ( r → ) ] = - 4 π φ ( 0 → ) , where we “integrated by parts” in the second line. The integral of the angular part in the above last line is not straightforward. It can be evaluated expanding φ(r→) in spherical harmonics, and we leave for the careful reader to perform it; see reference [18] page 33 for further information. Hence, we obtain equation (42), that is Δ 1 r = - 4 π δ 3 ( r → ) . A very important point is to show that the definitions of 1/r and 1/r2, as distributions, are determined by Physics. In fact the definitions (57) and (58) of the above quantities are unique for r > 0. In fact, take for example the definition given by equation (57) of the distribution corresponding to the classical function 1/r. We could add to it any linear combination of δ3(r→) and its derivatives and we would still represent 1/r for r > 0. However, the electrical potential defined by equation (57) is the one that satisfies Maxwell equations in generalized form as shown in [17]. So, the generalized form of the Maxwell equations defines the distributions we need to write them. The conclusion of this section is that we can work with either value of X in equation (1), provided we do it consistently. Notwithstanding, we have to worry about what are the definition as distributions of the magnitudes we are dealing. We summarize a few features of the different approaches presented above in Table 1. Table 1 Comparison between the weak and strong definitions of the Dirac delta function. Quantity Weak value Strong value ∫ - ∞ ∞ δ ( x ) d x 1 1 ∫ 0 ∞ δ ( x ) d x 1 2 1 δ3(r) δ ( r ) 2 π r 2 δ ( r ) 4 π r 2 Δ 1 r - 4 π δ 3 ( r → ) - 4 π δ 3 ( r → ) 4.2. Obtaining ∇⋅r→r3=4πδ3(r→) The potential of a unit charge placed at the origin is given by ψ(r→)=1r. Its gradient is (61) ∇ ( 1 r ) = ∂ ∂ r ( 1 r ) r → r + ⋯ = - r → r 3 . Therefore, the associated Coulomb field to this charge is (62) E → = - ∇ ψ = r → r 3 . Following Temple’s approach as in Section 4.1.1, we can define a distribution corresponding to this electric field as a limit of a set of vector functions that tend to it as (63) weak lim n → ∞ E → H ( r - 1 n ) = weak lim n → ∞ 1 r 2 r → r H ( r - 1 n ) . The action of the electric field E→ as a distribution in a test function φ(r→) is, using spherical coordinates, (64) G E → [ φ ] = ∫ E → φ ( r → ) d 3 V = lim n → ∞ ∫ 1 n ∞ ∫ 0 2 π ∫ 0 π e → r r 2 φ ( r → ) r 2 d r sin θ d θ d ϕ , where e→r=r→/r is the radial unit vector. Now let us evaluate the divergence of the electric field in spherical coordinates (65) ∇ ⋅ E → = 1 r 2 ∂ ∂ r ( r 2 E r ) + 1 r sin θ ∂ ∂ θ ( E θ sin θ ) + 1 r sin θ ∂ E ϕ ∂ ϕ which vanishes for r > 0. Analogously to equation (29), the divergence of this vector distribution is (66) ∇ ⋅ G E → [ φ ] = G ∇ ⋅ E → [ φ ] = weak lim n → ∞ ∫ - ∇ φ ( r → ) ⋅ E → H ( r - 1 n ) d 3 V . To proceed we integrate by parts, remembering that the surface term vanishes to the boundary condition of φ, and use the identity (67) ∇ ⋅ ( f E → ) = ∇ f ⋅ E → + f ∇ ⋅ E → to obtain that (68) G ∇ ⋅ E → [ φ ] = lim n → ∞ ∫ [ ∇ H ( r - 1 n ) ⋅ e → r r 2 + H ( r - 1 n ) ∇ ⋅ ( e → r r 2 ) ] φ ( r → ) d 3 V = lim n → ∞ ∫ 0 ∞ ∫ 0 2 π ∫ 0 π δ ( r - 1 n ) e → r ⋅ e → r r 2 φ ( r → ) r 2 d r sin θ d θ d ϕ = ∫ 4 π δ 3 ( r → ) φ ( r → ) d 3 V = 4 π φ ( 0 → ) , where we employed equation (15) to go from the second to the third above lines. Remember that equation (15) is valid when we work with the strong definition of the Dirac delta function. Therefore, we have proved that (69) ∇ ⋅ ( r → r 3 ) = 4 π δ 3 ( r → ) . In a close analogy with was presented in Section 4.1.2 we can also show that this result is valid for the weak definition of the delta function. Once again, this section shows clearly that using careful definitions of physical magnitudes we can obtain well known expressions using a sound mathematical formulation. of this paper where we describe and deduce two routinely used formulas in electromagnetism:
-
(1)
The Laplacian of the Coulomb potential, namely
where r stands for .
-
(2)
The divergence of the Coulomb field of a unit point charge, namely
These two formulas were not know before the theory of generalized functions. We show how the theory modifies our understanding of electromagnetism and clarify complex examples. A very good early paper on this is [1717. R. Skinner and J.A. Weil, Am. J. Phys. 57, 777 (1989).]. The two formulas are well know by physics students, but we show that their derivation appear to depends on the value of the integral mentioned in the title.
As mentioned above these formulas were unknown until the development of distribution theory and without them some calculations are problematic. For instance, we know that
for and that this laplacian is not defined for . However,
This was not considered a problem before the introduction of distributions. It was considered to be just the potential of a point charge, which, for the pioneers, didn’t exist anyway: it was an idealized case. Of course if we consider equation (2) the above result becomes so natural that the above calculation is considered as a proof of this equation. Note, however, that since the right side of this equations is the delta function, a distribution or generalized function as it is also called, the left side must also be a generalized function. Therefore, we shall return to this in Sections 3 3. Distributions according to Schwartz and Temple In this section we present two approaches to distributions, that is the Schwartz and Temple ones. The latter one is simpler than the Schwartz one, however, this is more general. 3.1. The Schwartz definition In order to define a distribution or a generalized function, as they are also known, we recall the definition of a functional. A functional F is a mathematical object that acts on functions and produces a number. So if φ(x) is a function and F is a functional we can write F[φ] = Number. A simple example of a functional is given by a function f(x) that acts on a “good function” φ(x) as follows: (22) ∫ - ∞ ∞ f ( x ) φ ( x ) d x = Number . The function φ(x) should be a good function so that the integral is well defined. We say that the function f(x) generates the functional. equation (22) can be generalized to three dimensions as follows (23) ∫ - ∞ ∞ ∫ - ∞ ∞ ∫ - ∞ ∞ f ( x , y , z ) φ ( x , y , z ) d 3 V = Number . where d3V = dxdydz and φ(x,y,z) is a good function in three dimensions. A distribution or generalized function is a linear and continuous functional acting upon the space of the “good functions” φ(x). Distributions that are generated by functions like in equation (22) are called regular distributions. We shall denote distributions generated by a function f(x) by G f = ∫ f ( x ) φ ( x ) d x . Distributions that are not generated by functions are called irregular. The Dirac delta function in one dimension is defined by following functional (24) DiracDelta ( x - y ) [ φ ( x ) ] ≡ φ ( y ) , where φ(y) is the value of φ at the point y, that is, a number. The Dirac delta function is an irregular distribution, because there are no function that satisfies equation (26). equation (24) very often written as (25) δ ( x - y ) [ φ ] = φ ( y ) . The last equation is also written as (26) ∫ - ∞ ∞ δ ( x - y ) φ ( x ) d x = φ ( y ) . The function φ(x) in the Schwartz theory is called a “test function” and is a function that have support (the interval where the function is different from zero) in a finite interval of the real line. Moreover, it is continuous and infinitely differentiable. A classical example of a “test function” is given by (27) φ ( x ) = { exp [ - a 2 a 2 - x 2 ] if | x | < a 0 if | x | ≥ 0 whose support is the open set (−a,a) and it is infinitely differentiable. Physical magnitudes, that have no singularities can be simple “promoted” to regular distributions by replacing f(x,y,z) by it in equation (23). So, consider the x component of the electric field that gives rise to a regular distribution (28) G E x [ φ ] = ∫ - ∞ ∞ ∫ - ∞ ∞ ∫ - ∞ ∞ E x ( x , y , z ) φ ( x , y , z ) d 3 V = Number , where φ(x,y,z) is a three-dimensional test function. It is common to hear that the derivative of a function that has a finite jump discontinuity has a delta multiplied by this jump. This should be clarified as follows: Suppose you have a function f(x). Promote this to a distribution, that is, consider that it generates a distribution. The distribution so generated is infinitely differentiable. For example, the first derivative Gf′ of a distribution generated by f acting on a test function φ(x) is by definition (29) G f ′ [ φ ] ≡ G f [ - φ ′ ] . We then recall the definition of the step (Heaviside) function (30) H ( x - x 1 ) = { 1 for x ≥ x 1 0 for x < x 1 . Consider now H(x−x1) as a distribution. The space of test functions φ(x) is the set of infinite differentiable function defined in a finite interval of the real numbers containing the point x1. So, the distribution is the functional (31) G H [ φ ] = ∫ - ∞ ∞ H ( x - x 1 ) φ ( x ) d x = ∫ x 1 ∞ φ ( x ) d x . According to equation (29) the derivative of GH[φ] is (32) G H ′ [ φ ] = - G H [ - φ ′ ] = - ∫ x 1 ∞ d φ d x d x = - φ ( ∞ ) + φ ( x 1 ) = φ ( x 1 ) . However, this functional is the Dirac delta function, viz. (33) δ ( x - x 1 ) [ φ ] = ∫ - ∞ ∞ δ ( x - x 1 ) φ ( x ) d x = φ ( x 1 ) . It is a very important result that if a function is continuous or have only finite jump discontinuities it can be “promoted” to a generalized function by just using it with suitable test functions. However if a function has an infinite discontinuity more care is required as discussed in Section 4 with respect to the function 1/|r→|. It is possible and very useful to use distributions defined with test functions with support in a finite interval (a,b) of the real axis. In fact, we are going to use this type of distributions in another article in this series. Distributions have limitations. For example, they can not be multiplied. Let us illustrate this point. We have seen that, to discuss the δ(x−y), it is formally convenient to write equation (26). Likewise, to discuss the meaning of ∫0∞δ(x)dx, we should consider ∫-∞∞δ(x)H(x)ϕ(x)dx where H(x) is the step function defined in equation (30) and which renders H(x)φ(x) discontinuous. Therefore, ∫ - ∞ ∞ δ ( x ) H ( x ) φ ( x ) d x does not make sense in the Schwartz distribution theory. This explains why it is difficult to solve the problem of what is the value of X in the expression ∫ 0 ∞ δ ( x ) d x = X in Schwartz distribution theory. Also, to solve the more general problem of calculating equation (21) with test functions defined in the interval [0, ∞] for r, [0,2π] for ϕ and [0,π] for θ is of no help, because the test functions vanish at the end points. So, we conclude that although ∫ - ∞ ∞ δ ( x - x ′ ) F ( x ′ ) d x ′ = F ( x ) when the integral is over a finite interval ∫ a b δ ( x - x ′ ) F ( x ′ ) d x ′ and when the variable x′ coincides with one of the ends of the interval of integration, the above integral is undefined or requires great care. As mentioned before we can circumvent this as we shall show in SubSection 4.1.2 of this paper. 3.2. The Temple definition The Temple [16] approach to distributions or generalized functions is simpler, but perhaps, less general than Schwartz approach. Two good books using this approach are [8, 26]. According to Temple generalized functions are special limits of sequences of functions. These special limits, also called weak limits, are defined as follows: A sequence of differentiable functions fn(x) (n = 1,2, …) converges weakly to f(x) if for any test function φ(x) (see above) the limit (34) lim n → ∞ ∫ - ∞ ∞ f n ( x ) φ ( x ) d x = ∫ - ∞ ∞ f ( x ) φ ( x ) d x exists in spite of that classically limn → ∞ fn(x) does not exist. Sometimes this is written as weak lim n → ∞ f n ( x ) = f ( x ) . Here are two examples of such sequences: (35) G n ( x ) = n x 1 1 + n 2 x 2 and (36) W n ( x ) = sin ( n x ) π x . These examples tend to the Dirac delta function, which is not a function classically. But, as we shall see other functions can be “promoted” to distributions. Furthermore, the following sequence tends to the above Heaviside distribution (37) H n ( x ) = 1 2 + 1 π arctan ( n x ) . Differentiation in the Temple approach is like in the Schwartz definition, (38) weak lim n → ∞ d d x f n ( x ) = d d x f ( x ) = f ′ ( x ) so that (39) ∫ - ∞ ∞ f ′ φ ( x ) d x = - ∫ - ∞ ∞ f φ ′ ( x ) d x , that is, the result given by equation (29) holds. Remark 2: There are other definitions and alternative treatments to generalized functions. One that uses discontinuous test functions is by Kurasov [27]. This theory may be used to solve some of the problems raised in [28]. and 4 4. What are the values of Δ1r and ∇⋅r→r3? In classical electromagnetism, that is, prior to the invention of distributions these formulae were not known. We shall discuss below several ways of obtaining them, and the advantages they have over the approach used prior to distribution theory. 4.1. Obtaining Δ1r=-4πδ3(r→) We begin discussing the case of Δ1r where r=|r→|=x2+y2+z2 is the modulus of the vector r→=xi→+ yj→+zk→ and Δ is the Laplacian operator, whose expression in cartesian coordinates is (40) Δ = ∂ 2 ∂ x 2 + ∂ 2 ∂ y 2 + ∂ 2 ∂ z 2 . In spherical spherical coordinates the Laplacian is (41) Δ = 1 r ∂ 2 ∂ r 2 r + … where we omitted contributions from the angular part of the operator. The fact that (42) Δ 1 r = - 4 π δ 3 ( r → ) is the object of much discussion in the literature. In fact, there are countless papers about this subject. Here is a selection of the ones we find more useful [21, 29, 30, 31, 32]. In order to calculate Δ1r we have to “promote” the function 1/r to a distribution. As mentioned above and discussed below, this is not a simple task because the classical function 1/r is mathematically not well defined for r=0. The physical significance of the magnitude 1/r we are considering is the inverse of the length of the segment from 0 to the point whose radial coordinate is r. As can be anticipated, the use of the strong or weak definition of the Dirac delta function in equation (42) requires that the definition of the distribution associated to 1/r should be chosen carefully as showed below. 4.1.1. The definition of the distribution G1rB[φ] used by S. M. Blinder Our goal in this section is to prove equation (42) using the strong definition of the Dirac delta function consistently. To this end, we analyze the definition of the distribution corresponding to the classical function 1/r that follows was used by S. M. Blinder in the second part of his article [21]. Moreover, let us follow Temple’s method to define a distribution that corresponds to the classical function 1/r: (43) G 1 r B = weak lim n → ∞ 1 r H ( r - 1 n ) , that is, we are using a sequence of functions H(r-1n) that tends to H(r) when n → ∞. A more careful definition of 1/r, in three dimensions, will be given in SubSection 4.1.3 where this calculation is repeated in more detail. We shall omit the weak limn → ∞ detail in what follows, for simplicity. Then, (44) Δ 1 r = Δ G 1 r B = 1 r ∂ 2 ∂ r 2 H ( r ) or (45) Δ 1 r = 1 r 2 [ r ∂ ∂ r δ ( r ) ] = - 1 r 2 δ ( r ) , where the last equality was obtained using (46) - r ∂ ∂ r δ ( r ) = δ ( r ) . For the sake of continuity of the argument, let us postpone the proof of equation (46). Now, using the expression for δ(r→) given in equation (15) for the strong definition of the delta function we can rewrite equation (45) as Δ 1 r = - 4 π δ 3 ( r → ) . Hence, we have demonstrated equation (42). Therefore, this result can be consistently obtained with the use of the strong definition of the delta function provided equation (46) holds true for this choice. So, let us obtain this equation. In order to demonstrate equation (46) using the strong definition of the delta function we use a test function of the form (47) φ ( r ) = { g ( r ) r for r > 0 0 for r < 0 with g(r) infinitely differentiable and satisfying (48) lim r → 0 r g ( r ) = 0 . Then, to demonstrate equation (46) we integrate the right-hand side of it, that is r∂∂rδ(r), multiplied by the test function g(r)r in a volume 4πr2dr to get (49) 4 π ∫ 0 ∞ δ ( r ) r g ( r ) r r 2 d r = 4 π g ( 0 ) . On the other hand, the left-hand side of equation (46) multiplied by the test function (47), and integrated in the volume 4πr2dr gives (50) 4 π ∫ 0 ∞ - ∂ δ ( r ) ∂ r g ( r ) r r 2 d r = 4 π g ( 0 ) , when we integrate the left-hand side of last equation by parts. Let us do the details of this last calculation, following the page 34 of the reference [18] with modifications. (51) ∫ 0 ∞ - ∂ δ ( r ) ∂ r g ( r ) r r 2 d r = ∫ 0 ∞ - ∂ δ ( r ) ∂ r [ r g ( r ) ] d r = ∫ - ∞ ∞ - ∂ δ ( r ) ∂ r [ r g ( r ) ] d r where we used that g(r) vanishes for r < 0 in the last step. Now, equation (47) allow us to write the last integral as (52) ∫ - ∞ ∞ δ ( r ) d d r [ r g ( r ) ] d r = ∫ - ∞ ∞ δ ( r ) [ r d g ( r ) d r + g ( r ) ] d r = g ( 0 ) . This demonstrates the result. 4.1.2. The definition of the distribution G1rKH[φ] used by Ben Kuang-Yu Hu Let us now see that using a different definition of the distribution that corresponds to the classical function 1/r, we can demonstrate equation (42) using the weak definition of the delta function. We follow here the article by Ben Yu-Kuang Hu [33], but with some modifications. First we define new spherical coordinates with r ranging from −∞ to ∞ and the polar angle θ ranging from 0 to π/2. Then, we can define the distribution (53) G 1 r K H = 1 r sign ( r ) = 1 r { 1 for r > 0 - 1 for r < 0 . Once again, the magnitude r is physically the distance from the origin to a point with coordinate r. Now we calculate (54) Δ ( G 1 r K H ) = 1 r ∂ 2 ∂ r 2 sign ( r ) = 2 r ∂ ∂ r δ ( r ) = - 2 δ ( r ) r 2 = - 4 π δ 3 ( r → ) , where now we have used the results of equations (46) and (14). Hence, we have demonstrated equation (42) but using in the last step the weak definition of the delta function, that is equation (14). Once again we used the result given in equation (46), so let us derive it for the weak definition of the delta function. To do so, we use a test function φ(r) whose support contains the origin and integrate its product with the left-hand side equation (46) (55) - ∫ - ∞ ∞ φ ( r ) r ∂ δ ( r ) ∂ r d r = ∫ - ∞ ∞ ∂ ( r φ ( r ) ) ∂ r δ ( r ) d r (56) = ∫ - ∞ ∞ ( r ∂ φ ( r ) ∂ r + φ ( r ) ) δ ( r ) d r = ∫ - ∞ ∞ δ ( r ) φ ( r ) d r where the last equality follows from the fact the derivative φ is continuous and then r∂ φ/∂ r|r = 0 = 0. Therefore, equation (46) also holds for the weak definition of the Dirac delta function. In brief, we have obtained the well know result Δ 1 r = - 4 π δ 3 ( r → ) using both the weak and the strong definitions of the delta function. This should be no surprise. The two results are the same but they result from two different definitions of the distribution that correspond to the classical function 1/r. Another more careful definition of this distribution is going to be presented, from the point of view of Schwartz, in Section 4.1.3 below. This definition is much more comprehensive than the two we have just presented. The same distribution, from the point of view of Temple is going to be presented in the new subsection. 4.1.3. The definition of the distribution G1/rSW[φ] used by Ray Skinner and John A. Weil Another and more clear way to define the distribution 1/r is given by Ray Skinner and John A. Weil [17]. We now present this approach to show that equation (42) is true, making clear that this framework is more illuminating than the above demonstrations. We follow closely the presentation of [17]. The approach is to consider that both sides of the equation (42) must be interpreted as distributions. That is, the classical operator and the classical function 1/r must be “promoted” to a generalized operator acting on generalized functions and generalized function respectively. Let us begin by “promoting” 1/r to a generalized function. The classical function 1/r and its derivatives −1/r2, 2/r3, etc. “blow up” at r = 0. Following Skinner and Weil we can define the generalized function corresponding to 1/r as (57) G 1 r S W [ φ ] = ∫ S φ ( r → ) r d r sin θ d θ d ϕ = ∫ S φ ( r → ) r d r d Ω where φ is any test function and dΩ = sin θdθdϕ is the solid angle subtended from the origin to an element of volume containing by r→. Similarly, the classical function 1/r2 corresponds to the generalized function (58) G 1 r 2 S W [ φ ] = ∫ S φ ( r → ) d r d Ω . Equations (57) and (58), however, do not define univocally, from the mathematical point of view, a generalized function corresponding to 1/r and 1/r2 respectively, as we shall see at the end of this section. Nevertheless, Physics chooses the definitions (57) and (58) uniquely! We can now calculate formula (42) which requires that we use for the generalized ∇ operator following a definition in agreement with equation (29). We start by calculating the generalized gradient of the generalized function corresponding to 1/r given by equation (57). We have (59) ∇ G 1 r S W [ φ ] ≡ - ∫ S [ ∇ φ ( r → ) ] r d r d Ω = - ∫ S [ ∇ φ ( r → ) ] 1 r d 3 V Next we evaluate the Laplacian of the generalized function G1rSW corresponding to 1/r. We have, once again using equation (29) (60) Δ G 1 r S W [ φ ] = ∇ ⋅ ∇ G 1 r S W [ φ ] = ∫ S ∇ ⋅ [ ∇ φ ( r → ) ] r d r d Ω = p a r t s ∫ S ∇ φ ( r → ) ⋅ r → r d r d Ω = ∫ S ∂ φ ∂ r ( r → ) d r d Ω = 4 π [ lim r → ∞ φ ( r → ) - lim r → 0 φ ( r → ) ] = - 4 π φ ( 0 → ) , where we “integrated by parts” in the second line. The integral of the angular part in the above last line is not straightforward. It can be evaluated expanding φ(r→) in spherical harmonics, and we leave for the careful reader to perform it; see reference [18] page 33 for further information. Hence, we obtain equation (42), that is Δ 1 r = - 4 π δ 3 ( r → ) . A very important point is to show that the definitions of 1/r and 1/r2, as distributions, are determined by Physics. In fact the definitions (57) and (58) of the above quantities are unique for r > 0. In fact, take for example the definition given by equation (57) of the distribution corresponding to the classical function 1/r. We could add to it any linear combination of δ3(r→) and its derivatives and we would still represent 1/r for r > 0. However, the electrical potential defined by equation (57) is the one that satisfies Maxwell equations in generalized form as shown in [17]. So, the generalized form of the Maxwell equations defines the distributions we need to write them. The conclusion of this section is that we can work with either value of X in equation (1), provided we do it consistently. Notwithstanding, we have to worry about what are the definition as distributions of the magnitudes we are dealing. We summarize a few features of the different approaches presented above in Table 1. Table 1 Comparison between the weak and strong definitions of the Dirac delta function. Quantity Weak value Strong value ∫ - ∞ ∞ δ ( x ) d x 1 1 ∫ 0 ∞ δ ( x ) d x 1 2 1 δ3(r) δ ( r ) 2 π r 2 δ ( r ) 4 π r 2 Δ 1 r - 4 π δ 3 ( r → ) - 4 π δ 3 ( r → ) 4.2. Obtaining ∇⋅r→r3=4πδ3(r→) The potential of a unit charge placed at the origin is given by ψ(r→)=1r. Its gradient is (61) ∇ ( 1 r ) = ∂ ∂ r ( 1 r ) r → r + ⋯ = - r → r 3 . Therefore, the associated Coulomb field to this charge is (62) E → = - ∇ ψ = r → r 3 . Following Temple’s approach as in Section 4.1.1, we can define a distribution corresponding to this electric field as a limit of a set of vector functions that tend to it as (63) weak lim n → ∞ E → H ( r - 1 n ) = weak lim n → ∞ 1 r 2 r → r H ( r - 1 n ) . The action of the electric field E→ as a distribution in a test function φ(r→) is, using spherical coordinates, (64) G E → [ φ ] = ∫ E → φ ( r → ) d 3 V = lim n → ∞ ∫ 1 n ∞ ∫ 0 2 π ∫ 0 π e → r r 2 φ ( r → ) r 2 d r sin θ d θ d ϕ , where e→r=r→/r is the radial unit vector. Now let us evaluate the divergence of the electric field in spherical coordinates (65) ∇ ⋅ E → = 1 r 2 ∂ ∂ r ( r 2 E r ) + 1 r sin θ ∂ ∂ θ ( E θ sin θ ) + 1 r sin θ ∂ E ϕ ∂ ϕ which vanishes for r > 0. Analogously to equation (29), the divergence of this vector distribution is (66) ∇ ⋅ G E → [ φ ] = G ∇ ⋅ E → [ φ ] = weak lim n → ∞ ∫ - ∇ φ ( r → ) ⋅ E → H ( r - 1 n ) d 3 V . To proceed we integrate by parts, remembering that the surface term vanishes to the boundary condition of φ, and use the identity (67) ∇ ⋅ ( f E → ) = ∇ f ⋅ E → + f ∇ ⋅ E → to obtain that (68) G ∇ ⋅ E → [ φ ] = lim n → ∞ ∫ [ ∇ H ( r - 1 n ) ⋅ e → r r 2 + H ( r - 1 n ) ∇ ⋅ ( e → r r 2 ) ] φ ( r → ) d 3 V = lim n → ∞ ∫ 0 ∞ ∫ 0 2 π ∫ 0 π δ ( r - 1 n ) e → r ⋅ e → r r 2 φ ( r → ) r 2 d r sin θ d θ d ϕ = ∫ 4 π δ 3 ( r → ) φ ( r → ) d 3 V = 4 π φ ( 0 → ) , where we employed equation (15) to go from the second to the third above lines. Remember that equation (15) is valid when we work with the strong definition of the Dirac delta function. Therefore, we have proved that (69) ∇ ⋅ ( r → r 3 ) = 4 π δ 3 ( r → ) . In a close analogy with was presented in Section 4.1.2 we can also show that this result is valid for the weak definition of the delta function. Once again, this section shows clearly that using careful definitions of physical magnitudes we can obtain well known expressions using a sound mathematical formulation. of this article, and explain how electromagnetic theory was before the theory of distributions and the advantages gained after the electric field, magnetic field, electric potential, etc., are “promoted” to distributions so that both sides of the equations are distributions. The end of Section 4 4. What are the values of Δ1r and ∇⋅r→r3? In classical electromagnetism, that is, prior to the invention of distributions these formulae were not known. We shall discuss below several ways of obtaining them, and the advantages they have over the approach used prior to distribution theory. 4.1. Obtaining Δ1r=-4πδ3(r→) We begin discussing the case of Δ1r where r=|r→|=x2+y2+z2 is the modulus of the vector r→=xi→+ yj→+zk→ and Δ is the Laplacian operator, whose expression in cartesian coordinates is (40) Δ = ∂ 2 ∂ x 2 + ∂ 2 ∂ y 2 + ∂ 2 ∂ z 2 . In spherical spherical coordinates the Laplacian is (41) Δ = 1 r ∂ 2 ∂ r 2 r + … where we omitted contributions from the angular part of the operator. The fact that (42) Δ 1 r = - 4 π δ 3 ( r → ) is the object of much discussion in the literature. In fact, there are countless papers about this subject. Here is a selection of the ones we find more useful [21, 29, 30, 31, 32]. In order to calculate Δ1r we have to “promote” the function 1/r to a distribution. As mentioned above and discussed below, this is not a simple task because the classical function 1/r is mathematically not well defined for r=0. The physical significance of the magnitude 1/r we are considering is the inverse of the length of the segment from 0 to the point whose radial coordinate is r. As can be anticipated, the use of the strong or weak definition of the Dirac delta function in equation (42) requires that the definition of the distribution associated to 1/r should be chosen carefully as showed below. 4.1.1. The definition of the distribution G1rB[φ] used by S. M. Blinder Our goal in this section is to prove equation (42) using the strong definition of the Dirac delta function consistently. To this end, we analyze the definition of the distribution corresponding to the classical function 1/r that follows was used by S. M. Blinder in the second part of his article [21]. Moreover, let us follow Temple’s method to define a distribution that corresponds to the classical function 1/r: (43) G 1 r B = weak lim n → ∞ 1 r H ( r - 1 n ) , that is, we are using a sequence of functions H(r-1n) that tends to H(r) when n → ∞. A more careful definition of 1/r, in three dimensions, will be given in SubSection 4.1.3 where this calculation is repeated in more detail. We shall omit the weak limn → ∞ detail in what follows, for simplicity. Then, (44) Δ 1 r = Δ G 1 r B = 1 r ∂ 2 ∂ r 2 H ( r ) or (45) Δ 1 r = 1 r 2 [ r ∂ ∂ r δ ( r ) ] = - 1 r 2 δ ( r ) , where the last equality was obtained using (46) - r ∂ ∂ r δ ( r ) = δ ( r ) . For the sake of continuity of the argument, let us postpone the proof of equation (46). Now, using the expression for δ(r→) given in equation (15) for the strong definition of the delta function we can rewrite equation (45) as Δ 1 r = - 4 π δ 3 ( r → ) . Hence, we have demonstrated equation (42). Therefore, this result can be consistently obtained with the use of the strong definition of the delta function provided equation (46) holds true for this choice. So, let us obtain this equation. In order to demonstrate equation (46) using the strong definition of the delta function we use a test function of the form (47) φ ( r ) = { g ( r ) r for r > 0 0 for r < 0 with g(r) infinitely differentiable and satisfying (48) lim r → 0 r g ( r ) = 0 . Then, to demonstrate equation (46) we integrate the right-hand side of it, that is r∂∂rδ(r), multiplied by the test function g(r)r in a volume 4πr2dr to get (49) 4 π ∫ 0 ∞ δ ( r ) r g ( r ) r r 2 d r = 4 π g ( 0 ) . On the other hand, the left-hand side of equation (46) multiplied by the test function (47), and integrated in the volume 4πr2dr gives (50) 4 π ∫ 0 ∞ - ∂ δ ( r ) ∂ r g ( r ) r r 2 d r = 4 π g ( 0 ) , when we integrate the left-hand side of last equation by parts. Let us do the details of this last calculation, following the page 34 of the reference [18] with modifications. (51) ∫ 0 ∞ - ∂ δ ( r ) ∂ r g ( r ) r r 2 d r = ∫ 0 ∞ - ∂ δ ( r ) ∂ r [ r g ( r ) ] d r = ∫ - ∞ ∞ - ∂ δ ( r ) ∂ r [ r g ( r ) ] d r where we used that g(r) vanishes for r < 0 in the last step. Now, equation (47) allow us to write the last integral as (52) ∫ - ∞ ∞ δ ( r ) d d r [ r g ( r ) ] d r = ∫ - ∞ ∞ δ ( r ) [ r d g ( r ) d r + g ( r ) ] d r = g ( 0 ) . This demonstrates the result. 4.1.2. The definition of the distribution G1rKH[φ] used by Ben Kuang-Yu Hu Let us now see that using a different definition of the distribution that corresponds to the classical function 1/r, we can demonstrate equation (42) using the weak definition of the delta function. We follow here the article by Ben Yu-Kuang Hu [33], but with some modifications. First we define new spherical coordinates with r ranging from −∞ to ∞ and the polar angle θ ranging from 0 to π/2. Then, we can define the distribution (53) G 1 r K H = 1 r sign ( r ) = 1 r { 1 for r > 0 - 1 for r < 0 . Once again, the magnitude r is physically the distance from the origin to a point with coordinate r. Now we calculate (54) Δ ( G 1 r K H ) = 1 r ∂ 2 ∂ r 2 sign ( r ) = 2 r ∂ ∂ r δ ( r ) = - 2 δ ( r ) r 2 = - 4 π δ 3 ( r → ) , where now we have used the results of equations (46) and (14). Hence, we have demonstrated equation (42) but using in the last step the weak definition of the delta function, that is equation (14). Once again we used the result given in equation (46), so let us derive it for the weak definition of the delta function. To do so, we use a test function φ(r) whose support contains the origin and integrate its product with the left-hand side equation (46) (55) - ∫ - ∞ ∞ φ ( r ) r ∂ δ ( r ) ∂ r d r = ∫ - ∞ ∞ ∂ ( r φ ( r ) ) ∂ r δ ( r ) d r (56) = ∫ - ∞ ∞ ( r ∂ φ ( r ) ∂ r + φ ( r ) ) δ ( r ) d r = ∫ - ∞ ∞ δ ( r ) φ ( r ) d r where the last equality follows from the fact the derivative φ is continuous and then r∂ φ/∂ r|r = 0 = 0. Therefore, equation (46) also holds for the weak definition of the Dirac delta function. In brief, we have obtained the well know result Δ 1 r = - 4 π δ 3 ( r → ) using both the weak and the strong definitions of the delta function. This should be no surprise. The two results are the same but they result from two different definitions of the distribution that correspond to the classical function 1/r. Another more careful definition of this distribution is going to be presented, from the point of view of Schwartz, in Section 4.1.3 below. This definition is much more comprehensive than the two we have just presented. The same distribution, from the point of view of Temple is going to be presented in the new subsection. 4.1.3. The definition of the distribution G1/rSW[φ] used by Ray Skinner and John A. Weil Another and more clear way to define the distribution 1/r is given by Ray Skinner and John A. Weil [17]. We now present this approach to show that equation (42) is true, making clear that this framework is more illuminating than the above demonstrations. We follow closely the presentation of [17]. The approach is to consider that both sides of the equation (42) must be interpreted as distributions. That is, the classical operator and the classical function 1/r must be “promoted” to a generalized operator acting on generalized functions and generalized function respectively. Let us begin by “promoting” 1/r to a generalized function. The classical function 1/r and its derivatives −1/r2, 2/r3, etc. “blow up” at r = 0. Following Skinner and Weil we can define the generalized function corresponding to 1/r as (57) G 1 r S W [ φ ] = ∫ S φ ( r → ) r d r sin θ d θ d ϕ = ∫ S φ ( r → ) r d r d Ω where φ is any test function and dΩ = sin θdθdϕ is the solid angle subtended from the origin to an element of volume containing by r→. Similarly, the classical function 1/r2 corresponds to the generalized function (58) G 1 r 2 S W [ φ ] = ∫ S φ ( r → ) d r d Ω . Equations (57) and (58), however, do not define univocally, from the mathematical point of view, a generalized function corresponding to 1/r and 1/r2 respectively, as we shall see at the end of this section. Nevertheless, Physics chooses the definitions (57) and (58) uniquely! We can now calculate formula (42) which requires that we use for the generalized ∇ operator following a definition in agreement with equation (29). We start by calculating the generalized gradient of the generalized function corresponding to 1/r given by equation (57). We have (59) ∇ G 1 r S W [ φ ] ≡ - ∫ S [ ∇ φ ( r → ) ] r d r d Ω = - ∫ S [ ∇ φ ( r → ) ] 1 r d 3 V Next we evaluate the Laplacian of the generalized function G1rSW corresponding to 1/r. We have, once again using equation (29) (60) Δ G 1 r S W [ φ ] = ∇ ⋅ ∇ G 1 r S W [ φ ] = ∫ S ∇ ⋅ [ ∇ φ ( r → ) ] r d r d Ω = p a r t s ∫ S ∇ φ ( r → ) ⋅ r → r d r d Ω = ∫ S ∂ φ ∂ r ( r → ) d r d Ω = 4 π [ lim r → ∞ φ ( r → ) - lim r → 0 φ ( r → ) ] = - 4 π φ ( 0 → ) , where we “integrated by parts” in the second line. The integral of the angular part in the above last line is not straightforward. It can be evaluated expanding φ(r→) in spherical harmonics, and we leave for the careful reader to perform it; see reference [18] page 33 for further information. Hence, we obtain equation (42), that is Δ 1 r = - 4 π δ 3 ( r → ) . A very important point is to show that the definitions of 1/r and 1/r2, as distributions, are determined by Physics. In fact the definitions (57) and (58) of the above quantities are unique for r > 0. In fact, take for example the definition given by equation (57) of the distribution corresponding to the classical function 1/r. We could add to it any linear combination of δ3(r→) and its derivatives and we would still represent 1/r for r > 0. However, the electrical potential defined by equation (57) is the one that satisfies Maxwell equations in generalized form as shown in [17]. So, the generalized form of the Maxwell equations defines the distributions we need to write them. The conclusion of this section is that we can work with either value of X in equation (1), provided we do it consistently. Notwithstanding, we have to worry about what are the definition as distributions of the magnitudes we are dealing. We summarize a few features of the different approaches presented above in Table 1. Table 1 Comparison between the weak and strong definitions of the Dirac delta function. Quantity Weak value Strong value ∫ - ∞ ∞ δ ( x ) d x 1 1 ∫ 0 ∞ δ ( x ) d x 1 2 1 δ3(r) δ ( r ) 2 π r 2 δ ( r ) 4 π r 2 Δ 1 r - 4 π δ 3 ( r → ) - 4 π δ 3 ( r → ) 4.2. Obtaining ∇⋅r→r3=4πδ3(r→) The potential of a unit charge placed at the origin is given by ψ(r→)=1r. Its gradient is (61) ∇ ( 1 r ) = ∂ ∂ r ( 1 r ) r → r + ⋯ = - r → r 3 . Therefore, the associated Coulomb field to this charge is (62) E → = - ∇ ψ = r → r 3 . Following Temple’s approach as in Section 4.1.1, we can define a distribution corresponding to this electric field as a limit of a set of vector functions that tend to it as (63) weak lim n → ∞ E → H ( r - 1 n ) = weak lim n → ∞ 1 r 2 r → r H ( r - 1 n ) . The action of the electric field E→ as a distribution in a test function φ(r→) is, using spherical coordinates, (64) G E → [ φ ] = ∫ E → φ ( r → ) d 3 V = lim n → ∞ ∫ 1 n ∞ ∫ 0 2 π ∫ 0 π e → r r 2 φ ( r → ) r 2 d r sin θ d θ d ϕ , where e→r=r→/r is the radial unit vector. Now let us evaluate the divergence of the electric field in spherical coordinates (65) ∇ ⋅ E → = 1 r 2 ∂ ∂ r ( r 2 E r ) + 1 r sin θ ∂ ∂ θ ( E θ sin θ ) + 1 r sin θ ∂ E ϕ ∂ ϕ which vanishes for r > 0. Analogously to equation (29), the divergence of this vector distribution is (66) ∇ ⋅ G E → [ φ ] = G ∇ ⋅ E → [ φ ] = weak lim n → ∞ ∫ - ∇ φ ( r → ) ⋅ E → H ( r - 1 n ) d 3 V . To proceed we integrate by parts, remembering that the surface term vanishes to the boundary condition of φ, and use the identity (67) ∇ ⋅ ( f E → ) = ∇ f ⋅ E → + f ∇ ⋅ E → to obtain that (68) G ∇ ⋅ E → [ φ ] = lim n → ∞ ∫ [ ∇ H ( r - 1 n ) ⋅ e → r r 2 + H ( r - 1 n ) ∇ ⋅ ( e → r r 2 ) ] φ ( r → ) d 3 V = lim n → ∞ ∫ 0 ∞ ∫ 0 2 π ∫ 0 π δ ( r - 1 n ) e → r ⋅ e → r r 2 φ ( r → ) r 2 d r sin θ d θ d ϕ = ∫ 4 π δ 3 ( r → ) φ ( r → ) d 3 V = 4 π φ ( 0 → ) , where we employed equation (15) to go from the second to the third above lines. Remember that equation (15) is valid when we work with the strong definition of the Dirac delta function. Therefore, we have proved that (69) ∇ ⋅ ( r → r 3 ) = 4 π δ 3 ( r → ) . In a close analogy with was presented in Section 4.1.2 we can also show that this result is valid for the weak definition of the delta function. Once again, this section shows clearly that using careful definitions of physical magnitudes we can obtain well known expressions using a sound mathematical formulation. is especially important in this respect.
2. The problem
The value of the integral in equation (1) involving the Dirac delta function is the cause of some perplexity in the literature. The following integral
is well know, but the value of equation (1) is subject of some discussion in the literature.
According to G. Barton [1818. G. Barton, Elements of Green’s Functions and Propagation (Oxford Sciencepublications, Oxford, 1989).] on page 33 the “strong definition of the delta function requires X = 1, … but some books choose ”. We quote below some books and articles that choose this last value:
-
(1)
The first reference is in page 29 of [1919. J.D. Jackson, Mathematical Methods for Quantum Mechanics (A. Benjamin, New York, 1962).]. No explanation for this is given, but we can think that the reasoning is as follows. We can write equation (1) as
since δ(x) is an even function. Here is a similar argument
then
Since equation (8) is an even function of x we have
-
(2)
The second reference is given in page 791 of [2020. R. Courant and D. Hilbert, Methods of Mathematical Physics (John Wileyand Sons, New York, 1962) v. 2.]. The justification for using equation (1) with is that the n-dimensional delta function is given by
where ωn is the surface area of the n-dimensional unit sphere. In order to the integral of over the whole space to be one we must use equation (1) with .
Remark 1:Formula (14) is a simplified version of the argument given by Courant and Hilbert [2020. R. Courant and D. Hilbert, Methods of Mathematical Physics (John Wileyand Sons, New York, 1962) v. 2.] described above in equation (11). Take for instance n = 3, then ω3 = 4π.
-
(3)
The third reference is given by Blinder [2121. S.M. Blinder, Am. J. Phys. 71, 816 (2003).] in his equation (7). His justification for this is that “the factor reflects the fact that the delta function is located in one of the limits, so that, only half of the delta function is within the range of integration”.
-
(4)
The fourth reference is given by [2222. D. Zhang, Y. Ding and T. Ma, Am. J. Phys. 57, 281 (1989).]. The authors of this article propose a new definition of the Dirac delta function. According to them, the usual definition of the delta function is given by
if the interval does not contain the origin, and indeterminate if a or b equal zero, that is, if one of the end points of the interval of integration coincides with zero.
They propose, assuming that a < b, that the definition should be
This definition of delta function was proposed earlier and independently by John von Neumann and David Hilbert [2323. J. Von Neumann Collected Works (Pergamon, Oxford, 1994) v. 1 p. 111.]. This fact was pointed out by [2424. F.A. Muller, Am. J. Phys. 62, 11 (1994).].
One consequence of the weak definition of the delta function is that when you consider the three-dimensional delta function we should have
with . However, if you use the strong definition of the delta function we have
In fact, consider in spherical coordinates. Since the delta function is at the origin it has no angular part and is in fact δ(r) [2525. R.N. Bracewell, The Fourier Transform and Its Applications (McGraw-Hill, Boston, 2000).]. Therefore, using the weak definition of the delta function and equation (14), we can verify the consistency of the results
However, using equation (15) and the strong definition also leads to a consistent result
At this point it is already evident that the use of the weak or the strong definitions of the Dirac delta function implies that derived results, like equations (14) and (15) must be consistently chosen.
It is difficult to obtain these results (14) and (15) using distribution theory in the elementary form used by physicists. In fact, consider the delta function in spherical coordinates, which is given by [1818. G. Barton, Elements of Green’s Functions and Propagation (Oxford Sciencepublications, Oxford, 1989).]
or
We now show that equation (18) satisfies one of the basic properties of the delta function and comment upon it. In fact, we have for
On the other hand, the limit is not well defined if we adopt the weak definition for the Dirac delta function. In order to see this fact, let us take the limit along a line r0 → 0 with ϕ0 and θ0 fixed. From equation (20) we obtain that
Clearly this last expression is problematic for the weak definition since its value depends on ϕ0 and θ0, as well as it is not equal to one! Notwithstanding, this limit is well defined if we adopt the strong definition since the θ and ϕ are equal to zero independently of the values of θ0 and ϕ0. We can circumvent this dilemma as we shall show in Section 4.1.2 4.1.2. The definition of the distribution G1rKH[φ] used by Ben Kuang-Yu Hu Let us now see that using a different definition of the distribution that corresponds to the classical function 1/r, we can demonstrate equation (42) using the weak definition of the delta function. We follow here the article by Ben Yu-Kuang Hu [33], but with some modifications. First we define new spherical coordinates with r ranging from −∞ to ∞ and the polar angle θ ranging from 0 to π/2. Then, we can define the distribution (53) G 1 r K H = 1 r sign ( r ) = 1 r { 1 for r > 0 - 1 for r < 0 . Once again, the magnitude r is physically the distance from the origin to a point with coordinate r. Now we calculate (54) Δ ( G 1 r K H ) = 1 r ∂ 2 ∂ r 2 sign ( r ) = 2 r ∂ ∂ r δ ( r ) = - 2 δ ( r ) r 2 = - 4 π δ 3 ( r → ) , where now we have used the results of equations (46) and (14). Hence, we have demonstrated equation (42) but using in the last step the weak definition of the delta function, that is equation (14). Once again we used the result given in equation (46), so let us derive it for the weak definition of the delta function. To do so, we use a test function φ(r) whose support contains the origin and integrate its product with the left-hand side equation (46) (55) - ∫ - ∞ ∞ φ ( r ) r ∂ δ ( r ) ∂ r d r = ∫ - ∞ ∞ ∂ ( r φ ( r ) ) ∂ r δ ( r ) d r (56) = ∫ - ∞ ∞ ( r ∂ φ ( r ) ∂ r + φ ( r ) ) δ ( r ) d r = ∫ - ∞ ∞ δ ( r ) φ ( r ) d r where the last equality follows from the fact the derivative φ is continuous and then r∂ φ/∂ r|r = 0 = 0. Therefore, equation (46) also holds for the weak definition of the Dirac delta function. In brief, we have obtained the well know result Δ 1 r = - 4 π δ 3 ( r → ) using both the weak and the strong definitions of the delta function. This should be no surprise. The two results are the same but they result from two different definitions of the distribution that correspond to the classical function 1/r. Another more careful definition of this distribution is going to be presented, from the point of view of Schwartz, in Section 4.1.3 below. This definition is much more comprehensive than the two we have just presented. The same distribution, from the point of view of Temple is going to be presented in the new subsection. of this paper.
Due to the above controversy, Gabriel Barton [1818. G. Barton, Elements of Green’s Functions and Propagation (Oxford Sciencepublications, Oxford, 1989).] prefers to use the so called strong definition of the delta function, that is, the one that takes X = 1. His book, strongly recommended, carefully studies the two cases. We would like now to study this problem in the light of the Schwartz definition of generalized functions. As we have already mentioned distributions is the name that Schwartz [77. L. Schwartz, Theory of Distributions (Herman, Paris, 1950).] used for these mathematical objects. Generalized functions is the name that G. Temple [99. G. Temple, J. Lon. Math. Soc. 28, 175 (1953).] prefer to call the Schwartz distributions.
3. Distributions according to Schwartz and Temple
In this section we present two approaches to distributions, that is the Schwartz and Temple ones. The latter one is simpler than the Schwartz one, however, this is more general.
3.1. The Schwartz definition
In order to define a distribution or a generalized function, as they are also known, we recall the definition of a functional. A functional F is a mathematical object that acts on functions and produces a number. So if φ(x) is a function and F is a functional we can write F[φ] = Number. A simple example of a functional is given by a function f(x) that acts on a “good function” φ(x) as follows:
The function φ(x) should be a good function so that the integral is well defined. We say that the function f(x) generates the functional.
equation (22) can be generalized to three dimensions as follows
where d3V = dxdydz and φ(x,y,z) is a good function in three dimensions.
A distribution or generalized function is a linear and continuous functional acting upon the space of the “good functions” φ(x). Distributions that are generated by functions like in equation (22) are called regular distributions. We shall denote distributions generated by a function f(x) by
Distributions that are not generated by functions are called irregular.
The Dirac delta function in one dimension is defined by following functional
where φ(y) is the value of φ at the point y, that is, a number. The Dirac delta function is an irregular distribution, because there are no function that satisfies equation (26). equation (24) very often written as
The last equation is also written as
The function φ(x) in the Schwartz theory is called a “test function” and is a function that have support (the interval where the function is different from zero) in a finite interval of the real line. Moreover, it is continuous and infinitely differentiable.
A classical example of a “test function” is given by
whose support is the open set (−a,a) and it is infinitely differentiable.
Physical magnitudes, that have no singularities can be simple “promoted” to regular distributions by replacing f(x,y,z) by it in equation (23). So, consider the x component of the electric field that gives rise to a regular distribution
where φ(x,y,z) is a three-dimensional test function.
It is common to hear that the derivative of a function that has a finite jump discontinuity has a delta multiplied by this jump. This should be clarified as follows: Suppose you have a function f(x). Promote this to a distribution, that is, consider that it generates a distribution. The distribution so generated is infinitely differentiable. For example, the first derivative of a distribution generated by f acting on a test function φ(x) is by definition
We then recall the definition of the step (Heaviside) function
Consider now H(x−x1) as a distribution. The space of test functions φ(x) is the set of infinite differentiable function defined in a finite interval of the real numbers containing the point x1. So, the distribution is the functional
According to equation (29) the derivative of GH[φ] is
However, this functional is the Dirac delta function, viz.
It is a very important result that if a function is continuous or have only finite jump discontinuities it can be “promoted” to a generalized function by just using it with suitable test functions. However if a function has an infinite discontinuity more care is required as discussed in Section 4 4. What are the values of Δ1r and ∇⋅r→r3? In classical electromagnetism, that is, prior to the invention of distributions these formulae were not known. We shall discuss below several ways of obtaining them, and the advantages they have over the approach used prior to distribution theory. 4.1. Obtaining Δ1r=-4πδ3(r→) We begin discussing the case of Δ1r where r=|r→|=x2+y2+z2 is the modulus of the vector r→=xi→+ yj→+zk→ and Δ is the Laplacian operator, whose expression in cartesian coordinates is (40) Δ = ∂ 2 ∂ x 2 + ∂ 2 ∂ y 2 + ∂ 2 ∂ z 2 . In spherical spherical coordinates the Laplacian is (41) Δ = 1 r ∂ 2 ∂ r 2 r + … where we omitted contributions from the angular part of the operator. The fact that (42) Δ 1 r = - 4 π δ 3 ( r → ) is the object of much discussion in the literature. In fact, there are countless papers about this subject. Here is a selection of the ones we find more useful [21, 29, 30, 31, 32]. In order to calculate Δ1r we have to “promote” the function 1/r to a distribution. As mentioned above and discussed below, this is not a simple task because the classical function 1/r is mathematically not well defined for r=0. The physical significance of the magnitude 1/r we are considering is the inverse of the length of the segment from 0 to the point whose radial coordinate is r. As can be anticipated, the use of the strong or weak definition of the Dirac delta function in equation (42) requires that the definition of the distribution associated to 1/r should be chosen carefully as showed below. 4.1.1. The definition of the distribution G1rB[φ] used by S. M. Blinder Our goal in this section is to prove equation (42) using the strong definition of the Dirac delta function consistently. To this end, we analyze the definition of the distribution corresponding to the classical function 1/r that follows was used by S. M. Blinder in the second part of his article [21]. Moreover, let us follow Temple’s method to define a distribution that corresponds to the classical function 1/r: (43) G 1 r B = weak lim n → ∞ 1 r H ( r - 1 n ) , that is, we are using a sequence of functions H(r-1n) that tends to H(r) when n → ∞. A more careful definition of 1/r, in three dimensions, will be given in SubSection 4.1.3 where this calculation is repeated in more detail. We shall omit the weak limn → ∞ detail in what follows, for simplicity. Then, (44) Δ 1 r = Δ G 1 r B = 1 r ∂ 2 ∂ r 2 H ( r ) or (45) Δ 1 r = 1 r 2 [ r ∂ ∂ r δ ( r ) ] = - 1 r 2 δ ( r ) , where the last equality was obtained using (46) - r ∂ ∂ r δ ( r ) = δ ( r ) . For the sake of continuity of the argument, let us postpone the proof of equation (46). Now, using the expression for δ(r→) given in equation (15) for the strong definition of the delta function we can rewrite equation (45) as Δ 1 r = - 4 π δ 3 ( r → ) . Hence, we have demonstrated equation (42). Therefore, this result can be consistently obtained with the use of the strong definition of the delta function provided equation (46) holds true for this choice. So, let us obtain this equation. In order to demonstrate equation (46) using the strong definition of the delta function we use a test function of the form (47) φ ( r ) = { g ( r ) r for r > 0 0 for r < 0 with g(r) infinitely differentiable and satisfying (48) lim r → 0 r g ( r ) = 0 . Then, to demonstrate equation (46) we integrate the right-hand side of it, that is r∂∂rδ(r), multiplied by the test function g(r)r in a volume 4πr2dr to get (49) 4 π ∫ 0 ∞ δ ( r ) r g ( r ) r r 2 d r = 4 π g ( 0 ) . On the other hand, the left-hand side of equation (46) multiplied by the test function (47), and integrated in the volume 4πr2dr gives (50) 4 π ∫ 0 ∞ - ∂ δ ( r ) ∂ r g ( r ) r r 2 d r = 4 π g ( 0 ) , when we integrate the left-hand side of last equation by parts. Let us do the details of this last calculation, following the page 34 of the reference [18] with modifications. (51) ∫ 0 ∞ - ∂ δ ( r ) ∂ r g ( r ) r r 2 d r = ∫ 0 ∞ - ∂ δ ( r ) ∂ r [ r g ( r ) ] d r = ∫ - ∞ ∞ - ∂ δ ( r ) ∂ r [ r g ( r ) ] d r where we used that g(r) vanishes for r < 0 in the last step. Now, equation (47) allow us to write the last integral as (52) ∫ - ∞ ∞ δ ( r ) d d r [ r g ( r ) ] d r = ∫ - ∞ ∞ δ ( r ) [ r d g ( r ) d r + g ( r ) ] d r = g ( 0 ) . This demonstrates the result. 4.1.2. The definition of the distribution G1rKH[φ] used by Ben Kuang-Yu Hu Let us now see that using a different definition of the distribution that corresponds to the classical function 1/r, we can demonstrate equation (42) using the weak definition of the delta function. We follow here the article by Ben Yu-Kuang Hu [33], but with some modifications. First we define new spherical coordinates with r ranging from −∞ to ∞ and the polar angle θ ranging from 0 to π/2. Then, we can define the distribution (53) G 1 r K H = 1 r sign ( r ) = 1 r { 1 for r > 0 - 1 for r < 0 . Once again, the magnitude r is physically the distance from the origin to a point with coordinate r. Now we calculate (54) Δ ( G 1 r K H ) = 1 r ∂ 2 ∂ r 2 sign ( r ) = 2 r ∂ ∂ r δ ( r ) = - 2 δ ( r ) r 2 = - 4 π δ 3 ( r → ) , where now we have used the results of equations (46) and (14). Hence, we have demonstrated equation (42) but using in the last step the weak definition of the delta function, that is equation (14). Once again we used the result given in equation (46), so let us derive it for the weak definition of the delta function. To do so, we use a test function φ(r) whose support contains the origin and integrate its product with the left-hand side equation (46) (55) - ∫ - ∞ ∞ φ ( r ) r ∂ δ ( r ) ∂ r d r = ∫ - ∞ ∞ ∂ ( r φ ( r ) ) ∂ r δ ( r ) d r (56) = ∫ - ∞ ∞ ( r ∂ φ ( r ) ∂ r + φ ( r ) ) δ ( r ) d r = ∫ - ∞ ∞ δ ( r ) φ ( r ) d r where the last equality follows from the fact the derivative φ is continuous and then r∂ φ/∂ r|r = 0 = 0. Therefore, equation (46) also holds for the weak definition of the Dirac delta function. In brief, we have obtained the well know result Δ 1 r = - 4 π δ 3 ( r → ) using both the weak and the strong definitions of the delta function. This should be no surprise. The two results are the same but they result from two different definitions of the distribution that correspond to the classical function 1/r. Another more careful definition of this distribution is going to be presented, from the point of view of Schwartz, in Section 4.1.3 below. This definition is much more comprehensive than the two we have just presented. The same distribution, from the point of view of Temple is going to be presented in the new subsection. 4.1.3. The definition of the distribution G1/rSW[φ] used by Ray Skinner and John A. Weil Another and more clear way to define the distribution 1/r is given by Ray Skinner and John A. Weil [17]. We now present this approach to show that equation (42) is true, making clear that this framework is more illuminating than the above demonstrations. We follow closely the presentation of [17]. The approach is to consider that both sides of the equation (42) must be interpreted as distributions. That is, the classical operator and the classical function 1/r must be “promoted” to a generalized operator acting on generalized functions and generalized function respectively. Let us begin by “promoting” 1/r to a generalized function. The classical function 1/r and its derivatives −1/r2, 2/r3, etc. “blow up” at r = 0. Following Skinner and Weil we can define the generalized function corresponding to 1/r as (57) G 1 r S W [ φ ] = ∫ S φ ( r → ) r d r sin θ d θ d ϕ = ∫ S φ ( r → ) r d r d Ω where φ is any test function and dΩ = sin θdθdϕ is the solid angle subtended from the origin to an element of volume containing by r→. Similarly, the classical function 1/r2 corresponds to the generalized function (58) G 1 r 2 S W [ φ ] = ∫ S φ ( r → ) d r d Ω . Equations (57) and (58), however, do not define univocally, from the mathematical point of view, a generalized function corresponding to 1/r and 1/r2 respectively, as we shall see at the end of this section. Nevertheless, Physics chooses the definitions (57) and (58) uniquely! We can now calculate formula (42) which requires that we use for the generalized ∇ operator following a definition in agreement with equation (29). We start by calculating the generalized gradient of the generalized function corresponding to 1/r given by equation (57). We have (59) ∇ G 1 r S W [ φ ] ≡ - ∫ S [ ∇ φ ( r → ) ] r d r d Ω = - ∫ S [ ∇ φ ( r → ) ] 1 r d 3 V Next we evaluate the Laplacian of the generalized function G1rSW corresponding to 1/r. We have, once again using equation (29) (60) Δ G 1 r S W [ φ ] = ∇ ⋅ ∇ G 1 r S W [ φ ] = ∫ S ∇ ⋅ [ ∇ φ ( r → ) ] r d r d Ω = p a r t s ∫ S ∇ φ ( r → ) ⋅ r → r d r d Ω = ∫ S ∂ φ ∂ r ( r → ) d r d Ω = 4 π [ lim r → ∞ φ ( r → ) - lim r → 0 φ ( r → ) ] = - 4 π φ ( 0 → ) , where we “integrated by parts” in the second line. The integral of the angular part in the above last line is not straightforward. It can be evaluated expanding φ(r→) in spherical harmonics, and we leave for the careful reader to perform it; see reference [18] page 33 for further information. Hence, we obtain equation (42), that is Δ 1 r = - 4 π δ 3 ( r → ) . A very important point is to show that the definitions of 1/r and 1/r2, as distributions, are determined by Physics. In fact the definitions (57) and (58) of the above quantities are unique for r > 0. In fact, take for example the definition given by equation (57) of the distribution corresponding to the classical function 1/r. We could add to it any linear combination of δ3(r→) and its derivatives and we would still represent 1/r for r > 0. However, the electrical potential defined by equation (57) is the one that satisfies Maxwell equations in generalized form as shown in [17]. So, the generalized form of the Maxwell equations defines the distributions we need to write them. The conclusion of this section is that we can work with either value of X in equation (1), provided we do it consistently. Notwithstanding, we have to worry about what are the definition as distributions of the magnitudes we are dealing. We summarize a few features of the different approaches presented above in Table 1. Table 1 Comparison between the weak and strong definitions of the Dirac delta function. Quantity Weak value Strong value ∫ - ∞ ∞ δ ( x ) d x 1 1 ∫ 0 ∞ δ ( x ) d x 1 2 1 δ3(r) δ ( r ) 2 π r 2 δ ( r ) 4 π r 2 Δ 1 r - 4 π δ 3 ( r → ) - 4 π δ 3 ( r → ) 4.2. Obtaining ∇⋅r→r3=4πδ3(r→) The potential of a unit charge placed at the origin is given by ψ(r→)=1r. Its gradient is (61) ∇ ( 1 r ) = ∂ ∂ r ( 1 r ) r → r + ⋯ = - r → r 3 . Therefore, the associated Coulomb field to this charge is (62) E → = - ∇ ψ = r → r 3 . Following Temple’s approach as in Section 4.1.1, we can define a distribution corresponding to this electric field as a limit of a set of vector functions that tend to it as (63) weak lim n → ∞ E → H ( r - 1 n ) = weak lim n → ∞ 1 r 2 r → r H ( r - 1 n ) . The action of the electric field E→ as a distribution in a test function φ(r→) is, using spherical coordinates, (64) G E → [ φ ] = ∫ E → φ ( r → ) d 3 V = lim n → ∞ ∫ 1 n ∞ ∫ 0 2 π ∫ 0 π e → r r 2 φ ( r → ) r 2 d r sin θ d θ d ϕ , where e→r=r→/r is the radial unit vector. Now let us evaluate the divergence of the electric field in spherical coordinates (65) ∇ ⋅ E → = 1 r 2 ∂ ∂ r ( r 2 E r ) + 1 r sin θ ∂ ∂ θ ( E θ sin θ ) + 1 r sin θ ∂ E ϕ ∂ ϕ which vanishes for r > 0. Analogously to equation (29), the divergence of this vector distribution is (66) ∇ ⋅ G E → [ φ ] = G ∇ ⋅ E → [ φ ] = weak lim n → ∞ ∫ - ∇ φ ( r → ) ⋅ E → H ( r - 1 n ) d 3 V . To proceed we integrate by parts, remembering that the surface term vanishes to the boundary condition of φ, and use the identity (67) ∇ ⋅ ( f E → ) = ∇ f ⋅ E → + f ∇ ⋅ E → to obtain that (68) G ∇ ⋅ E → [ φ ] = lim n → ∞ ∫ [ ∇ H ( r - 1 n ) ⋅ e → r r 2 + H ( r - 1 n ) ∇ ⋅ ( e → r r 2 ) ] φ ( r → ) d 3 V = lim n → ∞ ∫ 0 ∞ ∫ 0 2 π ∫ 0 π δ ( r - 1 n ) e → r ⋅ e → r r 2 φ ( r → ) r 2 d r sin θ d θ d ϕ = ∫ 4 π δ 3 ( r → ) φ ( r → ) d 3 V = 4 π φ ( 0 → ) , where we employed equation (15) to go from the second to the third above lines. Remember that equation (15) is valid when we work with the strong definition of the Dirac delta function. Therefore, we have proved that (69) ∇ ⋅ ( r → r 3 ) = 4 π δ 3 ( r → ) . In a close analogy with was presented in Section 4.1.2 we can also show that this result is valid for the weak definition of the delta function. Once again, this section shows clearly that using careful definitions of physical magnitudes we can obtain well known expressions using a sound mathematical formulation. with respect to the function .
It is possible and very useful to use distributions defined with test functions with support in a finite interval (a,b) of the real axis. In fact, we are going to use this type of distributions in another article in this series.
Distributions have limitations. For example, they can not be multiplied. Let us illustrate this point. We have seen that, to discuss the δ(x−y), it is formally convenient to write equation (26). Likewise, to discuss the meaning of , we should consider where H(x) is the step function defined in equation (30) and which renders H(x)φ(x) discontinuous. Therefore,
does not make sense in the Schwartz distribution theory. This explains why it is difficult to solve the problem of what is the value of X in the expression
in Schwartz distribution theory.
Also, to solve the more general problem of calculating equation (21) with test functions defined in the interval [0, ∞] for r, [0,2π] for ϕ and [0,π] for θ is of no help, because the test functions vanish at the end points.
So, we conclude that although
when the integral is over a finite interval
and when the variable x′ coincides with one of the ends of the interval of integration, the above integral is undefined or requires great care. As mentioned before we can circumvent this as we shall show in SubSection 4.1.2 4.1.2. The definition of the distribution G1rKH[φ] used by Ben Kuang-Yu Hu Let us now see that using a different definition of the distribution that corresponds to the classical function 1/r, we can demonstrate equation (42) using the weak definition of the delta function. We follow here the article by Ben Yu-Kuang Hu [33], but with some modifications. First we define new spherical coordinates with r ranging from −∞ to ∞ and the polar angle θ ranging from 0 to π/2. Then, we can define the distribution (53) G 1 r K H = 1 r sign ( r ) = 1 r { 1 for r > 0 - 1 for r < 0 . Once again, the magnitude r is physically the distance from the origin to a point with coordinate r. Now we calculate (54) Δ ( G 1 r K H ) = 1 r ∂ 2 ∂ r 2 sign ( r ) = 2 r ∂ ∂ r δ ( r ) = - 2 δ ( r ) r 2 = - 4 π δ 3 ( r → ) , where now we have used the results of equations (46) and (14). Hence, we have demonstrated equation (42) but using in the last step the weak definition of the delta function, that is equation (14). Once again we used the result given in equation (46), so let us derive it for the weak definition of the delta function. To do so, we use a test function φ(r) whose support contains the origin and integrate its product with the left-hand side equation (46) (55) - ∫ - ∞ ∞ φ ( r ) r ∂ δ ( r ) ∂ r d r = ∫ - ∞ ∞ ∂ ( r φ ( r ) ) ∂ r δ ( r ) d r (56) = ∫ - ∞ ∞ ( r ∂ φ ( r ) ∂ r + φ ( r ) ) δ ( r ) d r = ∫ - ∞ ∞ δ ( r ) φ ( r ) d r where the last equality follows from the fact the derivative φ is continuous and then r∂ φ/∂ r|r = 0 = 0. Therefore, equation (46) also holds for the weak definition of the Dirac delta function. In brief, we have obtained the well know result Δ 1 r = - 4 π δ 3 ( r → ) using both the weak and the strong definitions of the delta function. This should be no surprise. The two results are the same but they result from two different definitions of the distribution that correspond to the classical function 1/r. Another more careful definition of this distribution is going to be presented, from the point of view of Schwartz, in Section 4.1.3 below. This definition is much more comprehensive than the two we have just presented. The same distribution, from the point of view of Temple is going to be presented in the new subsection. of this paper.
3.2. The Temple definition
The Temple [1616. G. Temple, Proc. Roy. Soc. A 228, 175 (1955).] approach to distributions or generalized functions is simpler, but perhaps, less general than Schwartz approach. Two good books using this approach are [88. M.J. Lighthill, Introduction to Fourier Analysis and generalizedFunction (Cambridge University Press, Cambridge, 1964)., 2626. T. Schucker Distributions, Fourier Transforms, And Some of theirApplications to Physics (World Scientific, Singapore, 1991).].
According to Temple generalized functions are special limits of sequences of functions. These special limits, also called weak limits, are defined as follows: A sequence of differentiable functions fn(x) (n = 1,2, …) converges weakly to f(x) if for any test function φ(x) (see above) the limit
exists in spite of that classically limn → ∞ fn(x) does not exist. Sometimes this is written as
Here are two examples of such sequences:
and
These examples tend to the Dirac delta function, which is not a function classically. But, as we shall see other functions can be “promoted” to distributions. Furthermore, the following sequence tends to the above Heaviside distribution
Differentiation in the Temple approach is like in the Schwartz definition,
so that
that is, the result given by equation (29) holds.
Remark 2: There are other definitions and alternative treatments to generalized functions. One that uses discontinuous test functions is by Kurasov [2727. P. Kurasov, J. Math. Anal. 201, 297 (1996).]. This theory may be used to solve some of the problems raised in [2828. F.A.B. Coutinho, Y. Nogami and F.M. Toyama, Rev. Bras. Ens. Fís. 31, 4302 (2009).].
4. What are the values of and ?
In classical electromagnetism, that is, prior to the invention of distributions these formulae were not known. We shall discuss below several ways of obtaining them, and the advantages they have over the approach used prior to distribution theory.
4.1. Obtaining
We begin discussing the case of where is the modulus of the vector and Δ is the Laplacian operator, whose expression in cartesian coordinates is
In spherical spherical coordinates the Laplacian is
where we omitted contributions from the angular part of the operator.
The fact that
is the object of much discussion in the literature. In fact, there are countless papers about this subject. Here is a selection of the ones we find more useful [2121. S.M. Blinder, Am. J. Phys. 71, 816 (2003)., 2929. A. Gsponer, Eur. J. Phys. 28, 267 (2007)., 3030. V. Hnizdo, Eur. J. Phys. 32, 287 (2011)., 3131. C.P. Frahm, Am. J. Phys. 51, 826 (1983)., 3232. J. Franklin, Am J Phys. 78, 1225 (2010).].
In order to calculate we have to “promote” the function 1/r to a distribution. As mentioned above and discussed below, this is not a simple task because the classical function 1/r is mathematically not well defined for r=0. The physical significance of the magnitude 1/r we are considering is the inverse of the length of the segment from 0 to the point whose radial coordinate is r.
As can be anticipated, the use of the strong or weak definition of the Dirac delta function in equation (42) requires that the definition of the distribution associated to 1/r should be chosen carefully as showed below.
4.1.1. The definition of the distribution used by S. M. Blinder
Our goal in this section is to prove equation (42) using the strong definition of the Dirac delta function consistently. To this end, we analyze the definition of the distribution corresponding to the classical function 1/r that follows was used by S. M. Blinder in the second part of his article [2121. S.M. Blinder, Am. J. Phys. 71, 816 (2003).]. Moreover, let us follow Temple’s method to define a distribution that corresponds to the classical function 1/r:
that is, we are using a sequence of functions that tends to H(r) when n → ∞. A more careful definition of 1/r, in three dimensions, will be given in SubSection 4.1.3 4.1.3. The definition of the distribution G1/rSW[φ] used by Ray Skinner and John A. Weil Another and more clear way to define the distribution 1/r is given by Ray Skinner and John A. Weil [17]. We now present this approach to show that equation (42) is true, making clear that this framework is more illuminating than the above demonstrations. We follow closely the presentation of [17]. The approach is to consider that both sides of the equation (42) must be interpreted as distributions. That is, the classical operator and the classical function 1/r must be “promoted” to a generalized operator acting on generalized functions and generalized function respectively. Let us begin by “promoting” 1/r to a generalized function. The classical function 1/r and its derivatives −1/r2, 2/r3, etc. “blow up” at r = 0. Following Skinner and Weil we can define the generalized function corresponding to 1/r as (57) G 1 r S W [ φ ] = ∫ S φ ( r → ) r d r sin θ d θ d ϕ = ∫ S φ ( r → ) r d r d Ω where φ is any test function and dΩ = sin θdθdϕ is the solid angle subtended from the origin to an element of volume containing by r→. Similarly, the classical function 1/r2 corresponds to the generalized function (58) G 1 r 2 S W [ φ ] = ∫ S φ ( r → ) d r d Ω . Equations (57) and (58), however, do not define univocally, from the mathematical point of view, a generalized function corresponding to 1/r and 1/r2 respectively, as we shall see at the end of this section. Nevertheless, Physics chooses the definitions (57) and (58) uniquely! We can now calculate formula (42) which requires that we use for the generalized ∇ operator following a definition in agreement with equation (29). We start by calculating the generalized gradient of the generalized function corresponding to 1/r given by equation (57). We have (59) ∇ G 1 r S W [ φ ] ≡ - ∫ S [ ∇ φ ( r → ) ] r d r d Ω = - ∫ S [ ∇ φ ( r → ) ] 1 r d 3 V Next we evaluate the Laplacian of the generalized function G1rSW corresponding to 1/r. We have, once again using equation (29) (60) Δ G 1 r S W [ φ ] = ∇ ⋅ ∇ G 1 r S W [ φ ] = ∫ S ∇ ⋅ [ ∇ φ ( r → ) ] r d r d Ω = p a r t s ∫ S ∇ φ ( r → ) ⋅ r → r d r d Ω = ∫ S ∂ φ ∂ r ( r → ) d r d Ω = 4 π [ lim r → ∞ φ ( r → ) - lim r → 0 φ ( r → ) ] = - 4 π φ ( 0 → ) , where we “integrated by parts” in the second line. The integral of the angular part in the above last line is not straightforward. It can be evaluated expanding φ(r→) in spherical harmonics, and we leave for the careful reader to perform it; see reference [18] page 33 for further information. Hence, we obtain equation (42), that is Δ 1 r = - 4 π δ 3 ( r → ) . A very important point is to show that the definitions of 1/r and 1/r2, as distributions, are determined by Physics. In fact the definitions (57) and (58) of the above quantities are unique for r > 0. In fact, take for example the definition given by equation (57) of the distribution corresponding to the classical function 1/r. We could add to it any linear combination of δ3(r→) and its derivatives and we would still represent 1/r for r > 0. However, the electrical potential defined by equation (57) is the one that satisfies Maxwell equations in generalized form as shown in [17]. So, the generalized form of the Maxwell equations defines the distributions we need to write them. The conclusion of this section is that we can work with either value of X in equation (1), provided we do it consistently. Notwithstanding, we have to worry about what are the definition as distributions of the magnitudes we are dealing. We summarize a few features of the different approaches presented above in Table 1. Table 1 Comparison between the weak and strong definitions of the Dirac delta function. Quantity Weak value Strong value ∫ - ∞ ∞ δ ( x ) d x 1 1 ∫ 0 ∞ δ ( x ) d x 1 2 1 δ3(r) δ ( r ) 2 π r 2 δ ( r ) 4 π r 2 Δ 1 r - 4 π δ 3 ( r → ) - 4 π δ 3 ( r → ) where this calculation is repeated in more detail. We shall omit the weak limn → ∞ detail in what follows, for simplicity. Then,
or
where the last equality was obtained using
For the sake of continuity of the argument, let us postpone the proof of equation (46). Now, using the expression for given in equation (15) for the strong definition of the delta function we can rewrite equation (45) as
Hence, we have demonstrated equation (42). Therefore, this result can be consistently obtained with the use of the strong definition of the delta function provided equation (46) holds true for this choice. So, let us obtain this equation.
In order to demonstrate equation (46) using the strong definition of the delta function we use a test function of the form
with g(r) infinitely differentiable and satisfying
Then, to demonstrate equation (46) we integrate the right-hand side of it, that is , multiplied by the test function in a volume 4πr2dr to get
On the other hand, the left-hand side of equation (46) multiplied by the test function (47), and integrated in the volume 4πr2dr gives
when we integrate the left-hand side of last equation by parts. Let us do the details of this last calculation, following the page 34 of the reference [1818. G. Barton, Elements of Green’s Functions and Propagation (Oxford Sciencepublications, Oxford, 1989).] with modifications.
where we used that g(r) vanishes for r < 0 in the last step. Now, equation (47) allow us to write the last integral as
This demonstrates the result.
4.1.2. The definition of the distribution used by Ben Kuang-Yu Hu
Let us now see that using a different definition of the distribution that corresponds to the classical function 1/r, we can demonstrate equation (42) using the weak definition of the delta function. We follow here the article by Ben Yu-Kuang Hu [3333. B. Yu-Kuang Hu, Am. J. Phys. 72, 409 (2004).], but with some modifications.
First we define new spherical coordinates with r ranging from −∞ to ∞ and the polar angle θ ranging from 0 to π/2. Then, we can define the distribution
Once again, the magnitude r is physically the distance from the origin to a point with coordinate r.
Now we calculate
where now we have used the results of equations (46) and (14). Hence, we have demonstrated equation (42) but using in the last step the weak definition of the delta function, that is equation (14).
Once again we used the result given in equation (46), so let us derive it for the weak definition of the delta function. To do so, we use a test function φ(r) whose support contains the origin and integrate its product with the left-hand side equation (46)
where the last equality follows from the fact the derivative φ is continuous and then r∂ φ/∂ r|r = 0 = 0. Therefore, equation (46) also holds for the weak definition of the Dirac delta function.
In brief, we have obtained the well know result
using both the weak and the strong definitions of the delta function.
This should be no surprise. The two results are the same but they result from two different definitions of the distribution that correspond to the classical function 1/r. Another more careful definition of this distribution is going to be presented, from the point of view of Schwartz, in Section 4.1.3 4.1.3. The definition of the distribution G1/rSW[φ] used by Ray Skinner and John A. Weil Another and more clear way to define the distribution 1/r is given by Ray Skinner and John A. Weil [17]. We now present this approach to show that equation (42) is true, making clear that this framework is more illuminating than the above demonstrations. We follow closely the presentation of [17]. The approach is to consider that both sides of the equation (42) must be interpreted as distributions. That is, the classical operator and the classical function 1/r must be “promoted” to a generalized operator acting on generalized functions and generalized function respectively. Let us begin by “promoting” 1/r to a generalized function. The classical function 1/r and its derivatives −1/r2, 2/r3, etc. “blow up” at r = 0. Following Skinner and Weil we can define the generalized function corresponding to 1/r as (57) G 1 r S W [ φ ] = ∫ S φ ( r → ) r d r sin θ d θ d ϕ = ∫ S φ ( r → ) r d r d Ω where φ is any test function and dΩ = sin θdθdϕ is the solid angle subtended from the origin to an element of volume containing by r→. Similarly, the classical function 1/r2 corresponds to the generalized function (58) G 1 r 2 S W [ φ ] = ∫ S φ ( r → ) d r d Ω . Equations (57) and (58), however, do not define univocally, from the mathematical point of view, a generalized function corresponding to 1/r and 1/r2 respectively, as we shall see at the end of this section. Nevertheless, Physics chooses the definitions (57) and (58) uniquely! We can now calculate formula (42) which requires that we use for the generalized ∇ operator following a definition in agreement with equation (29). We start by calculating the generalized gradient of the generalized function corresponding to 1/r given by equation (57). We have (59) ∇ G 1 r S W [ φ ] ≡ - ∫ S [ ∇ φ ( r → ) ] r d r d Ω = - ∫ S [ ∇ φ ( r → ) ] 1 r d 3 V Next we evaluate the Laplacian of the generalized function G1rSW corresponding to 1/r. We have, once again using equation (29) (60) Δ G 1 r S W [ φ ] = ∇ ⋅ ∇ G 1 r S W [ φ ] = ∫ S ∇ ⋅ [ ∇ φ ( r → ) ] r d r d Ω = p a r t s ∫ S ∇ φ ( r → ) ⋅ r → r d r d Ω = ∫ S ∂ φ ∂ r ( r → ) d r d Ω = 4 π [ lim r → ∞ φ ( r → ) - lim r → 0 φ ( r → ) ] = - 4 π φ ( 0 → ) , where we “integrated by parts” in the second line. The integral of the angular part in the above last line is not straightforward. It can be evaluated expanding φ(r→) in spherical harmonics, and we leave for the careful reader to perform it; see reference [18] page 33 for further information. Hence, we obtain equation (42), that is Δ 1 r = - 4 π δ 3 ( r → ) . A very important point is to show that the definitions of 1/r and 1/r2, as distributions, are determined by Physics. In fact the definitions (57) and (58) of the above quantities are unique for r > 0. In fact, take for example the definition given by equation (57) of the distribution corresponding to the classical function 1/r. We could add to it any linear combination of δ3(r→) and its derivatives and we would still represent 1/r for r > 0. However, the electrical potential defined by equation (57) is the one that satisfies Maxwell equations in generalized form as shown in [17]. So, the generalized form of the Maxwell equations defines the distributions we need to write them. The conclusion of this section is that we can work with either value of X in equation (1), provided we do it consistently. Notwithstanding, we have to worry about what are the definition as distributions of the magnitudes we are dealing. We summarize a few features of the different approaches presented above in Table 1. Table 1 Comparison between the weak and strong definitions of the Dirac delta function. Quantity Weak value Strong value ∫ - ∞ ∞ δ ( x ) d x 1 1 ∫ 0 ∞ δ ( x ) d x 1 2 1 δ3(r) δ ( r ) 2 π r 2 δ ( r ) 4 π r 2 Δ 1 r - 4 π δ 3 ( r → ) - 4 π δ 3 ( r → ) below. This definition is much more comprehensive than the two we have just presented. The same distribution, from the point of view of Temple is going to be presented in the new subsection.
4.1.3. The definition of the distribution used by Ray Skinner and John A. Weil
Another and more clear way to define the distribution 1/r is given by Ray Skinner and John A. Weil [1717. R. Skinner and J.A. Weil, Am. J. Phys. 57, 777 (1989).]. We now present this approach to show that equation (42) is true, making clear that this framework is more illuminating than the above demonstrations. We follow closely the presentation of [1717. R. Skinner and J.A. Weil, Am. J. Phys. 57, 777 (1989).].
The approach is to consider that both sides of the equation (42) must be interpreted as distributions. That is, the classical operator and the classical function 1/r must be “promoted” to a generalized operator acting on generalized functions and generalized function respectively.
Let us begin by “promoting” 1/r to a generalized function. The classical function 1/r and its derivatives −1/r2, 2/r3, etc. “blow up” at r = 0. Following Skinner and Weil we can define the generalized function corresponding to 1/r as
where φ is any test function and dΩ = sin θdθdϕ is the solid angle subtended from the origin to an element of volume containing by . Similarly, the classical function 1/r2 corresponds to the generalized function
Equations (57) and (58), however, do not define univocally, from the mathematical point of view, a generalized function corresponding to 1/r and 1/r2 respectively, as we shall see at the end of this section. Nevertheless, Physics chooses the definitions (57) and (58) uniquely!
We can now calculate formula (42) which requires that we use for the generalized ∇ operator following a definition in agreement with equation (29). We start by calculating the generalized gradient of the generalized function corresponding to 1/r given by equation (57). We have
Next we evaluate the Laplacian of the generalized function corresponding to 1/r. We have, once again using equation (29)
where we “integrated by parts” in the second line. The integral of the angular part in the above last line is not straightforward. It can be evaluated expanding in spherical harmonics, and we leave for the careful reader to perform it; see reference [1818. G. Barton, Elements of Green’s Functions and Propagation (Oxford Sciencepublications, Oxford, 1989).] page 33 for further information. Hence, we obtain equation (42), that is
A very important point is to show that the definitions of 1/r and 1/r2, as distributions, are determined by Physics. In fact the definitions (57) and (58) of the above quantities are unique for r > 0. In fact, take for example the definition given by equation (57) of the distribution corresponding to the classical function 1/r. We could add to it any linear combination of and its derivatives and we would still represent 1/r for r > 0. However, the electrical potential defined by equation (57) is the one that satisfies Maxwell equations in generalized form as shown in [1717. R. Skinner and J.A. Weil, Am. J. Phys. 57, 777 (1989).]. So, the generalized form of the Maxwell equations defines the distributions we need to write them.
The conclusion of this section is that we can work with either value of X in equation (1), provided we do it consistently. Notwithstanding, we have to worry about what are the definition as distributions of the magnitudes we are dealing. We summarize a few features of the different approaches presented above in Table 1.
4.2. Obtaining
The potential of a unit charge placed at the origin is given by . Its gradient is
Therefore, the associated Coulomb field to this charge is
Following Temple’s approach as in Section 4.1.1 4.1.1. The definition of the distribution G1rB[φ] used by S. M. Blinder Our goal in this section is to prove equation (42) using the strong definition of the Dirac delta function consistently. To this end, we analyze the definition of the distribution corresponding to the classical function 1/r that follows was used by S. M. Blinder in the second part of his article [21]. Moreover, let us follow Temple’s method to define a distribution that corresponds to the classical function 1/r: (43) G 1 r B = weak lim n → ∞ 1 r H ( r - 1 n ) , that is, we are using a sequence of functions H(r-1n) that tends to H(r) when n → ∞. A more careful definition of 1/r, in three dimensions, will be given in SubSection 4.1.3 where this calculation is repeated in more detail. We shall omit the weak limn → ∞ detail in what follows, for simplicity. Then, (44) Δ 1 r = Δ G 1 r B = 1 r ∂ 2 ∂ r 2 H ( r ) or (45) Δ 1 r = 1 r 2 [ r ∂ ∂ r δ ( r ) ] = - 1 r 2 δ ( r ) , where the last equality was obtained using (46) - r ∂ ∂ r δ ( r ) = δ ( r ) . For the sake of continuity of the argument, let us postpone the proof of equation (46). Now, using the expression for δ(r→) given in equation (15) for the strong definition of the delta function we can rewrite equation (45) as Δ 1 r = - 4 π δ 3 ( r → ) . Hence, we have demonstrated equation (42). Therefore, this result can be consistently obtained with the use of the strong definition of the delta function provided equation (46) holds true for this choice. So, let us obtain this equation. In order to demonstrate equation (46) using the strong definition of the delta function we use a test function of the form (47) φ ( r ) = { g ( r ) r for r > 0 0 for r < 0 with g(r) infinitely differentiable and satisfying (48) lim r → 0 r g ( r ) = 0 . Then, to demonstrate equation (46) we integrate the right-hand side of it, that is r∂∂rδ(r), multiplied by the test function g(r)r in a volume 4πr2dr to get (49) 4 π ∫ 0 ∞ δ ( r ) r g ( r ) r r 2 d r = 4 π g ( 0 ) . On the other hand, the left-hand side of equation (46) multiplied by the test function (47), and integrated in the volume 4πr2dr gives (50) 4 π ∫ 0 ∞ - ∂ δ ( r ) ∂ r g ( r ) r r 2 d r = 4 π g ( 0 ) , when we integrate the left-hand side of last equation by parts. Let us do the details of this last calculation, following the page 34 of the reference [18] with modifications. (51) ∫ 0 ∞ - ∂ δ ( r ) ∂ r g ( r ) r r 2 d r = ∫ 0 ∞ - ∂ δ ( r ) ∂ r [ r g ( r ) ] d r = ∫ - ∞ ∞ - ∂ δ ( r ) ∂ r [ r g ( r ) ] d r where we used that g(r) vanishes for r < 0 in the last step. Now, equation (47) allow us to write the last integral as (52) ∫ - ∞ ∞ δ ( r ) d d r [ r g ( r ) ] d r = ∫ - ∞ ∞ δ ( r ) [ r d g ( r ) d r + g ( r ) ] d r = g ( 0 ) . This demonstrates the result. , we can define a distribution corresponding to this electric field as a limit of a set of vector functions that tend to it as
The action of the electric field as a distribution in a test function is, using spherical coordinates,
where is the radial unit vector.
Now let us evaluate the divergence of the electric field in spherical coordinates
which vanishes for r > 0. Analogously to equation (29), the divergence of this vector distribution is
To proceed we integrate by parts, remembering that the surface term vanishes to the boundary condition of φ, and use the identity
to obtain that
where we employed equation (15) to go from the second to the third above lines. Remember that equation (15) is valid when we work with the strong definition of the Dirac delta function. Therefore, we have proved that
In a close analogy with was presented in Section 4.1.2 4.1.2. The definition of the distribution G1rKH[φ] used by Ben Kuang-Yu Hu Let us now see that using a different definition of the distribution that corresponds to the classical function 1/r, we can demonstrate equation (42) using the weak definition of the delta function. We follow here the article by Ben Yu-Kuang Hu [33], but with some modifications. First we define new spherical coordinates with r ranging from −∞ to ∞ and the polar angle θ ranging from 0 to π/2. Then, we can define the distribution (53) G 1 r K H = 1 r sign ( r ) = 1 r { 1 for r > 0 - 1 for r < 0 . Once again, the magnitude r is physically the distance from the origin to a point with coordinate r. Now we calculate (54) Δ ( G 1 r K H ) = 1 r ∂ 2 ∂ r 2 sign ( r ) = 2 r ∂ ∂ r δ ( r ) = - 2 δ ( r ) r 2 = - 4 π δ 3 ( r → ) , where now we have used the results of equations (46) and (14). Hence, we have demonstrated equation (42) but using in the last step the weak definition of the delta function, that is equation (14). Once again we used the result given in equation (46), so let us derive it for the weak definition of the delta function. To do so, we use a test function φ(r) whose support contains the origin and integrate its product with the left-hand side equation (46) (55) - ∫ - ∞ ∞ φ ( r ) r ∂ δ ( r ) ∂ r d r = ∫ - ∞ ∞ ∂ ( r φ ( r ) ) ∂ r δ ( r ) d r (56) = ∫ - ∞ ∞ ( r ∂ φ ( r ) ∂ r + φ ( r ) ) δ ( r ) d r = ∫ - ∞ ∞ δ ( r ) φ ( r ) d r where the last equality follows from the fact the derivative φ is continuous and then r∂ φ/∂ r|r = 0 = 0. Therefore, equation (46) also holds for the weak definition of the Dirac delta function. In brief, we have obtained the well know result Δ 1 r = - 4 π δ 3 ( r → ) using both the weak and the strong definitions of the delta function. This should be no surprise. The two results are the same but they result from two different definitions of the distribution that correspond to the classical function 1/r. Another more careful definition of this distribution is going to be presented, from the point of view of Schwartz, in Section 4.1.3 below. This definition is much more comprehensive than the two we have just presented. The same distribution, from the point of view of Temple is going to be presented in the new subsection. we can also show that this result is valid for the weak definition of the delta function.
Once again, this section shows clearly that using careful definitions of physical magnitudes we can obtain well known expressions using a sound mathematical formulation.
Acknowledgments
The authors thank prof. Luiz Nunes de Oliveira for a critical reading of the manuscript. M.A., F.A.B.C. and O.J.P.E. are supported in part by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq). O.J.P.E. is also supported by Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP).
References
-
1.M. Amaku, F.A.B. Coutinho, O.J.P. Éboli and E. Massad, Some problems with the Dirac delta function: Divergent series in Physics, accepted for publication in Braz. J. Phys. (2021).
-
2.G. Kirchhoff, Sitz. d. K. Preuss Akad. Wiss (Berlin) 22, 641 (1882).
-
3.O. Heaviside, Proc. Roy. Soc. A 52 , 504 (1893).
-
4.O. Heaviside, Proc. Roy. Soc. A 54 , 105 (1894).
-
5.R.H. Weber und R. Ganz, Repertorium Der Physik, 1, Band 2. Teil (Wiley, Berlin, 1916).
-
6.P.A.M. Dirac, The Principles of Quantum Mechanics (Oxford UniversityPress, Oxford, 1930).
-
7.L. Schwartz, Theory of Distributions (Herman, Paris, 1950).
-
8.M.J. Lighthill, Introduction to Fourier Analysis and generalizedFunction (Cambridge University Press, Cambridge, 1964).
-
9.G. Temple, J. Lon. Math. Soc. 28, 175 (1953).
-
10.J. Mikusinski, Fundamenta Mathematicae 35, 235 (1948).
-
11.On generalized exponential functions see, J. Mikusinski, Studia Math. 13, 48 (1951).
-
12.A. Ederly, Operational calculus and generalized functions (Holt, Rinehart and Winston, New York, 1962).
-
13.J.P. Marchand, Distributions An Outline (Dover Publications, New York, 2007).
-
14.D.A.V. Tonidandel and A.E.A. Araújo, Rev. Bras. Ens. Fis. 37, 3306 (2015).
-
15.M.G. Katz and D.A. Tall, Found. Sci. 18, 107 (2013).
-
16.G. Temple, Proc. Roy. Soc. A 228, 175 (1955).
-
17.R. Skinner and J.A. Weil, Am. J. Phys. 57, 777 (1989).
-
18.G. Barton, Elements of Green’s Functions and Propagation (Oxford Sciencepublications, Oxford, 1989).
-
19.J.D. Jackson, Mathematical Methods for Quantum Mechanics (A. Benjamin, New York, 1962).
-
20.R. Courant and D. Hilbert, Methods of Mathematical Physics (John Wileyand Sons, New York, 1962) v. 2.
-
21.S.M. Blinder, Am. J. Phys. 71, 816 (2003).
-
22.D. Zhang, Y. Ding and T. Ma, Am. J. Phys. 57, 281 (1989).
-
23.J. Von Neumann Collected Works (Pergamon, Oxford, 1994) v. 1 p. 111.
-
24.F.A. Muller, Am. J. Phys. 62, 11 (1994).
-
25.R.N. Bracewell, The Fourier Transform and Its Applications (McGraw-Hill, Boston, 2000).
-
26.T. Schucker Distributions, Fourier Transforms, And Some of theirApplications to Physics (World Scientific, Singapore, 1991).
-
27.P. Kurasov, J. Math. Anal. 201, 297 (1996).
-
28.F.A.B. Coutinho, Y. Nogami and F.M. Toyama, Rev. Bras. Ens. Fís. 31, 4302 (2009).
-
29.A. Gsponer, Eur. J. Phys. 28, 267 (2007).
-
30.V. Hnizdo, Eur. J. Phys. 32, 287 (2011).
-
31.C.P. Frahm, Am. J. Phys. 51, 826 (1983).
-
32.J. Franklin, Am J Phys. 78, 1225 (2010).
-
33.B. Yu-Kuang Hu, Am. J. Phys. 72, 409 (2004).
Publication Dates
-
Publication in this collection
07 July 2021 -
Date of issue
2021
History
-
Received
06 Apr 2021 -
Reviewed
03 June 2021 -
Accepted
05 June 2021