Abstracts
In this paper, we show that the Central Limit Theorem is deeply ingrained in the mathematical and physical structure of Quantum Mechanics. We show, furthermore, that the Central Limit Theorem provides us with a clarification of the assumptions made by other quantization processes – in particular those using the notion of expansion of some variable up to second order.
Keywords Central Limit Theorem; Mathematical derivation; Schrödinger equation; Teaching of Quantum Mechanics
Neste artigo, mostramos que o Teorema do Limite Central está profundamente enraizado na estrutura matemática e física da Mecânica Quântica. Mostramos, além disso, que o Teorema do Limite Central nos fornece um esclarecimento das suposições feitas por outros processos de quantização – em particular aqueles que utilizam a noção de expansão de alguma variável até segunda ordem.
Palavras-chave: Teorema do Limite Central; Derivação matemática; Equação de Schrödinger; Ensino de Mecânica Quântica
1. Introduction
In the first two papers of this series [1, 2], we have derived the Schrödinger equation from two quite different constructs: the characteristic function and the Boltzmann’s entropy. Each approach was based upon only two axioms and we were able to mathematically connect them. In so doing, we were able to derive the phase space probability density function for Quantum Mechanics which, as we have shown in [2], gives the correct energy values and other quantities.
Other mathematical connections were found. We showed the intrinsic connection of the characteristic function derivation with Feynman’s path integral formalism, and showed how the Bohr-Sommerfeld rules are deeply carved into the formal structure of Quantum Mechanics. Using the formal developments of [2], we were able to address many elements of Bohmian Quantum Mechanics to show that we can approach that interpretation from constructs quite different from the ones David Bohm had used [3]; such constructs, and all those used in the derivation made in [2], are simply those of generally used in the field of Kinetic Theory [4, 5]. We have also shown that the phase space probability density function F(q, p; t), derived in [2] in connection with [1], is the only one that maximizes the entropy and minimizes the quantum mechanical energy of a physical system [6]. Using examples, we have also shown that the phase space probability density function obtained is the one that satisfies the first axiom of the characteristic function derivation, which states that the momentum Fourier transformed Liouville equation should be satisfied, if one considers it up to second order, that is
This “second order” expansion is a common feature of the characteristic function derivation and all those derivations formally equivalent to it, such as Feynman’s path integral approach (in which the second order imposition is made on the time variable[7]) and the entropy derivation, made in [2] and others throughout the history of Physics [8] and not only of Quantum Mechanics.
However, it remains for us to understand what this “second order” constraint represents mathematically and physically. We did have a clue on that, since, as we mentioned in [1], a second order expansion of a characteristic function may refer to the validity of the Central Limit Theorem. This, of course, was just a clue, a hunch that must be mathematically proved. This is the objective of this paper.
Thus, at this point, we have the situation shown in Figure 1, where all the interconnections (shown as traced lines) were already formally established. The formal connection we are interested now is the one related to the Central Limit Theorem.
This is not a mere whim, but plays crucial role in the context of all derivations we are willing to present. Indeed, in [1] and [2] we assumed an expansion up to second order in the variable δq, but inverted the Fourier transform defining the characteristic function, which represents an integration in δq from −∞ to ∞. This must be justified and it is the Central Limit Theorem that provides such a justification.
The present approach also reinforces our argument that each derivation of the Schrödinger equation deepens our comprehension of Quantum Mechanics.
The paper is organized in the following way: in section two, we present the kind of sampling we are making on phase space, when using a momentum space characteristic function applied to a phase space probability density function. This paves the way for us to present, in section three, a derivation of the Central Limit Theorem for the kind of sampling we are using. In section four, we show the deep connections between the characteristic function derivation and the Central Limit Theorem. Section five presents the way our axioms should be mathematically and physically understood for a derivation of the Schrödinger equation, that becomes identical to the one presented in paper I (but now with qualifications). In section six we present our final considerations, in which we provide brief considerations about the potential application of the elements discussed in this paper in the context of physics teaching.
2. Momentum Space Characteristic Function an Sampling
In [1], we have used the Characteristic Function, defined as
to mathematically derive the Schrödinger equation, where F(q, p; t) is the probability density defined upon phase space. In that derivation process we have expanded the characteristic function only up to second order in the parameter and we have pointed out that this should be connected to the Central Limit Theorem. In [2], on the other hand, we have inverted the Fourier transform in (2) to get the explicit expression of the phase space probability density function.
If we express this probability density function in terms of the characteristic function momentum moments (see [2]), it is possible to get
where the average momentum is given, for each point of the configuration space, as
and the momentum variance at each q is
a result that corresponds exactly to what one would expect regarding the Central Limit Theorem [9], but one that deserves qualifications.
Indeed, the definition (2) means that the coordinate q is being set as a label, since the characteristic function is defined upon the momentum space for each coordinate q. This means that the characteristic function is defined over fibers positioned at each coordinate point q, and so is the Gaussian function obtained in (3). Thus, we get, in general, different Gaussian functions for different points q of the configuration space – and this is what we mean by “fibers” in our previous comments.
These comments are important, since they make us understand the kind of sampling that is being performed upon the phase space. Thus, as the physical system evolve on phase space by assuming the points (q(t), p(t)), the sampling is performed, for each region (q − δq/2, q + δq/2), having q as its center, by getting each momentum p when the system passes over this fiber, as schematically shown in Figure 2 – the mentioned region is what makes our probability function a probability density. Our variable of interest is then the sums of random variables
Schematic representation of the sampling made upon phase space over fiber regions given by (q − δq/2, q + δq/2), with δq → 0 to get a sequence of points pk.
where the pj(q) are the momentum random variables, while P(q) is their sum.
We thus note that the phase space sampling does not accompanies the dynamical evolution of the system. Nonetheless, it represents this evolution (at each instant of time t) by letting the physical system to evolve and making the previously mentioned sampling. It is important to adequately comprehend this point: the evolution of the phase space probability density function means, in the present approach, the evolution of the sampling of sums of momentum values at each fiber region labeled by q on configuration space.
3. The Central Limit Theorem
We begin this section by presenting the Central Limit Theorem and proving it for the our particular case.
Theorem:consider, for a given fiber centered on q on the configuration space, a sequence of independent random variables p1, p2, . . . , pn with ⟨pi⟩ = μi(q; t) and, i = 1, 2, . . .If we put p = p1 + p2 + · · · + pn then, under very general conditions, the reduced variable1
where
has approximately a Gaussian distribution with and . Thus, if Fn is the probability distribution function of the random variable p (the sum), for each fiber centered in q, whose probability density is ρ(q; t), then we have
where
and
Proof. (We will demonstrate the theorem only for situations in which the m = μk’s are all equal and so are all the . However, the theorem has a much wider applicability [10]). Under these assumptions, we have μ(q; t)(n) = nm and (note that this means that s2 must go to zero with n−1 as we make n → ∞, the same being valid for m. Consider now the probability P(q, p; t)dp of being at some interval (p, p + dp) after n steps, each one within (pk, pk + dpk), with probability
where we are assuming that all the w(*) are identical and the qk are in the interval (q − δq/2, q + δq/2)). We assume that ω(qk, pk, t) = ξ(qk; t)w(qk, pk; t), which is a condition for the statistical independence between the two random variables, since ξ(qk; t) is the probability of being in the vicinity of qk and w(qk, pk, t) is the conditional probability of being in the vicinity of pk assuming that the system is in the vicinity of qk.
In this case, we get
and, as a consequence, it is possible to get our probability function as
We then use [11]
and
Now, if instead of working with pk, we use the re-scaled random variable (pk − m)/s, such that
(note that z refers to each pk) then, the properties of the Fourier transform give us
This last result means that the re-scaled characteristic function Z(q, θ; t) of the variable p must be given by
since we are considering the pk independent variables, a demand of the Central Limit Theorem. Indeed, in this case, the characteristic function of their sum is just the product of the characteristic functions of each pk (actually, this is one of the most important features of characteristic functions).
Thus,
and we can develop z(θ) in Maclaurin series as
where all the derivatives are with respect to θ and R is the remainder of the expansion. Since the definition of z(q, θ; t) implies that
we end with2
and, thus,
Note that is a finite quantity, while and are infinitesimal, because of the infinitesimal character of m and s2 (in the sense that they go to zero as n−1 when n → ∞). Note, however, that there is a factor n multiplying the logarithm. This means that we will have to seek for the expanded expression in the logarithm.
Since we want results for n → ∞, we develop the logarithm in power series to find
Note that the first two terms cancel out and we end up with
where Ωn(θ) depends upon n as some inverse power law of the type n−α with α > 0. Thus, if we take the limit n → ∞ we get
which, upon inversion, gives the Gaussian probability function (u = u(∞))
which is the result we were willing to show.♢
This result has an enormous importance for our developments made so far. Indeed, in [2], we have made the inversion of the characteristic function in terms of δq to get the expression (3). However, we have also expanded the characteristic function only up to the second order in δq. These two calculations would have been contradictory, if the expansion of the characteristic function up to second order was meaning that δq is an “infinitesimal” quantity (then, the integration of δq from −∞ to ∞ that appears in the inversion of the Fourier transform would have no meaning whatsoever). In the next section, we address this point in connection to the characteristic function derivation.
4. Connection with The Characteristic Function Derivation
In this section we will unravel the connections between the characteristic function derivation of [1] and the Central Limit Theorem. The clarifications to emerge will apply equally well to all other derivations already seen that also use the notion of an expansion up to second order in some parameter. This ultimate connection sheds the final light upon the correct interpretation of the quantum formalism, which we will address elsewhere.
The characteristic function derivation in [1] begins with the integrated Fourier transform of the Liouville equation and the definition of a characteristic function (at this point we write it as ζ (q, δq; t) since we do not know if it should correspond to z (q, δq; t) or Z (q, δq; t))
imposing also that
and using
If we expand expression (20) up to second order as
which may be written, using
as
which is nothing but the expression for Z(n) (q, δq; t), where we stress the appearance of the number n, since ζ = Z(n) (q, δq; t) is the characteristic function for the sum of n random variables pk. Note that the term within brackets in (31) is just z(q, δq; t)n. This means that the term O (δq3) is, in fact, a term depending on n−α, for some positive real number α. This term disappears as we make n → ∞.
Therefore, its disappearance is not due to some infinitesimal character of the variable δq. This variable can be of any magnitude but the last term is such that limn→∞ O (δq3) → 0, and the characteristic function must be expanded up to second order in δq.
5. The Central Limit Theorem Derivation of the Schrödinger Equation
The axioms of the theory coming from this derivation are the same already presented in [1], but with an important qualification:
Axiom 1: The Fourier transformed Liouville equation is valid for the description of any quantum system that can be described by the Schrödinger equation.
Axiom 2: The characteristic function Z(n) (q, δq; t) of the random variable can be written (in the limit n → ∞) as the product
for any quantum system, and Quantum Mechanics refers to the universality class defined by the Central Limit Theorem3.
The present approach thus furnishes us the extra assumption which was lacking in the developments of [2]; the one that gives coherence to the development of the characteristic function up to second order and the inversion of the Fourier transform. This extra assumption is the one regarding the reference of Quantum Mechanics to the universality class defined by the Central Limit Theorem, which is now fully clarified.
This result underpins, from a different perspective, our argument, already presented in [1], that it is important to make these derivations using different constructs, since they may clarify important points of each other. Indeed, it is worth saying that the characteristic function presented in (34) is nothing but the density matrix in its 1 × 1 version. Indeed, the approach can be easily extended, for example, to encompass the definition of the characteristic function as
to represent statistical mixtures, where M is some diagonal matrix M = diag(p1, . . . , pn). In this case, one has
where tr(*) is the trace operation. This reinforces the interpretation of the preceding calculations as a statistical approach without having to postulate it.
6. Final Considerations
We showed in this paper that the Central Limit Theorem is part of the formal and physical structure of Quantum Mechanics. Moreover, we have used this result to understand the “second order” expansion made in [1] regarding the characteristic function derivation of the Schrödinger equation. Indeed, the present result allows us to understand why we can make the expansion of the characteristic function in the variable δq and invert the Fourier transformation related to it to get the phase space probability density function. Thus, in this paper, we have used intensively papers’ I [1] and II [2] results. This shows, in fact, how the quantization processes are interwoven making a complex web of results based on different constructs, despite connected with each other.
Thus, at this point we have the situation shown in Figure 3.
Current status of our approach to the derivation of the Schrödinger equation by different perpectives.
However, there is something missing yet. Indeed, we have not furnished the dynamical equations that furnishes the points (q, p) on phase space, that is, the systems trajectories. These dynamical equations are not Newton’s equation, and we must know their expression.
We leave this result for a future paper in which we mathematically show that these dynamical equations are the Langevin Equations of Quantum Mechanics. With that, we cover the last and greatly important node shown in gray in Figure 3 and conclude our series of deductions of the Schrödinger equation from different, but complimentary, perspectives.
Finally, we have already stressed the importance of bringing the discussion of these themes to the classroom when teaching Quantum Mechanics. It provides not only a deepened understand of the meaning of the theory, but also, and more importantly, a critical and careful general behavior regarding science. Moreover, when each quantization process uses a different physical construct, it thus correlates the learning of Quantum Mechanics with different areas of Physics, giving to it a more synthetic comprehension, as we did at [12,13,14], as well as a critical and meaningful view of physics teaching, as done in [15,16,17,18,19]. Characteristic functions, Boltzmann’s entropy, maximization of entropy, minimization of energy, Central Limit Theorem and many others already used in this and the previous papers show that Physics is a interwoven enterprise of concepts and theories – a view that is not generally reinforced in physics courses.
Acknowledgments
The National Council for Scientific and Technological Development (CNPq).
References
- [1] O.L. Silva Filho and M. Ferreira, Rev. Bras. Ens. Fís. 46, e20240183 (2024).
- [2] O.L. Silva Filho and M. Ferreira, Rev. Bras. Ens. Fís. 46, e20240219 (2024).
- [3] D. Bohm, Phys. Rev. 85, 166 (1952).
- [4] R.L. Liboff, Kinetic Theory (Prentice-Hall, Englewood Cliffs, 1990).
- [5] T. Takabayasi, Prog. Theoret. Phys. 11, 341 (1954).
- [6] R.G. Parr and W. Yang, Density-functional Theory of Atoms and Molecules (Oxford Academic, New York, 1986).
- [7] R.P. Feynman, Statistical Mechanics, a Set of Lectures (Addison-Wesley, Reading, 1998).
- [8] M. Born, Natural Philosophy of Cause and Chance (Oxford University Press, Oxford, 1949).
- [9] A.I. Khinchin, Mathematical Foundations of Statistical Mechanics (Dover, New York, 1949).
- [10] P. Levy, in: Oeuvres de Paul Levy (Ecole Polytechnique, Paris, 1976).
- [11] F. Reif, Fundamentals of Statistical and Thermal Physics (McGraw-Hill, Singapore, 1965).
- [12] O.L. Silva Filho and M. Ferreira, Rev. Bras. Ens. Fís. 43, e20200508 (2021).
- [13] O.L. Silva Filho, M. Ferreira and R.G.G. Amorim, Rev. Bras. Ens. Fís. 44, e20220109 (2022).
- [14] O.L. Silva Filho and M. Ferreira, Rev. Bras. Ens. Fís. 45, e20230231 (2023).
- [15] M. Ferreira, O.L. Silva Filho, M.C. Batista, A. Abrão Filho, A. Strapasson and A.E. Santana, Rev. Bras. Ens. Fís. 45, e20230254 (2023).
- [16] M. Ferreira, O.L. Silva Filho, M.A. Moreira , G.B. Franz, K.O. Portugal and D.X.P.Nogueira, Rev. Bras. Ens. Fís. 42, e20200057 (2020).
- [17] M. Ferreira, R.V.L. Couto, O.L. Silva Filho, L. Paulucci, F.F. Monteiro, Rev. Bras. Ens. Fís. 43, e20210157 (2021).
- [18] A. Strapasson, M. Ferreira, D. Cruz-Cano, J. Woods, M.P.N.M Soares and O.L. Silva Filho, International Journal of Educational Technology in Higher Education 19, 5 (2022).
- [19] M. Ferreira, Silva O.L. Filho, A.B.S. Nascimento and A.B. Strapasson, Humanities & Social Sciences Communications 10, 768 (2023).
-
1
The index n is simply saying that we have summed only up to a finite number n of random variables.
-
2
The remainder R will be a complicated expression of these quantities and powers of n.
-
3
Note that this statement already implies that the characteristic function should be developed up to second order in δq – these two statements are equivalent.
Publication Dates
-
Publication in this collection
23 Aug 2024 -
Date of issue
2024
History
-
Received
17 June 2024 -
Reviewed
20 July 2024 -
Accepted
21 July 2024