Acessibilidade / Reportar erro

Derivation of the Schrödinger equation I: the characteristic function

Derivação da equação de Schrödinger I: a função característica

Abstracts

In this paper, we present a mathematical derivation of the Schrödinger equation departing from only two axioms. We also show that, using this formal derivation process, it is possible to directly derive the Schrödinger equation in generalized curvilinear coordinate systems. This derivation is also shown to be equivalent to Feynman’s path integral approach, but goes further, allowing us to mathematically derive the Bohr-Sommerfeld quantization rules. The use of a small parameter, both in the present derivation, where it is δr, and Feynman’s derivation, where it is ϵ = δt, is also clarified in terms of the Central Limit Theorem. Therefore, the article makes a didactic transposition of the topic of quantization, allowing it to be addressed in the context of teaching Quantum Mechanics. The epistemological importance of axiomatic approaches for the mathematical derivation and the interpretation of the symbols of the theory is also considered.

Keywords:
Schrödinger equation; mathematical derivation; characteristic function


Neste artigo, apresentamos uma derivação matemática da equação de Schrödinger partindo de apenas dois axiomas. Mostramos também que, utilizando este processo de derivação formal, é possível derivar diretamente a equação de Schrödinger em sistemas de coordenadas curvilíneas generalizadas. Esta derivação também se mostra equivalente à abordagem de integrais de trajetória de Feynman, mas vai além, permitindo-nos derivar matematicamente as regras de quantização de Bohr-Sommerfeld. O uso de um parâmetro pequeno, tanto na presente derivação, em que aparece como δr, quanto na derivação de Feynman, em que aparece como ϵ = δt, também é esclarecido em termos do Teorema do Limite Central. O artigo faz, pois, uma transposição didática do tema da quantização, permitindo que seja abordado no contexto do ensino de Mecânica Quântica. A importância epistemológica de abordagens axiomáticas para a derivação matemática e a interpretação dos símbolos da teoria também é tratada.

Palavras-chave:
Equação de Schrödinger; derivação matemática; função característica


1. Introduction

The history of the birth of Quantum Mechanics based upon the Schrödinger equation is already well known. It came from a sequence of works of Erwin Schrödinger made in the end of 1925 and published in 1926. It is also known that the process by which this equation was obtained was based on the identification

(1)pp^=ix, xx^=x,
in which the classical functions x and p are taken into the quantum mechanical operatorsx^ and p^, respectively, and used, together with the classical Hamiltonian H, to make the identification
(2)H=p22m+V(x)H^=p^22m+V(x),
such that one gets the equation
(3)22m2x2+V(x)ψ(x,t)=iψ(x,t)t,
in which there appears a new function ψ(x, t), named wave function. Historically, the interpretation of the wave function immediately became the subject of many debates and discussions [1[1] J. Mehra, Foundations of Physics 17, 5 (1987).], up to the appearance of the presentation of the first interpretation of Quantum Mechanics, called the Copenhagen Interpretation [2[2] J. Faye, in: Stanford Encyclopedia of Philosophy, edited by E.N. Zalta (Stanford University, Stanford, 2019).].

An early version of the Copenhagen Interpretation1 1 Since, presently, one may say that nowadays there are many such interpretations that assume slightly different constructs [3]. was presented in the Solvay Congress in 1927. Even after that, the interpretation of this function remained (and remains) being the source of dispute among different interpretations of QM, as one can see, for example, from the Hidden Variables Interpretation, suggested by David Bohm [4[4] D. Bohm, Phys. Rev 85, 166 (1952)., 5[5] D. Bohm and B.J. Hiley, The Undivided Universe (Routledge, London, 1983), v. 1.], the Statistical Interpretation, suggested by Leslie Ballentine [6[6] L.E. Ballentine, Rev. Mod. Phys. 42, 358 (1970).], the Stochastic Interpretation [7[7] L.J. La Peña-Auerbach, Math. Phys. 10, 1620 (1969)., 8[8] L.J. La Peña-Auerbach, Phys. Lett. A 31, 403 (1970)., 9[9] L.J. La Peña-Auerbach, Math. Phys. 12, 453 (1971)., 10[10] L.J. La Peña-Auerbach, Found. Phys. 12, 1017 (1982).] presented by many authors, and the Many Worlds Interpretation, introduced by Hugh Everett [11[11] H. Everett, Reviews of Modern Physics 29, 3 (1957)., 12[12] H. Everett, J.A. Wheller, B.S. Dewitt, L.N. Cooper, Van Vechten D. and Graham N., in: Princeton Series in Physics, edited by B. Dewitt and R.N. Grahan (Princeton University Press, New Jersey, 1973).], to cite but a few.

We call the process of mathematical derivation of the Schrödinger equation, or some heuristic approach to get it, “quantization”, despite the fact that term “quantization” may refer to a broader class of approaches. Some problems appear from the heuristic manner in which the equation was postulated by Schrödinger. Indeed, the quantization in generalized coordinates, for instance, cannot be performed in a direct way. It is known that it is necessary to first make the quantization in Cartesian coordinates to, after that, proceed to the passage (based on the operators) to the curvilinear coordinate system desired. This, obviously, is a relevant theoretical problem (despite not being a practical one), since such a fundamental process such as the quantization cannot depend upon the choice of the (arbitrary) coordinate system representation. This problem was tackled on the literature without, however, presenting cogent results [13[13] G.R. Gruber, Foundations of Physics 1, 891 (1971)., 14[14] G.R. Gruber, Prog. Theo. Phys. 6, 31 (1972)., 15[15] G.R. Gruber, Foundations of Physics 5, 227 (1975).]. The previous heuristic approach is the one presented in physics books, being them on a more didactic level or an advanced one.

However, since its proposition, the Schrödinger equation has already received many mathematical derivations, under some diverse perspectives. Important among these are the stochastic derivations [7[7] L.J. La Peña-Auerbach, Math. Phys. 10, 1620 (1969).], which presented a boom in publications in the decades of 1950 and 1960, and are still a lively field of investigation [16[16] E. Santos, in: The Oxford Handbook of the History of Quantum Interpretations, edited by O. Freire Jr. (Oxford University Press, Oxford, 2022).]. This is so because when one derives the Schrödinger equation from a, supposedly, more fundamental principle and/or equation, there can be suggested important results regarding the interpretation of the whole theory. In fact, as it will become clear from the derivation we present in what follows, the very interpretation of the function ψ becomes quite direct and obvious, since it is inherited from the equations used in the axioms. This symbolic heritage process is, thus, of a fundamental epistemological value.

Two of such mathematical derivations of the Schrödinger equation were developed in [17[17] L.S.F. Olavo, Physica A 262, 127 (1999a)., 18[18] L.A. Gribov, Journal of Applied Spectroscopy 86, 4 (2019)., 19[19] L.S.F. Olavo, Physica A 271, 260 (1999b).], departing from first principles. These mathematical derivations have the peculiarity of being axiomatic (and depending upon only two axioms), besides being very mathematically straightforward. Beyond the mathematical perspective, important by itself, these mathematical demonstrations introduce epistemological issues, given the way by which they are performed, allowing the teacher to address the field of Quantum Mechanics in a different way even in introductory courses – a possibility that is normally not utilized in such courses, that end up being too mathematical, and functioning as mere manuals[20[20] T. Kuhn, The Structure of Scientific Revolutions (University of Chigago Press,Chicago, 1962), v. 1.] for the non-initiated.

Didactic transposition [21[21] Y. Chevallard, La Transposition Didactique (Grenoble, La Pensée sauvage, 1991), v. 1.] is, in essence, an engineering process for creating a teaching object. Its operation presupposes the conversion of basic knowledge, from a scientific modality (called “wise knowledge”), to that of clear academic incursion (admitted as “taught knowledge”), having as a mediator what is conventionally defined as “knowing how to teach”, that is, the one that usually appears in books or teaching manuals. In a strict sense, it is an instructional design tool.

It is widely known in the community of physicists that Schrödinger’s equation is a model development (appropriately, an interpretation) of Quantum Mechanics for a system (atoms, molecules and subatomic particles – free, bound or localized) that admits electronic behavior as of wave nature. As a result, the solution of such modeling is capable of producing a set of wave functions (or, in more pertinent terms, state functions), each associated with an electron binding energetic level. It is, therefore, a linear partial differential equation that describes the variation of the quantum state of a physical system as a function of time [22[22] L.S.F. Olavo, M. Ferreira and R.G.G. Amorim, Rev. Bras. Ens. Fís 44, e20220109 (2022)., 23[23] L.S.F. Olavo and M. Ferreira, Rev. Bras. Ens. Fís. 43, e20200508 (2021).].

It can be seen, from this succinct – and, we could say, relatively crude – definition, that there are several knowledges (definitions, concepts and operators) that come from the field of production of scientific knowledge, admit a didactic conversion into a formal compilation of the theory and its developments and ultimately enables transposition to didactic situations of interest. Here, succinctly, by deriving Schrödinger’s equation by just two axioms and demonstrating that it is possible to do so directly in generalized curvilinear coordinate systems, the equivalence between Feynman’s trajectory integral approach, as well as enables the mathematical demonstration of the Bohr-Sommerfeld quantization rules. The calibration of parameters, algebraic manipulation and the use of differential and integral calculation attest to the seminal scientific character of the approach which, in its ultimate consequences, based on the mathematical operations and physical elucidations made, become viable as an object of didactic transposition, such as objectified.

The objective of this paper, thus, is to present one of these derivations, called the characteristic function derivation, and show the mathematical developments that can be achieved with it. This will be done in the next section. In section 3, we will show how the presented axiomatic approach can be used to make a direct quantization in any coordinate system, in a complete formal way (that is, by writing the axioms in the desired coordinate system and making the same mathematical operations used in the derivation in Cartesian coordinates). In the fourth section we will show how this mathematical derivation allows one to mathematically derive the Bohr-Sommerfeld quantization rules. The fifth section is devoted to show how this quantization method is connected to Feynman’s path integral approach. In the sixth section we present our conclusions. In continuation papers, we aim at presenting other mathematical derivations of the Schrödinger equation and the relations that they have with the one here developed, showing that they are all mathematically interconnected.

2. Characteristic Function Derivation

We begin our axiomatization of Quantum Mechanics by presenting the axioms and showing that they, alone, allow us to mathematically derive the Schrödinger equation.

Axiom 1The characteristic function of the phase-space probability density function F(q, p; t), defined by

(4)Zq,δq;t=+Fq,p;teipδq/ldp,
where is a universal parameter with dimensions of angular momentum, is such that it can be written as
(5)Zq,δq;t=ψ*qδq2;tψq+δq2;t,
and should be expanded up to second order in the parameter δq.

Axiom 2For an isolated system, the joint phase-space probability density function related to any Quantum Mechanical phenomenon obeys the Fourier transformed Liouville equation

(6)expipδqldFq,p;tdt=0.
The derivative in (6) may be written as
(7)dFq,p;tdt=Ft+pmFqVqFp=0,
where it is already assumed that the underlying forces may be written as the gradient of a potential function. We now apply transformation (4) to this last equation and use the fact that F(q, p; t) is a probability density function to put
(8)Fq,p;teipδq/lp=p=+=0,
to arrive at the equation for the characteristic function (without the assumption 5)
(9)l2m2Zqδq+δqVqZ=ilZt,
that is the differential equation for the characteristic function Z(q, δq; t).

We can now use Axiom 2 and write the function Z(q, δq; t) in terms of the functions ψ(q; t) as in (5) and expand it up to second order. Thus, putting

(10)ψq;t=Rq;teiSq;t/l,
since ψ(q; t) is, in general, a complex function, we get the result
(11)Zq,δq;t=R2+δq22Rq;t2Rq2Rq2×expiδqlSq

Now we put expression (11) into (9) and separate the real and imaginary terms to find equations

(12)R2t+qRq;t2mSq,tq=0
and
(13)iδqlqSt+12mSq2+Vql22mR2Rq2=0.

The first equation may be identified as a continuity equation, precisely by the kind of semantic inheritance to which we have already referred. Indeed, since we have written the characteristic function as in (5) and also put ψ as in (10), we immediately find that

(14)Rq;t2=limδq0Zq,δq;t=ψ*q;tψq;t=+Fq,p;tdp
which must be a probability density function defined upon configuration space, since F(q, p; t) is a probability density function defined upon phase space—this means that ψ(q; t) must be a probability amplitude, an interpretation inherited from the axioms. It is also easy to show that
(15)Rq;t2mSq,tq=ilmlimδq0Zq,δq;tδq=+pmFq,p;tdp,
which gives equation (12) its unambiguous interpretation as a continuity equation. The second equation has a derivation with respect to q and thus may be written as
(16)St+12mSq2+Vql22mRq;t2Rq2=f(t),
in which the function f(t) is arbitrary. Since we can redefine S(q; t) as
Sq;t=Sq;t+0tftdt
to cancel out the right hand side of the previous equation, we may just consider that f(t) = 0 without loss of generality.

However, equations (16) with f(t) = 0 and (12) are fully equivalent to the Schrödinger equation

(17)22m2ψq2+Vqψq;t=iψq;tt,
since, if we replace the definition (10) in (17) and collect the real and imaginary terms (and make = to “discover” the value of our universal parameter2 2 One should remember that no mathematical derivation process can simply “find” the universal parameter of a theory. A similar situation can be found in Gravitation (where G is experimentally obtained) or in Electromagnetism. ), we also arrive at the same results (12) and (16).

This is the complete derivation and it does not depend upon any kind of abstruse mathematics, although the nature of the expansion up to second order must still be clarified. Indeed, it is well-known that when one assumes that the characteristic function should be written up to second order, then the Central Limit Theorem is in place, which is something that can be shown using this approach [24[24] L.S.F. Olavo, Foundations of Physics 34, 891 (2004).]. Thus, this derivation implies that Quantum Mechanics obeys the Central Limit Theorem.

3. Quantization in Spherical Coordinates: An Example

It is really awful that to quantize a system (to write down its Schrödinger equation) according to some orthogonal curvilinear coordinate system, for instance, one has to first write down its Schrödinger equation in Cartesian coordinates and then change to the desired orthogonal system in the quantum mechanical “side”.

This would imply the embarrassing conclusion that all the formalism depends upon a coordinate system, which is preposterous. There had been trials in the literature to overcome these difficulties[25[25] J.M. Domingos and M.H. Caldeira, Foundations of Physics 14, 2 (1984)., 26[26] N.M. Witriol, Foundations of Physics 5, 4 (1975).], but even these approaches are permeated with additional suppositions as in [13[13] G.R. Gruber, Foundations of Physics 1, 891 (1971)., 14[14] G.R. Gruber, Prog. Theo. Phys. 6, 31 (1972)., 15[15] G.R. Gruber, Foundations of Physics 5, 227 (1975).], where the author has to postulate that the total quantum-mechanical momentum operator pqi corresponding to the generalized coordinate qi is given by

(18)pqi=iqi
and where one also has to write the kinetic energy term of the classical Hamiltonian as
(19)H=12mikpqi*gikpqk.

These approaches seem rather unsatisfactory for we would like to derive our results using only first principles, without having to add more postulates to the theory.

On the other hand, if we do have an axiomatic approach, since every formal aspect of the theory should be contained in the axioms (or else they are not a complete set of axioms), the problem of quantization in generalized curvilinear orthogonal coordinate systems must also be contained in the axioms in such a way that, having written the axioms in the desired coordinate system, one must find the Schrödinger equation in this same coordinate system. This is an imposition (and in fact quite a strong one, as we will see) upon the set of axioms. For the sake of clarity and simplicity, we show in what follows an example on quantization in spherical coordinates that can be generalized to any coordinate system[17[17] L.S.F. Olavo, Physica A 262, 127 (1999a)., 27[27] L.S.F. Olavo, Quantum Mechanics: Principles, New Perspectives, Extensions, and Interpretation (Nova Publishers, New York, 2016)., 28[28] R.H. Kohler, Foundations of Physics 6, 2 (1976).]. We begin rewriting our two axioms in the appropriate coordinate system as:

Axiom 1The characteristic function, defined as

(20)Zr,δr;t=+Fr,θ,ϕ,pr,pθ,pϕ;teipδr/d3p,
is such that it can be written as
(21)Zr,δr;t=ψ*rδr2;tψr+δr2;t,
and should be expanded up to second order in δq.

Axiom 2 For an isolated system, the joint phase-space probability density function obeys the integrated Liouville equation

(22) exp i p δ r d F r , θ , ϕ , p r , p θ , p ϕ ; t d t = 0 .

From these axioms, all the previous calculations proceed in the same fashion, with the usual complications introduced by the non Cartesian coordinate system. Thus, the classical Hamiltonian in spherical coordinates is given by

(23)H=12mpr2+pθ2r2+pϕ2r2sin2θ+Vr,
and the Liouville equation becomes
(24)Ft+prmFr+pθmr2Fθ+pϕmr2sin2θFϕVrpθ2mr3pϕ2mr3sin2θFpr+pϕ2mr2sin2θcotθFpθ=0.
Now, the Fourier transformation in (20) can be easily constructed3 3 We provide, in the appendix, a Maple program that performs each important (and involved) calculation of this section. . Note that we have
(25)δr=δrr^+rδθθ^+rsinθδϕϕ^,
where r^,θ^,ϕ^ are the unit normal vectors, and
(26)p=prr^+pθrθ^+pϕrsinθϕ^,
giving
(27)pδr=prδr+pθδθ+pϕδϕ,
which is a general feature of what is called Mathieu’s transformations ([29[29] Lanczos C., The Variational Principles of Mechanics (Dover, New York, 1970), v. 1.]), that form a subset of the canonical transformations. Indeed, point transformations are a particular case of Mathieu’s transformations.

The relations between the momenta in Cartesian and spherical coordinates are given by

(28)px=prsinθcosϕ+pθrcosθcosϕpϕrsinϕsinθpy=prsinθsinϕ+pθrcosθsinϕ+pϕrcosϕsinθ,pz=prcosθpθrsinθ
and the Jacobian relating the two volume elements is given by
(29)dpxdpydpz=Jpdprdpθdpϕ,
and thus, since point transformations are canonical and the phase space volume element does not change, we have
(30)Jp=1r2sinθ.

With results (27) and (30) we may write

(31)Zr,δr;t=Fr,p;t×expiprδr+pθδθ+pϕδϕdprdpθdpϕr2sinθ.

We now impose this transformation upon the Liouville equation (24) and use Axiom 1 to find the equation for the characteristic function as

(32)2m1r2rr2Zδr+1r2sinθθsinθZδθ+1r2sin2θ2Zϕδϕ+2mδrr32Zδθ2+δrr3sin2θ2Zδϕ2+δθcotθr2sin2θ2Zδϕ2+δrVrZ=iZt.

To proceed with the calculations we must now write Z in spherical coordinates. We know that it must be written in Cartesian coordinates as

(33)ZR,δr;t= R2+R4i,j=13δxiδxJ2RxixJ14i,j=13δxiδxJRxiRxJ×expiδxSx+δySy+δzSz,
where we used the fact that the ψ(⃗r; t) can be written as in (10). We now use the fact that
(34)x=sinθcosϕr+1rcosθcosϕθ1rsinϕsinθϕy=sinθsinϕr+1rcosθsinϕθ+1rsinϕsinθϕ,y=cosθr1rsinθθ
where u is an abbreviation for /∂u. Thus, in spherical coordinates, the characteristic function becomes (up to second order in δr, δθ and δϕ)
(35)Zr,δr;t=R2+R4δr22Rr2+δθ22Rθ2+rRr+δϕ22Rϕ2+rsin2θRr+cosθsinθRθ+2δrδθ2Rrθ1rRθ+2rδϕ2Rrϕ1rRϕ+2δθδϕ2RθϕcotθRϕ14δr2Rr2+δθ2r2Rθ2+δϕ2r2sin2θRϕ2+2δrδθrRrRθ+2δrδϕrsinθRrRϕ+2δθδϕr2sinθRθRϕ]}×expiδrSr+δθSθ+δϕSϕ.

Substituting this expression into (32) and collecting zeroth and first order terms in δr, δθ, and δϕ, we find the following two equations (four, if we consider their scallar counterparts)

(36)δrrSt+12mS2+Vr22mR2R=0
and
(37)R2t+R2mS=0,
all written in spherical coordinates (the gradient, the divergent, and the Laplacian differential operators). It is then possible to show that we have the equivalence of these two last equations with the Schrödinger equation given by
(38)22m2ψr,t+Vrψr,t=iψr,tt,
also written in spherical coordinates. To see this, one needs only to write the last equation in spherical coordinates, write
(39)ψ=Rr,texpiSr,t,
substitute this result in the pervious equation and subtract the overall expression from the one coming from (36). This ends our derivation. This procedure, developed for spherical coordinates, can be generalized to any coordinate system[17[17] L.S.F. Olavo, Physica A 262, 127 (1999a)., 18[18] L.A. Gribov, Journal of Applied Spectroscopy 86, 4 (2019).].

It is important to note here the non-trivial algebraic relations involved in the preceeding derivation. Equation (32) is already very complicated and the substitution in it of the extremely complicated expression (35) turns the problem into a very long and intricate (although direct) algebraic problem (see appendix A).

It would be an extravagance to believe that the fact that the derivation was successful in this case (and all cases related to other coordinate systems) is simply a matter of coincidence. Our confidence in the derivation method and the axioms should increase with the success of this application. When we work out, in a future section, the connection of this derivation with Feynman’s path integral approach, it is expected that this confidence will also be increased.

4. Connection to Bohr-Sommerfeld Rules

We have the definition of the characteristic function as given by (4) and also the imposition that it must be written as a product of the type shown in (5). Since the characteristic function is a Fourier Transformation of the probability density defined upon phase space, if it is a product, then the probability density must be a convolution – this is just the convolution theorem of Fourier transforms. We thus write

(40)Fq,p;t=+ϕ*q,2pp;tϕq,p;tdp,
where ϕ is some phase-space probability amplitude. In this case it is easy to show that the integration in (4) leads to
Zq,δq;t=Fϕ*q,p;tFϕq,p;t,
as desired, so that Fϕ represents the Fourier transformation of ϕ with respect to p.

Now writing (look at the factor two in the denominator)

(41)ψq+δq2;t=Fϕq,p;t=expi2pδqϕq,p;tdp,
such that
(42)ψ*qδq2;t=Fϕ*q,p;t=expi2pδqϕ*q,p;tdp,
we reach the expression in (5). Thus, the constraint (5) is mathematically equivalent to assume the previous form for F(q, p; t), and thus, the mathematical form for ψ(q; t) in terms of δq/2.

From results (41) and (42) it is very easy to mathematically derive the Bohr-Sommerfeld rules. Consider that we are interested in translating the amplitude ψ(q; t) in configuration space from the point q to the point q + Δq by infinitesimal transformations. In expression (41) we can see that the kernel of the infinitesimal transformation is given by

(43)Kpqq+δq,q= expipqδq,
and we write explicitly the dependence of p(q) on variable q to make it clear that we are on a trajectory of the system4 4 If one is distressed by the notion of trajectory, look at Feynman’s approach, to be considered in the next section. . The finite transformation
ψq;tψq+Δq
would imply in the kernel
ψq+Δq;t=Kpq+Δq,qϕq,p;tdp,
such that (the arguments here are quite similar to those of Feynman in his path integral approach[30[30] R.P. Feynman and A.R. Hibbs, Quantum Mechanics and Path Integrals (McGraw-Hill, New York, 1965), v. 1.]:
Kpq+Δq,q=limNΠn1NKpq+n1δqq+nδq,q+n1δq,
where we put Nδq = Δq and take the limit N → ∞, since Δq is a finite interval and δq is infinitesimal. Using (43) we find that this last expression can be written as
Kpq+Δq,q= expilimNn=0Npq+nδqδq.

The sum in the exponent is clearly an integral taken along the trajectory of the particle and we end up with

Kpq+Δq,q= expiqq+Δqpqdq.

If Δq assesses a symmetry of the problem (q + Δq can be equal to q for rotations by 2π, for instance) we must impose that

(44)ψq+Δq;t=±ψq;t,
where the ± sign comes from the fact that ψ(q; t) is an amplitude, and the physically important quantity is the density, which allows both signs. Since we can now write
ψq+Δq;t=expiqq+Δqpqdqϕq,p;tdp,
we obey (44) if we put
expiqq+Δqpqdq=±1.
This last expression immediately implies that
(45)qq+Δqpqdq=2nπ=nhif Kp=+12nπ+π=n+12hif Kp=-1,
which is the expression for the Bohr-Sommerfeld rules, with the difference that, from the mathematical derivation, we also find the possibility of half-integral numbers.

This was never predicted in the historical development of the theory and is usually considered as a flaw, since there are a number of situations in which half-integral quantum numbers are necessary [31[31] L. Pauling and E.B. Wilson, Introduction to Quantum Mechanics, with applications to chemistry (Dover, New York, 1963), v. 1.]. Obviously, this theory can neither assess results related to the probability amplitudes, such as those related to intensities, nor problems without symmetries – except in an approximate way.

However, given the derivation process, relations (45) cannot be considered a “mere approximation” for systems showing some kind of symmetry, although it may be assumed as a first approximation (semiclassical is the usual word) for systems in which there is no available symmetry. As with the Feynman approach (that we will soon present), the integrals (45) give the most or least probable trajectories of the system’s particles, and, thus, they also furnish the points at which one should expect maxima or minima for the probability density.

5. Connections with Feynman’s Path Integral Approach

In this section we are interested in showing that our approach towards the establishment of the Bohr- Sommerfeld conditions is fully equivalent with Feynman’s path integral method (only slight modifications would be necessary)[17[17] L.S.F. Olavo, Physica A 262, 127 (1999a)., 27[27] L.S.F. Olavo, Quantum Mechanics: Principles, New Perspectives, Extensions, and Interpretation (Nova Publishers, New York, 2016)., 30[30] R.P. Feynman and A.R. Hibbs, Quantum Mechanics and Path Integrals (McGraw-Hill, New York, 1965), v. 1.].

On the other hand, Feynman’s approach already has a very nice interpretation in terms of displacements and it points in the direction of randomness (without making it explicit–we will return to this point in what follows). Moreover, in Feynman’s interpretation there appears also a quantity (usually written as ε) representing “infinitesimal amounts of time”, which is equivalent to disregarding second order terms in ε.

We begin by writing

pδq=pδqδtδt,
and we put q˙=δq/δt. This means that q˙ represents the velocity taken over the same trajectory, since in general we would have
δq=Δq+q˙δt,
where Δq is the separation between two distinct trajectories (which we made equal to zero). We must have
Δt1t2pq˙dt=Δqt1qt2pdq=0,
which is an expression for the Principle of Least Action. We may now use
q˙p=Lq,q˙;tE,
where Lq,q˙;t is the classical Lagrangian function and E is the energy (supposed constant) of the system under consideration. Our expression (41) becomes
(46)ψqt+δt2= expi2Lq,q˙;tEδt×ϕq,p;tJpq˙dq˙,
where Jp/q˙ is the Jacobian of the transformation q,pq,q˙.

The kernel of the infinitesimal (in time) transformation in (46) is given by

Kq˙(t)t+δt,t=Jp(t)q˙(t)expi2Lq,q˙;tEδt,
such that the transformation between two different times ta = 0 and tb = t may be written as
Kq˙(t)tb,ta=limNn=1NKq˙t+n1δtt+nδt,t+n1δt,
where Nδt = tbta, making it necessary to take the limit N → ∞, since δt is infinitesimal. We may thus write [tn = t + (n − 1) δt]
Kq˙(t)tb,ta=limNn=1NJptnq˙tn×expilimNLqtn,q˙tn;tEδt.

In the appropriate limit we get

Kq˙(t)tb,ta=A×expitatbLqtn,q˙tn;tdtiEtbta,
where we put
A=limNn=1NJptnq˙tn.

Since the classical action is given by

Scltb,ta=tatbLq,q˙;tdt
we finally get the desired result
Kq˙(t)tb,ta=AexpiSclta,tbiEtbta,
which is the expression for the kernel of Feynman’s path integral approach. Note that the Feynman approach furnishes the amplitudes and is an alternative approach to the Schrödinger equation.

6. Final Considerations

In this paper we have presented a direct mathematical method to derive the Schrödinger equation based upon only two axioms. The non-trivial aspect of this derivation was confirmed by the example on the quantization in curvilinear coordinate systems, of which we presented the spherical one.

We have also used this derivation to show that the Bohr-Sommerfeld rules are its direct formal consequence. This should be used to improve our understanding of the role played by these rules in the quantum mechanical formalism, instead of considering it just a matter of coincidence – a rather impressive one, if we also consider their generalization to the relativistic realm, from which one gets, for instance, the correct fine structure spectral lines of the Hydrogen atom. The statement, usually made, that this approach cannot deal with the harmonic oscillator, because it misses the ℏω/2 term is avoided in the present approach, since, as we have shown, this term can be present (although the approach itself does not furnish a way to decide when it should be present). These results contradict the notion that there is an “old” quantum theory in opposition to a “new” one. In fact, they show that the Born-Sommerfeld rules (for systems with exact symmetries) are carved into the deep mathematical structure of the quantum mechanical formalism.

Finally, we have also shown that the characteristic function derivation is completely equivalent to Feynman’s path integral approach, with the difference that it is taken on the phase-space, with a Hamiltonian function, not upon the configuration space, using a Lagrangian function, as it is with Feynman’s.

The didactic relevance of the present approach is two folded. First, it is quite direct and straightforward (at least in the Cartesian coordinate system) and allows one to address important interpretation issues that were the source of many discussions when the Schrödinger equation was proposed. Secondly, it puts some usual statements easily found in the literature in a new and different perspective, based on sound mathematical reasons, and not only on superficial thinking. Indeed, this approach can be used in any introductory course about Quantum Mechanics. This is the basis of the perspective of didactic transposition as we announced in the theoretical construction of the research problem.

At this point we may ask ourselves if there are other derivations of the Schrödinger equation (quantization methods) that can be made equivalent to the present one. Being them based upon different physical quantities, they can form together a sound basis for the interpretation of the theory, to cite but an important possibility. We leave this task to future papers on the subject.

Acknowledgments

The National Council for Scientific and Technological Development (CNPq).

Supplementary Material

The following online material is available for this article Supplementary material – Maple program to make the derivation in spherical coordinates.

References

  • [1]
    J. Mehra, Foundations of Physics 17, 5 (1987).
  • [2]
    J. Faye, in: Stanford Encyclopedia of Philosophy, edited by E.N. Zalta (Stanford University, Stanford, 2019).
  • [3]
    D. Howard, in: The Oxford Handbook of the History of Quantum Interpretations, edited by O. Freire Jr. (Oxford University Press, Oxford, 2022).
  • [4]
    D. Bohm, Phys. Rev 85, 166 (1952).
  • [5]
    D. Bohm and B.J. Hiley, The Undivided Universe (Routledge, London, 1983), v. 1.
  • [6]
    L.E. Ballentine, Rev. Mod. Phys. 42, 358 (1970).
  • [7]
    L.J. La Peña-Auerbach, Math. Phys. 10, 1620 (1969).
  • [8]
    L.J. La Peña-Auerbach, Phys. Lett. A 31, 403 (1970).
  • [9]
    L.J. La Peña-Auerbach, Math. Phys. 12, 453 (1971).
  • [10]
    L.J. La Peña-Auerbach, Found. Phys. 12, 1017 (1982).
  • [11]
    H. Everett, Reviews of Modern Physics 29, 3 (1957).
  • [12]
    H. Everett, J.A. Wheller, B.S. Dewitt, L.N. Cooper, Van Vechten D. and Graham N., in: Princeton Series in Physics, edited by B. Dewitt and R.N. Grahan (Princeton University Press, New Jersey, 1973).
  • [13]
    G.R. Gruber, Foundations of Physics 1, 891 (1971).
  • [14]
    G.R. Gruber, Prog. Theo. Phys. 6, 31 (1972).
  • [15]
    G.R. Gruber, Foundations of Physics 5, 227 (1975).
  • [16]
    E. Santos, in: The Oxford Handbook of the History of Quantum Interpretations, edited by O. Freire Jr. (Oxford University Press, Oxford, 2022).
  • [17]
    L.S.F. Olavo, Physica A 262, 127 (1999a).
  • [18]
    L.A. Gribov, Journal of Applied Spectroscopy 86, 4 (2019).
  • [19]
    L.S.F. Olavo, Physica A 271, 260 (1999b).
  • [20]
    T. Kuhn, The Structure of Scientific Revolutions (University of Chigago Press,Chicago, 1962), v. 1.
  • [21]
    Y. Chevallard, La Transposition Didactique (Grenoble, La Pensée sauvage, 1991), v. 1.
  • [22]
    L.S.F. Olavo, M. Ferreira and R.G.G. Amorim, Rev. Bras. Ens. Fís 44, e20220109 (2022).
  • [23]
    L.S.F. Olavo and M. Ferreira, Rev. Bras. Ens. Fís. 43, e20200508 (2021).
  • [24]
    L.S.F. Olavo, Foundations of Physics 34, 891 (2004).
  • [25]
    J.M. Domingos and M.H. Caldeira, Foundations of Physics 14, 2 (1984).
  • [26]
    N.M. Witriol, Foundations of Physics 5, 4 (1975).
  • [27]
    L.S.F. Olavo, Quantum Mechanics: Principles, New Perspectives, Extensions, and Interpretation (Nova Publishers, New York, 2016).
  • [28]
    R.H. Kohler, Foundations of Physics 6, 2 (1976).
  • [29]
    Lanczos C., The Variational Principles of Mechanics (Dover, New York, 1970), v. 1.
  • [30]
    R.P. Feynman and A.R. Hibbs, Quantum Mechanics and Path Integrals (McGraw-Hill, New York, 1965), v. 1.
  • [31]
    L. Pauling and E.B. Wilson, Introduction to Quantum Mechanics, with applications to chemistry (Dover, New York, 1963), v. 1.
  • 1
    Since, presently, one may say that nowadays there are many such interpretations that assume slightly different constructs [3[3] D. Howard, in: The Oxford Handbook of the History of Quantum Interpretations, edited by O. Freire Jr. (Oxford University Press, Oxford, 2022).].
  • 2
    One should remember that no mathematical derivation process can simply “find” the universal parameter of a theory. A similar situation can be found in Gravitation (where G is experimentally obtained) or in Electromagnetism.
  • 3
    We provide, in the appendix, a Maple program that performs each important (and involved) calculation of this section.
  • 4
    If one is distressed by the notion of trajectory, look at Feynman’s approach, to be considered in the next section.

Publication Dates

  • Publication in this collection
    15 July 2024
  • Date of issue
    2024

History

  • Received
    22 May 2024
  • Accepted
    05 June 2024
Sociedade Brasileira de Física Caixa Postal 66328, 05389-970 São Paulo SP - Brazil - São Paulo - SP - Brazil
E-mail: marcio@sbfisica.org.br