Overview of mathematical modelling ideas in single particle quantum mechanics

Quantum mechanical models in three dimensions and involving many particles, etc., use a formalism which is largely based on some key mathematical modelling ideas pertaining to single particle systems. These mathematical modelling ideas are principally designed to tell us what sets of values we are `allowed’ to observe when we measure some aspect of a quantum system, and with what probabilities we can expect to observe particular values from these sets. For quick reference, I wanted to have a relatively brief and coherent overview of these key mathematical ideas and have written the present note for this purpose. Dirac notation is used throughout.

Quantum particle in a ring

To start with, suppose all we know is that a quantum particle in a closed ring of length $L$ has a given linear momentum $p$ when an observation of its position is made. The result of the observation is that the particle will be found at some point $x$ lying in a small interval of length $dx$ on the closed ring. We assume that the radius of the ring is sufficiently large to make angular momentum considerations negligible. De Broglie tells us that the linear momentum of the quantum particle must be associated with a wavelength $\lambda$ of some wave function $\langle x | p \rangle$ according to the rule

$p = \frac{h}{\lambda} \qquad \qquad \qquad \qquad \qquad (1)$

where $h$ is Planck’s constant. (There is also a fundamental equation $E = hf$ linking the energy of a quantum particle to the frequency of a wave function. This will be introduced later in connection with modelling time evolution in quantum mechanics). Somewhat surprisingly at first sight, this is all the information we need to deduce a precise functional form for the wave function $\langle x | p \rangle$ as well as the allowed spectrum of momenta for the particle. (Note that $|\langle x| p\rangle|^2 dx$ gives the probability that a particle with momentum $p$ will be found at a position $x$ lying in a small interval of length $dx$ on the closed ring. This probability distribution can be observed through experiment, while $\langle x | p \rangle$ itself is unobservable directly. We refer to $\langle x | p \rangle$ as a quantum mechanical probability amplitude in this connection).

First, we imagine what the position wave function $\langle x | p \rangle$ must look like as a Fourier series. Since only a single wavelength is involved in this particular setup, $\langle x | p \rangle$ must be a pure wave, i.e., the exponential form of its Fourier series must consist of a single term, so we must have

$\langle x | p \rangle = A \exp(i k x) \qquad \qquad \qquad \qquad \qquad (2)$

for some $A$ that still remains to be calculated, where $k$ is the wave number

$k = \frac{2 \pi}{\lambda} \qquad \qquad \qquad \qquad \qquad (3)$

Combining (1) and (3), we can write de Broglie’s linear momentum formula as

$p = \hbar k \qquad \qquad \qquad \qquad \qquad (4)$

where $\hbar = \frac{h}{2 \pi}$ is the reduced Planck’s constant. We can then rewrite the above wave function to explicitly incorporate momentum as

$\langle x | p \rangle = A \exp\big(\frac{i x p}{\hbar}\big) \qquad \qquad \qquad \qquad \qquad (5)$

An assumption called the certainty condition in quantum mechanics, or sometimes the square integrability condition or normalizability condition, is that the quantum particle must be found upon measurement to be located somewhere on the ring, so we must have

$\int_0^L |\langle x| p\rangle|^2 dx = 1 \qquad \qquad \qquad \qquad \qquad (6)$

(This requires quantum mechanics to take place in a Hilbert space of square-integrable functions, as discussed further below). Substituting our wave function so far into this then gives $A = \frac{1}{\sqrt{L}}$ , so we can write the full position wave function as

$\langle x | p \rangle = \frac{1}{\sqrt{L}} \exp\big(\frac{i x p}{\hbar}\big) \qquad \qquad \qquad \qquad \qquad (7)$

Second, to deduce the permitted spectrum of momenta, we use the fact that the position wave function for a quantum particle in a ring must be periodic with period $L$ , so we must have

$\langle x | p \rangle = \langle x + L | p \rangle \qquad \qquad \qquad \qquad \qquad (8)$

Therefore

$\frac{1}{\sqrt{L}} \exp\big(\frac{i x p}{\hbar}\big) = \frac{1}{\sqrt{L}} \exp\big(\frac{i (x + L) p}{\hbar}\big) \qquad \qquad \qquad \qquad \qquad (9)$

which implies

$\frac{L p}{\hbar} = 2 n \pi$

for $n \in \mathbb{N}$ , and therefore the permitted spectrum of momenta for this setup is given by

$p = \frac{2 n \pi \hbar}{L} \qquad \qquad \qquad \qquad \qquad (10)$

It is impossible for this quantum particle to have a momentum $p$ that does not belong to this particular discrete spectrum.

In our one-dimensional quantum particle in a ring system, a measurement result that is nondegenerate uniquely defines a state of the system, denoted by a ket $| \ \rangle$ in Dirac notation, e.g., $| p \rangle$ for a momentum measurement or $| x \rangle$ for a position measurement. Formally, as outlined below, these kets are vectors in a Hilbert space. In contrast, a degenerate measurement is a measurement that cannot uniquely define a state of the system, i.e., it is consistent with more than a single state and is thus regarded as a superposition of states. As an example, the energy $E$ of our particle in a ring system is degenerate because it is a quadratic function of momentum, $E = \frac{p^2}{2m}$ , and there is nothing in the model to prevent the momentum being negative. Thus, the same value of $E$ can correspond to a momentum state $| p \rangle$ or a momentum state $| -p \rangle$ , so the label $E$ cannot by itself be used to uniquely define a state $| E \rangle$ here.

An event is a measurement finding for a system when it is in a known, i.e., pre-prepared, state. For example, we can measure the position $x$ for our system when we know the quantum particle in the ring has been prepared to have momentum state $| p \rangle$ . In quantum mechanics, all such events have probability amplitudes with the property that their absolute value squared gives a probability density for the event, as mentioned above. In this case, as mentioned earlier, the probability amplitude for the position measurement is $\langle x | p \rangle$ in Dirac notation, with $\int_0^L |\langle x | p \rangle|^2 dx = 1$ . Note that there is an implicit assumption that the measurement of position occurs instantaneously after the quantum system has been prepared in its initial momentum state, i.e., there should not have been enough time for the system to evolve away from its pre-prepared momentum state before position is measured. This implicit assumption applies to all quantum events. Also note that the famous Dirac delta function in the continuous case (and the Kronecker delta function in the discrete case) arise by considering self-space events, i.e., performing the same measurement twice in rapid succession. For example, suppose the quantum particle is known to be in position state $| X \rangle$ and then a second measurement of position is quickly made. The measurement space in this case is continuous and the self-space probability amplitude $\langle x | X \rangle$ is given by the Dirac delta function

$\langle x | X \rangle = \delta(x - X)$

which is such that $\delta(x - X) = 0$ for $x \neq X$ , while $\delta(x - X) = \infty$ for $x = X$ in such a way that $\int_{-\infty}^{\infty} dx \ \delta(x - X) = 1$ , and which famously also has as its defining property the integral equation

$\int_{-\infty}^{\infty} dx \ \delta(x - X) f(x) = f(X)$

In the case of repeated measurements of momentum, which has a discrete measurement space for the quantum particle in a ring model, the corresponding self-space probability amplitude would be given by the Kronecker delta

$\langle p | P \rangle = \delta_p^P$

which is such that $\delta_p^P = 1$ if $p = P$ , and $\delta_p^P = 0$ if $p \neq P$ . The Dirac delta function can be thought of as the continuous measurement space analogue of the Kronecker delta for discrete spaces.

The abstract measurement space consisting of the spectrum of all possible results you can get from a measurement is a Hilbert space, i.e., a complex inner product space which is also a complete metric space with regard to the distance function induced by the inner product. The state of a physical system is treated as a vector in such a Hilbert space. There is a Hilbert space containing all the possible position and momentum states above, for example, and the probability amplitudes $\langle x | p \rangle$ are formally the inner products of the two state vectors $| x \rangle$ and $| p \rangle$ in this Hilbert space. A necessary and sufficient condition for these inner products to exist is that the functions involved are square-integrable. The completeness of these measurement spaces means that each can act as a basis, i.e., a ‘language’, in terms of which other states can be represented. Thus, we can regard the probability amplitude $\langle x | p \rangle$ as the projection of state $| p \rangle$ on $x$ -space, or the momentum state $| p \rangle$ expressed in the language of $x$ , or simply the state $| p \rangle$ expressed as a function of $x$ . Similarly, $\langle p | x \rangle$ can be interpreted as the position state $| x \rangle$ expressed in the language of $p$ . A key axiom of quantum mechanics is that $\langle x | p \rangle$ and $\langle p | x \rangle$ are related via complex conjugation:

$\langle p | x \rangle = \langle x | p \rangle^{\ast} \qquad \qquad \qquad \qquad \qquad (11)$

So, for example, given $\langle x | p \rangle$ as defined in (7) above, we can immediately deduce that

$\langle p | x \rangle = \frac{1}{\sqrt{L}} \exp\big(\frac{- i x p}{\hbar}\big) \qquad \qquad \qquad \qquad \qquad (12)$

(But remember that these probability amplitudes are never directly observable. Only the associated probability distributions are observable experimentally).

We can use this idea of measurement spaces acting as language spaces (or, more formally, basis sets) to express one state as a superposition of other states. For example, a hypothetical state $| \psi \rangle$ can be experimentally resolved into momentum state components, expressed mathematically as the superposition

$| \psi \rangle = \sum_p | p \rangle \langle p | \psi \rangle \qquad \qquad \qquad \qquad \qquad (13)$

This is a discrete sum because recall that, for the quantum particle in a ring model, the measurement space for momentum is discrete. On the right-hand side, each $| p \rangle$ is a pure momentum state, and the bra-ket $\langle p | \psi \rangle$ is the weight attached to this pure momentum state in the superposition. The same state $| \psi \rangle$ can equally well be perceived as a superposition of position states:

$| \psi \rangle = \sum_x | x \rangle \langle x | \psi \rangle \qquad \qquad \qquad \qquad \qquad (14)$

The summation here is to be implemented as an integral since the measurement space for position is continuous for the quantum particle in a ring model. Thus, we can write (14) in integral form as

$| \psi \rangle = \int_x dx \ \psi(x) | x \rangle$

Notice how the delta function definition of the self-space amplitude, $\langle X | x \rangle = \delta(X - x)$ , now arises in this integral when computing the coefficients of the expansion of $\psi$ in terms of the continuous position basis:

$\langle X| \psi \rangle = \int_x dx \ \psi(x) \langle X | x \rangle = \int_x dx \ \psi(x) \delta(X - x) = \psi(X)$

Thus, the position space wave functions $\psi(x)$ , evaluated at particular values of $x$ , are just the coefficients of a state $| \psi \rangle$ (belonging to Hilbert space) in a superposition using a continuous position basis.

It is useful to bear in mind that these superposition equations are analogous to how a periodic function $f(x)$ can be expanded in a Fourier series. We are using the same kinds of mathematical structures and arguments in defining the superpositions above.

Both of the superposition equations above are special cases of a general rule called the completeness theorem or superposition theorem, which can be expressed mathematically in Dirac notation as

$\sum_q | q \rangle \langle q | = 1 \qquad \qquad \qquad \qquad \qquad (15)$

The ket $| q \rangle$ on the left-hand side of this equation is a pure $q$ -state. The bra $\langle q |$ can be regarded here as an operator that, when it is applied to any state $| \psi \rangle$ , gives the weight attached to that pure $q$ -state in the superposition for $| \psi \rangle$ :

$| \psi \rangle = \sum_q | q \rangle \langle q | \psi \rangle \qquad \qquad \qquad \qquad \qquad (16)$

In this last equation, it is as if we are multiplying $| \psi \rangle$ on the right-hand side by 1 to get $| \psi \rangle$ again on the left-hand side, so we represent this situation as in equation (15) above.

Measuring an observable for our quantum system, assuming the system has been prepared so that it is in a known state when the measurement takes place, is modelled mathematically as applying a suitable operator to the prepared state ket. If an operator $\hat{\Omega}$ operates on a state ket $| \psi \rangle$ , another new state ket is produced denoted by $\hat{\Omega} | \psi \rangle$ . For each such operator $\hat{\Omega}$ , there must be a rule or instruction which tells us what is produced when the ket undergoes the operation. This rule must itself be given in some language, e.g., the representation of $\hat{\Omega}$ in $x$ -space is denoted $\langle x | \hat{\Omega}$ , and in $p$ -space is denoted $\langle p | \hat{\Omega}$ . An important example is the operator $\hat{D}$ whose rule in $x$ -space is take the derivative with respect to x. Therefore, in Dirac notation:

$\langle x | \hat{D} = \frac{d}{dx} \langle x | \qquad \qquad \qquad \qquad \qquad (17)$

This equation can be applied to any ket, so for example

$\langle x | \hat{D} | \psi \rangle = \frac{d}{dx} \langle x | \psi \rangle \qquad \qquad \qquad \qquad \qquad (18)$

In the same way that a state has an existence which is independent of the language used to represent it, so too may operators be expressed in different languages. For example, we can convert an instruction $\langle x | \hat{\Omega}$ in $x$ -language to an instruction $\langle p | \hat{\Omega}$ in $p$ -language by using a ket-bra sum $\sum_x | x \rangle \langle x |$ . We get

$\langle p | \hat{D} | \psi \rangle = \sum_x \langle p | x \rangle \langle x | \hat{D} | \psi \rangle \qquad \qquad \qquad \qquad \qquad (19)$

No specific state $| \psi \rangle$ is involved here, so we can write the general language-translation-rule as

$\langle p | \hat{D} = \sum_x \langle p | x \rangle \langle x | \hat{D} \qquad \qquad \qquad \qquad \qquad (20)$

To give an example of implementing this in practice, recall that for our quantum particle in a ring system we have

$\langle p | x \rangle = \frac{1}{\sqrt{L}} \exp\big(\frac{- i x p}{\hbar}\big)$

Therefore, the conversion represented by (19) can be written in conventional notation as

$\langle p | \hat{D} | \psi \rangle = \int_0^L dx \frac{1}{\sqrt{L}} \exp\big(\frac{- i x p}{\hbar}\big) \frac{d \psi}{d x} \qquad \qquad \qquad \qquad \qquad (21)$

Integrating the right-hand side by parts, we find it equals

$\frac{i p}{\hbar} \int_0^L dx \frac{1}{\sqrt{L}} \exp\big(\frac{- i x p}{\hbar}\big) \psi(x)$

$= \frac{i p}{\hbar} \sum_x \langle p | x \rangle \langle x | \psi \rangle$

$= \frac{i p}{\hbar} \langle p | \psi \rangle$

Therefore,

$\langle p | \hat{D} | \psi \rangle = \frac{i p}{\hbar} \langle p | \psi \rangle$

Since $\psi$ is arbitrary, the general rule is

$\langle p | \hat{D} = \frac{i p}{\hbar} \langle p | \qquad \qquad \qquad \qquad \qquad (22)$

The operators that represent measurement of observables in quantum mechanics are linear and hermitian. (A hermitian matrix is the complex analogue of a symmetric matrix. It is equal to its own conjugate transpose. Another type of matrix which arises frequently in quantum mechanics is a unitary matrix. This is the complex analogue of an orthogonal matrix, the latter having the property that the transpose matrix is the inverse. In the case of a unitary matrix, the complex conjugate transpose is the inverse. Hermitian operators are usually related to the measurement of observables in quantum mechanics, whereas unitary operators are related to carrying out processes such as translations of quantum states in space and time. We need to extend the symmetric and orthogonal properties of ordinary matrices to include complex conjugation in the definition of hermitian and unitary matrices to cater for the fact that quantum mechanics involves complex numbers, e.g., in the unitary matrix case, simply transposing a complex matrix and multiplying by the original matrix would in general produce another complex matrix rather than an identity matrix, without the complex conjugation step). As per Sturm-Liouville theory, hermitian operators have the property that their eigenvalues are real and form complete measurement spaces, i.e., they represent the entire spectrum of values that can be observed when a measurement is made. It is important to understand, however, that these eigenvalues are only revealed when the operator is applied to its particular home measurement space, formally called its eigenspace. Each operator is defined in terms of a particular instruction (telling us what the operator does) and in terms of the set of eigenvalues it gives rise to when it is applied to its particular eigenspace. The effect of applying an operator to a ket in its eigenspace will be to produce the same ket multiplied by an eigenvalue of the operator. The kets for which this happens are the eigenkets of the operator. For example, based on the result in (22) above, define a new operator $\hat{p}$ as

$\hat{p} = - i \hbar \hat{D} \qquad \qquad \qquad \qquad \qquad (23)$

Then on the basis of (22), we immediately deduce that

$\langle p | \hat{p} = p \langle p | \qquad \qquad \qquad \qquad \qquad (24)$

This is the bra form of the eigenvalue equation for $\hat{p}$ , meaning that in $p$ -language, which is the language of the eigenspace of $\hat{p}$ , the effect of $\hat{p}$ is to multiply any state it operates on by a number $p$ . We can equivalently state the eigenvalue equation in ket form:

$\hat{p} | p \rangle = p | p \rangle \qquad \qquad \qquad \qquad \qquad (25)$

This is saying that the effect of $\hat{p}$ on its eigenket is to produce a number times the same eigenket, no matter what language the instruction for $\hat{p}$ is expressed in. The two forms (24) and (25) are equivalent, and both mean that the operator $\hat{p}$ produces the spectrum of momenta for the quantum particle in a ring model, and the eigenkets of $\hat{p}$ are $| p \rangle$ . We say that the $p$ -space kets $| p \rangle$ are eigenstates of $\hat{p}$ with eigenvalues $p$ . Notice that when the operator $\hat{D}$ is expressed in $x$ -language instead of $p$ -language, we do not get a proper eigenvalue equation (see (18) above). Therefore, $x$ -space is not the eigenspace of $\hat{D}$ . However, when the operator $\hat{D}$ is expressed in $p$ -language, we do get a proper eigenvalue equation (equation (22) above), so $p$ -space is the eigenspace for $\hat{D}$ .

Similarly, since position is a measurable quantity for the quantum particle in a ring, there is a position operator $\hat{x}$ which generates a position measurement space, just as $\hat{p}$ generates momentum measurement space. The eigenvalue equation for $\hat{x}$ is simply

$\hat{x} | x \rangle = x | x \rangle \qquad \qquad \qquad \qquad \qquad (26)$

Note that each of $\hat{x}$ and $\hat{p}$ can be expressed in the language of the other’s space. Expressing $\hat{x}$ in $p$ -language, we find

$\langle p | \hat{x} = i \hbar \frac{d}{dp} \langle p| \qquad \qquad \qquad \qquad \qquad (27)$

(Proof: $\langle p | \hat{x} | \psi \rangle = \sum_x \langle p | x \rangle \langle x | \hat{x} | \psi \rangle = \sum_x \langle p | x \rangle x \langle x | \psi \rangle = i\hbar \frac{d}{dp} \sum_x \langle p | x \rangle \langle x | \psi \rangle = i\hbar \frac{d}{dp} \langle p | \psi \rangle$ ). Similarly, expressing $\hat{p}$ in $x$ -language we get

$\langle x | \hat{p} = - i \hbar \frac{d}{dx} \langle x| \qquad \qquad \qquad \qquad \qquad (28)$

(Proof: $\langle x | \hat{p} | \psi \rangle = \sum_p \langle x | p \rangle \langle p | \hat{p} | \psi \rangle = \sum_p \langle x | p \rangle p \langle p | \psi \rangle = -i\hbar \frac{d}{dx} \sum_p \langle x | p \rangle \langle p | \psi \rangle = -i\hbar \frac{d}{dx} \langle x | \psi \rangle$ ). In each case, we are not using the correct eigenspace for the operator, so the equations we get are not in the form of proper eigenvalue equations. However, we can check that (27) and (28) are correct by seeing if they do produce proper eigenvalue equations when applied to the correct kets. When applied to the ket $| x \rangle$ , equation (27) must produce the proper eigenvalue equation

$\langle p | \hat{x}| x \rangle = x \langle p| x \rangle \qquad \qquad \qquad \qquad \qquad (29)$

Therefore we need to confirm that

$i \hbar \frac{d}{dp} \langle p| x \rangle = x \langle p| x \rangle \qquad \qquad \qquad \qquad \qquad (30)$

but this result is immediate when we substitute (12) into each side of (30). Similarly, when applied to the ket $| p \rangle$ , equation (28) must produce the proper eigenvalue equation

$\langle x | \hat{p}| p \rangle = p \langle x | p \rangle \qquad \qquad \qquad \qquad \qquad (31)$

Therefore we need to confirm that

$- i \hbar \frac{d}{dx} \langle x | p \rangle = p \langle x | p \rangle \qquad \qquad \qquad \qquad \qquad (32)$

but this is again immediate when we substitute (7) into each side of (32).

Another interesting thing to observe here is that we can reverse this argument and derive the form of the probability amplitude $\langle x | p \rangle$ as a function of $x$ from (32), interpreted as a differential equation. We can regard (32) as the the eigenvalue equation in (25) above, but expressed in the language of $x$ (i.e., the differential equation in (32) is obtained by introducing the bra $\langle x|$ in (25) above, yielding $\langle x | \hat{p} | p \rangle = p \langle x | p \rangle$ ). The solution to (32) is precisely the form of $\langle x | p \rangle$ given in (7) above, with the constant coming from the normalization condition $\int_0^L |\langle x| p\rangle|^2 dx = 1$ as before.

In addition to position and momentum operators $\hat{x}$ and $\hat{p}$ , energy is also an observable, so it must have an operator that generates the spectrum of measurable energies. This is the Hamiltonian operator, which for a quantum particle in a ring model has the form

$\hat{H} = \frac{\hat{p}^2}{2m} \qquad \qquad \qquad \qquad \qquad (33)$

(There is no potential energy in our simple model system, so no function of $x$ in the Hamiltonian). Note that the form of the operator $\hat{H}$ is the same as the form of the classical kinetic energy function $\frac{p^2}{2m}$ . In the quantum case, the eigenvalues of $\hat{H}$ are precisely $\frac{p^2}{2m}$ . Remarkably, a simple rule that works for finding quantum mechanical operators corresponding to classical observables is to put hats on the observables in algebraic relations between classical observables. This is how we can get $\hat{H}$ from $\frac{p^2}{2m}$ . As another example, we could in principle obtain a quantum velocity operator as $\hat{v} = \frac{\hat{p}}{m}$ .

Having the form of the Hamiltonian operator $\hat{H}$ , we can now find its eigenvalues from the eigenvalue problem that generates them:

$\hat{H} | E \rangle = E | E \rangle \qquad \qquad \qquad \qquad \qquad (34)$

This is actually the general form of the famous time-independent Schrödinger equation. Casting this equation into $x$ -space, we can solve for the probability amplitudes $\langle x | E \rangle$ :

$-\frac{\hbar^2}{2m} \frac{d^2}{dx^2} \langle x | E \rangle = E \langle x | E \rangle \qquad \qquad \qquad \qquad \qquad (35)$

This is a simple second-order linear differential equation with general solution

$\langle x | E \rangle = A \exp\bigg(i x \sqrt{\frac{2mE}{\hbar^2}}\bigg) + B \exp\bigg(- i x \sqrt{\frac{2mE}{\hbar^2}}\bigg) \qquad \qquad \qquad \qquad \qquad (36)$

From the fact that this wave function must be single-valued for our quantum particle in a ring, we deduce

$E = \frac{n^2 h^2}{2 m L^2}$

for $n \in \mathbb{N}$ . Remember, however, that there is degeneracy in $E$ for the quantum particle in a ring model: there is nothing to prevent there being two possible $p$ -states (one positive, one negative) for each value of $E$ .

Quantum particle on an infinite line

Suppose we modify the previous model by allowing the radius of the ring, $\frac{L}{2 \pi}$ , to become infinitely large, so that the `track’ now effectively extends infinitely to the right and infinitely to the left of the starting point at zero. The quantum particle is then free to move anywhere on the real line, i.e., the position measurement space has become $x \in (-\infty, \infty)$ . For such a quantum particle, momentum is no longer quantized (intuitively, this is because the probability amplitude periodicity constraint $\langle x | p \rangle = \langle x + L | p \rangle$ is no longer relevant), so the momentum measurement space is now $p = \hbar k \in (-\infty, \infty)$ . In this context, with both position and momentum being continuous measurement spaces, Fourier transforms arise naturally because they enable any function of $x$ over the domain $x \in (-\infty, \infty)$ , such as $\psi(x) = \langle x | \psi \rangle$ , to be expanded as an integral in $k$ over the same domain:

$\psi(x) = \int_{-\infty}^{\infty} dk \exp(i k x) \phi(k) \qquad \qquad \qquad \qquad \qquad (37)$

where $\phi(k)$ is given by

$\phi(k) = \frac{1}{2 \pi} \int_{-\infty}^{\infty} dx \exp(- i k x) \psi(x) \qquad \qquad \qquad \qquad \qquad (38)$

We can use this Fourier transform relationship to obtain physically-realistic mathematical models of a quantum particle on an infinite line. As is well known, in order to obtain a well-behaved wave function representation of a free particle on an infinite line, i.e., a wave function that is square-integrable, it is necessary to construct a superposition of pure waves incorporating a continuous range of wave numbers $k$ . This wave packet approach then naturally gives rise to discussions relating to Heisenberg’s uncertainty principle. Before considering this, however, it is interesting to note that we can also derive a pseudo-normalized pure wave form for the wave function $\langle x | p \rangle$ in the case of a quantum particle on an infinite line, somewhat analogous to the pure wave form in the previous section. To do this, we will first obtain a representation of the Dirac delta function using the above Fourier transform by letting $\psi(x) = \langle x | X = 0 \rangle = \delta(x)$ . Putting this into (38) gives

$\phi(k) = \frac{1}{2 \pi} \int_{-\infty}^{\infty} dx \exp(- i k x) \delta(x) = \frac{1}{2 \pi} \exp(- i k 0) = \frac{1}{2 \pi}\qquad \qquad \qquad \qquad \qquad (39)$

Then, putting this result into (37) gives us the following useful representation:

$\delta(x) = \frac{1}{2 \pi} \int_{-\infty}^{\infty} dk \exp(i k x) \qquad \qquad \qquad \qquad \qquad (40)$

We will also observe that for any $b \in \mathbb{R}-\{0\}$ , we have

$\delta(b x) = \frac{\delta(x)}{|b|} \qquad \qquad \qquad \qquad \qquad (41)$

(To see this, note first that $\delta(x) = \delta(-x)$ because $\int dx f(x) \delta(x) = \int dx f(x) \delta(-x) = f(0)$ . Therefore, only the absolute value of $b$ in $\delta(bx)$ is relevant. We need to show that $\int dz g(z) \delta(b z) = \frac{1}{|b|} \int dx f(x) \delta(x)$ . In the integral on the left-hand side, make the change of variable $x = |b| z$ . Then $z = \frac{x}{|b|}$ and $dz = \frac{dx}{|b|}$ , and noting that $g\big(\frac{x}{b}\big) = f(x)$ , i.e., both are functions of $x$ , we can write $\int dz g(z) \delta(b z) = \frac{1}{|b|} \int dx f(x) \delta(x)$ as required).

We now use this machinery to derive an explicit pure wave functional form for $\langle x | p \rangle$ . First, consider the self-space bra-ket

$\langle p | P \rangle = \delta(p - P) = \delta(\hbar(k - K)) = \frac{1}{\hbar} \delta(k - K)\qquad \qquad \qquad \qquad \qquad (42)$

where the final equality follows from (41). Using (40), reversing the roles of $x$ and $k$ there, we can write

$\langle p | P \rangle = \frac{1}{\hbar}\delta(k - K)$

$= \frac{1}{\hbar}\delta(K - k)$

$= \frac{1}{\hbar}\frac{1}{2 \pi} \int_{-\infty}^{\infty} dx \exp(i (K-k) x)$

$= \int_{-\infty}^{\infty} dx \big[\frac{1}{\sqrt{h}}\exp(-i k x)\big]\big[\frac{1}{\sqrt{h}} \exp(i K x) \big]\qquad \qquad \qquad \qquad \qquad (43)$

But we also have in Dirac notation

$\langle p | P \rangle = \sum_x \langle p | x \rangle \langle x | P \rangle \qquad \qquad \qquad \qquad \qquad (44)$

Comparing (43) and (44) we can immediately see that

$\langle x | p \rangle = \frac{1}{\sqrt{h}} \exp(i k x) = \frac{1}{\sqrt{h}} \exp\big(\frac{i x p}{\hbar}\big) \qquad \qquad \qquad \qquad \qquad (45)$

The only difference between $\langle x | p \rangle$ here and in (7) above for the quantum particle in a ring case is that the normalization constant here is $\frac{1}{\sqrt{h}}$ instead of $\frac{1}{\sqrt{L}}$ . It is easily checked that (45) is in fact not square-integrable for the case of the quantum particle on an infinite line, i.e., $\langle x | p \rangle$ is not an element of a Hilbert space in the conventional way here, since $\int_{-\infty}^{\infty} dx \ |\langle x | p \rangle|^2 = \infty$ . Instead, a kind of pseudo-normalization convention is being used in specifying the normalization constant in (45), called normalization to a Dirac delta function. From (43) (remembering that $k = \frac{p}{\hbar}$ ) we see that the normalization constant in (45) is giving us normalization to a Dirac delta function in the form

$\int_{-\infty}^{\infty} dx \ \langle p | x \rangle \langle x | P \rangle = \delta(p - P) \qquad \qquad \qquad \qquad \qquad (46)$

However, the non-square-integrable individual pure waves in (45) do not represent physically realizable states. To construct a square-integrable wave packet representation of a quantum particle on a line, we go back to the Fourier transform equation (37) above and regard it as a sum of pure waves over a continuous range of wave numbers $k$ , with the amplitudes of these pure waves being modulated by the function $\phi(k)$ . A standard example is a Gaussian wave packet which we can obtain here by specifying

$\phi(k) = \frac{1}{\sqrt{2 \pi}}\big(\frac{2 \alpha}{\pi}\big)^{1/4} e^{-\alpha(k-k_0)^2}$

where $k_0$ is some fixed wave number. Note that $|\sqrt{2\pi} \phi(k)|^2$ is the probability density function of a $\mathcal{N}(k_0, \frac{1}{4\alpha})$ variable, i.e., up to a normalizing constant we get a Gaussian probability distribution for the wave number $k$ centered on $k_0$ with standard deviation $\frac{1}{2\sqrt{\alpha}}$ . Inserting this expression for $\phi(k)$ in (37) gives

$\psi(x) = \frac{1}{\sqrt{2 \pi}} \big(\frac{2\alpha}{\pi}\big)^{1/4} \int_{-\infty}^{\infty} dk \ e^{-\alpha(k-k_0)^2} e^{i k x}$

The integral here can be solved using straightforward changes of variables to give the result

$\psi(x) = \big(\frac{1}{2 \pi \alpha}\big)^{1/4} e^{i k_0 x} e^{-x^2/4 \alpha}$

This is a localized wave function and is square-integrable since $|\psi(x)|^2$ is the probability density function of a $\mathcal{N}(0, \alpha)$ variable, i.e., we get a Gaussian probability distribution for the position $x$ centered at zero and with standard deviation $\sqrt{\alpha}$ . (As a check, note that substituting this expression for $\psi(x)$ into (38) must give us the expression we used for $\phi(k)$ . Using the the result

$-\big(\frac{1}{4\alpha}\big)x^2 - i(k - k_0)x = -\big(\frac{1}{4\alpha}\big)(x+2 \alpha i(k-k_0))^2 - \alpha(k-k_0)^2$

and the fact that $\int_{-\infty}^{\infty} dx \ e^{-a(x+b)^2} = \sqrt{\frac{\pi}{a}}$ , we can substitute the expression for $\psi(x)$ into (38) to get

$\phi(k) = \frac{1}{2 \pi} \int_{-\infty}^{\infty} dx \ \big(\frac{1}{2 \pi \alpha}\big)^{1/4} e^{i k_0 x} e^{-x^2/4 \alpha} e^{-i k x}$

$= \frac{1}{2 \pi} \big(\frac{1}{2 \pi \alpha}\big)^{1/4} \int_{-\infty}^{\infty} dx \ e^{-i (k - k_0)x} e^{-x^2/4 \alpha}$

$= \frac{1}{2 \pi} \big(\frac{1}{2 \pi \alpha}\big)^{1/4} e^{-\alpha(k-k_0)^2} \int_{-\infty}^{\infty} dx \ e^{-(x + 2 \alpha i (k - k_0))^2/4 \alpha}$

$= \frac{1}{2 \pi} \big(\frac{1}{2 \pi \alpha}\big)^{1/4} e^{-\alpha(k-k_0)^2} \bigg(\frac{\pi}{\big(\frac{1}{4 \alpha}\big)}\bigg)^{1/2}$

$= \frac{1}{\sqrt{2 \pi}}\big(\frac{2 \alpha}{\pi}\big)^{1/4} e^{-\alpha(k-k_0)^2}$

as required). Denoting the standard deviation of $x$ by $\Delta x$ and the standard deviation of $k$ by $\Delta k$ , we have

$\Delta x \Delta k = \sqrt{\alpha} \cdot \frac{1}{2 \sqrt{\alpha}} = \frac{1}{2}$

But using $p = \hbar k$ , we can write this as

$\Delta x \Delta p = \frac{\hbar}{2}$

It can be shown that for anything other than a Gaussian wave packet we would find

$\Delta x \Delta p > \frac{\hbar}{2}$

and thus we obtain the famous Heisenberg uncertainty principle for position and momentum:

$\Delta x \Delta p \ge \frac{\hbar}{2}$

This wave packet model is a static one, i.e., the wave function $\psi(x)$ is time-independent. We can obtain a time-dependent wave function $\psi(x, t)$ by multiplying the integrand in (37) by a quantum mechanical time evolution operator (the time evolution operator will be discussed below). Then, using the same $\phi(k)$ as in the static case and manipulating the integral in a similar manner as before, we find that $\psi(x, t)$ is again a localized Gaussian wave packet but now with the property that the standard deviation of the associated Gaussian probability distribution increases with time (i.e., the bell-shaped Gaussian probability distribution spreads out over time).

Quantum harmonic oscillator

To extend the model further, we now consider a quantum harmonic oscillator in which the position still has domain $x \in (-\infty, \infty)$ . In this model, the momentum $p$ is again no longer quantized but energy $E$ remains quantized and is no longer a degenerate measurement. The approach is essentially to take the ideal classical harmonic oscillator’s Hamiltonian, convert it to a quantum mechanical Hamiltonian operator by `putting hats’ on the classical observables, and then use the operator to solve an eigenvalue problem.

Recall that, in the ideal classical harmonic oscillator model, a particle of mass $m$ attached to a spring with spring constant $K$ is always pulled towards an equilibrium position by a Hooke’s law force $F = -Kx$ . Newton’s second law $F = ma$ then gives us a second-order differential equation

$m \frac{d^2 x}{dt^2} = -Kx \qquad \qquad \qquad \qquad \qquad (47)$

with general solution

$x = A \cos\big(\sqrt{\frac{K}{m}} \cdot t\big) + B \sin \big(\sqrt{\frac{K}{m}} \cdot t\big) \qquad \qquad \qquad \qquad \qquad (48)$

The angular frequency (or, equivalently, angular velocity – radians per second) is thus

$\omega = \sqrt{\frac{K}{m}} \qquad \qquad \qquad \qquad \qquad (49)$

and is usually referred to as the natural oscillation frequency of the harmonic oscillator.

The particle’s kinetic energy is $\frac{p^2}{2m}$ where $p$ is the classical linear momentum, and, unlike the previous models, we now have a potential energy $V(x) = \frac{Kx^2}{2}$ . (Recall that potential energy is the negative of work, $W$ . For the classical harmonic oscillator, we have $W = \int F dx = -\int dx \ Kx = -\frac{1}{2}Kx^2$ , so the potential energy is the negative of this). The classical Hamiltonian is then

$H = \frac{p^2}{2m} + \frac{Kx^2}{2} \qquad \qquad \qquad \qquad \qquad (50)$

Using (49), we can write the corresponding quantum mechanical Hamiltonian operator as

$\hat{H} = \frac{\hat{p}^2}{2m} + \frac{m \omega^2}{2} \hat{x}^2 \qquad \qquad \qquad \qquad \qquad (51)$

As before, the position spectrum is generated by $\hat{x}$ and the momentum spectrum is generated by $\hat{p}$ . The energy spectrum is generated by $\hat{H}$ via the time-independent Schrödinger equation, which is the following eigenvalue problem for $E$ :

$\hat{H} | E \rangle = E | E \rangle \qquad \qquad \qquad \qquad \qquad (52)$

In Dirac notation, this general eigenvalue problem statement does not yet specify a `language’ in which to express the solutions. We can cast the eigenvalue problem into $x$ -space by introducing the bra $\langle x |$ and expressing the operator $\hat{H}$ in $x$ -language. We get the time-independent Schrödinger equation as the second-order differential equation

$-\frac{\hbar^2}{2m} \frac{d^2}{dx^2} \langle x | E \rangle + \frac{m \omega^2}{2} x^2 \langle x | E \rangle = E \langle x | E \rangle \qquad \qquad \qquad \qquad \qquad (53)$

This is a well-known differential equation in disguise. The general form is

$-\frac{d^2}{d \xi^2} (y_n) + \xi^2 y_n = 2 \big(n + \frac{1}{2}\big) y_n \qquad \qquad \qquad \qquad \qquad (54)$

It is easily seen that (53) becomes (54) if we take

$E = \big(n + \frac{1}{2} \big) \hbar \omega \qquad \qquad \qquad \qquad \qquad (55)$

and

$x = \xi \sqrt{\frac{\hbar}{m \omega}} \qquad \qquad \qquad \qquad \qquad (56)$

The solutions of (54) are stated in terms of the Hermite polynomials, $H_0(\xi) = 1$ , $H_1(\xi) = 2 \xi$ , $H_3(\xi) = 8 \xi^3 - 12 \xi$ , $\ldots$ . The solutions in terms of $x$ are

$\langle x | E \rangle = i^n \frac{\exp\big(-\frac{m \omega x^2}{2 \hbar}\big) H_n \big(x \sqrt{\frac{m \omega}{\hbar}}\big)}{\sqrt{2^n n! \sqrt{\frac{\pi \hbar}{m \omega}}}} \qquad \qquad \qquad \qquad \qquad (57)$

where only solutions with nonnegative integer values of $n$ make $\langle x | E \rangle$ square-integrable, so only these solutions are allowable. We see that $E$ is indeed quantized, i.e., indexed by nonnegative integers $n$ , so if we wanted to we could write $\langle x | E \rangle$ as $\langle x | n \rangle$ to emphasise the relevant energy index. The $n + \frac{1}{2}$ sequence of energies means that $E$ can never be zero for the quantum harmonic oscillator, and it can only go up in discrete steps of $\hbar \omega$ from its minimum value of $\frac{1}{2} \hbar \omega$ . Also note that, as stated earlier, the ladder of $E$ values is not degenerate for the quantum harmonic oscillator (unlike the particle in a ring model) because only nonnegative integers $n$ yield valid solutions for this particular eigenvalue problem.

The quantum harmonic oscillator is also used to introduce additional quantum concepts such as quantum tunnelling (the probability amplitudes $\langle x | E \rangle$ are nonzero outside the quadratic potential well for this model, so there is a nonzero probability that the quantum particle can be found outside the potential well. This would be impossible for the classical harmonic oscillator). Another thing to observe here is that it is also possible to cast the eigenvalue problem in $p$ -language instead of $x$ -language (resulting in expressions for the probability amplitudes $\langle p | E \rangle$ which are isomorphic to those for $\langle x | E \rangle$ ), and in energy-level language, which introduces the ideas of creation and annihilation operators (also known as raising and lowering operators, respectively, or as ladder operators), as well as commutators. These are nonphysical operators, i.e., they do not correspond to observables, but rather are regarded as useful computational aids.

The annihilation operator $\hat{a}$ for the quantum harmonic oscillator can be defined in terms of $\hat{p}$ and $\hat{x}$ as

$\hat{a} \sqrt{\hbar \omega} = \frac{\hat{p}}{\sqrt{2m}} - i \sqrt{\frac{m \omega^2}{2}}\cdot \hat{x} \qquad \qquad \qquad \qquad \qquad (58)$

and the creation operator is its hermitian conjugate

$\hat{a}^{\dag} \sqrt{\hbar \omega} = \frac{\hat{p}}{\sqrt{2m}} + i \sqrt{\frac{m \omega^2}{2}}\cdot \hat{x} \qquad \qquad \qquad \qquad \qquad (59)$

These can be derived from the Hamiltonian operator by pseudo-factorising it. See my previous note about this. From these two equations, we can show that their commutator is unity, i.e.,

$[\hat{a}, \hat{a}^{\dag}] = \hat{a} \hat{a}^\dag - \hat{a}^{\dag} \hat{a} = 1 \qquad \qquad \qquad \qquad \qquad (60)$

and the Hamiltonian operator itself can now be re-expressed in terms of these ladder operators as

$\hat{H} = \hbar \omega \big(\hat{a}^{\dag} \hat{a} + \frac{1}{2}\big) \qquad \qquad \qquad \qquad \qquad (61)$

We also find

$[\hat{a}, \hat{H}] = \hbar \omega \hat{a} \qquad \qquad \qquad \qquad \qquad (62)$

$[\hat{a}^{\dag}, \hat{H}] = -\hbar \omega \hat{a}^{\dag} \qquad \qquad \qquad \qquad \qquad (63)$

These results allow us to avoid all mention of $\hat{p}$ and $\hat{x}$ when investigating the energy measurement space generated by the operator $\hat{H}$ in the eigenvalue problem given as equation (52) above. We have

$[\hat{a}, \hat{H}] | E \rangle = \hbar \omega \hat{a} | E \rangle$

and thus

$\hat{a} \hat{H} | E \rangle - \hat{H} \hat{a} | E \rangle = \hbar \omega \hat{a} | E \rangle$

$\hat{H} \hat{a} | E \rangle = (E - \hbar \omega) \hat{a} | E \rangle \qquad \qquad \qquad \qquad \qquad (64)$

Similarly, from

$[\hat{a}^{\dag}, \hat{H}] | E \rangle = -\hbar \omega \hat{a}^{\dag} | E \rangle$

we have

$\hat{a}^{\dag} \hat{H} | E \rangle - \hat{H} \hat{a}^{\dag} | E \rangle = -\hbar \omega \hat{a}^{\dag} | E \rangle$

and so

$\hat{H} \hat{a}^{\dag} | E \rangle = (E + \hbar \omega) \hat{a}^{\dag} | E \rangle \qquad \qquad \qquad \qquad \qquad (65)$

Therefore, $\hat{a}$ applied to $| E \rangle$ produces a new eigenket $\hat{a} | E \rangle$ which is lowered in energy by $\hbar \omega$ . Similarly, $\hat{a}^{\dag}$ produces an eigenket raised in energy by $\hbar \omega$ . As mentioned earlier, we can use the integer index $n$ to label the energy states instead of $E$ , i.e., we can write $| n \rangle$ instead of $| E \rangle$ . To make the ladder effect of the creation and annihilation operators even more obvious, we can then write

$\hat{a} | n \rangle = c_n | n-1 \rangle \qquad \qquad \qquad \qquad \qquad (66)$

$\hat{a}^{\dag} | n \rangle = c_{n+1} | n+1 \rangle \qquad \qquad \qquad \qquad \qquad (67)$

and try to work out the eigenvalues $c_n$ and $c_{n+1}$ . The value of $c_n$ can be deduced by noting that applying $\hat{H}$ in the form of (61) above to $| n \rangle$ and using (55) we get

$\hbar \omega \big(\hat{a}^{\dag} \hat{a} + \frac{1}{2}\big) | n \rangle = \big(n + \frac{1}{2} \big) \hbar \omega | n \rangle$

which reduces to

$\hat{a}^{\dag} \hat{a} | n \rangle = n | n \rangle$

But $\hat{a}^{\dag} \hat{a} | n \rangle = | c_n |^2 | n \rangle$ , so we must have $c_n = \sqrt{n}$ . Therefore (66) can be written explicitly as

$\hat{a} | n \rangle = \sqrt{n} | n -1 \rangle$

We now use this result to deduce the value of $c_{n+1}$ by writing it as

$\hat{a} | N \rangle = \sqrt{N} | N - 1 \rangle$

Adding a bra $\langle n |$ gives us a self-space bracket on the right-hand side:

$\langle n | \hat{a} | N \rangle = \sqrt{N} \langle n | N - 1 \rangle \qquad \qquad \qquad \qquad \qquad (68)$

But $\langle n | N - 1 \rangle = \delta_n^{N-1}$ , so for (68) to be nonzero we must have $n = N - 1$ , so $N = n+1$ , and notice that this must also be true for $\langle n+1 | N \rangle$ to be nonzero, so $\langle n | N-1 \rangle$ and $\langle n+1 | N \rangle$ must both represent the same function. With these insights, we can re-express (68) as

$\langle n | \hat{a} | N \rangle = \sqrt{n+1} \langle n+1 | N \rangle \qquad \qquad \qquad \qquad \qquad (69)$

Taking the hermitian conjugate of both sides of (69) gives

$\langle N | \hat{a}^{\dag} | n \rangle = \sqrt{n+1} \langle N | n+1 \rangle \qquad \qquad \qquad \qquad \qquad (70)$

from which we deduce that the explicit form of (67) must be

$\hat{a}^{\dag} | n \rangle = \sqrt{n+1} | n + 1 \rangle$

i.e., $c_{n+1} = \sqrt{n+1}$ . Thus, we have

$\hat{a} | n \rangle = \sqrt{n} | n-1 \rangle$

$\hat{a}^{\dag} | n \rangle = \sqrt{n+1} | n+1 \rangle$

Quantum particle on a spherical shell

We now extend the model further by introducing two-dimensional quantum states for a quantum particle on a spherical shell. The general approach to modelling is similar to the approach for the harmonic oscillator in a number of ways. In particular, we begin by setting up a classical model of a particle on a sphere, and then quantize it by considering the relevant quantum mechanical operators in a quantum version of the model.

In a classical model of a particle on a spherical shell, the particle is constrained to positions at a fixed distance $R$ from the origin, but can otherwise move anywhere on a spherical surface. In the previous quantum particle in a ring model, $R = \frac{L}{2 \pi}$ was assumed to be sufficiently large to enable angular momentum considerations to be ignored, and the focus was then on linear momentum $p$ . In the present particle on a spherical shell model, we no longer assume $R$ is large and in fact the particle now only has angular momentum. As discussed in a previous post involving angular momentum calculations, the classical particle has a vector angular momentum about the origin given by

$\mathbf{L} = \mathbf{r} \times \mathbf{p} \qquad \qquad \qquad \qquad \qquad (71)$

which has components

$L_x = yp_z - zp_y \qquad \qquad \qquad \qquad \qquad (72)$

$L_y = zp_x - xp_z \qquad \qquad \qquad \qquad \qquad (73)$

$L_z = xp_y - yp_x \qquad \qquad \qquad \qquad \qquad (74)$

There is no potential energy in this model, only kinetic energy, but the constraint that the particle must remain on the spherical shell means that $\mathbf{r}$ and $\mathbf{p}$ are always orthogonal, so

$\mathbf{r} \cdot \mathbf{p} = 0 \qquad \qquad \qquad \qquad \qquad (75)$

This linear constraint means that only two of the three linear momentum components are linearly independent, so the linear momentum space for this model is two-dimensional. In view of (71) and (75), the magnitudes of the linear and angular momentum are related by

$|\mathbf{L}| = |\mathbf{r}| \cdot |\mathbf{p}| \sin\big(\frac{\pi}{2}\big) = Rp$

$p = \frac{L}{R} \qquad \qquad \qquad \qquad \qquad (76)$

Therefore, the Hamiltonian for the classical particle in a shell model is

$H = \frac{p^2}{2m} = \frac{L^2}{2mR^2} \qquad \qquad \qquad \qquad \qquad (77)$

Using spherical coordinates, the classical particle can wander anywhere on the sphere, with any angular momentum in any direction, its position characterised by the polar angle $\theta$ and the azimuthal angle $\varphi$ . In the case of a quantum particle on a spherical shell, a position measurement defines a two-dimensional basis space and a position state would be denoted by the ket $| \theta, \varphi \rangle$ . There would be two quantum operators $\hat{\theta}$ and $\hat{\varphi}$ whose eigenvalues are the spectra of possible $\theta$ and $\varphi$ values respectively, and a single eigenstate $| \theta, \varphi \rangle$ is now regarded as a simultaneous solution of the two eigenvalue problems for these operators:

$\hat{\theta} | \theta, \varphi \rangle = \theta | \theta, \varphi \rangle \qquad \qquad \qquad \qquad \qquad (78)$

$\hat{\varphi} | \theta, \varphi \rangle = \varphi | \theta, \varphi \rangle \qquad \qquad \qquad \qquad \qquad (79)$

This is possible if and only if $\hat{\theta}$ and $\hat{\varphi}$ commute, i.e., $[\hat{\theta}, \hat{\varphi}] = 0$ . Operators which commute are called compatible operators and they must have simultaneous eigenstates. Conversely, two operators which have simultaneous eigenstates must commute. (An indication of why this is true is as follows. Suppose we are given the eigenvalue equation (78), so that $| \theta, \varphi \rangle$ is an eigenstate of $\hat{\theta}$ with eigenvalue $\theta$ . Then, given $[\hat{\theta}, \hat{\varphi}] = 0$ , we must have

$\hat{\theta} (\hat{\varphi} | \theta, \varphi \rangle) = \hat{\varphi} (\hat{\theta} | \theta, \varphi \rangle = \hat{\varphi} (\theta | \theta, \varphi \rangle) = \theta (\hat{\varphi} | \theta, \varphi \rangle)$

Therefore, both $\hat{\varphi} | \theta, \varphi \rangle$ and $| \theta, \varphi \rangle$ are eigenstates of $\hat{\theta}$ with the same eigenvalue $\theta$ , so they must represent the same eigenstate up to some proportionality constant. The fact that $\hat{\varphi} | \theta, \varphi \rangle$ and $| \theta, \varphi \rangle$ represent the same eigenstate up to a proportionality constant is exactly what (79) says). The fact that this condition holds means that the measurement of an eigenvalue of $\hat{\theta}$ does not destroy knowledge of an eigenvalue of $\hat{\varphi}$ , and vice versa. This is not possible with all pairs of measurements. For example, the operators $\hat{x}$ and $\hat{p}$ are incompatible and thus do not have simultaneous eigenstates, so no two-dimensional state such as $|x, p \rangle$ exists. Knowledge of $p$ would be destroyed by a measurement of $x$ , and vice versa. This is indicated by the fact that $\hat{x}$ and $\hat{p}$ do not commute, since $[\hat{x}, \hat{p}] = i \hbar \neq 0$ . Simultaneous eigenstates can only exist for operators which commute.

Note that this idea of compatible vs incompatible operators is intimately related to Heisenberg’s uncertainty principle. Given two operators $\hat{A}$ and $\hat{B}$ , with standard deviations $\Delta a$ and $\Delta b$ respectively, the generalised form of Heisenberg’s uncertainty principle says

$\Delta a \Delta b \ge \frac{1}{2} |\langle [\hat{A}, \hat{B}] \rangle|$

where $|\langle [\hat{A}, \hat{B}] \rangle|$ denotes the absolute value of the expectation of the commutator of $\hat{A}$ and $\hat{B}$ . (The concept of the expectation of an operator is discussed in a later section below). Thus, there is no uncertainty principle for compatible operators, and measurement of one will not affect measurement of the other. However, there is always an uncertainty principle for incompatible operators, i.e., those with non-zero commutators, and it is impossible to measure one without destroying a previous measurement of the other.

Recall that in the case of the harmonic oscillator we could solve the eigenvalue problem in $x$ -language, $p$ -language, or energy-level-language, and the energy-level-language approach gave rise to creation and annihilation operator approaches. Similar considerations apply to the quantum particle on a spherical shell model. We can solve the eigenvalue problem in the language of angular momentum components rather than in the language of spatial or linear momentum states, and this gives rise to ladder operators analogous to the creation and annihilation operators for the quantum harmonic oscillator. Since angular momentum components can be measured, there must be angular momentum state labels and angular momentum operators. The angular momentum operators are simply the classical angular momenta turned into operators by `putting hats on them’. Thus, we get the operators

$\hat{L}_x = \hat{y}\hat{p}_z - \hat{z}\hat{p}_y \qquad \qquad \qquad \qquad \qquad (80)$

$\hat{L}_y = \hat{z}\hat{p}_x - \hat{x}\hat{p}_z \qquad \qquad \qquad \qquad \qquad (81)$

$\hat{L}_z = \hat{x}\hat{p}_y - \hat{y}\hat{p}_x \qquad \qquad \qquad \qquad \qquad (82)$

$\hat{L}^2 = \hat{L}_x^2 + \hat{L}_y^2 + \hat{L}_z^2 \qquad \qquad \qquad \qquad \qquad (83)$

The eigenvalues of $\hat{L}^2$ make up the measurement spectrum of the square of the total angular momentum.

We know that spatial states $| \theta, \varphi \rangle$ and linear momentum states are two-dimensional, so we must also find that the measurement space created by the angular momentum operators is two-dimensional since dimensionality is conserved, no matter what language is used to describe the system. Indeed, we find that no two of the three angular momentum operators commute with each other:

$[\hat{L}_x, \hat{L}_y] = i\hbar \hat{L}_z \qquad \qquad \qquad \qquad \qquad (84)$

$[\hat{L}_y, \hat{L}_z] = i\hbar \hat{L}_x \qquad \qquad \qquad \qquad \qquad (85)$

$[\hat{L}_z, \hat{L}_x] = i\hbar \hat{L}_y \qquad \qquad \qquad \qquad \qquad (86)$

Therefore, it is impossible to define eigenstates based on these components. However, all three components do commute with $\hat{L}^2$ :

$[\hat{L}_x, \hat{L}^2] = [\hat{L}_y, \hat{L}^2] = [\hat{L}_z, \hat{L}^2] = 0 \qquad \qquad \qquad \qquad \qquad (87)$

Therefore, it is possible to characterise an eigenstate using $\hat{L}^2$ and any one of the components. In what follows, we will use a basis corresponding to the simultaneous eigenstates of $\hat{L}^2$ and $\hat{L}_z$ . Suppose the eigenvalues of $\hat{L}$ are $\lambda \hbar^2$ and the eigenvalues of $\hat{L}_z$ are $\mu \hbar$ ( $\hbar$ has dimensions of angular momentum, so its presence here enables $\lambda$ and $\mu$ to be treated as dimensionless). We now seek to find $\lambda$ and $\mu$ as simultaneous solutions of the two eigenvalue problems

$\hat{L}^2 |\lambda, \mu \rangle = \lambda \hbar^2 |\lambda, \mu \rangle \qquad \qquad \qquad \qquad \qquad (88)$

$\hat{L}_z |\lambda, \mu \rangle = \mu \hbar |\lambda, \mu \rangle \qquad \qquad \qquad \qquad \qquad (89)$

To do this, we define two new operators, the lowering and raising operators $\hat{L}_{-}$ and $\hat{L}_{+}$ , which are the analogues of the annihilation and creation operators for the harmonic oscillator:

$\hat{L}_{-} = \hat{L}_x - i \hat{L}_y \qquad \qquad \qquad \qquad \qquad (90)$

$\hat{L}_{+} = (\hat{L}_{-})^{\dag} = \hat{L}_x + i \hat{L}_y \qquad \qquad \qquad \qquad \qquad (91)$

Note that since $\hat{L}^2$ commutes with $\hat{L}_x$ and $\hat{L}_y$ , it must also commute with $\hat{L}_{-}$ and $\hat{L}_{+}$ . We find that

$[\hat{L}_{-}, \hat{L}_z] = \hbar \hat{L}_{-} \qquad \qquad \qquad \qquad \qquad (92)$

$[\hat{L}_z, \hat{L}_{+}] = \hbar \hat{L}_{+} \qquad \qquad \qquad \qquad \qquad (93)$

and also

$[\hat{L}^2, \hat{L}_{-}] = [\hat{L}^2, \hat{L}_{+}] = 0 \qquad \qquad \qquad \qquad \qquad (94)$

Following a similar line of reasoning as with the annihilation and creation operators of the quantum harmonic oscillator, we find that

$\hat{L}_{-} |\lambda, \mu \rangle = c_{\mu} \hbar |\lambda, \mu - 1 \rangle \qquad \qquad \qquad \qquad \qquad (95)$

$\hat{L}_{+} |\lambda, \mu \rangle = c_{\mu+1} \hbar |\lambda, \mu + 1 \rangle \qquad \qquad \qquad \qquad \qquad (96)$

To find $c_{\mu}$ and $c_{\mu + 1}$ , we begin by noting that $(\mu \hbar)^2 \leq \lambda \hbar^2$ since $\hat{L}_z$ is just one component of $\hat{L}^2$ . Therefore, $\mu$ is bounded, so there must be a $\mu_{max}$ and a $\mu_{min}$ . Let $l = \mu_{max}$ . Next, we observe that the existence of the ladder operators in (95) and (96) implies that the eigenvalues $\mu$ are discrete, consisting of a ladder of unit steps. Thus, we have

$\mu_{max} = l = \mu_{min} + N \qquad \qquad \qquad \qquad \qquad (97)$

for some $N \in \mathbb{N}$ , and therefore

$\mu_{min} = l - N$

We next observe that

$\hat{L}_{-} |\lambda, \mu_{min} \rangle = 0 \qquad \qquad \qquad \qquad \qquad (98)$

$\hat{L}_{+} |\lambda, \mu_{max} \rangle = 0 \qquad \qquad \qquad \qquad \qquad (99)$

because the minimum state cannot be lowered further and the maximum state cannot be raised further. We also observe that $\hat{L}^2$ can be expressed in terms of $\hat{L}_{-}$ and $\hat{L}_{+}$ in two different ways:

$\hat{L}^2 = \hat{L}_{+} \hat{L}_{-} - \hbar \hat{L}_z + \hat{L}_z^2 \qquad \qquad \qquad \qquad \qquad (100)$

$\hat{L}^2 = \hat{L}_{-} \hat{L}_{+} + \hbar \hat{L}_z + \hat{L}_z^2 \qquad \qquad \qquad \qquad \qquad (101)$

We now apply (100) to the state $| \lambda, \mu_{min} \rangle = | \lambda, l-N \rangle$ , using (98), (88) and (89), and we get

$\lambda = (l - N)^2 - (l - N) \qquad \qquad \qquad \qquad \qquad (102)$

Similarly, applying (101) to the state $| \lambda, \mu_{max} \rangle = | \lambda, l\rangle$ using (99) we get

$\lambda = l^2 + l \qquad \qquad \qquad \qquad \qquad (103)$

From (102) and (103) we get

$(l - N)^2 - (l - N) - l^2 - l = (N - 2l)(N+1) = 0$

which implies

$l = \frac{N}{2} \qquad \qquad \qquad \qquad \qquad (104)$

We conclude from (103) and (104) that $\lambda = l(l+1)$ in (88), where the allowed spectrum of $l$ values are the nonnegative half-integers:

$l = 0, \frac{1}{2}, 1, \frac{3}{2}, \ldots \qquad \qquad \qquad \qquad \qquad (105)$

And $\mu$ runs from $\mu_{max} = l$ to $\mu_{min} = l - N = l - 2l = -l$ in integer steps. Using these results in (88) and (89), and using $l$ to label an eigenstate of $\hat{L}^2$ instead of $\lambda$ , we conclude

$\hat{L}^2 |l, \mu \rangle = l(l+1) \hbar^2 |l, \mu \rangle \qquad \qquad \qquad \qquad \qquad (106)$

$\hat{L}_z |l, \mu \rangle = \mu \hbar |l, \mu \rangle \qquad \qquad \qquad \qquad \qquad (107)$

where $2l$ is an integer and $\mu$ runs from $-l$ to $l$ in integer steps:

$\mu = -l, -l+1, -l+2, \ldots, l-1, l \qquad \qquad \qquad \qquad \qquad (108)$

Finally, to find $c_{\mu}$ in (95), we apply (100) to the state $|l, \mu \rangle$ to get

$l(l+1) \hbar^2 = |c_{\mu}|^2 \hbar^2 - \hbar^2 \mu + \hbar^2 \mu^2$

which implies

$|c_{\mu}|^2 = l(l+1) - \mu(\mu-1)$

Taking $c_{\mu}$ to be real, we get

$c_{\mu} = \sqrt{l(l+1) - \mu(\mu-1)}$

and

$c_{\mu+1} = \sqrt{l(l+1) - \mu(\mu+1)}$

Therefore, the effects of the lowering and raising operators are

$\hat{L}_{-} |l, \mu \rangle = \hbar \sqrt{l(l+1) - \mu(\mu-1)} |l, \mu - 1 \rangle \qquad \qquad \qquad \qquad \qquad (109)$

$\hat{L}_{+} |l, \mu \rangle = \hbar \sqrt{l(l+1) - \mu(\mu+1)} |l, \mu + 1 \rangle \qquad \qquad \qquad \qquad \qquad (110)$

So far, we have worked out the quantum particle in a spherical shell model using the language of angular momentum, $|l, \mu \rangle$ . We can also use the language of position, $|\theta, \varphi \rangle$ , and in fact we can work out the transformation matrix elements $\langle \theta, \varphi | l, \mu \rangle$ which also have the physical interpretation of being probability amplitudes for finding the quantum particle at position $\theta, \varphi$ when it is known to be in angular momentum state $l, \mu$ . This exercise will reveal the crucial fact that $l$ in (105) can only take integer values in the case of orbital angular momentum.

We first of all cast (106) and (107) in the language of $\theta, \varphi$ by introducing a bra $\langle \theta, \varphi|$ :

$\langle \theta, \varphi| \hat{L}^2 |l, \mu \rangle = l(l+1) \hbar^2 \langle \theta, \varphi |l, \mu \rangle \qquad \qquad \qquad \qquad \qquad (111)$

$\langle \theta, \varphi| \hat{L}_z |l, \mu \rangle = \mu \hbar \langle \theta, \varphi |l, \mu \rangle \qquad \qquad \qquad \qquad \qquad (112)$

To implement these equations, we need $\langle \theta, \varphi| \hat{L}^2$ and $\langle \theta, \varphi| \hat{L}_z$ , i.e., we need the operators $\hat{L}^2$ and $\hat{L}_z$ expressed in the language of $\theta, \varphi$ . I obtained these expressions as well as expressions for $\langle \theta, \varphi| \hat{L}_x$ and $\langle \theta, \varphi| \hat{L}_y$ in a previous post. We can also use these to get expressions for $\langle \theta, \varphi| \hat{L}_{-}$ and $\langle \theta, \varphi| \hat{L}_{+}$ . Using

$\langle \theta, \varphi| \hat{L}_z = -i \hbar \frac{\partial }{\partial \varphi}$

in (112) we get a differential equation

$-i \frac{\partial}{\partial \varphi} \langle \theta, \varphi | l, \mu \rangle = \mu \langle \theta, \varphi | l, \mu \rangle \qquad \qquad \qquad \qquad \qquad (113)$

And using

$\langle \theta, \varphi | \hat{L}^2 = - \hbar^2 \big( \frac{1}{\sin \theta}\frac{\partial}{\partial \theta} \big( \sin \theta \frac{\partial}{\partial \theta}\big) + \frac{1}{\sin^2 \theta}\frac{\partial^2}{\partial \varphi^2} \big)$

in (111) we get a second differential equation

$-\frac{1}{\sin \theta} \frac{\partial}{\partial \theta} \bigg(\sin \theta \frac{\partial}{\partial \theta} \langle \theta, \varphi | l, \mu \rangle \bigg) - \frac{1}{\sin^2 \theta} \frac{\partial^2}{\partial \varphi^2} \langle \theta, \varphi | l, \mu\rangle = l(l+1) \langle \theta, \varphi | l, \mu\rangle \qquad \qquad \qquad \qquad \qquad (114)$

The transformation matrix elements $\langle \theta, \varphi | l, \mu \rangle$ are the simultaneous solutions of these two differential equations. From (113) we deduce that

$\langle \theta, \varphi | l, \mu \rangle = e^{i \mu \varphi} \times f(\theta) \qquad \qquad \qquad \qquad \qquad (115)$

where $f(\theta)$ is some function of $\theta$ . This equation is crucial because it reveals that $\mu$ can only take integer values. This is a necessary condition for $\langle \theta, \varphi | l, \mu \rangle$ to be periodic with respect to $\varphi$ , which it must be. But by (108) this in turn implies that $l$ can only take integer values in the case of orbital angular momentum. The half-integer values in (105) must relate to something other than orbital angular momentum and will be found below to correspond to quantum mechanical spin angular momentum.

Using (115) in (114) we get a well-known differential equation:

$-\frac{1}{\sin \theta} \frac{\partial}{\partial \theta} \bigg(\sin \theta \frac{\partial}{\partial \theta} \langle \theta, \varphi | l, \mu \rangle \bigg) + \frac{\mu^2}{\sin^2 \theta} \langle \theta, \varphi | l, \mu\rangle = l(l+1) \langle \theta, \varphi | l, \mu\rangle \qquad \qquad \qquad \qquad \qquad (116)$

This is satisfied by the associated Legendre polynomials, $P_l^{\mu}(\cos \theta)$ . These are the functions $f(\theta)$ in (115). Using $P_l^{\mu}(\cos \theta)$ to construct properly normalized simultaneous solutions to (113) and (114) we obtain finally

$\langle \theta, \varphi | l, \mu \rangle = Y_{l, \ \mu}(\theta, \varphi) \qquad \qquad \qquad \qquad \qquad (117)$

where the functions $Y_{l, \ \mu}(\theta, \varphi)$ are well-known tabulated functions called the spherical harmonic functions. Interestingly, these functions characterise the normal mode standing waves of the vibrations of a spherical shell, i.e., the fundamental and all the harmonics of a spherical drumhead. They are the two-dimensional analogues on a sphere of one-dimensional sines and cosines (the harmonic functions for a line). The correct normalization constants for the probability amplitudes in (117) would be found using the certainty condition, which in spherical coordinates is

$\int_0^{2 \pi} \int_0^{\pi} \sin \theta d\theta d\varphi |\langle \theta, \varphi | l, \mu \rangle|^2 = 1$

Spin, Pauli spin matrices and matrix mechanics

The angular momentum of a circulating charged particle gives rise to a magnetic dipole moment. This concept is usually first encountered in elementary electromagnetism when considering the torque on a current loop immersed a magnetic field $\textbf{B}$ , which is given by

$\pmb{\tau} = \pmb{\mu} \times \textbf{B} \qquad \qquad \qquad \qquad \qquad (118)$

where

$\pmb{\mu} = (N i A)\cdot \textbf{n} \qquad \qquad \qquad \qquad \qquad (119)$

is the magnetic dipole moment. Here, $N$ is the number of turns in the current loop, $i = \frac{dq}{dt}$ is the conventional current flowing through the loop (i.e., the rate at which positive charge is flowing through a cross-section of the conducting loop), $A$ is the area enclosed by the current loop and $\textbf{n}$ is a unit vector which is normal to the plane of the area $A$ in the sense given by the right-hand rule. (The direction of the normal unit vector would be the reverse of that indicated by the right-hand rule if a negative charge flow was being considered).

In the case of a classical charged particle of mass $m$ and positive charge $e$ circulating around a spherical shell of radius $R$ with speed $v$ , we have $N=1$ , $i = e/dt$ , $dt = (2 \pi R)/v$ , and $A = \pi R^2$ , so the magnitude of the magnetic dipole moment is obtained as

$\mu = \frac{1}{2} e R \cdot v = \frac{e}{2m} R \cdot p = \frac{e}{2m} L \qquad \qquad \qquad \qquad \qquad (120)$

where $p$ and $L$ are the magnitudes of the linear and angular momenta of the orbiting charged particle respectively. In vector form, we have

$\pmb{\mu} = \frac{e}{2m} \textbf{L} \qquad \qquad \qquad \qquad \qquad (121)$

where the vectors point in a direction given by the right-hand rule. Thus, we see that the magnetic dipole moment is directly proportional to the angular momentum of the orbiting charged particle, with the constant of proportionality involving the charge-to-mass ratio of the particle.

Based on the link between magnetic dipole moment and angular momentum exhibited in (121), we would expect that the magnetic dipole moment of a quantum charged particle, e.g., an electron, say in the $z$ -direction of a coordinate system, would be related to the operators $\hat{L}^2$ and $\hat{L}_z$ in (106) and (107) above with their discrete spectra of eigenvalues. This is indeed the case. The iconic Stern-Gerlach experiment of 1922 was designed to measure the magnetic dipole moments of quantum particles and indeed found that these were quantized in accordance with the relationship suggested by (121) and the operators $\hat{L}^2$ and $\hat{L}_z$ discussed in the previous section. For example, the Stern-Gerlach apparatus produced discrete line spectra when measuring the $z-$ component of magnetic dipole moment, rather than continuous spectra which would have been expected classically, with the multiplicities of the discrete line spectra corresponding exactly to the multiplicities of the eigenvalues $\mu$ in (108) for given values of $l$ . In other words, for any given value of $l$ , the multiplicity of the discrete line spectra produced by the Stern-Gerlach apparatus was $2l+1$ in accordance with (108).

It was noted in the previous section that $l$ can only take integer values in the case of orbital angular momentum, so if magnetic dipole moments were due only to orbital angular momentum the multiplicities $2l + 1$ of the discrete line spectra produced by the Stern-Gerlach apparatus would always be odd numbers. However, this is not what was observed. Famously, a two-line spectrum was produced for the magnetic dipole moment of a beam of free electrons which had no orbital angular momentum at all. The two-line spectrum corresponded to eigenvalues $\mu$ in (108) of $\frac{1}{2}$ and $-\frac{1}{2}$ , and this is only possible if $l = \frac{1}{2}$ . Eventually, this result came to be understood as signifying that electrons have an intrinsic spin angular momentum which also produces its own magnetic dipole moment quite distinct from any magnetic moment due to orbital angular momentum. The difference between them is that while it is possible to have nonzero probability amplitudes $\langle \theta, \varphi | l, \mu \rangle$ for orbital angular momentum with integer values of $l$ , such probability amplitudes must always be zero for spin angular momentum with $l$ equal to a half-integer. Spin angular momentum simply has no representation in real physical space, so spin states cannot be cast into $\langle \theta, \varphi |$ language.

The convention is to use the letter $s$ rather than $l$ to denote spin parameter values, and we always have $s = \frac{1}{2}$ in the case of the electron (we say the electron spin quantum number is $\frac{1}{2}$ ). Apart from this, however, spin angular momentum operators obey exactly the same rules that were outlined for angular momentum in the previous section. We have a spin vector $\textbf{S}$ from whose components the operators $\hat{S}_x$ , $\hat{S}_y$ and $\hat{S}_z$ are built, exactly like $\textbf{L}$ and the operators $\hat{L}_x$ , $\hat{L}_y$ and $\hat{L}_z$ . Also, by (84)-(87) above we have

$[\hat{S}_x, \hat{S}_y] = i\hbar \hat{S}_z \qquad \qquad \qquad \qquad \qquad (122)$

$[\hat{S}_y, \hat{S}_z] = i\hbar \hat{S}_x \qquad \qquad \qquad \qquad \qquad (123)$

$[\hat{S}_z, \hat{S}_x] = i\hbar \hat{S}_y \qquad \qquad \qquad \qquad \qquad (124)$

and

$[\hat{S}_x, \hat{S}^2] = [\hat{S}_y, \hat{S}^2] = [\hat{S}_z, \hat{S}^2] = 0 \qquad \qquad \qquad \qquad \qquad (125)$

Focusing on the electron, and using (106)-(108) in the previous section, we have the results

$\hat{S}^2 |s=\frac{1}{2}, m \rangle = \frac{1}{2}\bigg(\frac{1}{2}+1 \bigg) \hbar^2 |s=\frac{1}{2}, m \rangle \qquad \qquad \qquad \qquad \qquad (126)$

$\hat{S}_z |s=\frac{1}{2}, m \rangle = m \hbar |s=\frac{1}{2}, m \rangle \qquad \qquad \qquad \qquad \qquad (127)$

where

$m = -\frac{1}{2}, \frac{1}{2} \qquad \qquad \qquad \qquad \qquad (128)$

We can also define spin raising and lowering operators $\hat{S}_{+}$ and $\hat{S}_{-}$ in terms of $\hat{S}_x$ and $\hat{S}_y$ using (90) and (91):

$\hat{S}_{+} = \hat{S}_x + i \hat{S}_y \qquad \qquad \qquad \qquad \qquad (129)$

$\hat{S}_{-} = \hat{S}_x - i \hat{S}_y \qquad \qquad \qquad \qquad \qquad (130)$

Applying these in the case of the electron using (109) and (110) in the previous section we get

$\hat{S}_{+} |s=\frac{1}{2}, m=-\frac{1}{2} \rangle$

$= \hbar \sqrt{\frac{1}{2}\bigg(\frac{1}{2}+1\bigg) - \bigg(-\frac{1}{2}\bigg)\bigg(-\frac{1}{2}+1\bigg)} |s=\frac{1}{2}, m= -\frac{1}{2} + 1\rangle$

$= \hbar |s=\frac{1}{2}, m= \frac{1}{2}\rangle \qquad \qquad \qquad \qquad \qquad (131)$

and similarly

$\hat{S}_{-} |s=\frac{1}{2}, m=\frac{1}{2} \rangle$

$= \hbar \sqrt{\frac{1}{2}\bigg(\frac{1}{2}+1\bigg) - \frac{1}{2}\bigg(\frac{1}{2}-1\bigg)} |s=\frac{1}{2}, m= \frac{1}{2} - 1\rangle$

$= \hbar |s=\frac{1}{2}, m= -\frac{1}{2}\rangle \qquad \qquad \qquad \qquad \qquad (132)$

Since the $+\frac{1}{2}$ value cannot be raised and the $-\frac{1}{2}$ value cannot be lowered, we also have

$\hat{S}_{+} |s=\frac{1}{2}, m=\frac{1}{2} \rangle = 0 \qquad \qquad \qquad \qquad \qquad (133)$

and

$\hat{S}_{-} |s=\frac{1}{2}, m=-\frac{1}{2} \rangle = 0 \qquad \qquad \qquad \qquad \qquad (134)$

The Pauli spin matrices and matrix mechanics are normally introduced at this stage by constructing transformation matrices between distinct spin states based on the results of a rotated Stern-Gerlach apparatus. For example, after a beam of electrons is split in two in the $z$ -direction by a Stern-Gerlach apparatus, one of the polarized beams can then be passed through a second Stern-Gerlach apparatus which is rotated by $90^{\circ}$ so that it is split in two in the $y$ -direction. The electrons in the incoming polarized beam will be in one of the eigenstates of the $\hat{S}_z$ operator, which will be denoted by $m$ or $M$ in what follows, whereas the electrons in an outgoing beam of the rotated apparatus will be in one of the eigenstates of the $\hat{S}_y$ operator, which will be denoted by $\mu$ in what follows. Thus, we have the eigenvalue equations

$\hat{S}_z |s=\frac{1}{2}, m \rangle = m\hbar |s=\frac{1}{2}, m \rangle \qquad \qquad \qquad \qquad \qquad (135)$

$\hat{S}_y |s=\frac{1}{2}, \mu \rangle = \mu \hbar |s=\frac{1}{2}, \mu \rangle \qquad \qquad \qquad \qquad \qquad (136)$

The operators $\hat{S}_z$ and $\hat{S}_y$ will both have the same measurement spectrum, namely $m = -\hbar/2, +\hbar/2$ for $\hat{S}_z$ and $\mu = -\hbar/2, +\hbar/2$ for $\hat{S}_y$ , but they will be referring to different and incompatible components of spin. We can now obtain transformation matrix elements $\langle m | \mu \rangle$ , i.e., probability amplitudes, for this experiment by casting the eigenvalue problem in (136) in the language of the incoming pre-prepared $z$ -state, and then solving it. Since all the bras and kets in this problem will have $s = \frac{1}{2}$ , we will drop this from the notation to reduce clutter. The eigenvalue problem to be solved can then be expressed as

$\langle m | \hat{S}_y | \mu \rangle = \mu \hbar \langle m | \mu \rangle \qquad \qquad \qquad \qquad \qquad (137)$

We can convert this problem into a matrix algebra problem by inserting an $M$ -space ket-bra sum at the second vertical on the left-hand side (this ket-bra sum will then be in the space of $\hat{S}_z$ eigenvalues):

$\sum_M \langle m | \hat{S}_y | M \rangle \langle M | \mu \rangle = \mu \hbar \langle m | \mu \rangle \qquad \qquad \qquad \qquad \qquad (138)$

The sum on the left can immediately be recognised as a prescription for matrix multiplication. We can deduce the matrix form of $\langle m | \hat{S}_y | M \rangle$ by using (129) and (130) to write

$\hat{S}_y = \frac{1}{2i} (\hat{S}_{+} - \hat{S}_{-}) \qquad \qquad \qquad \qquad \qquad (139)$

Now, using (131) and (133) and noting that $\langle m | M \rangle = \delta_m^M$ since it is a self-space bracket, we can write

$\langle m | \hat{S}_{+} | M \rangle = \begin{bmatrix} \langle m =+\frac{1}{2} | \hat{S}_{+} | M = +\frac{1}{2}\rangle & \langle m =+\frac{1}{2} | \hat{S}_{+} | M = -\frac{1}{2}\rangle \\ \\ \ \langle m =-\frac{1}{2} | \hat{S}_{+} | M = +\frac{1}{2}\rangle & \langle m =-\frac{1}{2} | \hat{S}_{+} | M = -\frac{1}{2}\rangle\end{bmatrix}$

$= \begin{bmatrix} 0 & \hbar \langle m =+\frac{1}{2} | M = +\frac{1}{2}\rangle \\ \\ \ 0 & \hbar \langle m =-\frac{1}{2} | M = +\frac{1}{2}\rangle \end{bmatrix} = \hbar \begin{bmatrix} 0 & 1 \\ \\ \ 0 & 0 \end{bmatrix} \qquad \qquad \qquad \qquad \qquad (140)$

Similarly, using (132) and (134) we can write

$\langle m | \hat{S}_{-} | M \rangle = \begin{bmatrix} \langle m =+\frac{1}{2} | \hat{S}_{-} | M = +\frac{1}{2}\rangle & \langle m =+\frac{1}{2} | \hat{S}_{-} | M = -\frac{1}{2}\rangle \\ \\ \ \langle m =-\frac{1}{2} | \hat{S}_{-} | M = +\frac{1}{2}\rangle & \langle m =-\frac{1}{2} | \hat{S}_{-} | M = -\frac{1}{2}\rangle \end{bmatrix} = \hbar \begin{bmatrix} 0 & 0 \\ \\ \ 1 & 0 \end{bmatrix} \qquad \qquad \qquad \qquad \qquad (141)$

Now using (139), (140) and (141) we can finally write

$\langle m | \hat{S}_y | M \rangle = \langle m | \frac{1}{2i} (\hat{S}_{+} - \hat{S}_{-}) | M \rangle = \frac{\hbar}{2} \begin{bmatrix} 0 & -i \\ \\ \ i & 0 \end{bmatrix} \qquad \qquad \qquad \qquad \qquad (142)$

The matrix on the right-hand side here is one of the three famous Pauli spin matrices. The eigenvalue problem in (138) can then be written in matrix form as

$\frac{\hbar}{2} \begin{bmatrix} 0 & -i \\ \\ \ i & 0 \end{bmatrix} \begin{bmatrix} \langle M =+\frac{1}{2} | \mu = +\frac{1}{2}\rangle & \langle M =+\frac{1}{2} | \mu = -\frac{1}{2}\rangle \\ \\ \ \langle M =-\frac{1}{2} | \mu = +\frac{1}{2}\rangle & \langle M =-\frac{1}{2} | \mu = -\frac{1}{2}\rangle \end{bmatrix} = \begin{bmatrix} \langle m =+\frac{1}{2} | \mu = +\frac{1}{2}\rangle & \langle m =+\frac{1}{2} | \mu = -\frac{1}{2}\rangle \\ \\ \ \langle m =-\frac{1}{2} | \mu = +\frac{1}{2}\rangle & \langle m =-\frac{1}{2} | \mu = -\frac{1}{2}\rangle \end{bmatrix} \begin{bmatrix} \frac{\hbar}{2} & 0 \\ \\ \ 0 & -\frac{\hbar}{2} \end{bmatrix} \qquad \qquad \qquad \qquad \qquad (143)$

where the matrices immediately to the left and right of the equals sign contain the same elements. We immediately recognise this as a simple matrix diagonalization problem. It is straightforward to show that the eigenvalues of the Pauli spin matrix in (142) are 1 and -1, so the right-most matrix in (143) has the eigenvalues of (142) in its diagonal. This means that the transformation matrix either side of the equals sign in (143) must be the matrix whose columns are the eigenvectors corresponding to the eigenvalues. Using elementary linear algebra, this matrix of eigenvectors, normalized so that the eigenvectors are orthonormal, is easily shown to be

$\begin{bmatrix} \langle m =+\frac{1}{2} | \mu = +\frac{1}{2}\rangle & \langle m =+\frac{1}{2} | \mu = -\frac{1}{2}\rangle \\ \\ \ \langle m =-\frac{1}{2} | \mu = +\frac{1}{2}\rangle & \langle m =-\frac{1}{2} | \mu = -\frac{1}{2}\rangle \end{bmatrix} = \begin{bmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ \\ \ \frac{i}{\sqrt{2}} & -\frac{i}{\sqrt{2}} \end{bmatrix} \qquad \qquad \qquad \qquad \qquad (144)$

The absolute value squared of each element of this matrix gives the probability of an outcome of a corresponding rotated Stern-Gerlach experiment. The first column contains the probability amplitudes for the case when the pre-prepared $z$ -state is spin-up, the second column contains the probability amplitudes for the case when the pre-prepared $z$ -state is spin-down. Note that the eigenvectors need to be normalized to ensure that the relevant column probabilities sum to 1.

We can repeat this entire exercise for the case in which the Stern-Gerlach apparatus is rotated in the $x$ -direction instead of the $y$ -direction, so that the relevant operators are $\hat{S}_z$ and $\hat{S}_x$ . Using completely analogous arguments, we find that

$\langle m | \hat{S}_x | M \rangle = \langle m | \frac{1}{2} (\hat{S}_{+} + \hat{S}_{-}) | M \rangle = \frac{\hbar}{2} \begin{bmatrix} 0 & 1 \\ \\ \ 1 & 0 \end{bmatrix} \qquad \qquad \qquad \qquad \qquad (145)$

where the matrix on the right-hand side is another one of the three Pauli spin matrices. By diagonalizing this matrix, we find that the transformation matrix containing the relevant probability amplitudes for this form of the rotated Stern-Gerlach experiment is

$\begin{bmatrix} \langle m =+\frac{1}{2} | \mu = +\frac{1}{2}\rangle & \langle m =+\frac{1}{2} | \mu = -\frac{1}{2}\rangle \\ \\ \ \langle m =-\frac{1}{2} | \mu = +\frac{1}{2}\rangle & \langle m =-\frac{1}{2} | \mu = -\frac{1}{2}\rangle \end{bmatrix} = \begin{bmatrix} \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \\ \\ \ \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \end{bmatrix} \qquad \qquad \qquad \qquad \qquad (146)$

We can repeat the exercise again assuming that the Stern-Gerlach apparatus is not rotated, so that an electron beam in a pre-prepared $z$ -state is again passed through the apparatus oriented in the $z$ -direction. In this case, using (127), we find

$\langle m | \hat{S}_z | M \rangle = \frac{\hbar}{2} \begin{bmatrix} 1 & 0 \\ \\ \ 0 & -1 \end{bmatrix} \qquad \qquad \qquad \qquad \qquad (147)$

where the matrix on the right-hand side is the final one of the three Pauli spin matrices. This matrix is already diagonalized, i.e., the eigenvalues are the elements in the main diagonal. We find that the transformation matrix containing the relevant probability amplitudes for this form of the rotated Stern-Gerlach experiment is

$\begin{bmatrix} \langle m =+\frac{1}{2} | M = +\frac{1}{2}\rangle & \langle m =+\frac{1}{2} | M = -\frac{1}{2}\rangle \\ \\ \ \langle m =-\frac{1}{2} | M = +\frac{1}{2}\rangle & \langle m =-\frac{1}{2} | M = -\frac{1}{2}\rangle \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ \\ \ 0 & 1 \end{bmatrix} \qquad \qquad \qquad \qquad \qquad (148)$

i.e., the identity matrix, which is not surprising given that $\langle m | M \rangle$ is a self-space bracket equal to the Kronecker delta.

Finally, we can use (126) to deduce the matrix form of $\langle m | \hat{S}^2 | M \rangle$ . This is

$\langle m | \hat{S}^2 | M \rangle = \frac{3\hbar^2}{4}\begin{bmatrix} 1 & 0 \\ \\ \ 0 & 1 \end{bmatrix} \qquad \qquad \qquad \qquad \qquad (149)$

Note that each of the transformation matrices in (144), (146) and (148) is a unitary matrix, i.e., the hermitian conjugate of the matrix equals the inverse of the matrix. This is a manifestation of the fundamental axiom of quantum mechanics that $\langle m| \mu\rangle$ and $\langle \mu| m\rangle$ must be connected by complex conjugation, i.e., $\langle \mu| m\rangle = \langle m| \mu\rangle^{\ast}$ .

The time-dependent Schrödinger equation and the time evolution operator

We said earlier that an event in quantum mechanics is a measurement that takes place instantaneously after a system has been prepared in a particular state. For example, a measurement of position can be made instantaneously after the system has been prepared to have a particular linear momentum state, and the probability amplitude for the position measurement would then be given by a wave function $\langle x | p \rangle$ . Immediate measurement is needed because, in general, a quantum state is time-dependent, sometimes emphasised by including the argument $t$ in the ket, e.g., $| \psi(t) \rangle$ . If a measurement is not made instantaneously after a system is prepared in a particular state $|\psi(t) \rangle$ , it is a basic postulate of quantum mechanics that the system will evolve over time in accordance with the time-dependent Schrödinger equation

$i \hbar \frac{d}{dt} | \psi(t) \rangle = \hat{H}(t) | \psi(t) \rangle \qquad \qquad \qquad \qquad \qquad (150)$

where $\hat{H}(t)$ is the Hamiltonian operator associated with the total energy of the system. This linear first-order differential equation plays a role in quantum mechanics somewhat analogous to Newton’s second law in classical mechanics in the sense that, given an initial condition for the system, it deterministically evolves the state of the system over time from then on. It is important to note, however, that in the case of quantum mechanics this deterministic evolution only takes place in the absence of any measurement of the system. Unlike a classical system, the time evolution process ‘stops’ when a quantum mechanical system is measured. The quantum state is immediately reset to the measured value and the time evolution process then begins anew from this new state. The probabilistic aspect of quantum mechanics only arises when a measurement is made in the sense that the result of the measurement (i.e., what particular eigenvalue of the relevant Hermitian operator will end up being observed) is a random variable. The time evolution part of the process in the intervening period between measurements is what is governed by the time-dependent Schrödinger equation and is completely deterministic.

In order to perform calculations using the time-dependent Schrödinger equation, it is necessary to specify the relevant Hamiltonian for the system and to express the state $\psi(t)$ as a superposition of the eigenstates of the Hamiltonian. This can be achieved using a ket-bra sum. The emphasis is usually on conservative systems, i.e., systems for which the total energy is conserved over time so the Hamiltonian is time-independent, $\hat{H}(t) = \hat{H}$ . Non-conservative systems are less tractable and require more elaborate techniques. We will only consider the conservative case here and begin by writing the typical eigenvalue equation for the Hamiltonian, i.e., the time-independent Schrödinger equation, as

$\hat{H} | E \rangle = E | E \rangle \qquad \qquad \qquad \qquad \qquad (151)$

Given the specification of the Hamiltonian for the system, it would be necessary to solve this time-independent Schrödinger equation to find the eigenvalues and eigenstates. Using a ket-bra sum, we can express the state $|\psi(t)\rangle$ as a superposition of the energy eigenstates:

$|\psi(t) \rangle = \sum_E | E \rangle \langle E | \psi(t) \rangle \qquad \qquad \qquad \qquad \qquad (152)$

Note that the eigenstates $| E \rangle$ are time-independent so the time-dependence only enters via the overlap coefficients $\langle E | \psi(t) \rangle$ in the superposition. The problem, then, is to find an explicit formula showing how these overlap coefficients change over time. To this end, we can write the full time-dependent Schrödinger equation in the energy basis given by the eigenstates of the Hamiltonian by acting on both sides of (150) with the bra $\langle E |$ :

$\langle E | i \hbar \frac{d}{dt} | \psi(t) \rangle = \langle E | \hat{H} | \psi(t) \rangle \qquad \qquad \qquad \qquad \qquad (153)$

Using the time-independence of energy eigenstates on the left-hand side and the bra form of the eigenvalue equation for the Hamiltonian on the right-hand side, this reduces to

$i \hbar \frac{d}{dt} \langle E | \psi(t) \rangle = E \langle E | \psi(t) \rangle \qquad \qquad \qquad \qquad \qquad (154)$

This is a simple first-order differential equation in the overlap coefficient $\langle E | \psi(t) \rangle$ which can be rearranged to read

$\frac{d \langle E | \psi(t) \rangle}{\langle E | \psi(t) \rangle} = -\frac{i}{\hbar} E \ dt$

Integrating both sides from an initial time $t_0$ to a later time $t_1$ , and defining $\Delta t = t_1 - t_0$ , we get

$\langle E | \psi(t_1) \rangle = \langle E | \psi(t_0) \rangle e^{- i E \Delta t /\hbar} \qquad \qquad \qquad \qquad \qquad (155)$

This tells us that, for a conservative system, there is a simple way to time-evolve the overlap coefficients in (152), and hence the state $|\psi(t) \rangle$ as a whole, from an initial time $t_0$ to a subsequent time $t_1$ . Writing the superposition in (152) at time $t_1$ and using the result in (155) we get

$|\psi(t_1) \rangle = \sum_E | E \rangle \langle E | \psi(t_1) \rangle = \sum_E | E \rangle \langle E | \psi(t_0) \rangle e^{- i E \Delta t /\hbar} \qquad \qquad \qquad \qquad \qquad (156)$

We can define a forward-displacement-in-time-by- $\Delta t$ operator (time evolution operator, for short) as a function of the Hamiltonian operator by writing (156) as

$|\psi(t_1) \rangle = e^{- i \hat{H} \Delta t/ \hbar} | \psi(t_0) \rangle \qquad \qquad \qquad \qquad \qquad (157)$

and interpreting the right-hand side of (157) as

$e^{- i \hat{H} \Delta t / \hbar} | \psi(t_0) \rangle = \sum_E | E \rangle \langle E | \psi(t_0) \rangle e^{- i E \Delta t /\hbar} \qquad \qquad \qquad \qquad \qquad (158)$

Thus, in the case of a conservative system, the rule by which an initial state $| \psi(t_0) \rangle$ evolves over time into a subsequent state $| \psi(t_1) \rangle$ is governed by the law of quantum mechanics expressed in (157). The initial state is simply acted upon by a time evolution operator $e^{- i \hat{H} \Delta t / \hbar}$ , which is an exponential function of another operator, the Hamiltonian operator. Such functions of operators are to be understood as needing to be implemented via their power series expansion (another similar one is the translation in space operator, which is an exponential function of the momentum operator). Thus, we are ultimately dealing with powers of operators in these cases and we can use well-known results for manipulating powers of operators when implementing functions of operators and proving further results relating to them. For the time evolution operator, we have

$e^{- i \hat{H} \Delta t / \hbar} = \sum_{n=0}^{\infty} \frac{(- i \hat{H} \Delta t / \hbar)^n}{n!}$

$= \hat{1} - \frac{1}{\hbar} i \hat{H} \Delta t + \frac{1}{2!} (- i \hat{H} \Delta t / \hbar)^2 + \frac{1}{3!} (- i \hat{H} \Delta t / \hbar)^3 + \cdots \qquad \qquad \qquad \qquad \qquad (159)$

where $\hat{1}$ denotes the identity. It is straightforward to show that the right-hand side of (158) results when we apply the power series form of the time evolution operator in (159) to the eigenstates of the Hamiltonian in the superposition for $| \psi(t_0) \rangle$ . However, caution is needed when manipulating functions of operators such as these, as some basic algebraic relationships involving scalars do not generally hold for operators. For example, by multiplying the power series for $e^{\hat{A}}$ and $e^{\hat{B}}$ and comparing the result with the power series for $e^{\hat{A}+\hat{B}}$ , we find that it is not true that $e^{\hat{A}} e^{\hat{B}} = e^{\hat{A}+\hat{B}}$ unless $\hat{A}$ and $\hat{B}$ commute. The correct general formula for combining products of two exponential functions of operators in this way is called the Baker-Campbell-Hausdorff formula. It says that $e^{\hat{A}} e^{\hat{B}} = e^{\hat{C}}$ , where

$\hat{C} = \hat{A} + \hat{B} + \frac{1}{2} [\hat{A}, \hat{B}] + \frac{1}{12} ([\hat{A}, [\hat{A}, \hat{B}]] - [\hat{B}, [\hat{A}, \hat{B}]]) + \cdots \qquad \qquad \qquad \qquad \qquad (160)$

This reduces to $\hat{C} = \hat{A} + \hat{B}$ only when $[\hat{A}, \hat{B}] = 0$ . Since the Hamiltonian commutes with itself, we can see that the time evolution operator is a unitary operator using the basic index rule for multiplying exponential functions:

$(e^{- i \hat{H} \Delta t / \hbar})^{\dag} (e^{- i \hat{H} \Delta t / \hbar}) = (e^{i \hat{H} \Delta t / \hbar})(e^{- i \hat{H} \Delta t / \hbar}) = \hat{1}$

From the form of the hermitian conjugate $(e^{- i \hat{H} \Delta t / \hbar})^{\dag} = e^{i \hat{H} \Delta t / \hbar}$ (this is straightforward to prove by applying the algebraic rules for hermitian conjugates to the power series form of the exponential), we also note that the inverse of the time evolution operator $e^{- i \hat{H} \Delta t / \hbar}$ in (157) is obtained simply by reversing the direction of the arrow of time.

We obtained the time evolution operator for a conservative system here by integrating the time-dependent Schrödinger equation (150) to arrive at (157). We could just as well have started by stating (157) as a basic postulate and deriving (150) from it as its differential form. For example, taking $\Delta t$ to be infinitesimally small, and writing (157) as

$|\psi + \Delta \psi \rangle = e^{- i \hat{H} \Delta t / \hbar} | \psi \rangle$

we can cast this equation into $x$ -language by acting on both sides with the bra $\langle x|$ and use the linear approximation to the exponential to get

$\langle x | \psi + \Delta \psi \rangle = \big(\hat{1} - \frac{1}{\hbar}i \Delta t \hat{H}\big) \langle x | \psi \rangle$

$= \langle x | \psi \rangle - \frac{1}{\hbar}i \Delta t \hat{H} \langle x | \psi \rangle$

Therefore,

$i \hbar \frac{(\langle x | \psi + \Delta \psi \rangle - \langle x | \psi \rangle )}{\Delta t} = \hat{H} \langle x | \psi \rangle$

or, in function notation as $\Delta t \rightarrow \infty$ ,

$\hat{H} \psi = i \hbar \frac{d \psi}{dt}$

which is the wave function form of Schrödinger’s time-dependent equation.

The time evolution rule in (157) encapsulates two fundamental principles of physics, conservation of energy (a classical principle) and de Broglie’s equation $E = hf$ linking the energy of a quantum particle to the frequency of a wave function (a purely quantum mechanical principle). We obtained (157) for a conservative system with a time-independent Hamiltonian, meaning that the system is closed and the total energy is an invariant of the motion. In quantum mechanics, this is expressed by saying that energy eigenstates are stationary when the rule (157) is applied to them. If the ket $| \psi(t_0) \rangle$ on the right hand side of (157) happens to be an eigenstate of the Hamiltonian already, say $| \psi(t_0) \rangle = | E \rangle$ , then we get

$e^{- i \hat{H} \Delta t/ \hbar} | E \rangle = e^{- i E \Delta t/ \hbar} | E \rangle \qquad \qquad \qquad \qquad \qquad (161)$

and the ket on the right hand side of (161) is the same eigenket of $\hat{H}$ as $| E \rangle$ up to a constant of proportionality. It yields the same eigenvalue when acted upon by $\hat{H}$ as $| E \rangle$ does, and the probability of observing this eigenvalue in a measurement would remain unchanged. Thus, for a conservative system, energy eigenstates are stationary in the sense that earlier and later eigenstates of $\hat{H}$ can only differ through a phase factor so measurement probabilities remain unchanged. This will not be true in general for a ket $| \psi(t_0) \rangle$ on the right hand side of (157) which is not an energy eigenstate of the Hamiltonian. When this state is expressed as a superposition of energy eigenstates to obtain (158), each energy eigenstate in the superposition will have a different phase factor, i.e., the relative phases associated with each energy eigenstate change with time, so the later state $| \psi(t_1) \rangle$ will not generally be the same as the earlier state $| \psi(t_0) \rangle$ to within a phase factor. De Broglie’s equation comes into the story because we can rewrite $E = hf$ as $E = \hbar \omega$ , where $\omega = 2 \pi f = E/\hbar$ is the angular frequency of a wave associated with the energy $E$ . Thus, the phase factors arising from applying the time evolution operator for a conservative system have the form $e^{-i \omega \Delta t}$ . With every energy $E$ there is to be associated a unique single frequency $\omega = E/\hbar$ , and (158) shows that the eigenstates of energy evolve over time at this fixed frequency.

To give some applied examples of calculations involving the time evolution operator, consider a beam of neutrons entering a region of uniform magnetic field at time $t = 0$ . Although a neutron does not have charge, it does have a magnetic dipole moment and a spin angular momentum index $s = \frac{1}{2}$ (same as the electron). Suppose the beam of incoming neutrons is prepared with a known $y$ -direction spin component and travels in the $+x$ -direction. At $t= 0$ , it enters the region of uniform magnetic field $\textbf{B}$ pointing in the $+z$ -direction. In classical physics, the potential energy that would be associated in this situation with a magnetic dipole moment $\textbf{M}$ in a magnetic field $\textbf{B} = \textbf{e}_z B$ is $-\textbf{M} \cdot \textbf{B}$ . (To get an idea of why this is the case, recall from (118) that the torque experienced by the magnetic dipole moment in the magnetic field is $\pmb{\tau} = \textbf{M} \times \textbf{B}$ . The work done in aligning this with the magnetic field is $W = \int \tau d \theta = \int M B \sin \theta d \theta = M B \cos \theta = M \cdot B$ . Therefore, the potential energy is $-W = - M \cdot B$ ). There is no kinetic energy because the magnetic force is always perpendicular to the velocity. Thus, in the classical case, the Hamiltonian would be $H = -\textbf{M} \cdot \textbf{B} = -M_z B$ . By analogy, the Hamiltonian operator expressing the energy of the neutron magnetic dipole moment in the magnetic field pointing in the $+z$ -direction would be

$\hat{H} = -\hat{\textbf{M}} \cdot \textbf{B} = \Omega \hat{S}_z \qquad \qquad \qquad \qquad \qquad (162)$

where $\Omega$ represents a collection of parameters which together carry the dimensions of angular frequency. Suppose a neutron enters the magnetic field at time $t = 0$ in a spin state $| \mu = +\frac{1}{2} \rangle$ . This is the initial state of the system, i.e., the state $| \psi(t_0) \rangle$ in (157). It is an eigenstate of the spin angular momentum operator $\hat{S}_y$ with eigenvalue $\mu = \frac{1}{2}$ , as defined by (136). This initial neutron state will evolve over time in accordance with (157) into a later state $| \psi(t_1) \rangle$ given by

$| \psi(t_1) \rangle = \exp \big(\frac{- i \Delta t \Omega \hat{S}_z}{\hbar}\big) |\mu = +\frac{1}{2} \rangle \qquad \qquad \qquad \qquad \qquad (163)$

But the state $|\mu = +\frac{1}{2} \rangle$ is not an eigenstate of the Hamiltonian in (162). The eigenstates of the Hamiltonian are those of $\hat{S}_z$ , defined by (135). We now need to expand $|\mu = +\frac{1}{2} \rangle$ in a superposition of the Hamiltonian’s eigenstates using a ket-bra sum. We write

$|\mu = +\frac{1}{2} \rangle = \sum_m | m \rangle \langle m | \mu = +\frac{1}{2} \rangle$

$= | m = +\frac{1}{2} \rangle \langle m = +\frac{1}{2} | \mu = +\frac{1}{2} \rangle + | m = -\frac{1}{2} \rangle \langle m = -\frac{1}{2} | \mu = +\frac{1}{2} \rangle \qquad \qquad \qquad \qquad \qquad (164)$

where $m$ labels the eigenstates of $\hat{S}_z$ and thus of $\hat{H}$ . We need the overlap coefficients $\langle m | \mu = +\frac{1}{2} \rangle$ appearing in the superposition (164), but these are exactly the ones calculated in (144), in the first column. Thus, we can write the superposition as

$|\mu = +\frac{1}{2} \rangle = \frac{1}{\sqrt{2}} | m = +\frac{1}{2} \rangle + \frac{i}{\sqrt{2}} | m = -\frac{1}{2} \rangle \qquad \qquad \qquad \qquad \qquad (165)$

Applying the time evolution operator to this as in the right-hand side of (163) we obtain

$| \psi(t_1) \rangle = \frac{1}{\sqrt{2}} | m = +\frac{1}{2} \rangle \exp \big(\frac{- i \Delta t \Omega}{2}\big) + \frac{i}{\sqrt{2}} | m = -\frac{1}{2} \rangle \exp \big(\frac{i \Delta t \Omega}{2}\big) \rangle \qquad \qquad \qquad \qquad \qquad (166)$

We see from (166) that $| \psi(t_1) \rangle$ is not the same state as $| \psi(t_0) \rangle = | \mu = +\frac{1}{2} \rangle$ up to a phase factor, so this is an example of a nonstationary state. The relative phases associated with each energy eigenstate have changed with time.

Given that the system is initially in the $\hat{S}_y$ eigenstate $|\mu = +\frac{1}{2} \rangle$ at time $t_0$ , suppose we want to know the probability amplitude to find it in the $\hat{S}_z$ eigenstate $|m = +\frac{1}{2} \rangle$ at a later time $t_1$ . We can apply the bra $\langle m = +\frac{1}{2}|$ to the time-evolved ket in (166) to get

$\langle m = +\frac{1}{2} | \exp \big(\frac{- i \Delta t \Omega \hat{S}_z}{\hbar}\big) |\mu = +\frac{1}{2} \rangle = \frac{1}{\sqrt{2}} \exp \big(\frac{- i \Delta t \Omega}{2}\big) \qquad \qquad \qquad \qquad \qquad (167)$

We observe that the absolute value squared of (167) is the same as we would get if we had applied the bra $\langle m = +\frac{1}{2}|$ to the initial state in (165). Thus, the probability of finding the spin up state in the $+z$ -direction does not vary with time, i.e., it is a stationary state, confirming that this spin up state is an eigenstate of energy.

By way of contrast, suppose again that the system is initially in the $\hat{S}_y$ eigenstate $|\mu = +\frac{1}{2} \rangle$ at time $t_0$ , but now we want to know the probability amplitude to find it still in the $\hat{S}_y$ eigenstate $| \mu = +\frac{1}{2} \rangle$ at a later time $t_1$ . We can apply the bra $\langle \mu = +\frac{1}{2}|$ to the time-evolved ket in (166) to get

$\langle \mu = +\frac{1}{2} | \exp \big(\frac{- i \Delta t \Omega \hat{S}_z}{\hbar}\big) |\mu = +\frac{1}{2} \rangle$

$= \frac{1}{\sqrt{2}} \langle \mu = +\frac{1}{2}| m = +\frac{1}{2} \rangle \exp \big(\frac{- i \Delta t \Omega}{2}\big) + \frac{i}{\sqrt{2}} \langle \mu = +\frac{1}{2}| m = -\frac{1}{2} \rangle \exp \big(\frac{i \Delta t \Omega}{2}\big) \qquad \qquad \qquad \qquad \qquad (168)$

But the overlap coefficients in (168) are just the complex conjugates of the entries in the first column of (144), so we have

$\langle \mu = +\frac{1}{2} | \exp \big(\frac{- i \Delta t \Omega \hat{S}_z}{\hbar}\big) |\mu = +\frac{1}{2} \rangle$

$= \frac{1}{\sqrt{2}} \frac{1}{\sqrt{2}} \exp \big(\frac{- i \Delta t \Omega}{2}\big) - \frac{i}{\sqrt{2}} \frac{i}{\sqrt{2}} \exp \big(\frac{i \Delta t \Omega}{2}\big)$

$= \cos \big( \frac{\Omega \Delta t}{2}\big) \qquad \qquad \qquad \qquad \qquad (169)$

Therefore, the probability of finding the initial spin state again at time $t_1$ is

$|\langle \mu = +\frac{1}{2} | \exp \big(\frac{- i \Delta t \Omega \hat{S}_z}{\hbar}\big) |\mu = +\frac{1}{2} \rangle|^2 = \cos^2 \big( \frac{\Omega \Delta t}{2}\big) = \frac{1}{2} (1 + \cos(\Omega \Delta t)) \qquad \qquad \qquad \qquad \qquad (170)$

We find that the probability amplitude in (169) oscillates in time at frequency $\Omega/2$ , and the probability in (170) varies sinusoidally over time at frequency $\Omega$ . Thus, even if we end up at $t_1$ observing the same state $|\mu = +\frac{1}{2} \rangle$ that we started with at $t_0$ , the state $|\mu = +\frac{1}{2} \rangle$ is not stationary, since the passage of time causes variation in the relative phase differences among the energy eigenstates in the superposition and this causes the probability of this outcome to vary with time.

Another important mathematical idea in this area of quantum mechanics is that of calculating how the average value of an operator changes with time. This leads to the concepts of the classical limit of quantum mechanics and invariant operators of a quantum system which always give stationary measurement results. Consider an operator $\hat{A}$ which represents some measurement observable and has the eigenvalue equation $\hat{A} | A \rangle = A | A \rangle$ . Suppose a quantum system is prepared at time $t = t_0$ to be in some quantum state $| \alpha \rangle$ which is not an eigenstate of $\hat{A}$ . With a view to applying $\hat{A}$ to $| \alpha \rangle$ , we can expand the latter as a superposition of the eigenstates of $\hat{A}$ :

$| \alpha \rangle = \sum_A | A \rangle \langle A | \alpha \rangle \qquad \qquad \qquad \qquad \qquad (171)$

Then applying $\hat{A}$ to this gives us a new state expressed as a superposition:

$\hat{A} | \alpha \rangle = \sum_A \hat{A} | A \rangle \langle A | \alpha \rangle = \sum_A A | A \rangle \langle A | \alpha \rangle \qquad \qquad \qquad \qquad \qquad (172)$

To perform probability calculations using (172) we would need to cast it in a relevant language by applying a suitable bra to both sides. In particular, casting (172) in the language of $\alpha$ gives us the expected value of the operator $\hat{A}$ , denoted by $\langle A \rangle$ :

$\langle \alpha | \hat{A} | \alpha \rangle = \sum_A A \langle \alpha | A \rangle \langle A | \alpha \rangle = \sum_A A | \langle A | \alpha \rangle |^2 = \langle A \rangle \qquad \qquad \qquad \qquad \qquad (173)$

This corresponds to applying $\langle \alpha | \hat{A}$ to the state $| \alpha \rangle$ instantaneously after the latter is prepared, repeating this experiment a large number of times to obtain many results from the spectrum of possible $A$ -values, and then working out the average of $A$ from the experimental data. But suppose there is a time delay between preparing the state $| \alpha \rangle$ and making the repeated measurements of $\hat{A}$ , so that instead of making each measurement at $t = t_0$ we make it at a later time $t = t_1$ . Then by (157) the state $| \alpha \rangle$ will have evolved to a state $| \beta \rangle$ by the time each measurement is made, where $| \beta \rangle$ is given by

$| \beta \rangle = \exp (- i \Delta t \hat{H} /\hbar) | \alpha \rangle \qquad \qquad \qquad \qquad \qquad (174)$

Using the time-reversed operator $(e^{- i \hat{H} \Delta t / \hbar})^{\dag} = e^{i \hat{H} \Delta t / \hbar}$ , we can write the expected value of $\hat{A}$ at time $t_1$ as

$\langle \beta | \hat{A} | \beta \rangle = \langle \alpha | \exp (i \Delta t \hat{H} /\hbar) \hat{A} \exp (- i \Delta t \hat{H} /\hbar) | \alpha \rangle \qquad \qquad \qquad \qquad \qquad (175)$

Differentiating (175) with respect to time then allows us to study how the average value of an operator changes with time. Using the product rule, we get

$\frac{\partial}{\partial \Delta t} [\exp (i \Delta t \hat{H} /\hbar) \hat{A} \exp (- i \Delta t \hat{H} /\hbar)]$

$= \frac{i \hat{H}}{\hbar} \exp (i \Delta t \hat{H} /\hbar) \hat{A} \exp (- i \Delta t \hat{H} /\hbar)$

$+ \exp (i \Delta t \hat{H} /\hbar) \frac{\partial \hat{A}}{\partial t} \exp (- i \Delta t \hat{H} /\hbar)$

$+ \exp (i \Delta t \hat{H} /\hbar) \hat{A} \big(-\frac{i \hat{H}}{\hbar} \big) \exp (- i \Delta t \hat{H} /\hbar)$

$= \exp (i \Delta t \hat{H} /\hbar) \big(\frac{\partial \hat{A}}{\partial t} + \frac{1}{i \hbar} [\hat{A}, \hat{H}]\big) \exp (-i \Delta t \hat{H} /\hbar) \qquad \qquad \qquad \qquad \qquad (176)$

Using (176) with (175) we deduce that

$\frac{d}{dt} \langle \beta | \hat{A} | \beta \rangle = \langle \beta | \big(\frac{\partial \hat{A}}{\partial t} + \frac{1}{i \hbar} [\hat{A}, \hat{H}]\big) | \beta \rangle \qquad \qquad \qquad \qquad \qquad (177)$

What (177) is telling us is that the time derivative of $\langle \beta | \hat{A} | \beta \rangle$ is actually the average of a new operator bracketed between the same states. The new operator consists of two terms. The first term is the time derivative of the operator itself. This is often taken to be zero because it is unusual to have time-varying measurement apparatus in an experiment. In the case of the Stern-Gerlach experiments, for example, an operator with a non-zero time derivative would correspond to something like the apparatus continually revolving while the measurements were being made. In the following, we will always assume $\partial \hat{A}/ \partial t = 0$ . The second term of the new operator in (177) is the commutator of $\hat{A}$ with $\hat{H}$ . The key point here is that the average value of a fixed measurement observable $\hat{A}$ will change over time if $[\hat{A}, \hat{H}] \neq 0$ .

Something called the Correspondence Principle in quantum mechanics suggests that, in the case $[\hat{A}, \hat{H}] \neq 0$ , the average of $\hat{A}$ should evolve over time in accordance with a Newtonian-like equation of motion. The famous Ehrenfest Theorem, for example, is a specific case of this involving position and momentum operators. Consider a particle in a potential $V(x)$ for which the Hamiltonian is

$\hat{H} = \frac{\hat{p}^2}{2m} + V(\hat{x})$

Taking first $\hat{A} = \hat{x}$ and then $\hat{A} = \hat{p}$ in (177) with $\partial \hat{A}/ \partial t = 0$ , and using the results $\langle x | [\hat{p}, \hat{V}] = -i \hbar (dV/dx) \langle x |$ and $[\hat{x}, \hat{p}^2] = 2 i \hbar \hat{p}$ , we obtain the Newtonian-like equations of motion

$\frac{d}{dt} \langle \hat{x} \rangle = \frac{d}{dt} \langle x | \hat{x} | x \rangle = \langle x | \frac{1}{i\hbar} [\hat{x}, \hat{H}] | x \rangle$

$= \frac{1}{i\hbar} \langle x | \big[\hat{x}, \frac{\hat{p}^2}{2m} \big] | x \rangle = \frac{1}{m} \langle x | \hat{p} | x \rangle = \frac{1}{m} \langle \hat{p} \rangle \qquad \qquad \qquad \qquad \qquad (178)$

and

$\frac{d}{dt} \langle \hat{p} \rangle = \frac{d}{dt} \langle x | \hat{p} | x \rangle = \langle x | \frac{1}{i\hbar} [\hat{p}, \hat{H}] | x \rangle$

$= \frac{1}{i\hbar} \langle x | [\hat{p}, V(\hat{x}) ] | x \rangle = \langle x | -(dV/dx) | x \rangle = -\langle dV/dx \rangle \qquad \qquad \qquad \qquad \qquad (179)$

In the case $\partial \hat{A}/ \partial t = 0$ and $[\hat{A}, \hat{H}] = 0$ , we would have a stationary time average, i.e., $d \langle \hat{A} \rangle/dt = 0$ . Such operators represent conserved quantities, formally called invariants of the system. For example, the energy of a closed system is such an invariant, as we saw above. Another example of an invariant operator in the context of the neutron in a magnetic field is the case $\hat{A} = \hat{S}^2$ . The total spin angular momentum operator is not a function of time and it commutes with the Hamiltonian specified in (162) so it must be an invariant of the system. By (126), we have $\hat{S}^2 = 3 \hbar^2/4$ and this must be preserved in the magnetic field, fundamentally because the magnetic field cannot change the neutron’s spin parameter $s = \frac{1}{2}$ . In contrast, the operator $\hat{S}_y$ with the eigenvalue equation given in (136) does not commute with the Hamiltonian in (162) so it cannot be an invariant of the system. A neutron entering the system in a pre-prepared spin up state $|\mu = \frac{1}{2} \rangle$ can be found at the exit in a spin down state $| \mu = -\frac{1}{2} \rangle$ .