A mathematics and physics `scrapbook', with notes on miscellaneous things that catch my interest in a range of areas including: mathematical methods; number theory; probability theory; stochastic processes; mathematical economics; econometrics; quantum theory; relativity theory; cosmology; cloud physics; statistical mechanics; nonlinear dynamics; electronic engineering; graph and network theory; mathematics in Latin; computational mathematics using Python and other maths software.
I recently needed to approximate a logarithm of the form where is some large number. It was not possible to use the usual Maclaurin series approximation for because this only holds for . However, the following is a useful trick. We have
Therefore, suppose we replace with , where is any large number. Then
The approximation formula works surprisingly well, being more accurate the larger is . For example, to calculator-accuracy we have , while taking we get the approximation .
Zipf’s law refers to the phenomenon that many data sets in social and exact sciences are observed to obey a power law of the form with the exponent approximately equal to 2. In the present note I want to set out a simple Yule-Simon process (similar to one first discussed in Simon, H, 1955, On a class of skew distribution functions, Biometrika, 44:425-40) which shows clearly how Zipf’s law can emerge from urn-type processes following a similar pattern.
The simple process discussed here involves the appearance of new species within families of closely related species called genera. New species appear within genera (through evolutionary processes) which usually remain quite close in their main characteristics to the pre-existing species. However, every so often, a new species will appear which is sufficiently different from all pre-existing ones to enable it to be regarded as having started a completely new genus. We can construct a simple Yule-Simon process as a stylised version of this. Suppose that species appear one at a time and that when the number of species reaches , the next new species will start a new genus. Therefore, when the first new genus appears, there are species in total. Species continue to appear one at a time, and when the number of species reaches (i.e., another species have appeared), the next new species will again start a new genus. Thus, when the second new genus appears, there are species in total. We assume this process continues indefinitely, so that when the -th new genus appears, there are species in total.
We further assume that between each genus and the next, the new species that appear will be distributed among the already existing genera in proportion to the number of species they already have (this gives rise to the characteristic feature of Zipf’s law when applied to wealth distribution in economics that `the rich get richer’). So at stage , the next species that appears will appear in the -th genus with probability
where is the number of species already in , and is simply the total number of species at stage , which is . There are opportunities for this to happen, so genus gains a new species with probability
Let denote the fraction of genera that have species when the total number of genera is . Then is the number of genera that have species when the total number of genera is , and the expected number of genera of size that gain a new species in this interval is
Now, when these genera gain the new species, they will move out of the class of genera with species, and into the class of genera with species, so the number of genera with species will fall by . Analogously, the expected number of genera with species that will gain a new species is
When these genera gain a new species, they will move into the class of genera with species, so the number of genera with species will rise by . Therefore we can write a master equation for the new number of genera with species at stage thus:
However, this master equation does not hold for genera of size 1. Instead, these genera obey
The second term on the right-hand side is 1 because, by definition, exactly one new genus appears at each step of the process, so there is only one entrant from the class of genera with zero species into the class of genera with one species.
We assume there is a steady state as , in which case we get the steady state equation for
and the steady state equation for
But using the steady state equation for above we observe that
and substituting this back into the steady state equation for above we get
Continuing the iteration in this way we get
and using the steady state expression for in this we get
Since with , we can write this as
Now, the gamma function is defined as
and the beta function for and is defined as
It is not too difficult to show that the two are related by the equation
and furthermore, for large we have
Comparing with the final expression for above in terms of the gamma function we see that
So we get a power law when the genera size is large, and we get Zipf’s law when the number of new entrant species is large.
When the parameters of some physical systems are precisely tuned, the systems can enter a phase transition in which the behaviour of observables changes dramatically. In particular, the systems can become scale-free in the sense of losing any relationship to scales of measurement, i.e., the systems suddenly switch to behaving the same irrespective of the scales of measurement being used. (For many examples of this, and a discussion of scale invariance arising from phase transitions, visit this website). Among the critical phenomena in the vicinity of these phase transitions we can then get power law behaviours, e.g., for probability distributions of observables in the system. In the present note, I want to record a simple proof that whenever a probability distribution is scale-free, it must in fact be a power law of the form .
The scale-free characteristic can be expressed as
so that multiplying the argument by a scale factor simply results in the same probability function multiplied by a scale factor , where is some other function. To show that any function having this scale-free characteristic must be a power law, begin by setting . Then and therefore
so the expression for above becomes
Since this is an identity in the scale factor , we can differentiate both sides with respect to to get
Setting in this we get
This is a separable first-order differential equation, so
and therefore
Setting we find , so
and thus we arrive at the power law
where
So the power law distribution is the only function satisfying the scale-free criterion . In the vicinity of the critical point of a continuous phase transition at which a physical system becomes scale-free, power law behaviour should be seen among the observables in the system.
When you think of the classical harmonic oscillator, think of a mass connected to a spring oscillating at a natural frequency which is independent of the initial position or velocity of the mass. The natural frequency will depend only on the stiffness of the spring and the size of the mass.
When you think of the quantum harmonic oscillator, think of quasi-factorising the Hamiltonian operator in Schrödinger’s equation to get creation and annihilation operators, re-expressing the Hamiltonian in terms of these ladder operators, and operating on the system with these ladder operators to increase and decrease the energy of the system by multiples of discrete packets of energy (‘quanta’).
Recall that the classical harmonic oscillator model, say a mass bouncing up and down on a spring with spring constant aligned with the -axis, involves a restoring force on the mass whenever it is away from equilibrium at . Newton’s second law then gives the differential equation
Defining (this will be the natural frequency of the oscillations), we can write this as
which has the general solution
We can obtain particular solutions from this general solution by specifying initial conditions. For example, starting off with the mass at 10cm below the equilibrium position and then releasing it gives the initial conditions
Applying these to the general solution we get the equations
Therefore the particular solution for this situation is
The work done by the spring force on the mass in opposing its motion from, say, the equilibrium position to a height above the equilibrium point is given by
This work can be viewed as transferring energy from the kinetic energy of the mass to the elastic potential energy of the spring (or more strictly speaking, the mass-spring system). The potential energy of the spring for this displacement from equilibrium is thus
(Also note that, as usual, the original spring force is recoverable as the negative of the first derivative of the potential energy).
If the maximum displacement of the spring before returning to equilibrium is , so that it momentarily stops there (maximum potential energy, zero kinetic energy), then the maximum speed of the mass which occurs when it is back at the equilibrium point (zero potential energy, maximum kinetic energy) can be calculated using the conservation of total energy equation as
This maximum speed is higher the higher the spring constant (i.e., the stiffer the spring) and the lower the mass of the particle.
In the case of the quantum harmonic oscillator, we use the time-independent Schrödinger equation rather than Newton’s second law to get the relevant differential equation. Starting from the total energy equation , written as
where is linear momentum, we simply replace by the quantum mechanical momentum operator and by the quantum mechanical position operator to get the time-independent Schrödinger equation
The bracketed expression on the left-hand side is the Hamiltonian operator (corresponding to the total energy of the system, kinetic plus potential) acting on the wave function :
Inside the brackets we now have an operator sum of squares similar to the algebraic sum of squares which can be factorized as . We would therefore like to factorize by setting , , and writing
However, multiplying out this expression does not give us , because the operators and do not commute. Instead we get
where
is the quantum mechanical commutator of and . Therefore
where and are creation and annihilation operators respectively (also known as raising and lowering operators, and collectively as ladder operators) defined as
If we reversed the order of and we would find that
We can therefore write the Schrödinger equation as
However, the real usefulness of ladder operators becomes apparent when we apply the Hamiltonian written in this form to rather than to . We get
Therefore if is a solution to the quantum harmonic oscillator problem, is also a solution, i.e., we can apply the creation operator to the solution and get another solution with an energy eigenvalue instead of .
Using the same algebra, we find that if is a solution to the quantum harmonic oscillator problem with energy eigenvalue , then is another solution with energy eigenvalue .
We therefore call the ladder operator a raising operator because applying it to a quantum state results in a new quantum state whose energy is higher by a quantum of energy . The term creation operator arises because these quanta of energy actually behave like particles, so the addition of this extra quantum of energy can also be viewed as the creation of a new particle. Similarly, is a lowering operator because applying it to a quantum state results in a new quantum state whose energy is lower by a quantum of energy . It is also known as an annihilation operator because this process is like removing a particle.
Now, the Schrödinger equation for the quantum harmonic oscillator has a solution set consisting of eigenfunctions (expressed in terms of Hermite polynomials), each with a corresponding energy eigenvalue
for . The energy of a quantum harmonic oscillator is therefore indeed quantized in steps of . The lowest possible energy, namely the zero point energy corresponding to , is . The corresponding eigenfunction is . Since the energy level cannot fall below , we specify , so trying to apply the lowering operator to just gives the zero function. We could actually solve for the explicit form of by solving the condition as a simple first-order differential equation:
Normalizing, we get
so the explicit form of the zero point energy eigenfunction is
We can now get all the higher energy eigenfunctions by repeatedly applying the raising operator to this explicit form for (with some adjustments for normalization).
In the present note I want to explore the decomposition of an arbitrary Lorentz transformation in the form , where and are orthogonal Lorentz matrices and is a simple Lorentz matrix (to be defined below). We will use throughout a metric tensor of the form .
Any matrix that preserves the quadratic form is called Lorentz. We have here
where is the temporal coordinate and are spatial coordinates. So, what this means is that if is a matrix, it is Lorentz if is such that
Therefore is Lorentz iff .
The set of all Lorentz matrices thus defined form a group, the Lorentz group, under matrix multiplication. The identity element of the group is obviously the identity matrix . To prove closure under matrix multiplication, suppose and are two Lorentz matrices. Then
so the product is also Lorentz. To prove that the inverse of a Lorentz matrix is also Lorentz, suppose is Lorentz so that . Then since , left-multiplying both sides by gives
so the inverse of is (and the inverse of is , by right-multiplying). But then we have
so the inverse of is also Lorentz. Therefore the Lorentz matrices form a group as claimed.
For two inertial reference frames in standard configuration, the Lorentz transformation will be a matrix of the form
such that . Any matrix of this form is said to be simple Lorentz. The relative velocity in the physical situation modelled by is recovered as
Notice that since , we have
which is itself a simple Lorentz matrix corresponding to a reversal in the sign of .
Notice also that the transpose of a Lorentz matrix is Lorentz. To see this, if is Lorentz then . Pre-multiplying this by and post-multiplying by we get
as required.
We now look in detail at the decomposition result mentioned at the start of this note. This expresses in mathematical terms the possibility of simplifying an arbitrary Lorentz matrix by a suitable rotation of axes. The result says that an arbitrary Lorentz matrix has the representation
where is a simple Lorentz matrix with parameters (with ) and , and and are orthogonal Lorentz matrices defined by
and
where , , , , with and chosen so that is orthogonal.
A corollary of this result is that if connects two inertial frames, then the relative velocity between the frames is
To prove this decomposition result and its corollary, we begin by observing that
But from the first element of the product , which is
we get
Using this result in the expression for above we therefore have
Therefore the matrix
is orthogonal (i.e., its columns and rows are orthogonal unit vectors – recall that and are chosen so that this is true). Therefore the matrix has the form
Therefore
This is a orthogonal matrix which is also Lorentz. It is clearly orthogonal since . To confirm that is Lorentz we compute
using the fact that , and are unit vectors. Therefore , so is both orthogonal and Lorentz as claimed. Therefore we have
as claimed in the decomposition result above.
Now, since is a product of Lorentz matrices, itself must be a Lorentz matrix. To show that it is orthogonal, we can write out explicitly as
where the omitted rows are of the same form as the first row, but with , , , instead of , , , , for .
We now focus on proving that the top row and first column of this product are . If this is the case, then the remaining submatrix must be orthogonal since is Lorentz (the equation can only be satisfied if the submatrix is orthogonal) and this will imply that the entire matrix is orthogonal.
The – element of the product is
(since , , are the last three elements in the first row of multiplied by )
(since this is the same quadratic form as in the first element of derived above, which equals 1).
The next element in the top row is
(since ).
For the third element in the top row we have
since and are orthogonal.
Finally, for the fourth element in the top row we have
since and are orthogonal.
Next, we consider the first column of the product. For each , we have
since the quadratic form inside the bracket is zero when (it is an off-diagonal element of ).
Therefore the product matrix is of the form
where the submatrix must be orthogonal, since is Lorentz. This proves the main decomposition result.
To prove the corollary, note that in a simple Lorentz matrix the relative velocity is given by
An apparent paradox in Einstein’s Special Theory of Relativity, known as a Thomas precession rotation in atomic physics, has been verified experimentally in a number of ways. However, somewhat surprisingly, it has not yet been demonstrated algebraically in a straightforward manner using Lorentz-matrix-algebra. Authors in the past have resorted instead to computer verifications, or to overly-complicated derivations, leaving undergraduate students in particular with the impression that this is a mysterious and mathematically inaccessible phenomenon. This is surprising because, as shown in the present note, it is possible to use a basic property of orthogonal Lorentz matrices and a judicious choice for the configuration of the relevant inertial frames to give a very transparent algebraic proof. It is pedagogically useful for physics students particularly at undergraduate level to explore this. It not only clarifies the nature of the paradox at an accessible mathematical level and sheds additional light on some mathematical properties of Lorentz matrices and relatively-moving frames. It also illustrates the satisfaction that a clear mathematical understanding of a physics problem can bring, compared to uninspired computations or tortured derivations.
B-splines and collocation techniques have been applied to the solution of Schrödinger’s equation in quantum mechanics since the early 1970s, but one aspect that is noticeably missing from this literature is the use of Gaussian points (i.e., the zeros of Legendre polynomials) as the collocation points, which can significantly reduce approximation errors. Authors in the past have used equally spaced or nonlinearly distributed collocation points (noticing that the latter can increase approximation accuracy) but, strangely, have continued to avoid Gaussian collocation points so there are no published papers employing this approach. Using the methodology and computer routines provided by Carl de Boor’s book A Practical Guide to Splines as a `numerical laboratory’, the present dissertation examines how the use of Gaussian collocation points can interact with other features such as box size, mesh size and the order of polynomial approximants to affect the accuracy of approximations to Schrödinger’s bound state wave functions for the electron in the hydrogen atom. In particular, we explore whether or not, and under what circumstances, B-spline collocation at Gaussian points can produce more accurate approximations to Schrödinger’s wave functions than equally spaced and nonlinearly distributed collocation points. We also apply B-spline collocation at Gaussian points to a Schrödinger equation with cubic nonlinearity which has been used extensively in the past to study nonlinear phenomena. Our computer experiments show that in the case of the hydrogen atom, collocation at Gaussian points can be a highly successful approach, consistently superior to equally spaced collocation points and often superior to nonlinearly distributed collocation points. However, we do encounter some situations, typically when the mesh is quite coarse relative to the box size for the hydrogen atom, and also in the cubic Schrödinger equation case, in which nonlinearly distributed collocation points perform significantly better than Gaussian collocation points.
A Lie group is a group which is also a smooth differentiable manifold. Every Lie group has an associated tangent space called a Lie algebra. As a vector space, the Lie algebra is often easier to study than the associated Lie group and can reveal most of what we need to know about the group. This is one of the general motivations for Lie theory. A table of some common Lie groups and their associated Lie algebras can be found here. All matrix groups are Lie groups. An example of a matrix Lie group is the -dimensional rotation group . This group is linked to a set of antisymmetric matrices which form the associated Lie algebra, usually denoted by . Like all Lie algebras corresponding to Lie groups, the Lie algebra is characterised by a Lie bracket operation which here takes the form of commutation relations between the above-mentioned antisymmetric matrices, satisfying the formula
The link between and is provided by the matrix exponential map in the sense that each point in the Lie algebra is mapped to a corresponding point in the Lie group by matrix exponentiation. Furthermore, the exponential map defines parametric paths passing through the identity element in the Lie group. The tangent vectors obtained by differentiating these parametric paths and evaluating the derivatives at the identity are the elements of the Lie algebra, showing that the Lie algebra is the tangent space of the associated Lie group manifold.
In the rest of this note I will unpack some aspects of the above brief summary without going too much into highly technical details. The Lie theory of rotations is based on a simple symmetry/invariance consideration, namely that rotations leave the scalar products of vectors invariant. In particular, they leave the lengths of vectors invariant. The Lie theory approach is much more easily generalisable to higher dimensions than the elementary trigonometric approach using the familiar rotation matrices in two and three dimensions. Instead of obtaining the familiar trigonometric rotation matrices by analysing the trigonometric effects of rotations, we will see below that they arise in Lie theory from the exponential map linking the Lie algebra to the rotation group , in a kind of matrix analogue of Euler’s formula .
Begin by considering rotations in -dimensional Euclidean space as being implemented by multiplying vectors by a rotation matrix which is a continuous function of some parameter vector such that . In Lie theory we regard these rotations as being infinitesimally small, in the sense that they move us away from the identity by an infinitesimally small amount. If is the column vector of coordinate differentials, then the rotation embodied in is implemented as
Since we require lengths to remain unchanged after rotation, we have
which implies
In other words, the matrix must be orthogonal. Furthermore, since the determinant of a product is the product of the determinants, and the determinant of a transpose is the same as the original determinant, we can write
Therefore we must have
But we can exclude the case because the set of orthogonal matrices with negative determinants produces reflections. For example, the orthogonal matrix
has determinant and results in a reflection in the -axis when applied to a vector. Here we are only interested in rotations, which we can now define as having orthogonal transformation matrices such that . Matrices which have unit determinant are called special, so focusing purely on rotations means that we are dealing exclusively with the set of special orthogonal matrices of dimension , denoted by .
It is straightforward to verify that constitutes a group with the operation of matrix multiplication. It is closed, has an identity element , each element has an inverse (since the determinant is nonzero), and matrix multiplication is associative. Note that this means a rotation matrix times a rotation matrix must give another rotation matrix, so this is another property needs to satisfy.
The fact that is also a differentiable manifold, and therefore a Lie group, follows in a technical way (which I will not delve into here) from the fact that is a closed subgroup of the set of all invertible real matrices, usually denoted by , and this itself is a manifold of dimension . The latter fact is demonstrated easily by noting that for , the determinant function is continuous, and is the inverse image under this function of the open set . Thus, is itself an open subset in the -dimensional linear space of all the real matrices, and thus a manifold of dimension . The matrix Lie group is a manifold of dimension , not . One way to appreciate this is to observe that the condition for every means that you only need to specify off-diagonal elements to specify each . In other words, there are elements in each but the condition means that there are equations linking them, so the number of `free’ elements in each is only . We will see shortly that is also the dimension of , which must be the case given that is to be the tangent space of the manifold (the dimension of a manifold is the dimension of its tangent space).
If we now Taylor-expand to first order about we get
where is an infinitesimal matrix of order and we will (for now) ignore terms like which are of second and higher order in . Now substituting into we get
Thus, the matrix must be antisymmetric. In fact, will be a linear combination of some elementary antisymmetric basis matrices which play a crucial role in the theory, so we will explore this more. Since a sum of antisymmetric matrices is antisymmetric, and a scalar product of an antisymmetric matrix is antisymmetric, the set of all antisymmetric matrices is a vector space. This vector space has a basis provided by some elementary antisymmetric matrices containing only two non-zero elements each, the two non-zero elements in each matrix appearing in corresponding positions either side of the main diagonal and having opposite signs (this is what makes the matrices antisymmetric). Since there are distinct pairs of possible off-diagonal positions for these two non-zero elements, the basis has dimension and, as will be seen shortly, this vector space in fact turns out to be the Lie algebra . The basis matrices will be written as where and identify the pair of corresponding off-diagonal positions in which the two non-zero elements will appear. We will let run through the numbers in order, and with each pair and fixed, the element in the -th row and -th column of each matrix is then given by the formula
To clarify this, we will consider the antisymmetric basis matrices for , and . In the case we have so there is a single antisymmetric matrix. Setting , , we get and so the antisymmetric matrix is
In the case we have antisymmetric basis matrices corresponding to the three possible pairs of off-diagonal positions for the two non-zero elements in each matrix. Following the same approach as in the previous case, these can be written as
Finally, in the case we have antisymmetric basis matrices corresponding to the six possible pairs of off-diagonal positions for the two non-zero elements in each matrix. These can be written as
So in the case of a general infinitesimal rotation in -dimensional space of the form , the antisymmetric matrix will be a linear combination of the antisymmetric basis matrices of the form
But note that using the standard matrix exponential series we have
This suggests
and in fact this relationship between rotations and the exponentials of antisymmetric matrices turns out to be exact, not just an approximation. To see this, observe that and commute since . This means that
(note that in matrix exponentiation only if and commute – see below). Since the diagonal elements of an antisymmetric matrix are always zero, we also have
Thus, is both special and orthogonal, so it must be an element of . Conversely, suppose . Then we must have
so is antisymmetric.
So we have a tight link between and via matrix exponentiation. We can do a couple of things with this. First, for any real parameter and antisymmetric basis matrix , we have and this defines a parametric path through which passes through its identity element at . Differentiating with respect to and evaluating the derivative at we find that
which indicates that the antisymmetric basis matrices are tangent vectors of the manifold at the identity, and that the set of antisymmetric basis matrices form the tangent space of . Another thing we can do with the matrix exponential map is quickly recover the elementary rotation matrix in the case . Noting that and separating the exponential series into even and odd terms in the usual way we find that
where the single real number here is the angle of rotation. This is the matrix analogue of Euler’s formula that was mentioned earlier.
To further elucidate how the antisymmetric basis matrices form a Lie algebra which is closely tied to the matrix Lie group , we will show that the commutation relation between them is closed (i.e., that the commutator of two antisymmetric basis matrices is itself antisymmetric), and that these commutators play a crucial role in ensuring the closure of the group (i.e., in ensuring that a rotation multiplied by a rotation produces another rotation). First, suppose that and are two distinct antisymmetric matrices. Then since the transpose of a product is the product of the transposes in reverse order we can write
This shows that the commutator of two antisymmetric matrices is itself antisymmetric, so the commutator can be written as a linear combination of the antisymmetric basis matrices . Furthermore, since we can write and , we have
so every commutator between antisymmetric matrices can be written in terms of the commutators of the antisymmetric basis matrices. Next, suppose we exponentiate the antisymmetric matrices and to obtain the rotations and . Since is closed, it must be the case that
where is another rotation and therefore is an antisymmetric matrix. To see the role of the commutator between antisymmetric matrices in ensuring this, we will expand both sides. For the left-hand side we get
For the right-hand side we get
Equating the two expansions we get
where the remaining terms on the right-hand side are of second and higher order in , and . A result known as the Baker-Campbell-Hausdorff formula shows that the remaining terms on the right-hand side of are in fact all nested commutators of and . The series for with a few additional terms expressed in this way is
This shows that unless and commute, since only in this case do all the commutator terms in the series for vanish. Since the commutator of two antisymmetric matrices is itself antisymmetric, this result also shows that is an antisymmetric matrix, and therefore must be a rotation.
Since every commutator between antisymmetric matrices can be written in terms of the commutators of the antisymmetric basis matrices, a general formula for the latter would seem to be useful. In fact, the formula given earlier, namely
completely characterises the Lie algebra . To conclude this note we will therefore derive this formula ab initio, starting from the formula
for the -th element of each matrix . We have
Focus on first. Using the Einstein summation convention, the product of the -th row of with the -th column of is
Now focus on . The product of the -th row of with the -th column of is
So the element in the -th row and -th column of is
But notice that
and similarly for the other Einstein summation terms. Thus, the above sum reduces to
But
Thus the element in the -th row and -th column of is
Extending this to the matrix as a whole gives the required formula:
Certain arithmetical functions, known as Dirichlet characters mod , are used extensively in analytic number theory. Given an arbitrary group , a character of is generally a complex-valued function with domain such that has the multiplicative property for all , and such that for some . Dirichlet characters mod are certain characters defined for a particular type of group , namely the group of reduced residue classes modulo a fixed positive integer . A reduced residue system modulo is a set of integers
which are incongruent modulo , and each of which is relatively prime to (the function is Euler’s totient function, which counts the number of positive integers not exceeding which are coprime with ). For each integer in this set, we define a residue class as the set of all integers which are congruent to modulo . For example, for , we have and one reduced residue system mod is . The reduced residue classes mod are then
What we are saying is that this set of reduced residue classes mod form a group, and the Dirichlet characters mod are certain characters defined for this group. In general, if we define multiplication of residue classes by
(i.e., the product of the residue classes of and is the residue class of the product ), then the set of reduced residue classes modulo forms a finite abelian group of order with this operation. The identity is the residue class . The inverse of in the group is the residue class such that mod . If we let be the group of reduced residue classes mod , with characters , then we define the Dirichlet characters mod as arithmetical functions of the form
There are distinct Dirichlet characters modulo , each of which is completely multiplicative and periodic with period . Each character value is a (complex) root of unity if whereas whenever . We also have for all Dirichlet characters. For each , there is one character, called the principal character, which is such that
These facts uniquely determine the Dirichlet character table for each . For reference purposes, I will set out the first ten Dirichlet character tables in the present note and demonstrate their calculation in detail.
k = 2
We have so there is only one Dirichlet character in this case (the principal one), with values and .
k = 3
We have so there are two Dirichlet characters in this case. One of them will be the principal character which takes the values , and . To work out the second Dirichlet character we consider the two roots of unity
and
Note that the set of least positive residues mod is generated by :
mod()
mod()
Therefore the non-principal Dirichlet character will be completely determined by the values of . If we set
then
(though this calculation is superfluous here since anyway. This is a fundamental property of Dirichlet characters arising from the fact that they are completely multiplicative). We also have . This completes the second character. (From now on we will omit the statements of the zero values of the Dirichlet characters, which as stated earlier arise whenever ).
k = 4
We have so there are two Dirichlet characters in this case. One of them will be the principal character. (From now on we will always denote the principal character by ). To work out the second Dirichlet character we again consider the two roots of unity
and
Note that the set of least positive residues mod is generated by :
mod()
mod()
Therefore the non-principal Dirichlet character will be completely determined by the values of . If we set
then
(though again this second calculation is superfluous since anyway). This completes the second character.
k = 5
We have so there are four Dirichlet characters in this case. We consider the four roots of unity
Note that the set of least positive residues mod is generated by :
mod()
mod()
mod()
mod()
Therefore the non-principal Dirichlet characters will be completely determined by the values of . If we set
then
(and we have ). This completes the second character.
To compute the third character we can set
then
(and we have ). This completes the third character.
To compute the fourth character we set
then
(and we have ). This completes the fourth character.
k = 6
We have so there are two Dirichlet characters in this case. We consider the two roots of unity
and
Note that the set of least positive residues mod is generated by :
mod()
mod()
Therefore the non-principal Dirichlet character will be completely determined by the values of . If we set
then
(though again this second calculation is superfluous since anyway). This completes the second character.
k = 7
We have so there are six Dirichlet characters in this case. We consider the six roots of unity
Note that the set of least positive residues mod is generated by :
mod()
mod()
mod()
mod()
mod()
mod()
Therefore the non-principal Dirichlet characters will be completely determined by the values of . If we set
then
(and we have ). This completes the second character.
To compute the third character we can set
then
(and we have ). This completes the third character.
To compute the fourth character we can set
then
(and we have ). This completes the fourth character.
To compute the fifth character we can set
then
(and we have ). This completes the fifth character.
Finally, to compute the sixth character we set
then
(and we have ). This completes the sixth character.
k = 8
We have so there are four Dirichlet characters in this case. We consider the four roots of unity
In this case, none of the four elements of the set of least positive residues mod generates the entire set. However, the characters must satisfy the following relations, which restrict the choices:
Each character’s values must be chosen in such a way that these three relations hold.
To compute the second character, suppose we begin by trying to set
and
Then we must have
but then
so this does not work. If instead we try to set
then we must have
but then
so this does not work either. Computations like these show that cannot appear in any of the characters mod . All the characters must be formed from . (Fundamentally, this is due to the fact that the group of least positive residues mod can be subdivided into four cyclic subgroups of order 2, each of which has characters whose values are the two roots of unity, and ).
To compute the second character we can set
and
then we must have
and this works.
To compute the third character we can set
and
then we must have
and this works too.
Finally, to compute the fourth character we can set
and
then we must have
and this works too.
k = 9
We have so there are six Dirichlet characters in this case. We consider the six roots of unity
Note that the set of least positive residues mod is generated by :
mod()
mod()
mod()
mod()
mod()
mod()
Therefore the non-principal Dirichlet characters will be completely determined by the values of . If we set
then
(and we have ). This completes the second character.
To compute the third character we can set
then
(and we have ). This completes the third character.
To compute the fourth character we can set
then
(and we have ). This completes the fourth character.
To compute the fifth character we can set
then
(and we have ). This completes the fifth character.
Finally, to compute the sixth character we can set
then
(and we have ). This completes the sixth character.
k = 10
We have so there are four Dirichlet characters in this case. We consider the four roots of unity
Note that the set of least positive residues mod is generated by :
mod()
mod()
mod()
mod()
Therefore the non-principal Dirichlet characters will be completely determined by the values of . If we set
then
(and we have ). This completes the second character.
To compute the third character we can set
then
(and we have ). This completes the third character.
Finally, to compute the fourth character we set
then
(and we have ). This completes the fourth character.
k = 11
We have so there are ten Dirichlet characters in this case. We consider the ten roots of unity
Note that the set of least positive residues mod is generated by :
mod()
mod()
mod()
mod()
mod()
mod()
mod()
mod()
mod()
mod()
Therefore the non-principal Dirichlet characters will be completely determined by the values of . If we set
then
(and we have ). This completes the second character.
To compute the third character we can set
then
(and we have ). This completes the third character.
To compute the fourth character we can set
then
(and we have ). This completes the fourth character.
To compute the fifth character we can set
then
(and we have ). This completes the fifth character.
To compute the sixth character we can set
then
(and we have ). This completes the sixth character.
To compute the seventh character we can set
then
(and we have ). This completes the seventh character.
To compute the eighth character we can set
then
(and we have ). This completes the eighth character.
To compute the ninth character we can set
then
(and we have ). This completes the ninth character.
Finally, to compute the tenth character we set
then
(and we have ). This completes the tenth character.
In a previous note I studied the mathematical setup of Noether’s Theorem and its proof. I briefly illustrated the mathematical machinery by considering invariance under translations in time, giving the law of conservation of energy, and invariance under translations in space, giving the law of conservation of linear momentum. I briefly mentioned that invariance under rotations in space would also yield the law of conservation of angular momentum but I did not work this out explicitly. I want to quickly do this in the present note.
We imagine a particle of unit mass moving freely in the absence of any potential field, and tracing out a path in the -plane of a three-dimensional Euclidean coordinate system between times and , with the -coordinate everywhere zero along this path. The angular momentum of the particle at time with respect to the origin of the coordinate system is given by
where is the vector product operation. Alternatively, we could have obtained this as
In terms of Lagrangian mechanics, the path followed by the particle will be a stationary path of the action functional
(in the absence of a potential field the total energy consists only of kinetic energy).
Now imagine that the entire path is rotated bodily anticlockwise in the -plane through an angle . This corresponds to a one-parameter transformation
which reduces to the identity when . We have
and therefore
so the action functional is invariant under this rotation since
Therefore Noether’s theorem applies. Let
Then Noether’s theorem in this case says
where
We have
Therefore Noether’s theorem gives us (remembering )
The expression on the left-hand side of this equation is the angular momentum of the particle (cf. the brief discussion of angular momentum at the start of this note), so this result is precisely the statement that the angular momentum is conserved. Noether’s theorem shows us that this is a direct consequence of the invariance of the action functional of the particle under rotations in space.