Decomposition of Lorentz transformations using orthogonal matrices

In the present note I want to explore the decomposition of an arbitrary Lorentz transformation L in the form L = R_1 \overline{L} R_2, where R_1 and R_2 are orthogonal Lorentz matrices and \overline{L} is a simple Lorentz matrix (to be defined below). We will use throughout a metric tensor of the form G = \text{diag}(1, -1, -1, -1)

Any 4 \times 4 matrix that preserves the quadratic form x^T G x is called Lorentz. We have here 

x = \begin{bmatrix} x^0 \\ x^1 \\ x^2 \\ x^3 \end{bmatrix}

where x^0 \equiv ct is the temporal coordinate and x^1, x^2, x^3 are spatial coordinates. So, what this means is that if A is a 4 \times 4 matrix, it is Lorentz if y = Ax is such that 

y^T G y = x^T A^T G A x = x^T G x   

Therefore A is Lorentz iff A^T G A = G

The set of all Lorentz matrices thus defined form a group, the Lorentz group, under matrix multiplication. The identity element of the group is obviously the 4 \times 4 identity matrix I_4. To prove closure under matrix multiplication, suppose A and B are two Lorentz matrices. Then 

(AB)^T G (AB) = B^T (A^T G A) B = B^T G B = G

so the product is also Lorentz. To prove that the inverse of a Lorentz matrix is also Lorentz, suppose A is Lorentz so that A^T G A = G. Then since G^2 = I_4, left-multiplying both sides by G gives

(G A^T G) A = I_4

so the inverse of A is G A^T G (and the inverse of A^T is G A G, by right-multiplying). But then we have 

(G A^T G)^T G (G A^T G) = G A G G G A^T G = G A G A^T G = I_4 G = G

so the inverse of A is also Lorentz. Therefore the Lorentz matrices form a group as claimed.

For two inertial reference frames in standard configuration, the Lorentz transformation will be a 4 \times 4 matrix of the form

\begin{bmatrix} a & b & 0 & 0 \\ b & a & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}

such that a^2 - b^2 = 1. Any 4 \times 4 matrix of this form is said to be simple Lorentz. The relative velocity in the physical situation modelled by A is recovered as 

\beta = \frac{v}{c} = -\frac{b}{a}

Notice that since A^{-1} = G A^T G, we have 

A^{-1} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{bmatrix} \begin{bmatrix} a & b & 0 & 0 \\ b & a & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{bmatrix}

= \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{bmatrix} \begin{bmatrix} a & -b & 0 & 0 \\ b & -a & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{bmatrix}

= \begin{bmatrix} a & -b & 0 & 0 \\ -b & a & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}

which is itself a simple Lorentz matrix corresponding to a reversal in the sign of \beta

Notice also that the transpose of a Lorentz matrix is Lorentz. To see this, if A is Lorentz then A^T G A = G. Pre-multiplying this by AG and post-multiplying by A^{-1} G we get

(AG)(A^T G A)(A^{-1} G) = A G G A^{-1} G

\iff (A G A^T)(G A A^{-1} G) = G

\iff A G A^T = G

as required. 

We now look in detail at the decomposition result mentioned at the start of this note. This expresses in mathematical terms the possibility of simplifying an arbitrary Lorentz matrix by a suitable rotation of axes. The result says that an arbitrary Lorentz matrix L = (a^i_j) has the representation

L = R_1 \overline{L} R_2

where \overline{L} is a simple Lorentz matrix with parameters a = |a^0_0| = \epsilon a^0_0 (with \epsilon = \pm 1) and b = - \sqrt{(a^0_0)^2 - 1}, and R_1 and R_2 are orthogonal Lorentz matrices defined by 

R_1 = L R_2^T \overline{L}^{-1}

and 

R_2 = [e_1 \ r^{\prime} \ s^{\prime}  \ t^{\prime}]^T

where e_1 = (1, 0, 0, 0), r^{\prime} = (\epsilon/b)(0, a^0_1, a^0_2, a^0_3) \equiv (0, r), s^{\prime} = (0, s), t^{\prime} = (0, t), with s and t chosen so that [r \  s  \ t] is 3 \times 3 orthogonal. 

A corollary of this result is that if L = (a^i_j) connects two inertial frames, then the relative velocity between the frames is 

v = c \sqrt{1 - (a^0_0)^{-2}}

To prove this decomposition result and its corollary, we begin by observing that 

||r||^2 = \big(\frac{\epsilon}{b}\big)^2 ((a^0_1)^2 + (a^0_2)^2 + (a^0_3)^2)

But from the first element of the product L G L^T = G, which is    

= \begin{bmatrix} a^0_0 & a^0_1 & a^0_2 & a^0_3 \\ a^1_0 & a^1_1 & a^1_2 & a^1_3 \\ a^2_0 & a^2_1 & a^2_2 & a^2_3 \\ a^3_0 & a^3_1 & a^3_2 & a^3_3 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{bmatrix} \begin{bmatrix} a^0_0 & a^1_0 & a^2_0 & a^3_0 \\ a^0_1 & a^1_1 & a^2_1 & a^3_1 \\ a^0_2 & a^1_2 & a^2_2 & a^3_2 \\ a^0_3 & a^1_3 & a^2_3 & a^3_3 \end{bmatrix}

= \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{bmatrix}

we get 

[a^0_0 \ -a^0_1 \ -a^0_2  \ -a^0_3] \begin{bmatrix} a^0_0 \\ -a^0_1 \\ -a^0_2 \\ -a^0_3 \end{bmatrix} = 1

\iff (a^0_0)^2 - (a^0_1)^2 - (a^0_2)^2 - (a^0_3)^2 = 1

Using this result in the expression for ||r||^2 above we therefore have

||r||^2 = \big(\frac{1}{b}\big)^2 ((a^0_0)^2 - 1) = \frac{b^2}{b^2} = 1

Therefore the matrix 

[r \ s \ t] = \begin{bmatrix} r_1 & s_1 & t_1 \\ r_2 & s_2 & t_2 \\ r_3 & s_3 & t_3 \end{bmatrix}

is orthogonal (i.e., its columns and rows are orthogonal unit vectors – recall that s and t are chosen so that this is true). Therefore the matrix R_2 has the form

R_2 = [e_1 \ r^{\prime} \ s^{\prime} \ t^{\prime}]^T = \begin{bmatrix} e_1 \\ r^{\prime} \\ s^{\prime} \\ t^{\prime} \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & r_1 & r_2 & r_3 \\  0 & s_1 & s_2 & s_3 \\ 0 & t_1 & t_2 & t_3 \end{bmatrix}

Therefore

R_2^T = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & r_1 & s_1 & t_1 \\  0 & r_2 & s_2 & t_2 \\ 0 & r_3 & s_3 & t_3 \end{bmatrix}

This is a 4 \times 4 orthogonal matrix which is also Lorentz. It is clearly orthogonal since R_2^{-1} = R_2^T. To confirm that R_2^T is Lorentz we compute

R_2 G R_2^T = \begin{bmatrix} 1 & 0 & 0 & 0 \\0 & r_1 & r_2 & r_3 \\0 & s_1 & s_2 & s_3 \\0 & t_1 & t_2 & t_3 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0 \\0 & -1 & 0 & 0 \\0 & 0 & -1 & 0 \\0 & 0 & 0 & -1 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0\\0 & r_1 & s_1 & t_1 \\0 & r_2 & s_2 & t_2 \\0 & r_3 & s_3 & t_3 \end{bmatrix}

= \begin{bmatrix} 1 & 0 & 0 & 0 \\0 & -r_1 & -r_2 & -r_3 \\0 & -s_1 & -s_2 & -s_3 \\0 & -t_1 & -t_2 & -t_3 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0\\0 & r_1 & s_1 & t_1 \\0 & r_2 & s_2 & t_2 \\0 & r_3 & s_3 & t_3 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 & 0 \\0 & -1 & 0 & 0 \\0 & 0 & -1 & 0 \\0 & 0 & 0 & -1 \end{bmatrix}

using the fact that r, s and t are unit vectors. Therefore R_2 G R_2^T = G, so R_2^T is both orthogonal and Lorentz as claimed. Therefore we have

R_1 \overline{L} R_2 = (L R_2^T \overline{L}^{-1})(\overline{L})(R_2) = L

as claimed in the decomposition result above.

Now, since R_1 = L R_2^T \overline{L}^{-1} is a product of Lorentz matrices, R_1 itself must be a Lorentz matrix. To show that it is orthogonal, we can write out L R_2^T \overline{L}^{-1} explicitly as

\begin{bmatrix} a_0 & b_0 & c_0 & d_0 \\a_1 & b_1 & c_1 & d_1 \\a_2 & b_2 & c_2 & d_2 \\a_3 & b_3 & c_3 & d_3 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0\\0 & r_1 & s_1 & t_1 \\0 & r_2 & s_2 & t_2 \\0 & r_3 & s_3 & t_3 \end{bmatrix} \begin{bmatrix} \epsilon a_0 & \sqrt{a_0^2 - 1} & 0 & 0 \\\sqrt{a_0^2 - 1} & \epsilon a_0 & 0 & 0 \\0 & 0 & 1 & 0 \\0 & 0 & 0 & 1 \end{bmatrix}

= \begin{bmatrix} (a_0) & (b_0 r_1 + c_0 r_2 + d_0 r_3) & (b_0 s_1 + c_0 s_2 + d_0 s_3) & (b_0 t_1 + c_0 t_2 + d_0 t_3) \\ \cdot & \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot & \cdot \end{bmatrix} \begin{bmatrix} \epsilon a_0 & \sqrt{a_0^2 - 1} & 0 & 0 \\\sqrt{a_0^2 - 1} & \epsilon a_0 & 0 & 0 \\0 & 0 & 1 & 0 \\0 & 0 & 0 & 1 \end{bmatrix}

where the omitted rows are of the same form as the first row, but with a_i, b_i, c_i, d_i instead of a_0, b_0, c_0, d_0, for i = 1, 2, 3.

We now focus on proving that the top row and first column of this product are (\pm 1, 0, 0, 0). If this is the case, then the remaining submatrix must be orthogonal since R_1 is Lorentz (the equation R_1^T G R_1 = G can only be satisfied if the submatrix is orthogonal) and this will imply that the entire R_1 matrix is orthogonal.

The 00– element of the product is

\epsilon a_0^2 + (\sqrt{a_0^2 - 1})(b_0 r_1 + c_0 r_2 + d_0 r_3)

= \epsilon a_0^2 + (\sqrt{a_0^2 - 1})(b_0^2 + c_0^2 + d_0^2) \cdot \frac{\epsilon}{-\sqrt{a_0^2 - 1}}

(since r_1, r_2, r_3 are the last three elements in the first row of L multiplied by \frac{\epsilon}{-\sqrt{a_0^2 - 1}})

= \epsilon (a_0^2 - b_0^2 - c_0^2 - d_0^2) = \epsilon

(since this is the same quadratic form as in the first element of L G L^T = G derived above, which equals 1).

The next element in the top row is

a_0 \sqrt{a_0^2 - 1} + \epsilon a_0 (b_0 r_1 + c_0 r_2 + d_0 r_3)

= a_0 \sqrt{a_0^2 - 1} + \epsilon a_0(b_0^2 + c_0^2 + d_0^2) \cdot \frac{\epsilon}{-\sqrt{a_0^2 - 1}}

= a_0 \sqrt{a_0^2 - 1} - a_0 \sqrt{a_0^2 - 1} = 0

(since b_0^2 + c_0^2 + d_0^2 = a_0^2 - 1).

For the third element in the top row we have

b_0 s_1 + c_0 s_2 + d_0 s_3 = -\frac{\sqrt{a_0^2 - 1}}{\epsilon}(r_1 s_1 + r_2 s_2 + r_3 s_3) = 0

since r and s are orthogonal.

Finally, for the fourth element in the top row we have

b_0 t_1 + c_0 t_2 + d_0 t_3 = -\frac{\sqrt{a_0^2 - 1}}{\epsilon}(r_1 t_1 + r_2 t_2 + r_3 t_3) = 0

since r and t are orthogonal.

Next, we consider the first column of the product. For each i = 1, 2, 3, we have

a_i \epsilon a_0 + \sqrt{a_0^2 - 1} (b_i r_1 + c_i r_2 + d_i r_3)

= a_i \epsilon a_0 + \sqrt{a_0^2 - 1} (b_i b_0 + c_i c_0 + d_i d_0)\cdot \frac{\epsilon}{-\sqrt{a_0^2 - 1}}

= a_i \epsilon a_0 - \epsilon (b_i b_0 + c_i c_0 + d_i d_0)

= \epsilon(a_i a_0 - b_i b_0 - c_i c_0 - d_i d_0) = 0

since the quadratic form inside the bracket is zero when i \neq 0 (it is an off-diagonal element of L G L^T = G).

Therefore the product matrix is of the form

R_1 = \begin{bmatrix} \epsilon & 0 & 0 & 0 \\0 & \  & \  & \ \\0 & \ & R & \ \\0 & \ & \ & \ \end{bmatrix}

where the 3 \times 3 submatrix R must be orthogonal, since R_1 is Lorentz. This proves the main decomposition result.

To prove the corollary, note that in a simple Lorentz matrix the relative velocity is given by

\beta = \frac{v}{c} = -\frac{b}{a} = \frac{\sqrt{(a_0^0)^2 - 1}}{|a_0^0|} = \sqrt{1 - (a_0^0)^{-2}}

and in L = R_1 \overline{L} R_2 this is unchanged.

Published by Dr Christian P. H. Salas

Mathematics Lecturer

Leave a comment