I recently listened to a public lecture on the electrodynamics at the middle night on New Year's Eve. This is the first time I heard that the Maxwell equation is initially described by James Clerk Maxwell himself with the help of quaternion, a mathematics created by Sir William Rowan Hamilton in 1843. But in today's textbooks, Maxwell equations are taught in the language of vector analysis and the concept of quaternion has been completely eliminated. A brief review of this history, for example, can be found in the paper "Development of Vector Analysis from Quaternions".
Quaternion definition
As the generalization of the complex number $z=a+i\,b$ with $i^2=-1$, quaternion is defined in the form of \begin{equation} \mathcal{Q} = q_0+q_1\,\mathbf{i}+q_2\,\mathbf{j}+q_3\,\mathbf{k}\,\tag{1}\end{equation} where $q_0, q_1, q_2,q_3\in \mathbb{R}$ and $\mathbf{i}, \mathbf{j},\mathbf{k}$ satisfy the multiplication rule \begin{equation}\mathbf{i}^2=\mathbf{j}^2=\mathbf{k}^2=-1\,,\quad \mathbf{i}\mathbf{j}=-\mathbf{j}\mathbf{i}=\mathbf{k}\,,\quad \mathbf{j}\mathbf{k}=-\mathbf{k}\mathbf{j}=\mathbf{i}\,,\quad \mathbf{k}\mathbf{i}=-\mathbf{i}\mathbf{k}=\mathbf{j}\,.\tag{2}\end{equation}
Pauli matrix representation
For students in physics major, a quaternion as in Eq.(1) can be simply understood as a $2\times2$ matrix with the correspondences: \begin{equation}\mathbf{i}=-i\,\sigma_1\,,\quad \mathbf{j}=-i\,\sigma_2\,,\quad \mathbf{k}=-i\,\sigma_3\,,\tag{3}\end{equation} where $\sigma_1, \sigma_2, \sigma_3$ are Pauli matrices \begin{equation}\sigma_1=\left[\begin{array}{cc} 0 & 1 \\ 1 & 0\end{array}\right]\,,\quad \sigma_2=\left[\begin{array}{cc} 0 & -i \\ i & 0\end{array}\right]\,,\quad \sigma_3=\left[\begin{array}{cc} 1 & 0 \\ 0 & -1\end{array}\right]\,. \tag{4}\end{equation}
Indeed, with the algebra between the Pauli matrices \begin{equation}\sigma_m\,\sigma_n = \delta_{mn}+i\,\epsilon_{mnl}\,\sigma_l\,,\tag{5}\end{equation} It is easy to verify that the multiplication rule (2) is satisfied under the matrix representation (3) and (4).
Scalar and vector
In modern textbook of electrodynamics, a scalar is a number like $\phi$ and a vector is usually written in the form of \begin{equation}\vec{q}\equiv q_1\,\hat{x}+q_2\,\hat{y}+q_3\,\hat{z}\,,\end{equation} where $\hat{x},\hat{y},\hat{z}$ are the unit vector along the x, y, z axis of the Cartisan coordinate system.
But in history, scalars and vectors are actually first defined in quaternion: As in Eq (1), the part $q_0$ is called scalar and the other part \begin{equation}\mathbf{q}\equiv q_1\,\mathbf{i}+q_2\,\mathbf{j}+q_3\,\mathbf{k}\tag{6}\end{equation} is called vector. Hamilton introduced the operators $\mathbb{S}$ and $\mathbb{V}$ that extract the scalar and vector from a quaternion (1): \begin{equation}q_0 \equiv \mathbb{S}\,\mathcal{Q}\,,\quad q_1\,\mathbf{i}+q_2\,\mathbf{j}+q_3\,\mathbf{k}\equiv \mathbb{V}\mathcal{Q}\,. \tag{7}\end{equation}
Fundamentally, scalars and vectors are classified by their behaviors under rotations in 3d space. To see Eq. (6) is really a vector, we have to check its property under the rotations. Readers who are not interested in this argument can skip the following proof:
For students in physics major, a rotation in 3d space can be represented by a SU(2) group element \begin{equation}\mathbf{U}(\omega_1, \omega_2, \omega_3)=\exp\left[-i\sum_{l=1}^3\omega_l\frac{\sigma_l}{2}\right]\,.\end{equation} (BTW, $\mathbf{U}$ is a unit quaternion. Nowadays, computer vision community still teach their students quaternion to describe rotations). A key relation, which we will not prove this post, is that \begin{equation} \mathbf{U}\sigma_j \mathbf{U}^{\dagger} = \mathbf{R}_{ij}\,\sigma_i\,,\end{equation} where $\mathbf{R}$ is a $3\times 3$ matrix in SO(3) group. Now, consider rotating the quaternion (6) to a new quaternion $\mathbf{q}'=\mathbf{U}\mathbf{q}\mathbf{U}^{\dagger}$, with the above relation, we can verify explicitly that $q'_1, q'_2, q'_3$ in $\mathbf{q'}\equiv q'_1\,\mathbf{i}+q'_2\,\mathbf{j}+q'_3\,\mathbf{k}$ is related to $q_1, q_2, q_3$ in Eq.(6) by \begin{equation}q'_m=\sum_{n=1}^3\mathbf{R}_{mn}q_n\quad \text{for }m=1,2,3\end{equation} which is indeed the property of a vector under a rotation.
Quaternion calculus
The core mathematics in electrodynamics is about the operator $\nabla$. In modern textbook of electrodynamics, $\nabla$ is defined by \begin{equation}\nabla \equiv \hat{x}\frac{\partial}{\partial x}+ \hat{y}\frac{\partial}{\partial y} + \hat{z}\frac{\partial}{\partial z}\,,\end{equation} where $\hat{x},\hat{y},\hat{z}$ are the unit vector along the x, y, z axis of the Cartisan coordinate system. When acting such $\nabla$ on a scalar $\phi$ or a vector $\vec{q}\equiv q_1\,\hat{x}+q_2\,\hat{y}+q_3\,\hat{z}$, there are THREE different operations named gradient, divergence and curl: \begin{eqnarray} \text{grad}\phi &=&\hat{x}\frac{\partial \phi}{\partial x}+ \hat{y}\frac{\partial \phi}{\partial y} + \hat{z}\frac{\partial \phi}{\partial z}\,, \\ \text{div}\,\vec{q} &\equiv& \frac{\partial q_1}{\partial x}+\frac{\partial q_2}{\partial y}+\frac{\partial q_3}{\partial z}\,,\\ \text{curl}\,\vec{q}&\equiv&\left(\frac{\partial q_3}{\partial y} - \frac{\partial q_2}{\partial z }\right)\hat{x}+ \left(\frac{\partial q_1}{\partial z} - \frac{\partial q_3}{\partial x}\right)\hat{y}+\left(\frac{\partial q_2}{\partial x} - \frac{\partial q_1}{\partial y}\right)\hat{z}\,.\end{eqnarray}
However, in history, the operator $\nabla$ is first introduced by Hamilton in the context of quaternion \begin{equation}\nabla \equiv \mathbf{i}\frac{\partial}{\partial x}+ \mathbf{j}\frac{\partial}{\partial y} + \mathbf{k}\frac{\partial}{\partial z}\,,\tag{8}\end{equation} where $\mathbf{i}, \mathbf{j}, \mathbf{k}$ are the same as that in the quaternion (1) and satisfy the multiplication rule (2).
Multiplying the operator (8) to the vector (6) from the left, we can compute explicitly that \begin{equation}\boxed{\nabla\,\mathbf{q}=- \text{Div}\,\mathbf{q} + \text{Curl}\,\mathbf{q}}\,,\tag{9}\end{equation}
where \begin{eqnarray} \text{Div}\,\mathbf{q}&\equiv& \frac{\partial q_1}{\partial x}+\frac{\partial q_2}{\partial y}+\frac{\partial q_3}{\partial z} = \text{div}\,\vec{q}\,\\ \text{Curl}\,\mathbf{q}&\equiv&\left(\frac{\partial q_3}{\partial y} - \frac{\partial q_2}{\partial z }\right)\mathbf{i}+ \left(\frac{\partial q_1}{\partial z} - \frac{\partial q_3}{\partial x}\right)\mathbf{j}+\left(\frac{\partial q_2}{\partial x} - \frac{\partial q_1}{\partial y}\right)\mathbf{k}\\&=& \left(\text{curl}\,\vec{q}\right)_x\,\mathbf{i}+ \left(\text{curl}\,\vec{q}\right)_y\,\mathbf{j}+\left(\text{curl}\,\vec{q}\right)_z\,\mathbf{k} \,.\end{eqnarray} Remarks:
- For vectors in quaternion like (6), we can also define divergence and curl that are the same as that in the vector analysis but with $\hat{x},\,\hat{y},\,\hat{z}$ replaced by $\mathbf{i},\,\mathbf{j},\,\mathbf{k}$.
- The "weird" definition of curl in vector analysis is the result of the multiplication rule (2) when computing $\nabla\,\mathbf{q}$.
- But more importantly, there is only ONE operation $\nabla\,\mathbf{q}$ in the language of quaternion. As seen on the right side of Eq. (9), Divergence and curl always appear simultaneously, and they are simply the scalar and vector parts of the quaternion $\nabla\,\mathbf{q}$.
Maxwell's quaternion notation
In history, Maxwell first wrote down his equations in components and there are around twenty equations. Later, he reduced the number of equations using Hamilton's quaternion.
Take one of his equations $\text{div}\,\vec{D}=\rho$ in today's textbooks for example. We can first define the vector in quaternion \begin{equation}\mathbf{D}\equiv D_x\,\mathbf{i}+D_y\,\mathbf{j}+D_z\,\mathbf{k}\,.\end{equation} Then by Eq. (8), (9) and (7), we have \begin{equation}-\mathbb{S}\,\nabla \mathbf{D}=\rho\,.\tag{10}\end{equation} The other three equations can be written down in quaternion forms using the same receipt.
Note:
A modern quaternion form of Maxwell equation is written down in the near 2003. See
this paper for details.
Maxwell equation in biquaternion
Recall that there is a more concise form of Maxwell equation in terms of
four-potential $A^{\mu}=(\phi/c, \vec{A})$ and
field strength $F_{\mu\nu}=\partial_{\mu} A_{\nu}-\partial_{\nu}A_{\mu}$ in Minkowski spacetime rather than the vectors in 3d Euclidean space because of the underlying Lorentz symmetry SO(1,3). This inspires us also to seek for a relativistic form of Maxwell equation in quaternion.
My first trial is to generalize the operator (8) to \begin{equation}\partial_{\text{trial}} \equiv \frac{1}{c}\frac{\partial}{\partial t}+\nabla \equiv \frac{1}{c}\frac{\partial}{\partial t}+ \mathbf{i}\frac{\partial}{\partial x}+ \mathbf{j}\frac{\partial}{\partial y} + \mathbf{k}\frac{\partial}{\partial z}\end{equation} and write the potential in quaternion as \begin{equation} \mathcal{A}_{\text{trial}}\equiv \frac{\phi}{c} + \mathbf{A}\equiv \frac{\phi}{c}+A_x\,\mathbf{i}+A_y\,\mathbf{j}+A_z\,\mathbf{k}\,.\end{equation} But the result of \begin{equation}\partial_{\text{trial}}\,\mathcal{A}_{\text{trial}} = \left(\frac{1}{c^2}\frac{\partial \phi}{\partial t}-\text{Div}\,\mathbf{A}\right)+\left(\frac{1}{c}\nabla\,\phi+\frac{1}{c}\frac{\partial}{\partial t}\mathbf{A}+ \text{Curl}\,\mathbf{A}\right)\end{equation} is problematic: the scalar part differs from the
Lorentz gauge by a relative minus sign; more importantly, the vector part only yields the combination of $-\frac{1}{c}\mathbf{E}+\mathbf{B}$ instead of individual $\mathbf{E}$ and $\mathbf{B}$.
The key insight is that the Lorentz symmetry $so(1,3)=su(2)\otimes su(2)$. As a result, one quaternion, as in the above trial, is not enough. We need two sets of quaternion. The two sets of quaternion can be combined by the imaginary number $i$, leading to the so-called
biquaternion: \begin{eqnarray}\boldsymbol{\mathcal{Q}}&=&\mathcal{Q}+i\,\mathcal{P}=\left(q_0+q_1\,\mathbf{i}+q_2\,\mathbf{j}+q_3\,\mathbf{k}\right)+i\,\left(p_0+p_1\,\mathbf{i}+p_2\,\mathbf{j}+p_3\,\mathbf{k}\right)\\&=& (q_0+i\,p_0)+(q_1+i\,p_1)\,\mathbf{i}+(q_2+i\,p_2)\,\mathbf{j}+(q_3+i\,p_3)\,\mathbf{k}\,,\end{eqnarray} which suggests that biquaternion simply generalizes $q_0, q_1, q_2,q_3$ in quaternion (1) from $\mathbb{R}$ to $\mathbb{C}$. Now we can generalize the operator (8) and write the potential in biquaternion as \begin{eqnarray} \boldsymbol{\partial} &\equiv& \frac{1}{ic}\frac{\partial}{\partial t}+\nabla \equiv \frac{1}{ic}\frac{\partial}{\partial t}+ \mathbf{i}\frac{\partial}{\partial x}+ \mathbf{j}\frac{\partial}{\partial y} + \mathbf{k}\frac{\partial}{\partial z}\,,\\ \boldsymbol{\mathcal{A}}&\equiv& \frac{\phi}{ic} + \mathbf{A}\equiv \frac{\phi}{ic}+A_x\,\mathbf{i}+A_y\,\mathbf{j}+A_z\,\mathbf{k}\,.\tag{11}\\ \end{eqnarray} As a result, \begin{equation}\boldsymbol{\partial}\boldsymbol{\mathcal{A}}=-\left(\frac{1}{c^2}\frac{\partial \phi}{\partial t}+\text{Div}\,\mathbf{A}\right)+\frac{i}{c}\left(-\nabla\,\phi-\frac{\partial}{\partial t}\mathbf{A}\right)+ \text{Curl}\,\mathbf{A}\,,\end{equation} whose scalar part is zero under the
Lorentz gauge and vector part is the field strength \begin{equation}\boldsymbol{\mathcal{F}}\equiv\frac{i}{c}\mathbf{E}+\mathbf{B}=\boldsymbol{\partial}\boldsymbol{\mathcal{A}}\,.\tag{12}\end{equation} Finally, note that \begin{eqnarray}\boldsymbol{\partial}^*\boldsymbol{\mathcal{F}}&=&\left(-\frac{1}{ic}\frac{\partial}{\partial t}+\nabla\right)\left(\frac{i}{c}\mathbf{E}+\mathbf{B}\right)\\&=&-\text{Div}\,\mathbf{B}-\frac{i}{c}\text{Div}\,\mathbf{E}+\left(\text{Curl}\,\mathbf{B}-\frac{1}{c^2}\frac{\partial}{\partial t}\mathbf{E}\right)+\frac{i}{c}\left(\text{Curl}\,\mathbf{E}+\frac{\partial}{\partial t}\mathbf{B}\right)\,,\end{eqnarray} and let \begin{equation}\boldsymbol{\mathcal{J}}\equiv\frac{c\rho}{i}+\mathbf{J}\,,\end{equation} We can write Maxwell equation in the vaccum as \begin{equation}\boxed{\boldsymbol{\partial}^*\boldsymbol{\mathcal{F}}=\mu_0\,\boldsymbol{\mathcal{J}}}\,.\tag{13}\end{equation}
Technical notes:
- In Minkowski spacetime, we can either define four-potential $A^{\mu}=(\phi/c, \vec{A})$ for the Lorentzian metric $\eta_{\mu\nu}\equiv\text{diag}[-1,+1,+1,+1]$ or $A^{\mu}=(\phi/ic, \vec{A})$ for the Euclidean metric $\text{diag}[+1,+1,+1,+1]$. The latter definition of $A^{\mu}$ is the intuition of our definition (11), but the underlying motivations to use the imaginary $i$ are different: one is to make metric signature Euclidean while the other is for the generalization from quaternion to biquaternion.
- The intuition that we construct Maxwell equation (13) using $\boldsymbol{\partial}^*\boldsymbol{\mathcal{F}}$ instead of $\boldsymbol{\partial}\boldsymbol{\mathcal{F}}$ is that $\boldsymbol{\partial}^*\boldsymbol{\mathcal{F}}=\boldsymbol{\partial}^*\boldsymbol{\partial}\boldsymbol{\mathcal{A}}$ and \begin{equation}\boldsymbol{\partial}^*\boldsymbol{\partial}=\frac{1}{c^2}\frac{\partial}{\partial t} +\nabla \nabla =\frac{1}{c^2}\frac{\partial}{\partial t} -\text{Div}\, \nabla\end{equation} is the expected d'Alembert operator.
Remarks:
- Besides the biquaternion form (12), one can also write the field strength using gamma matrices, leading to the so-called STA formulation of Maxwell equation. Like biquaternion, gamma matrices also contain two sets of Pauli matrices, complying with the requirement of Lorentz symmetry.
- How many equations in Maxwell equation? Four in vector analysis, two in term of $F_{\mu\nu}$, and finally ONE in biquaternion. This is why we say Maxwell "equation" instead of "equations". Here is an interesting story shared by Prof. Xiao-Gang Wen: “要想知道麦克斯韦方程有多重要,我这里讲个故事。我每次从加拿大开车入境美国,在边境上,美国检察官都这样问:“你是干什么的?”我回答:“搞物理的。”检察官:“你知道麦克斯韦方程有几个吗?”我想:如果检察官是研究生水平,我应当回答两个。如果检察官是大学生水平,我应回答四个。如果检察官是高中生水平,我就不知道回答几个了。最后我试着说:“四个。” 他就放我过境了。(看来我不是假装搞物理的。)”
A few words at the end
When
Heaviside and
Gibbs replaced Hamilton's quaternion with their vector analysis in electrodynamics, they believed that they made a great simplification. But quaternion and similar math appear again and again in quantum physics such as Pauli matrices in
Weyl equation or gamma matrices in
Dirac equation. It looks like the reduced workload when learning electrodynamics has to be paid back when learning later courses. Einstein is right: "Make everything as simple as possible, but not simpler."
excited!
ReplyDeleteExtending scalar part of the quaternion to be imaginary will solve the gauge, right?
ReplyDeleteYes. See the scalar term in the equation below (11) and above (12).
Delete