Gauge Symmetries, Symmetry Breaking, and Gauge-Invariant Approaches

Philipp Berghofer; Jordan François; Simon Friederich; Henrique Gomes; Guy Hetzroni; Axel Maas; René Sondenheimer

doi:10.1017/9781009197236

1 Introduction

Gauge symmetry is a central concept in essentially all of modern fundamental physics. The framework of theories in which gauge symmetries play a central role – gauge theories – is very general, and many physicists expect that any future discoveries will be accommodated within it. However, there are unresolved issues in the foundations of gauge theories, notably concerning which features of gauge theories are descriptively redundant, and which are crucial for empirical adequacy. The aim of this Element is to present precisely what is known on gauge symmetries and the possibility of gauge symmetry breaking, stressing the relevance of foundational and philosophical issues to current scientific practice and open questions, and to further outline what we take to be the most promising avenues forward. This Element is thus an invitation to anyone interested in understanding the conceptual foundations of gauge theories, and a reflection upon how these features shape the way we think about elementary fields and particles.

Most results on gauge theories stem from approaches that make some drastic simplifications. Gauge theories with weak interactions are often treated using perturbative approximations. In these approximations, many of the geometric properties of non-Abelian gauge theories, like their nontrivial topological features, play little to no role. (Lattice) simulations, suitable especially for strongly interacting theories, can be formulated in such a way that the gauge symmetry plays essentially no role in practice. Thus, the conceptual questions concerning the gauge symmetries themselves usually do not arise as problems in practice.

However, even at this point a little conceptual reflection shows that the central implicit and explicit foundational assumptions on gauge symmetries are not always consistent with one another. Gauge-dependent objects depend on the choice of gauge fixing, which is made on pragmatic grounds, not dictated by any choice of gauge made on “nature’s” behalf. This is one among several reasons why it is commonly stated that gauge-dependent objects cannot directly correspond to anything physically real. This assertion, however, casts doubt on the physical reality of elementary particles such as electrons and quarks, together with the fields that represent them. This is in sharp tension with the common discourse and with aspects of the scientific practice in which these gauge-dependent fields are taken to be physically real in the same sense as, say, atoms are usually taken to be physically real. This tension already highlights why properly understanding gauge symmetries is important from an ontological point of view.

There are three standard ways to avoid gauge-dependent objects in the treatment of the gauge interactions that form part of the Standard Model of elementary particle physics (Reference MaasMaas, 2019): (i) in quantum electrodynamics (QED) a so-called photon cloud dressing reestablishes gauge invariance by including, in the description of the electron, what one might characterize as its “Coulomb tail” (Reference HaagHaag, 1992); (ii) in quantum chromodynamics (QCD), the resolution, or rather the irrelevance of gauge dependence, is due to confinement, which requires that only uncharged (with respect to the non-Abelian color charge) and thereby gauge-invariant objects appear at distances at or beyond the radius of hadrons (Reference Lavelle and McMullanLavelle & McMullan, 1997; Reference YndurainYndurain, 2006); and (iii) in the electroweak sector, though much less known, it is due to the Fröhlich–Morchio–Strocchi mechanism (Reference Fröhlich, Morchio and StrocchiFröhlich, Morchio, and Strocchi, 1980, Reference Fröhlich, Morchio and Strocchi1981). While these three mechanisms appear quite different at first sight, they eventually all boil down to canceling gauge dependency by either eliminating the gauge degrees of freedom, or, at least, ensuring they do not appear in the empirically accessible range.

However, the fact that gauge-dependent objects can be eliminated in an unobtrusive way in the gauge theories just mentioned seems to depend on features that are specific to the theories combined in the Standard Model and may not hold in extensions of it. A more systematic strategy for eliminating gauge dependence may be necessary for future progress in the search for physics beyond the Standard Model. A “literal interpretation” of gauge fields that regards different gauge symmetry-related field configurations as physically distinct, in contrast, may well be an obstacle to such progress. Thus it is necessary to establish whether a manifestly gauge-invariant approach to gauge theories, replacing the current way of thinking about elementary particles, is compelling or perhaps even necessary for further progress. This need, as we shall see, mirrors themes in the recent philosophical discourse on gauge symmetries.

The structure of this Element is as follows. It starts out with a review of general features of (gauge) symmetries in Section 2. Many conceptual and technical complications surrounding gauge dependence arise in connection with the spontaneous breaking of gauge symmetry. The understanding of gauge symmetry breaking is particularly central in the context of the Brout–Englert–Higgs (BEH) effect, and this is discussed in Section 3. Based on this discussion, we motivate the search for gauge-invariant approaches in Section 4, and their implementation, given at various levels of detail, in Sections 5 and 6. In Section 7 we conclude with some reflections about the ultimate consequences of the results presented, and which key steps would have to be taken to answer all substantive open questions about gauge symmetries.

2 State of the Art: The Interpretation of Gauge Symmetries

2.1 Introduction

2.1.1 Symmetries

It has almost become a cliché to emphasize that symmetry plays a central role in modern physics. Weyl declared that “[a]s far as I see, all a priori statements in physics have their origin in symmetry” (Reference WeylWeyl, 1952, 126), Yang argued that “symmetry dictates interactions” (Reference YangYang, 1980, 42), Weinberg famously said that “[s]ymmetry principles have moved to a new level of importance in this century and especially in the last few decades: there are symmetry principles that dictate the very existence of all the known forces of nature” (Reference WeinbergWeinberg, 1992, 142), and philosopher Christopher Martin called twentieth-century physics the “Century of Symmetry” (Reference Martin, Brading and CastellaniMartin, 2003). Cliché or not, it is simply a fact that our currently best physical theories exhibit symmetries. In some cases, symmetry considerations played an important heuristic role in formulating the respective physical theories (e.g., the theory of relativity). In other cases, symmetries were found retrospectively (e.g., classical electrodynamics). Due to the omnipresence of symmetries and the important heuristic role symmetry considerations play in modern physics, it is fair to say that understanding and interpreting symmetries is crucial for understanding our physical theories and what they tell us about the world. Accordingly, reflecting on the mechanism, nature, and heuristic significance of symmetries has become a central task of physicists and philosophers (see, e.g., Reference Brading and CastellaniBrading & Castellani, 2003; Reference Dasgupta, Knox and WilsonDasgupta, 2022; Reference Ismael, Knox and WilsonIsmael, 2022; Reference WeylWeyl, 1952; Reference WignerWigner, 1967b; Reference YangYang, 1996).

In particular, here we are interested in questions concerning the ontology of symmetries. Are certain symmetries mere mathematical artifacts or are they physically real transformations? Here we put a special focus on gauge symmetries, which are at the very heart of modern physics. Importantly, we do not only argue that understanding the nature of gauge symmetries helps us to better understand our physical theories. We also argue that critical conceptual reflection on the nature of gauge symmetries indicates that textbook accounts of the Brout–Englert–Higgs (BEH) mechanism are misleading.Footnote ¹ As we will elaborate shortly, this line of reasoning is not unfamiliar in the philosophy of physics community. It is one of our aims to present it in its strongest form and show how such considerations naturally lead to gauge-invariant approaches to the BEH mechanism. In this context, as mentioned in Section 1, we investigate some available methods for reducing the theories to their gauge-invariant syntax in Sections 5 and 6.

Before we turn to gauge symmetries, we shall begin with some general remarks about symmetries in physics. Symmetries can concern physical objects and states or physical theories and laws. We say that an object or theory possesses a symmetry if there are transformations that leave certain features of the objects or the theory to which the transformations are applied preserved or unchanged. With respect to these features, the object or theory is invariant concerning the respective transformations. In this sense, one can say that “[s]ymmetry is invariance under transformation” (Reference KossoKosso, 2000, 83). Transformations that leave certain aspects unchanged are referred to as symmetry transformations. Groups of such transformations are referred to as symmetry groups. Types of symmetries can be distinguished according to types of transformations. In physics, we often find the following distinctions: continuous versus discrete symmetries, external versus internal symmetries, and global versus local symmetries.

Continuous symmetry is invariance under continuous transformation. An example of a continuous transformation would be the rotation of a circle. Since the appearance of a circle does not change under continuous rotations, we say that circles possess a continuous symmetry. The appearance of snowflakes, on the other hand, remains unchanged by rotations of sixty degrees but not, say, fifty degrees. This would be an example of a discrete transformation and thus snowflakes possess a discrete symmetry. We are particularly concerned with continuous symmetries. Mathematically, continuous symmetries can be described by Lie groups.

External symmetry is invariance under transformations that involve a change of the space-time coordinates. Examples of external transformations are spatial rotations. Accordingly, circles possess a continuous external symmetry and snowflakes a discrete external symmetry. Internal symmetries are symmetries in which the respective transformations do not involve a change of the space-time coordinates. Examples of internal transformations would be permutations of particles or phase transformations. The symmetries we are interested in, namely, the gauge symmetries in quantum field theory, are internal symmetries.

For our purposes, the most important distinction is the one between global and local symmetries. It is standard (but slightly inaccurate) to characterize global transformations as those performed identically at each point in space-time. Similarly, local transformations are ones that are performed arbitrarily at each point in space-time. More accurately, we follow Brading and Brown in making a “distinction between symmetries that depend on constant parameters (global symmetries) and symmetries that depend on arbitrary smooth functions of space and time (local symmetries)” (Reference Brading and BrownBrading and Brown, 2004, 649).Footnote ²

Let us exemplify the difference between a global and a local transformation by considering a field $Ψ$ undergoing the following phase transformation:

Ψ (x) \to Ψ^{'} (x) = e^{i θ} Ψ (x),

This is a (continuous, internal) global transformation because the phase change $θ$ is independent of the space-time point ( $θ$ does not depend on $x$ ). Now consider the phase transformation

Ψ (x) \to Ψ^{'} (x) = e^{i θ (x)} Ψ (x),

This is a (continuous, internal) local transformation because the phase change $θ (x)$ depends on the position in the field. The space of theories exhibiting local symmetries is much smaller – that is, more constrained – than the space of theories only exhibiting a global symmetry (since global symmetries, when they exist, are subgroups of local symmetries).

A prominent example of an external global transformation is the Lorentz transformation of special relativity. Accordingly, special relativity is based on an external global symmetry. Prominent examples of external local transformations are the arbitrary differentiable coordinate transformations we find in general relativity. Due to its general covariance, namely, the invariance with respect to such transformations, general relativity is based on an external local symmetry.

The symmetries that underlie modern particle physics are internal local symmetries. The Standard Model of particle physics, describing three of the four known fundamental interactions, is a non-Abelian gauge theory with an internal local $U (1) \times S U (2) \times S U (3)$ symmetry group. Concerning our terminology, it is to be noted that we use the terms “local symmetry” and “gauge symmetry” synonymously. Accordingly, any field theory in which the Lagrangian remains invariant under local transformations is a gauge theory. This means that all four fundamental interactions are described by gauge theories (Standard Model + general relativity).

Because gauge symmetries play such a prominent role in modern physics, they more than deserve due conceptual reflection. The central topic of this Element is born out of such reflection, namely, the ontological status of gauge symmetries. Should they be interpreted as merely the mathematical structure of our descriptions of reality or do they represent the structure of reality? To approach this question discussed in more detail in the following section, it is instructive to consider one of the finest examples of synergies between mathematics and physics: Noether’s results concerning the relationship between mathematical symmetries and physical theories.

The famous Noether theorem, also known as Noether’s first theorem, relates continuous global symmetries to conserved quantities. Stated informally, the theorem says that to every continuous global symmetry there corresponds a conservation law (see Reference Brading, Brown, Brading and CastellaniBrading and Brown, 2003). Conversely, every conserved quantity corresponds to a continuous global symmetry. Accordingly, global symmetries seem to be physical symmetries, symmetries of nature. And, indeed, there is some consensus that global symmetries are observable and that they have direct empirical significance (Reference Brading and BrownBrading & Brown, 2004; Reference FriederichFriederich, 2015; Reference GomesGomes, 2021; Reference HealeyHealey, 2009; Reference KossoKosso, 2000).

The situation is very different with respect to local symmetries. The difference can be illustrated by turning from Noether’s first theorem to what is sometimes called Noether’s second theorem (see Reference Brading, Brown, Brading and CastellaniBrading & Brown, 2003; Reference EarmanEarman, 2002, Reference Earman2004b; Reference RicklesRickles, 2008, 55f ).

It has been pointed out that Noether’s results imply that local symmetries impose “powerful restrictions on the possible form a theory can take” (Reference Brading, Brown, Brading and CastellaniBrading & Brown, 2003, 105). Specifically, the second Noether theorem ensures, through a so-called Gauss law, that the dynamics of the force fields are compatible with the dynamics of the charges that are their sources (Reference Gomes, Roberts and ButterfieldGomes, Roberts, & Butterfield, 2021). More broadly, “Noether’s second theorem tells us that in any theory with a local Noether symmetry there is always a prima facie case of underdetermination: more unknowns than there are independent equations of motion” Reference Brading, Brown, Brading and CastellaniBrading & Brown (2003: 104). As will be discussed in detail shortly, this underdetermination inherent to gauge theories implies “an apparent violation of determinism” Reference EarmanEarman (2002: 212).

2.1.2 Interpreting Gauge Symmetries: A First Look

Reference WignerWigner (1967b) compared the gauge invariance of electromagnetism to a theoretical ghost:Footnote ³

“This invariance is, of course, an artificial one, similar to that which we could obtain by introducing into our equations the location of a ghost. The equations then must be invariant with respect to changes of the coordinate of that ghost. One does not see, in fact, what good the introduction of the coordinate of the ghost does.”

This metaphor seems to describe the accepted view that is also, at least officially and explicitly, reflected in physics textbooks as well as that of prominent voices in the philosophy of physics community, declaring gauge theories to contain “surplus structure” (Reference Redhead, Kuhlmann, Lyre and WayneRedhead, 2002), “formal redundancy” (Reference Martin, Brading and CastellaniMartin, 2003), or “descriptive fluff” (Reference EarmanEarman, 2004b, 1239). On one widespread understanding, the surplus structure of many theories is manifested in a multiplicity of mathematical representations for each physical state of affairs. Under this definition, surplus structure is ubiquitous in physics: the simple use of coordinates in space-time physics would count as surplus. In some cases, such a representational multiplicity may be simply understood as the capacity of the theory to accommodate the viewpoints of different observers, that is, what symmetries are often understood to be about (e.g., in special relativity). Therefore, a more useful definition of surplus structure should not encompass the kind of representational multiplicity that is epistemically unavoidable, or even a theoretical virtue.

Reference Redhead, Kuhlmann, Lyre and WayneRedhead (2002) provides a more precise characterization of surplus structure. Given the mathematical structure $M$ of a theory used to represent a physical structure $P$ , if $P$ actually maps isomorphically only onto a substructure $M^{'}$ of $M$ , then by definition the surplus structure of the theory is the complement of $M^{'}$ in $M$ . Situations in which the mathematical structure that is understood as correlating to a physical structure is embedded within a larger mathematical structure are fairly common in modern physics. It is therefore not hard to find examples that fit this description and present no special interpretive difficulties – the use of complex numbers in classical wave mechanics or in circuit analysis comes to mind.

But obviously there are instances in modern physics where capturing the notion of surplus structure is harder, as the boundary between surplus and essential theoretical structures is hard to draw and may evolve as more knowledge is gained. Generally, which of the mathematical structures of a given theory (if any) correlates to a physical structure is a matter of an interpretation of the theory, which would generally have to balance different theoretical virtues.

We take the relation of mathematical formalism to reality to have three layers: (i) the measurable, (ii) the ontological (or “real,” or physical), and (iii) the mathematical. With these layers in mind, we can stipulate three desiderata concerning interpretations of gauge theories:

(D1) To avoid ontological indeterminism.
(D2) To avoid ontological commitments to quantities that are not measurable even in principle.
(D3) To avoid surplus mathematical structure that has no direct ontological correspondence.

The first two desiderata motivate an interpretation of unobservable or underdetermined theoretical concepts as structure that has no bearing on the ontology and is in that sense “surplus.” On the other hand, considerations of localityFootnote ⁴ and explanatory capacity would often push in the opposite direction, supporting the indispensability of the surplus mathematical structure. In the context of space-time theories, the desiderata (D2) and (D3) are intimately related to symmetry principles aiming to bring together the symmetries of space-time and those of the dynamics (Reference EarmanEarman, 1989). These principles can be applied in interpreting physical theories as well as in constructing them. A possible way of applying analogous principles in the context of gauge theories is by requiring that the symmetries of the dynamics coincide with the kinematical symmetries, that is, with the automorphisms of the mathematical structure taken to represent the possible physical states (Reference HetzroniHetzroni, 2021).

Broadly speaking, interpretations of gauge theories that take gauge transformations to be physically real – that is, to relate physically distinct objects – support a realist commitment to gauge-dependent quantities. Let us call these T1 interpretations. A T1 interpretation is clearly in tension with D1 and D2, but not in conflict with D3 because, for T1, gauge transformations have ontological correspondence. In contrast, interpretations that take gauge symmetries to be manifestation of surplus mathematical structure may no longer be in conflict with either D1 or D2, since they restrict ontological commitments to gauge-invariant quantities whose evolution is deterministic (and which may even be measurable). Let us call these interpretations T2. According to T2, gauge theories, in their standard formulations, have mathematical surplus structure, in conflict with D3.Footnote ⁵

And in fact, D3, as it stands, is a matter of degree: all physical theories harbor some amount of surplus structure that is not directly measurable. After all, physics is not formulated solely in terms of – as brackets containing long conjunctions and disjunctions of – directly observable phenomena, like positions of dials and so on. And even if we weaken “measurable” to “ontological,” some amount of surplus structure – for instance, a choice of coordinates, units, and so on – will usually remain in our description.

Thus, these issues do not pertain merely to the interpretation of gauge theories, they can also motivate the reformulation and extension of existing theories that on the one hand remove superfluous structure to obtain a more parsimonious representation, or on the other hand, promote what initially seems like surplus structure to physical structure.

Therefore, to proceed, we need to separate the chaff from the wheat with regards to surplus structure, and this requires a more refined notion of the term, identifying it with theoretical or formal features that can be excised from a theory without incurring any detriment to its explanatory and pragmatic virtues. Of course, such criteria still leave open what should be counted as explanatory and pragmatic virtues, an issue that depends on one’s viewpoint and goals. Yet, on occasion the criteria are rather clear-cut, and even in more complicated realistic situations we suggest that some consensus should be pursued so as to make the criteria effective.

Here we will exemplify this point by the usage of one more criterion: locality. Thus, redundancy of representation in a theory will be counted as surplus structure if eliminating it still allows us to describe physical states via locally determined quantities, or local variables. This will be the motivation for some of the approaches presented in Section 5. The issue at hand is therefore not limited to whether gauge symmetries manifest surplus structure or not; the refined question is how to distinguish between genuinely surplus structure and that having a physical signature related to nonlocality or nonseparability of gauge physics. Answering this refined question provides a hard-and-fast criterion for a theory to meet all the desiderata D1–D3.

2.2 The Development of Gauge Theories

2.2.1 Gauge Invariance in Classical Electromagnetism

The formal property known today as gauge invariance already appeared in Maxwell’s 1856 On Faraday’s Lines of Force, in which he showed, inter alia, that the magnetic vector potential, introduced a few years earlier by William Thomson, can give rise to a unified mathematical description of different phenomena described by Faraday. This gauge invariance later allowed for the elimination of the vector potential from the equations in the modern formulation of Maxwell’s equations by Hertz and Heaviside. In classical electromagnetism the equations of motion of the fields are Maxwell’s equations:

\begin{matrix} \nabla \cdot \vec{B} & = 0 \end{matrix}

(2.1)

\begin{matrix} \nabla \times \vec{E} + \frac{\partial \vec{B}}{\partial t} & = 0 \end{matrix}

(2.2)

\begin{matrix} \nabla \cdot \vec{E} & = ρ \end{matrix}

(2.3)

\begin{matrix} \nabla \times \vec{B} - \frac{\partial \vec{E}}{\partial t} & = \vec{j} . \end{matrix}

(2.4)

The equations of motion of the particles include the Lorentz force $q (\vec{E} + \vec{v} \times B)$ derived from the fields.

This description appears to have a straightforward interpretation: the electric field $\vec{E}$ and the magnetic field $\vec{B}$ constitute the basic field ontology, and Maxwell’s equations determine their behavior. The local values of the field can be found empirically based on the action of Lorentz force on particles. This interpretation seems to yield a local understanding of the interactions and a picture of a continuous flow of energy in space through electromagnetic radiation. Yet, this theory is not free of conceptual problems.Footnote ⁶ During the first half of the twentieth century these problems raised the question of whether the electric and magnetic fields are real mediators of the interaction, or merely a mathematical tool that helps physicists to keep track of it. The central alternative to the field ontology was a picture of point particles directly interacting with each other at a distance. This kind of theory was famously advocated by Reference Wheeler and FeynmanWheeler and Feynman (1949) based on earlier theories. The gauge freedom of the theory is related to a third mathematical representation, based on the electric potential $V$ and the magnetic vector potential $\vec{A}$ . This representation is particularly convenient in various kinds of physical situations (such as those involving conductors). It also has the advantage of having (2.1) and (2.2) follow as identities from the kinematics rather than as additional dynamical equations. The potentials are defined such that the fields satisfy $\vec{E} = - \nabla V - \frac{\partial \vec{A}}{\partial t}$ and $\vec{B} = \nabla \times \vec{A}$ . Yet, the potentials are underdetermined by the fields: the same magnetic field can be represented by many mathematically distinct potentials. For given potentials $\vec{A}$ and $V$ , the potentials ${\vec{A}}^{'} = \vec{A} + \nabla f$ and $V^{'} = V - \frac{\partial f}{\partial t}$ represent the same values of the fields $\vec{E}$ and $\vec{B}$ for any arbitrary smooth function of space and time $f$ .

The following transformation is therefore considered to be the gauge transformation of the theory:

\begin{matrix} \vec{A} \to \vec{A} + \nabla f, \\ V \to V - \frac{\partial f}{\partial t} . \end{matrix}

(2.5)

Under this transformation the field values do not change, and Maxwell’s equations therefore remain invariant; Maxwell’s equations possess a gauge symmetry.

According to the field interpretation mentioned earlier, the electric and magnetic potentials are devoid of ontological importance. Gauge invariance is therefore naturally interpreted as a manifestation of a redundancy in the way the potential represents the physical situation, rendering them as mere mathematical auxiliaries. However, future developments in quantum theory and particle physics gave three reasons to think that things may be more complicated. The first is the formal indispensability of the potentials in the theory. The second is the Aharonov–Bohm effect. (These first two reasons are described in the next subsection.) The third reason is that the property of gauge invariance can be promoted to a gauge principle, using which the equations that govern the interaction can be derived without appealing to prior knowledge of the classical limit. This method is the basis for the “symmetry dictates interaction” conception of contemporary field theories (Subsection 2.2.3). This profound significance of gauge invariance appears to many to conflict with the view that regards this invariance as a mere matter of mathematical redundancy (Section 2.3.1). This tension is further sharpened in the context of spontaneous gauge symmetry breaking (see Sections 3 and 4).

2.2.2 The Aharonov-Bohm Effect

In classical electromagnetism the potentials are nonmeasurable quantities whose local values are not well defined and do not form an essential part of the mathematical description of the dynamics that can be expressed in terms of the fields using Maxwell’s equations and the Lorenz force equation. Yet, there is a significant place in which they become indispensable, at least from a formal point of view, and this is the Hamiltonian (and also the Lagrangian) formulation of the theory. In quantum mechanics the Hamiltonian formulation gains fundamental significance as the generator of the temporal dynamics. The theory does not include force as a fundamental entity, but only as a derived phenomenon at a classical limit. Accordingly, it is the electric and magnetic potentials, and not the fields, which appear in the Hamiltonian and thus in the Schrödinger equation.

Aharonov and Bohm were intrigued by the question of whether this theoretical difference between the quantum and the classical can make an observational difference, and proposed an experiment in which it does. The proposed experiment is an electron interference experiment, in which a beam is split into two branches that are brought back together to form an interference pattern (Fig. 2.1).Footnote ⁷ The electric and magnetic fields along the possible trajectories are zero. In addition, a conducting solenoid is placed between the two branches in an area of space that is shielded from the beam (the wave function at the neighborhood of the solenoid is zero). The current in the solenoid induces a magnetic field inside it, but the foil ensures that there is no overlap between the support of the wave function in space-time and the magnetic field. The surprising fact is that in this scenario the interference pattern would depend on the magnetic field.Footnote ⁸

Figure 2.1 The Aharonov–Bohm experiment.

Figure taken from Aharonov and Bohm (1959).

The dependence of the interference pattern on the field can be expressed in terms of a phase difference between the two branches that is acquired in the process. The following calculation of the phase factor emphasizes the role of gauge invariance. Let us begin with the case $\vec{A} = 0$ everywhere and at all times, which, in a particular gauge, describes the experiment conducted with a zero magnetic field (we shall also use $ϕ = 0$ in all cases). Let us denote the solutions of the Schrödinger equation for the first branch ABF and for the second branch ACF by $ψ_{1}^{0} (\vec{r}, t)$ and $ψ_{2}^{0} (\vec{r}, t)$ respectively. An interference pattern is obtained in the interference region, the overlap of the support of these two wave functions, and in this case its form is given by ${(ψ_{1}^{0} (\vec{r}, t) + ψ_{2}^{0} (\vec{r}, t))}^{2}$ .

The experiment is described by the Hamiltonian:

H = \frac{1}{2 m} {(\vec{p} - \frac{q}{c} \vec{A})}^{2} + q ϕ .

(2.6)

It is invariant under the gauge transformation:

\vec{A} (\vec{r}, t) \to \vec{A} (\vec{r}, t) - \nabla Λ (\vec{r}, t),

(2.7)

for arbitrary smooth $Λ$ . This invariance is helpful for the calculation of the effect of the magnetic field. Let us denote by ${\vec{A}}^{*} (\vec{r}, t)$ the magnetic vector potential that describes the situation with a given nonzero magnetic flux $Φ_{B}$ . For simplicity, the choice of gauge is such that $A^{*}$ is time independent and that at point A we have ${\vec{A}}^{*} = 0$ . The magnetic field along the first branch is zero. That means that there is a certain gauge transformation $Λ_{1} (\vec{r})$ such that the potential ${\vec{A}}_{1} \equiv \vec{\nabla} Λ_{1}$ corresponds to a situation with zero magnetic field everywhere (i.e. $\vec{\nabla} \times {\vec{A}}_{1} = 0$ at all places), but along the first branch ${\vec{A}}_{1} = {\vec{A}}^{*}$ . Similarly, since at any point along the second branch the magnetic field is zero and $\vec{\nabla} \times {\vec{A}}^{*} = 0$ , there exist a gauge transformation $Λ_{2} (\vec{r})$ and a corresponding potential ${\vec{A}}_{2} \equiv \vec{\nabla} Λ_{2}$ such that $\vec{\nabla} \times {\vec{A}}_{2} = 0$ at all places, but along the second branch ${\vec{A}}_{2} = {\vec{A}}^{*}$ .

Now, any electromagnetic gauge transformation (2.7) is a symmetry of the Hamiltonian (2.6) when it is accompanied by the local phase transformation $ψ \to e^{i q Λ (\vec{r}, t) / ℏ c} ψ$ . As a consequence, if we start from $\vec{A} = 0$ everywhere and apply the gauge transformation defined by $Λ_{1} (\vec{r})$ that transforms the vector potential into ${\vec{A}}_{1}$ , the wave function $ψ_{1}^{0} (\vec{r}, t)$ would be transformed into $e^{i Λ_{1} (\vec{r})} ψ_{1}^{0} (\vec{r}, t)$ . But since ${\vec{A}}^{*} = {\vec{A}}_{1}$ along the first branch, the wave function $e^{i Λ_{1} (\vec{r})} ψ_{1}^{0} (\vec{r}, t)$ would also be the wave function of the wave-packet that travels along the first branch for the case of nonzero magnetic field. Similarly, if we start from $\vec{A} = 0$ everywhere and apply the gauge transformation defined by $Λ_{2} (r)$ , the wave function $ψ_{2}^{0} (\vec{r}, t)$ would be transformed into $e^{i q Λ_{2} (\vec{r}) / ℏ c} ψ_{2}^{0} (\vec{r}, t)$ . This expression would also hold for the wave function that travels along the second branch when the magnetic field is nonzero. Therefore, in this case the interference pattern would be given by ${(e^{i q Λ_{1} (\vec{r}) / ℏ c} ψ_{1}^{0} (\vec{r}, t) + e^{i q Λ_{2} (\vec{r}) / ℏ c} ψ_{2}^{0} (\vec{r}, t))}^{2} = {(ψ_{1}^{0} (\vec{r}, t) + e^{i q (Λ_{2} (\vec{r}) - Λ_{1} (\vec{r})) / ℏ c} ψ_{2}^{0} (\vec{r}, t))}^{2}$ .

The preceding definitions imply that at any point along the first branch $Λ_{1} (\vec{r}) = \int_{A}^{r} A_{1} (\vec{r^{'}}) \cdot \vec{d r^{'}}$ , where the integral is taken from point $A$ along the path of the particle in the first branch. But since ${\vec{A}}_{1}$ equals ${\vec{A}}^{*}$ in that region, we get $Λ_{1} (\vec{r}) = \int_{A}^{r} A^{*} ({\vec{r}}^{'}) \cdot {\vec{d r}}^{'}$ . Similarly, along the second branch, $Λ_{2} (\vec{r}) = \int_{A}^{r} A^{*} ({\vec{r}}^{'}) \cdot {\vec{d r}}^{'}$ . In the interference region, the phase difference that is responsible for the change in the interference pattern is given by the difference between these two phase factors, which is exactly

Δ φ_{A B} = \frac{q}{ℏ c} \oint A^{*} (\vec{r}) \cdot \vec{d r},

(2.8)

where the loop integral is over the closed loop ABFCA.

While this calculation was performed in a specific gauge, its final outcome is gauge independent. According to Stokes’ theorem, the loop integral equals the magnetic flux, such that the phase difference is $Δ φ_{A B} = q Φ_{B} / ℏ c$ .

At its core, the effect seems to present a dilemma between local action of the gauge-dependent potential and nonlocal action of the field. Thus, the effect gave rise to new controversies concerning both the foundations of electromagnetism and those of quantum theory, and it also substantially outlines the interpretational possibilities in the context of gauge theories. One possible conclusion is that locality and gauge invariant ontology cannot be reconciled. A related issue is the role of topology in physics, which is often presented as the main foundational issue associated with the Aharonov–Bohm effect. The point is that it is possible to gauge away the potential in the domain of each of the two paths ABF and ACF, but not in the union of the domains, due to the fact that the latter domain is not simply connected.Footnote ⁹ This dependence of the effect on the topology is described (e.g., by Reference RyderRyder, 1996) as the simplest demonstration for the rich mathematical structure associated with the vacuum of field theories. Reference Nounou, Brading and CastellaniNounou (2003) portrays this description as underlying a distinct interpretational approach to the effect, an approach that can be extended to an interpretational approach toward gauge theories. We discuss these interpretational possibilities in more detail in Section 2.3.3.

2.2.3 The Gauge Principle

In the early twentieth century the gauge invariance of classical electromagnetism was not considered as bearing any fundamental significance. An original attempt to rethink the issue was made by Reference WeylWeyl (1918),Footnote ¹⁰ who aimed for a geometrical unification of gravitation and electromagnetism based on an extension of Einstein’s general theory of relativity, introducing the term “gauge transformations” for the first time.Footnote ¹¹ Weyl saw great importance in the notion of locality that is expressed in general relativity in the dependence of the metric on space-time points. However, the geometry of the theory does not fully manifest the desired form of locality, due to the invariance of the inner product under parallel transport, which defines a global standard of length. Relaxing this condition introduces (in a modern terminology) a connection $ϕ$ of the local scaling group, and a corresponding local scale factor $λ$ for the transformation of the metric $g (x) \to \tilde{g} (x) = λ (x) g (x)$ . Weyl then identified the components of the connection 1-form $ϕ$ with the components of the electromagnetic four-potential. Thus, the curvature of the scaling connection was identified with electromagnetism in the same way the curvature of the Levi–Civita connection is associated with gravity. The empirical adequacy of this identification was soon criticized by Einstein and others – see Reference O’RaifeartaighO’Raifeartaigh (1997).

In Weyl’s theory the length scale acquires a nonintegrable measure factor along a trajectory in space-time. Just after the introduction of quantum mechanics, Reference LondonLondon (1927) suggested to reinterpret Weyl’s theory based on the observation that the quantum phase factor can be seen as an imaginary version of Weyl’s measure factor.

This discrepancy suggested that the relationship revealed through the concept of gauge between electromagnetism and gravity is that of an analogy rather than a straightforward unification. This analogy was the basis of Weyl’s gauge principle – see Reference WeylWeyl (1929a, Reference Weyl1929b). In these papers Weyl presented a formulation of general relativity using tetrads (which had recently been used in Einstein’s theory of distant parallelism, rejected by Weyl) and used this formulation to emphasize similarities between the structure of gravity and electromagnetism. Weyl noted that the metric does not fully determine the tetrad: there is a freedom of local Lorentz transformations. Weyl’s theory is based on 2-spinors, whose rotation (in internal space) can be regarded as a representation of the same Lorentz group. The tetrads that define distant-parallelism thus determine not only space-time curvature and the connection but also those of the spinors’ internal space. The important point is that the choice of tetrad does not fully determine the state of the spinors, as there remains a freedom of a phase factor. By analogy to the gravitational case, this freedom should be manifested as an invariance under local phase transformations.

The transformation of the $ψ$ induced by the rotation of the tetrad is determined only up to such a factor. In special relativity we must regard this gauge-factor as a constant because here we have only a single point-independent tetrad. Not so in general relativity; every point has its own tetrad and hence its own arbitrary gauge factor; because by the removal of a rigid connection between tetrads at different points the gauge-factor necessarily becomes an arbitrary function of position.Footnote ¹²

This desired local invariance motivated the introduction of a covariant derivative that includes a local quantity $f$ (the connection term in the covariant derivative) such that the action is invariant under the transformation:

\begin{matrix} ψ \to e^{i λ (x)} ψ & f_{p} \to f_{p} - \frac{\partial λ}{\partial x_{p}} . \end{matrix}

(2.9)

Weyl then notes that the resultant $f$ term in the action is identical to “the manner $\dots$ that the electromagnetic potential interacts with matter according to experiment. This justifies the identification of the quantities $f_{p}$ introduced here with the electromagnetic potentials.” Weyl then identifies the electromagnetic field $f_{p q} = \frac{\partial f_{q}}{\partial x_{p}} - \frac{\partial f_{p}}{\partial x_{q}}$ . He further notes the connection of the transformation to conservation of electric charge and stresses the analogy to the connection between conservation of momentum and angular momentum in general relativity and the invariance under local Lorentz transformations (rotation of the tetrads in space-time). Weyl thus regards the gravitational interaction and the electromagnetic interaction as manifesting the same new principle of gauge invariance (see also Reference WeylWeyl, 1929b).

Weyl’s gauge terminology with respect to the electromagnetic interaction was embraced by Reference PauliPauli (1941), Reference Pauli(1980) in his influential writings on quantum physics. It allowed for a reconstruction of the electromagnetic interaction in a quantum context from simple principles that do not appeal to classical electromagnetism. For example, the electromagnetic interaction term is introduced into the free Dirac equation $i γ^{μ} \partial_{μ} ψ - m ψ = 0$ using the gauge principle by replacing the derivative $\partial_{μ}$ with a gauge covariant derivative $D_{μ} = \partial_{μ} + i e A_{μ}$ . The resulting equation $i γ^{μ} (\partial_{μ} + i e A_{μ}) ψ - m ψ = 0$ is invariant under local gauge transformations

\begin{matrix} ψ \to e^{- i λ (x)} ψ & A_{μ} \to A_{μ} + \frac{1}{e} \partial_{μ} λ (x) . \end{matrix}

(2.10)

The analogy with the electromagnetic case motivated the successful attempt of Reference Yang and MillsYang and Mills (1954) to localize the $S U (2)$ isospin symmetry. The theory involves two Dirac spinors of equal mass that can be described using the Lagrangian $ℒ = i \bar{ψ} γ^{μ} \partial_{μ} ψ - \bar{ψ} m ψ$ with $ψ \equiv (\begin{matrix} ψ_{1} \\ ψ_{2} \end{matrix})$ . It is initially invariant under a global $S U (2)$ isospin symmetry. The Yang–Mills field $B_{μ}$ (a $2 \times 2$ matrix) is similarly introduced by replacing the derivative with a corresponding covariant derivative that renders the theory invariant under local $S U (2)$ transformations.

The significance of such gauge transformations is stressed through the fact that the invariance of the Lagrangian (or rather of the action) under some Lie group implies, via the famous Noether’s 1918 theorems (Reference Kosmann-Schwarzbach, Read and TehKosmann-Schwarzbach, 2022, Reference Kosmann-Schwarzbach and Schwarzbach2011), the conservation of associated physical charges. This was actually very significant in fueling Weyl’s enthusiasm regarding gauge invariance, from the very first inception of the concept in his 1918–19 papers.Footnote ¹³ This gives rise to an elegant symmetry argument “explaining” why the electric charge is conserved, in the same way that the energy-momentum tensor is conserved as a consequence of coordinate invariance.Footnote ¹⁴

Gauge theories of the Yang–Mills type were soon recognized as easy to renormalize. This provided a central motivation to pursue this line (Reference BrownBrown, 1993; Reference KibbleKibble, 2015). Thus, while Yang and Mills’s original theory failed to describe the strong nuclear interaction it was trying to account for, the method it presented soon became a template used in the construction of the theory of electroweak interactions (see Chapter 3) as a gauge theory of the group $S U (2) \times U (1)$ , and later also in the construction of the strong interaction. The development of the standard model was in this sense based on applying the same pattern of achieving local invariance by introducing gauge fields (Reference MillsMills, 1989). A retrospective description of these developments is given by Reference O’RaifeartaighO’Raifeartaigh (1997, emphasis in original):

invariance with respect to the local symmetry forces the introduction of the vector fields $A_{μ} (x)$ and determines the manner in which these fields interact with themselves and with matter. The fields $A_{μ} (x)$ turn out to be just the well-known radiation fields of particle physics, namely, the gravitational field, the electromagnetic field, the massive vector meson fields $Z^{0}$ , $W^{\pm}$ of the weak interactions and the coloured gluon fields $A_{μ}^{c}$ of the strong interactions. Thus gauge symmetry introduces all the physical radiation fields in a natural way and determines the form of their interactions, up to a few coupling constants. It is remarkable that this variety of physical fields, which play such different roles at the phenomenological level, are all manifestations of the same simple principle and even more remarkable that the way in which they interact with matter is prescribed in advance. It is not surprising, therefore, to find that the covariant derivative has a deep geometrical significance. [ $\dots$ ] the [local gauge groups] $G (x)$ are identified as sections of principal fiber-bundles and the radiation fields $A_{μ} (x)$ are mathematical connections.

Gauge theories obtained their modern geometric formulation from the late 1950s and on. In this formulation gauge fields are represented by differential forms that define connections of principle bundles: These are spaces for which to each point of space-time is attached a copy of an “internal” homogeneous space (the fiber) whose group of motions is the symmetry Lie group $G$ of the gauge theory – for example, in Maxwell’s theory, the fiber at each point is a circle along which one moves with the group $U (1)$ of rotations. The concept of gauge is often understood as being beyond a heuristic one, as a basic framework for fundamental field theories. This terminology and its application to the formulation of gauge theories is presented in Appendix A.

The gauge heuristics can be described in this language in the following way. Usually, one starts from a theory of free matter fields (without mutual interactions), say $ψ$ , which is only invariant under the action of the rigid, or “global,” Lie group $G$ : that is, $ψ \mapsto ψ^{g} := ρ (g)^{- 1} ψ$ , for $g \in G$ . This allows one to use their simple exterior derivatives to build the kinetic terms, which indeed transform likewise, $d ψ \mapsto d ψ^{g} = ρ (g)^{- 1} d ψ$ : for example, the free Dirac Lagrangian $L_{Free} (ψ) = 〈 ψ, d ψ 〉 - m 〈 ψ, * ψ 〉$ . Noticing then that the latter is not invariant under the substitution $g \to γ \in G$ , namely $L_{Free} (ψ)$ does not enjoy a local $G$ symmetry, one looks for the simplest modification of the free theory that does. This motivates the introduction of the potential $A$ minimally coupled to $ψ$ , together with its transformation (A-3) seen to compensate the inhomogeneous term arising from $d ψ^{γ} = d (ρ (γ)^{- 1} ψ)$ . The rule of thumb being then that in the Lagrangian, one substitutes exterior derivatives by covariant exterior derivatives $d \mapsto D := d + ρ_{*} (A)$ – thus named because $D ψ$ gauge transforms like $ψ$ : that is, $L_{Free} (ψ) \mapsto L_{Dirac} (ψ, A) = 〈 ψ, D ψ 〉 - m 〈 ψ, * ψ 〉$ . It is a theory of coupled matter fields whose interactions are mediated by a gauge potential $A$ .

At this point, this gauge potential is still external as it has no dynamics of its own. To make it fully dynamical, one needs only to add a $G$ -invariant kinetic term involving a tensorial field built from $A$ : the most natural candidate is the field strength $F = d A + A A$ (as defined in Appendix A) and the simplest Lagrangian is the YM term in (A-8). The final theory $L_{Final} (ψ, A) = L_{YM} (A) + L_{Dirac} (ψ, A)$ is a $G$ -gauge theory of dynamical matter fields interacting via a dynamical gauge field.

It appears that by localizing the symmetry, $G \mapsto G$ , we have switched on the interaction field and transformed a free theory into a coupled theory, $L_{Free} (ψ) \mapsto L_{Final} (ψ, A)$ . For $G = U (1)$ , one ends up with the Lagrangian for QED, $L_{Final} (ψ, A) = L_{QED} (ψ, A)$ , and for $G = S U (3)$ one ends up with the QCD Lagrangian $L_{QCD} (ψ, A)$ . Thus, for the nongravitational interactions, the gold standard for the physics community is the fact that the gauge field theories for the nongravitational interactions can be combined and quantized to give a renormalizable QFT (see Section 2.2.4), the Standard Model (SM), with gauge group $U (1) \times S U (2) \times S U (3)$ , whose quantitative predictions are in agreement with experiments to an impressively high degree of precision.

In addition to the gauge symmetries in particle physics, the analogy with gravitation continued to play a role in various works aiming to go beyond general relativity in the description of gravity, or to provide a unified framework for all interactions. Reference UtiyamaRyoyu Utiyama (1956) fully articulated the modern understanding of the gauge argument in terms of “gauging” a global symmetry using a Lie group to obtain a gauge theory.Footnote ¹⁵ Utiyama applied his approach also to general relativity, showing for the first time that general relativity can be recovered from a gauge principle applied to the rigid Lorentz group.

The theory of Reference Brans and DickeBrans and Dicke (1961) generalized general relativity based on conformal transformations, similar to the ones presented in Reference WeylWeyl (1918). After the success of gauge theoretic approaches to the nuclear interactions in the mid to late 1960s and early 1970s, and after the inception of supersymmetry, explorations of gauge theories of gravity and supergravity blossomed in the second half of the 1970s (see Reference ScholzScholz, 2020 for a historical review and Reference Blagojević, Hehl and KibbleBlagojević, Hehl, & Kibble, 2013 for a detailed review of the theories). These theories describe gravity using different non-Riemannian geometries introduced through the process of gauging different groups larger than the Lorentz group. The aim is to bridge the language gap with the description of the other interactions so as to facilitate either their unification or the quantization of gravity.

In this framework, the gauge potentials of gravity were for a long time thought about in the same terms as Ehresmann connections on fiber bundles that describe nongravitational interactions. Only quite recently was it recognized that gauge gravity is better understood in terms of the geometry of Cartan connections on principal bundles; see, for example, Reference WiseWise (2009, Reference Wise2010). Cartan geometry is indeed the natural generalization of (pseudo) Riemannian geometry, as introduced by É. Cartan in the 1920s, and the direct precursor of the notion of connection introduced in the late 1940s by C. Ehresmann (who was a pupil of É. Cartan). See Reference Cap and SlovakCap and Slovak (2009) and Reference SharpeSharpe (1996) for modern introductions.

Interpretive issues surrounding gauge symmetries are thus pressing for gravitational and nongravitational theories alike. But we will focus on interpreting the gauge principle for nongravitational theories. In that respect, we will describe conceptual/philosophical aspects in Section 2.3.1.

2.2.4 Quantization

The advent of quantum field theory necessitated the reshaping of the understanding of gauge symmetries substantially. Especially, the role of gauge transformations moves from a transformation of solutions of the equations of motion to an integral part of the definition of the quantum theory. This also modifies, to some extent, how gauge transformations are perceived in a quantum field theory, as will be outlined here.

In principle, quantizing a gauge theory is performed similarly as with nongauge theories, but for a few subtleties. For the purpose of illustration, this will be done here using a path-integral approach. We select the vector potential as the integration variable.Footnote ¹⁶

A way (Reference Böhm, Denner and JoosBöhm, Denner, & Joos, 2001) to understand the origin of the ensuing subtleties when integrating over the vector potential is to note that any gauge transformation leaves the action invariant, and, as a shift, also does not influence the measure. Hence, there are flatFootnote ¹⁷ directions of the path integral, and thus the integral diverges when integrating along these directions.

There are only a few possibilities to deal with these divergencies. One is to perform the quantization on a discrete space-time grid in a finite volume. In this way the divergences become controllable and can be removed before taking the limit to the original theory (Reference Montvay and MünsterMontvay & Münster, 1994).

Another one is to transform to manifestly gauge-invariant variables. This would be achieved by a variable transformation to the dressed fields of Section 5. However, in most cases, this has been thus far found to be practically impossible.

The third option is to remove the divergences by fixing a gauge (Reference Böhm, Denner and JoosBöhm et al., 2001). This is achieved by sampling every gauge orbit only partially in such a way that the result is finite while gauge-invariant quantities are not altered. Even though gauge-variant information is removed, this is not equivalent to introducing a gauge-invariant formulation. Any gauge condition will define one distinct way of removing the superfluous degrees of freedom, but what is removed and what is left differs for every choice of gauge.

As gauge fixing plays a central role in contemporary particle physics, as well as in Section 6, it is worthwhile to detail it further. Gauge fixing proceeds by selecting a gauge condition $C_{Λ}$ , which may involve the gauge field as well as any other fields, and which can be parametrized by some quantity or function $Λ$ in an arbitrary way. Necessary conditions for this gauge condition are that every gauge orbit has at least one representative fulfilling this condition. We will furthermore assume, for reasons of practicality, that there is only one such representative. In non-Abelian gauge theories, the Gribov–Singer ambiguity makes this requirement very involved (Reference GribovGribov, 1978; Reference SingerSinger, 1978) to implement and formulate, even to the point of practical impossibility (Reference Lavelle and McMullanLavelle & McMullan, 1997; Reference MaasMaas, 2013; Reference Vandersickel and ZwanzigerVandersickel & Zwanziger, 2012). Conceptually, however, this will not be an issue for now.

The procedure for determining the amplitude of any gauge-invariant observable $f$ goes as follows:

\frac{1}{N} \int_{Ω} D A_{μ} f (A_{μ}) \exp (i S [A_{_{μ}}])

(2.11)

\begin{matrix} = \frac{1}{N} \int_{Ω / Ω_{C}} D g \int_{Ω_{c}} D A_{μ} Δ [A_{μ}] δ (C_{Λ}) f (A_{μ}) exp (i S [A_{μ}]) \end{matrix}

(2.12)

\begin{matrix} = \frac{1}{N^{'}} \int_{Ω} D A_{μ} Δ [A_{μ}] f (A_{μ}) δ (C_{Λ}) exp (i S [A_{μ}]) \end{matrix}

(2.13)

\begin{matrix} = \frac{1}{N^{^{''}}} \int_{Ω_{C}} D A_{μ} Δ [A_{μ}] f (A_{μ}) exp (i S [A_{μ}]), \end{matrix}

(2.14)

in which possible further fields are suppressed. The original expression (2.11) is an integral over the whole set $Ω$ of all gauge orbits and all representatives on every gauge orbit. In (2.12) this set is split into the set $Ω_{C}$ , which contains for all orbits only the representatives fulfilling $C_{Λ}$ , and the remainder $Ω / Ω_{C}$ . This allows us to write the integration along gauge orbits $g$ and over gauge orbits separately. This separation requires the introduction of a $δ$ -function on the gauge condition and an additional weight factor, the Faddeev–Popov determinant $Δ$ . The latter ensures that the weight of the representative is the same for all gauge orbits, and thus gauge-invariant, ensuring $⟨ 1 ⟩ = 1$ . Then the integrals along the gauge orbits are orbit-independent and can be absorbed in the normalization in (2.13). Finally, resolving the $δ$ -function in (2.14) yields the gauge-fixed path integral, yielding yet another normalization. This approach is standard for perturbation theory (Reference Böhm, Denner and JoosBöhm et al., 2001).Footnote ¹⁸ As noted earlier, beyond perturbation theory the Gribov–Singer ambiguity makes this procedure in practice cumbersome. This manifests itself in the form and properties of both the Faddeev–Popov determinant $Δ$ and the gauge condition $C_{Λ}$ (Reference Lavelle and McMullanLavelle & McMullan, 1997; Reference MaasMaas, 2013; Reference Vandersickel and ZwanzigerVandersickel & Zwanziger, 2012).

In principle, the Faddeev–Popov determinant is a simple functional extension of the following argument: for a function with a single root, $f (x_{o}) = 0$ , the Dirac delta function obeys the following identity:

δ (f (x)) = \frac{δ (x - x_{o})}{| \det f' (x_{o}) |} .

(2.15)

Since the integral of $δ (x - x_{o})$ over $x$ gives unity, then

| \det f' (x_{o}) | \int^{​} d x δ (f (x)) = 1.

The functional settic

| \det J' (φ_{o}) | \int^{​} D φ δ (F (φ)) = 1.

(2.16)

Usually the field $φ$ is a gauge parameter, with (2.16) determining the factor for a gauge fixing.

Geometrically, the Faddeev–Popov determinant emerges as a functional Jacobian for a change of variables, since one now decomposes, as in Fubini’s theorem, a functional integration over all variables into an integration over a gauge-fixing section and an integration over the orbits (see e.g. Reference Babelon and VialletBabelon & Viallet, 1979 and Reference MottolaMottola, 1995). Of course, such a decomposition is also vulnerable to the Gribov ambiguity, and thus would not be available nonperturbatively. Nonetheless, this interpretation leads straightforwardly to the Faddeev–Popov determinant as follows: for the transformation $A_{μ} \to A_{μ}^{C g}$ , where $A_{μ}^{C}$ is the gauge-fixed connection, we obtain the respective Jacobian for the measure $D A \to det (J) D A^{C} D g$ .Footnote ¹⁹

It is an interesting question to ask what happens if one tries to calculate a gauge-dependent amplitude. In fact, if done so by putting a gauge-dependent $f$ in (2.11), the answer is always zero, up to some $δ$ -functions at coinciding arguments. Of course, such a calculation requires a method like the lattice regularization. The reason is that for any gauge field configuration with some value $A_{μ} (x_{0})$ at the fixed position $x_{0}$ , there exists a gauge transformation, which is only nonvanishing at $x_{0}$ , such that the value of the gauge-transformed gauge field is $- A_{μ} (x_{0})$ . In this way, any integration over the full gauge group yields zero. The only exception can happen if arguments coincide, yielding squares of the fields. On the other hand, evaluating a gauge-dependent quantity $f$ using (2.14) yields very nontrivial results.

The reason for this apparent disagreement is the step from (2.12) to (2.13). Here, the integral $\int D g$ was absorbed in the normalization, because none of the remaining expressions depended on it, since all of them were gauge-invariant. This is no longer true, if $f$ is gauge-dependent. Then the integral over gauge transformations can no longer be separated as a factor and be removed.

Thus, from a purely mathematical point of view, the expressions (2.11–2.12) and (2.13–2.14) are distinct theories. From the point of view of physics, there is just an infinite number of equivalent quantum theories, the one without gauge fixing and the infinitely many choices of $Ω_{c}$ , or equivalently $C_{Λ}$ , which all yield the same gauge-invariant observables, but differ for gauge-dependent ones.

Alternatively, this can also be taken to imply that any choice of theory with the action $S^{'} = S - i ln Δ$ gives the same gauge-invariant quantities, provided they are integrated over the corresponding set $Ω_{C}$ , either directly implemented as integration range or by a $δ$ -function. In either way, this leads ultimately to the expression (2.14).

This infinite degeneracy of quantum theories is a consequence of working with gauge freedom. This problem of infinite degeneracy would vanish if a transformation to gauge-invariant variables could be performed, as is aimed at in the dressing-field approach in Section 5. Alternatively, an approach like the one in Section 6 will take all these theories to be equivalent by defining the (gauge-invariant) observables to be only the ones that are the same for all. This yields the same observable result as the lattice regularization.

2.3 Interpreting Gauge Theories

2.3.1 The Gauge Principle Meets Philosophy

In the late twentieth century, gauge symmetries began to attract the attention of philosophers. Many of them identified the gauge principle as a prominent example of a fundamental shift in the methodology of theoretical physics, intertwined with the elevation of abstract mathematics. Reference SteinerSteiner (1989; Reference Steiner1998) raised the issue in the context of the question of the applicability of mathematics to natural science. He regarded the gauge argument (especially in the version of Yang and Mills) as a Pythagorean analogy, namely one that can only be expressed in a mathematical language and is not based on a physical similarity. Such mathematical analogies, according to Steiner, are motivated by human values (such as aesthetics) that guide the development of mathematics, and their repeated success in physics puts physics at odds with naturalism.

A closely related issue is the apparent conflict between the standard understanding of gauge symmetries as a matter of a choice of convention and their great methodological significance. Yang and Mills described gauge freedom in terms of local conventions:

The difference between a neutron and a proton is then a purely arbitrary process. As usually conceived, however, this arbitrariness is subject to the following limitation: once one chooses what to call a proton, what a neutron, at one space-time point, one is then not free to make any choices at other space-time points. It seems that this is not consistent with the localized field concept that underlies the usual physical theories.

(p. 192)

This view of gauge invariance became part of the received view, in part due to Reference WignerWigner (1967a), who emphasized the distinction between dynamical symmetries, such as gauge symmetries, and other symmetries such as Lorentz transformations. A philosophical view that adopts this approach was presented by Reference AuyangAuyang (1995) in her book that aimed to present quantum field theory in a Kantian categorical framework of objective knowledge. Reference TellerTeller (1997, Reference Teller2000) criticized this view, raising the question of “[h]ow can an apparently substantive conclusion follow from a fact about conventions?” (2000, p. 469).

This tension between the significance of gauge invariance as an essential part of the basis for the formulation of successful theories, and the appearance of gauge as essentially a manifestation of mathematical redundancy was dubbed by Reference Redhead, Kuhlmann, Lyre and WayneRedhead (2002) as “the most pressing problem in current philosophy of physics” (p. 299) and has been the subject of extensive philosophical inquiry. Many (Reference Martin, Brading and CastellaniMartin, 2003; Reference Norton, Brading and CastellaniNorton, 2003) have pointed out the similarity of the question to the controversial issue of the role of general covariance in general relativity (Reference NortonNorton, 1993), in which a mathematical constraint on the formulation of the theory is seen as fundamental to the construction of the theory.

One possible way to approach this question is to adopt a deflationary approach toward the gauge principle. Reference Brown, Butterfield and PagonisBrown (1999) noted that neither the curvature of the fiber bundle nor the back-reaction of the matter on the gauge field can be explained by the principle. Reference Martin, Brading and CastellaniMartin (2003) noted that in the gauge argument, the gauge principle is applied in conjunction with other principles such as Lorentz invariance, renormalizability, and simplicity. The requirement for a local symmetry does not by itself determine the form of the interaction, nor does it dictate its existence. While this criticism does amend misconceptions that appear in some textbooks, it does not claim to fully resolve the foundational and philosophical questions.

A recent interpretation of the gauge principle focuses on its methodological use in relation to Noether’s theorem: in Reference Gomes, Roberts and ButterfieldGomes et al. (2021), it is argued that, when constructing theories containing charges that are sources of fields that have their own dynamics – as happens in all known field theories – we have no way of ensuring that the conservation of charges is compatible with the fields’ dynamics, other than by explicit computation and trial and error. Unless, that is, the conservation of charges is associated to a rigid symmetry, through Noether’s first theorem. In that case, Reference Gomes, Roberts and ButterfieldGomes et al. (2021) show that converting that rigid symmetry into a malleable one enforces the compatibility between charge conservation and the dynamics of the corresponding fields. Thus, just as the GP guarantees that the equations governing the evolution of (for example) electromagnetic fields are compatible with the conservation of electric charge, so it would guarantee that the dynamics of any field are compatible with the conservation of its sources.

Other recent approaches seek for answers in the differential geometry of fiber bundles that underlies (classical) gauge field theory. (The valiant reader will find a short and dense overview of bundle geometry in Appendix B.) In this framework, gauge transformations arise in a manner very much analogous to coordinate changes in general relativistic physics: gauge transformations are seen to be generalized coordinate changes for an “enriched” space-time (the fiber bundle) whose points have an internal structure (the fibers). So, the gauge principle could be understood, like the principle of general covariance, as a principle of democratic epistemic access to the intrinsic geometry of an enriched space-time whose points are not structureless, and whose internal structure can be “probed” by the fundamental fields it contains.Footnote ²⁰ Embracing this as a satisfying explanation of the heuristic power of the GP means taking the mathematics seriously enough to accept that it may refer to an actual physical entity, therefore entertaining some degree of committment to the ontological character of the fiber bundle. See, for example, Reference CatrenCatren (2022) for a recent defence of this view.

Other approaches aim to relate the gauge heuristics to a more explicit physical content. One way to do so is by appealing to a notion of direct observability of gauge symmetries, in which the symmetry is restricted to observations performed on a certain subsystem (Section 2.3.4). A different approach (Reference LyreLyre, 2000, Reference Lyre2001, Reference Lyre2003; Reference MackMack, 1981) acknowledges gauge covariance as a formal requirement by itself and aims to account for its applicability by supporting it with an additional equivalence principle, formulated by analogy with Einstein’s presentation of general relativity based on general covariance and the equivalence principle. The principle roughly states that there is always a choice of gauge such that the gauge field vanishes at a given point.

Applying a generalized version of the equivalence principle can be beneficial in understanding gauge not only at the interpretational level but also at the level of heuristics. The “methodological equivalence principle” prescribes the introduction of an interaction based on an the way the dynamical law of a theory violates the invariance requirement. It provides a unified framework of understanding standard gauge theories, the use of general covariance in general relativity and tangent-space symmetries in gauge theories of gravity (Reference HetzroniHetzroni, 2020; Reference Hetzroni and ReadHetzroni & Read, 2023; Reference Hetzroni, Stemeroff, Posy and Ben-MenahemHetzroni & Stemeroff, 2023). What calls for an explanation is actually the local noninvariance of interaction-free theories (special relativity under general coordinate transformations, Dirac equation under change of local phase convention, etc.). The introduction of the gauge field (and similarly of the gravitational field) comes to provide a physical explanation. This approach is primarily motivated by existing evidence (supporting the locality of the preferred representations), but it can also motivate an interpretation of gauge theories as grounded in ontology of relational quantities, in harmony with a recent account by Reference RovelliRovelli (2014), and stressing a structural similarity between the gauge argument and Mach’s principle (Reference HetzroniHetzroni, 2021). The origin of the gap between gauge-independent physical phenomena and gauge-dependent theoretical representation is that while fundamental physical degrees of freedom amount to relations between pairs of physical entities, our theories use variables that refer to single objects, and not their relations.

2.3.2 Interpretation of Gauge Invariance

The appearance of a gauge symmetry in a given theory may have crucial implications for its possible interpretations and its ontology. The question of whether gauge symmetries reflect descriptive mathematical redundancy or some physical property is deeply entwined with issues of locality, separability, and determinism. Reconciling such theoretical virtues with each other (and in particular recall (D1)–(D3) introduced in Section 2.1.2) is a nontrivial matter.

Let us now introduce some terminology. We shall do so using the case study of electromagnetism (see Reference RicklesRickles, 2008: 48). Potentials related by the gauge transformation (2.5) (i.e., potentials that lead to the same fields) are gauge-equivalent. Gauge-equivalent configurations lie on the same gauge orbit. Accordingly, all potentials related by a gauge transformation are on the same gauge orbit. Our fields $\vec{E}$ and $\vec{B}$ remain constant under the gauge transformation (2.5), so they are gauge-invariant. The 4-vector potential $A^{μ}$ , namely the gauge field, is gauge-variant. Many mathematically distinct potentials lie on the same gauge orbit and lead to the same field. The fact that it does not matter which vector potential from a given gauge orbit we choose to express the fields is referred to as gauge freedom.

Gauge freedom implies that our gauge theory is mathematically underdetermined: the values of the gauge field throughout space in a given moment of time together with the Maxwell equations do not determine the potentials in different times. In some interpretations, this underdetermination corresponds to a form of indeterminism. If we have a point $p_{0}$ on phase space as our starting point,Footnote ²¹ the system does not evolve into a state represented by a unique phase point $p_{t}$ ; instead “we now have an indeterministic time-evolution, with a unique $p_{t}$ replaced by a gauge orbit” (Reference Redhead, Kuhlmann, Lyre and WayneRedhead, 2002, 289). This is to say that if we take a gauge-variant quantity such as the vector potential to be physically real, and mathematically distinct vector potentials to be physically distinct, then not only is there no way to determine the state empirically, the state is also underdetermined by the equations of motion.

Of course, this problem of indeterminism vanishes if one’s realism is constrained to gauge-invariant quantities, namely by identifying all gauge-equivalent states with one physical state, an assumption that is often referred to as Leibniz Equivalence (see, for example, Reference Saunders, Brading and CastellaniSaunders, 2003). We may therefore make a very broad distinction between two kinds of interpretations of gauge theories. On the one hand, there are interpretations that deny Leibniz equivalence, therefore taking gauge transformations to be physical transformations, allowing for realist commitments to gauge variant quantities, and on the other, interpretations that adopt Leibniz equivalence. Such interpretations would usually regard gauge freedom as an expression of surplus mathematical structure and restrict realist commitments to gauge-invariant quantities.

Proponents of the former type of interpretations may argue for a one-to-one correspondence between points on phase space and physical states such that each point on phase space represents a distinct physical state. This would mean that different points within the same gauge orbit would correspond to physically distinct states. In our case of classical electrodynamics, this would mean that gauge-equivalent vector potentials would correspond to physically distinct states. These states would be empirically equivalent in the sense that no possible observation or measurement could reveal which gauge-equivalent potential is currently in effect. As Rickles puts it, “these physically distinct possibilities are qualitatively indistinguishable” (Reference RicklesRickles, 2008, 62). Accordingly, such interpretations that allow gauge-variant quantities to be physically real come with the great disadvantage of an awkward indeterminism that allows for physical quantities that are in principle unobservable. However, it is a virtue of such interpretations to avoid the existence of surplus mathematical structure that is typically considered a characteristic feature of gauge theories. Since mathematically distinct descriptions are related to physically distinct states, the problem of explaining surplus mathematical structure vanishes.

Proponents of the latter type of interpretations may argue for a many-to-one correspondence between points on phase space and physical states such that points within the same gauge orbit would correspond to one and the same physical state. In our case of classical electrodynamics, this would mean that vector potentials are unphysical and gauge-equivalent vector potentials would correspond to the same physical state. Accordingly, such interpretations that restrict realist commitments to gauge-invariant quantities come with the virtue of avoiding indeterminism and restricting realist commitments to quantities that are in principle measurable. However, such interpretations that opt for a many-to-one correspondence between mathematical descriptions and physical states still face the problem of dealing with all the mathematical redundancy. Many physicists and philosophers share the sentiment of Zee that “gauge theories are also deeply disturbing and unsatisfying in some sense: They are built on a redundancy of description” (Reference ZeeZee, 2010, 187). Instead, many would desire a more direct correspondence between the mathematics and the physics, such that every mathematical degree of freedom has observable effects.

2.3.3 Ontology of Gauge Theories

Interpretations of gauge theories are often presented as a response to the tension between gauge invariance and locality that is revealed in the Aharonov–Bohm effect (Section 2.2.2).

In their original paper, Aharonov and Bohm argued that

in quantum mechanics, the fundamental physical entities are the potentials, while the fields are derived from them by differentiations $\dots$ Of course, our discussion does not bring into question the gauge invariance of the theory. But it does show that in a theory involving only local interactions (e.g., Schrödinger’s or Dirac’s equation, and current quantum-mechanical field theories), the potentials must, in certain cases, be considered as physically effective, even when there are no fields acting on the charged particles. $\dots$ we are led to regard $A_{μ} (x)$ as a physical variable. This means that we must be able to define the physical difference between two quantum states which differ only by gauge transformation.

The radical suggestion that the actual physics behind electromagnetism is not gauge-invariant is supposed to maintain locality. (A straightforward way to interpret classical electromagnetism in this way is to revive the concept of the aether, and consider the vector potential as its velocity field, and the electric field as its acceleration. See Reference BelotBelot, 1998.)

The idea has several significant disadvantages. One problem is that the interpretation leads to the radical indeterminism discussed in the previous subsection. Such an interpretation is not necessarily local; it might also lead to a form of action at a distance that is more explicit than the nonlocality of the Aharonov–Bohm effect. An electric current that starts to flow, for example, can (depending on the actual gauge) immediately change the values of the vector potential throughout space. Furthermore, it is controversial whether the potentials can provide a truly local explanation for the Aharonov–Bohm effect, due to the issue of separability (Reference Eynck, Lyre and RummellEynck, Lyre, & Rummell, 2001; Reference HealeyHealey, 1997, Reference Healey1999; Reference MaudlinMaudlin, 1998). The potential’s approach was criticized by Aharonov himself, due to the unexplained gap it contains between the gauge-independent observed phenomena and their gauge-dependent theoretical description (Reference Aharonov and RohrlichAharonov & Rohrlich, 2008).

In light of the evident downsides of potentials’ ontology, it may be worthwhile to reconsider the standard field interpretation of electromagnetism and see if it can be adapted to account for the Aharonov–Bohm effect. Such an approach was presented by Reference DeWittDeWitt (1962), who concluded that

Nonrelativistic particle mechanics as well as relativistic quantum field theories with an externally imposed electromagnetic field can therefore be formulated solely in terms of field strengths, at the expense, however, of having the field strengths appear nonlocally in line integrals.

A different account of a nonlocal influence of fields is described by Reference Aharonov and RohrlichAharonov and Rohrlich (2008) (mainly Chapters 4–5) in terms of modular variables. Nevertheless, even in this nonlocal approach the potentials are an essential part of the theoretical description of the interaction.Footnote ²²

A third explanation was suggested by Reference Wu and YangWu and Yang (1975). It closes the gap between the observable properties and the postulated physical properties in a straightforward way. Wu and Yang noted that the actual observable quantity is not the AB phase difference $Δ φ_{A B}$ itself (Equation (2.8)), but rather the phase factor (or holonomy, or anholonomy):

Φ_{A B} = exp (i Δ φ_{A B}) = exp (i \frac{q}{ℏ c} \oint A^{*} (\vec{r}) \cdot \vec{d r}) .

(2.18)

Their approach, known today as the holonomies approach to the Aharonov–Bohm effect, is to promote this factor (defined for any nonintersecting closed curve in space-time) to a fundamental variable from which all other electromagnetic quantities can be derived. The description of electromagnetism based on holonomies is gauge-invariant and nonlocal. This kind of nonlocality is not a dynamical action at a distance, but a kinematic nonseparability. The electromagnetic field, Wu and Yang argue, “underdescribes electromagnetism,” while the phase (2.8) overdescribes it (p. 12). The phase factor (2.18) constitutes the variable that provides the complete description of electromagnetic phenomena. With the phase factors interpreted as a fundamental variable, the theory becomes nonseparable in a strong sense (Reference HealeyHealey, 2007): the physical processes are not supervenient on the assignment of local properties to space-time points. This interpretation was advocated by Reference BelotBelot (1998) as a fruitful interpretation and by Reference Eynck, Lyre and RummellEynck et al. (2001) because of the local action and measurability of the holonomies.

Forming an analogy with the ontology of space debate, Reference ArntzeniusArntzenius (2014) relates to holonomies interpretation as gauge relationism, which he contrasts with the fiber-bundle substantivalism he presents. The ontology in this approach is a literal reading of the fiber-bundle representation of gauge theories. It consists of a fiber bundle representing the possible states of a gauge field over a space-time manifold. The structure of the fiber (vector space, Abelian\non-Abelian group, etc.) determines the property of the interaction represented by the gauge fields, understood in terms of connections over the bundle. In the most literal reading, gauge transformations in these interpretations do reflect physical change (as in Reference MaudlinMaudlin, 1998). It is possible, however, to formulate a “sophisticated” version of this substantivalism, which denies the existence of possible worlds that are only connected by a gauge transformation. The main reason to favor substantivalist approaches to gauge, according to Arntzenius, is the availability and simplicity of theories formulated in terms of gauge-dependent variables, in contrast to the difficulties of constructing a dynamical theory in which holonomies are a fundamental variable.

2.3.4 Direct Empirical Significance

The debate about the ontological correlates of symmetries is sometimes linked to the question of which symmetries have “direct empirical significance” and which do not. The idea here is that if a symmetry is a mere descriptive redundancy, it cannot be said to have any direct empirical significance (though it may have indirect empirical significance: the very fact that the theory can be formulated in terms of these symmetries has specific further implications, e.g. conservation laws). In contrast, symmetries that connect states that are physically distinct yet empirically equivalent are said to have direct empirical significance.

This question is intimately related to the distinction between systems and subsystems. There is no physical point of view from which a state of the entire universe and a symmetry-related state could be distinguished empirically, and it is therefore common to regard symmetries of the entire universe as not carrying any empirical significance. For symmetries applied only to proper subsystems of the universe, the situation is different. Here the standard example is the thought experiment known as Galileo’s ship: the inertial state of motion of a ship is immaterial to how events unfold in the cabin but is registered in the values of relational quantities such as the distance and velocity of the ship relative to the shore.

The question at the center of the debate about direct empirical significance is whether the same holds for local symmetries as in gauge theories. The orthodox view, that local gauge symmetries do not carry empirical significance, has recently been argued to be derivable from three simple assumptions about direct empirical significance and physical identity of subsystem states (Reference FriederichFriederich, 2015, Reference Friederich, Massimi, Romeijn and Schurz2017); see Reference Murgueitio RamírezMurgueitio Ramirez (2021) for criticism of the assumptions used there. However, recently, dissonant voices have become more prominent, for example, Reference Greaves and WallaceGreaves and Wallace (2014), Reference TehTeh (2016), and Reference WallaceWallace (2022a, Reference Wallace2022b). The unorthodox view reproduces, for any subsystem, a common interpretation of asymptotic symmetries within physics: that any gauge transformation that preserves the state at the boundary and yet is not asymptotically the identity, acquires empirical significance.

But this transposition of asymptotic conclusions to compact subsystems has come under attack from many directions. For instance, in Reference GomesGomes (2021) counterexamples were shown to exist in vacuum electromagnetism; follow-up work gauge-fixes the local symmetry and shows that only global symmetries can carry empirical significance as subsystem transformations, and this occurs only when such global symmetries are associated with covariantly conserved quasi-local charges; see Reference Gomes, Read and TehGomes (2019b). Moreover, the common view of asymptotic symmetries can be recovered within such a gauge-fixed formalism (cf. Reference RielloRiello, 2020).

2.4 Artificial versus Substantial Gauge Symmetries

Depending on the physicist and the time at which they write, the appearance of gauge symmetries in our most successful theories of fundamental physics is regarded either as a deep insight into the structure of matter and fundamental interactions, or as a mere feature of convenience, reflecting a mathematical redundancy of the descriptive formal apparatus (see Section 2.3.1). Indeed, the physical importance of gauge symmetries (if only at an heuristic level) is attested by the widespread acceptance of the gauge principle – constraining the form of interacting theories that generalize the free ones possessing fewer symmetries – as a fundamental concept on par with the relativity or general covariance principle. Both are sometimes seen as symmetry principles uncovering deep structural aspects of physical reality. Such a position, and endorsement of the spirit of Yang’s already mentioned aphorism that “symmetry dictates interactions” (Reference YangYang, 1980), is prima facie in tension with the view of gauge symmetries as “surplus” theoretical structure.

Resolving this tension, we argued in Section 2.1.2, hinges on a refined understanding of what should count as “surplus structure” in gauge theory.

A first step in the right direction is to notice that not all gauge symmetries are necessarily on a par. Indeed, it happens that, for technical reasons, physicists are used to routinely “adding” gauge symmetries to theories that initially had none, using a variety of tools (whose forefather is perhaps the Stueckelberg trick; see Reference Ruegg and Ruiz-AltabaRuegg & Ruiz-Altaba, 2004). But if practically any field theory can be turned into a gauge theory, it is hard to understand how gauge symmetries could be fundamental.

This state of affairs is reminiscent of a well-known discussion within the philosophy of general relativistic physics, spurred by the Kretschmann objection. In 1917, E. Kretschmann criticized the foundational status that Einstein gave to the principle of general covariance for his general relativity, observing that any theory could in principle be recast – perhaps with some cleverness required on the part of the theoretician – in the language of tensor calculus, thus in a generally covariant way. This prompted a recognition that one should distinguish substantive general covariance, native to a theory, from artificial general covariance, which is formally forced onto a theory, and that the main question is then to specify a demarcation criterion between the two.Footnote ²³

Given the preceding observation, a generalized Kretschmann objection could similarly be leveraged against the gauge principle, suggesting a provisional distinction between substantive and artificial gauge symmetries (Reference PittsPitts, 2008, Reference Pitts2009). The question would then be: when presented with some gauge field theory, on what ground could we make the distinction? Presumably, one may look for physical signatures of substantive gauge symmetries that are lacking for artificial ones. As discussions in previous sections suggest, there is a feature that is usually recognized as characterizing “honest” gauge symmetries: a trade-off between locality and gauge invariance. It is indeed a fact that some gauge theories can be reformulated in a gauge-invariant way without sacrificing the locality of their basic variables (e.g. scalar electrodynamics and the abelian Higgs model), while others can’t (e.g. spinorial electrodynamics and pure Yang–Mills theories) – see Section 5. This certainly does not exhaust the physical content of substantive gauge symmetries, but one may take this trade-off to be a robust physical signature: on the position that physical d.o.f. are gauge-invariant, this criterion would classify “true” gauge theories as ones that display a certain nonlocality (entirely distinct from quantum nonlocality).Footnote ²⁴ The famous Aharonov–Bohm effect, discussed in Section 2.2.2, seems to provide support to the claim.

Many – likely all – interpretive and conceptual issues raised by gauge symmetries and discussed in this Element concern the substantive kind. Artificial gauge symmetries pose no – or at least fewer – mysteries, as they can be redefined away, usually by some change of local field variables. Thus, presented with a gauge theory one could first seek to identify which, if any, of its gauge symmetries are artificial, and get rid of them so as to work with a minimal (“Ockhamized”) version of the theory displaying only substantive residual gauge symmetries, if any. We might call such a minimal, and still local, version of a gauge theory its “maximally invariant formulation.” The latter is certainly the proper candidate to be subjected to in-depth philosophical and theoretical analysis.Footnote ²⁵

According to the definition of “surplus structure” suggested in Section 2.1.2, artificial gauge symmetries are genuine surplus, while substantial gauge symmetries are not since they signal the non-locality – or non-separability – of the fundamental d.o.f. of gauge physics. Yet we should remark that, as also hinted in Section 2.1.2, it may be that features initially deemed surplus are promoted if it is eventually found that they have a direct physical signature, or one more indirect in the form of an important theoretical virtue. Thus, while the distinction between substantive and artificial gauge symmetries holds good for classical gauge field theories, one may be led to reassess this provisional demarcation in quantum gauge field theory.

Notably, a centrally attractive feature of gauge theories is their good renormalization properties under quantization. For a fundamental theory, renormalizability is often deemed an important virtue with deep physical relevance. Admitting this, if it appears that a gauge symmetry classically classified as artificial nonetheless plays a significant role in ensuring that the quantum theory is renormalizable, one would then be justified in promoting it to a substantial symmetry. Let us raise an intriguing possibility though: that on the contrary the distinction would extend to QFT, so that a gauge symmetry initially thought to be necessary to ensure good quantum properties, yet classically marked as artificial, would on second analysis turn out to be dispensable to the quantum theory.Footnote ²⁶ This we leave as an open question.

2.5 Dualities

When discussing the idea of the possibility of eliminating (gauge) symmetries, another item necessarily appearing is the concept of dualities: the situation that physical systems can have two or more equivalent theoretical formulations. The different formulations may have different symmetries, even to the point that there exists a formulation with no symmetries at all. In general, if multiple theories exist, which agree on all observables but have degrees of freedom that are not in a one-to-one-relation with each other, they are called dual to each other.

Such relations are more than just variable transformations. A variable transformation is a mathematical process that is unambiguous and thus invertible. In theories with the same physical content but different levels of symmetries, there is necessarily an ambiguity in the relation between variables.

A relatively trivial example in classical mechanics is the problem of orbital movement in the usual gravitational potential. It is a superintegrable system, which means that out of the three independent degrees of freedom all but one can be eliminated, using the conservation of the Runge–Lenz vector and of orbital angular momentum. But one can choose to eliminate only one of the two degrees of freedom. This gives an intermediate theory with symmetries inequivalent to those of the one with only one degree of freedom, and connected to it by a noninvertible variable transformation.

There is a vast multitude of further examples. Perhaps one of the most extreme cases is two-dimensional Yang–Mills theory in flat space-time (Reference Dosch and MullerDosch & Muller, 1979). This theory is trivial in the sense that there are no physical excitations, and the only physical state is the (empty) vacuum. Nonetheless, there is a denumerable infinite number of associated gauge theories, Yang–Mills theory with an arbitrary Lie group, which all create this same single state. These theories all differ in terms of their gauge-dependent Green’s functions and thus in the way cancellations occur to create such a single state.

There are further theories, which are known to be exactly related to each other and are nontrivial. However, these are either theories that are not gauge theories, like the low-temperature/high-temperature duality of the Ising model (Reference KogutKogut, 1979), or have a gauge theory only on one side, like the gauged O(4) linear sigma model with punctured target space and the corresponding nongauge theory of massive scalars and vectors. (See Section 5.3 and Reference Evertz, Jersak, Lang and NeuhausEvertz et al., 1986; Reference Fernandez, Fröhlich and SokalFernandez, Fröhlich, & Sokal, 1992; and Reference MaasMaas, 2019.)

While theories with trivial dynamics appear to be quite irrelevant, there is a wide variety of nontrivial theories where similar features are, to some extent, established, motivated, or conjectured. Probably most notorious are dualities between weakly coupled theories and strongly coupled theories. These are especially prolific for supersymmetric gauge theories (Reference WeinbergWeinberg, 2000), but have also been conjectured for some so-called walking theories (see e.g., Reference SanninoSannino, 2009). In such cases generalized electric and magnetic degrees of freedom often are exchanged. However, the celebrated AdS/CFT conjecture (Reference Freedman and Van ProeyenFreedman and Van Proeyen, 2012) suggests even the possibility that a quantum gauge theory in a fixed space-time could be equivalent to a classical gravitational theory in a different space-time. Were this to be proven, especially for any applicable theory describing our universe, this would imply that even the fundamental structure of space-time is exchangeable.

Dualities extend the ideas of Kretschmann (Reference PittsPitts, 2009), whose conjecture was that every theory can be covariantized. These considerations lead to a much stronger conjecture: that for every set of physical observables there exist multiple (gauge) theories that are not related to each other by unambiguously invertible variable transformations.

Finally, we note that very recently the topic of dualities has received some well-deserved attention in the philosophy of physics (see, e.g., Reference Castellani and RicklesCastellani and Rickles, 2017 and Reference De Haro and ButterfieldDe Haro and Butterfield, 2021). Here we find some notable agreement that duality is closely related to symmetry. More precisely, the idea is “that a duality is like a ‘giant symmetry’: a symmetry between theories” Reference De Haro and ButterfieldDe Haro & Butterfield (2021: 2974). In Section 2.1.1, we have discussed in what sense symmetry is to be understood as “invariance under transformation.” The idea now is that while symmetries typically relate different physical states, or, in the case of gauge symmetries, the same physical state to different mathematical descriptions, in the case of dualities theories are related such that different theories describe the same (observable) physics. What makes the interpretation of duality particularly interesting is that dual theories are often associated with radically different ontological commitments. For instance, as it is the case with respect to the AdS/CFT conjecture, if one theory is formulated in D-dimensional space-time and the (allegedly) dual theory in D–1-dimensional space-time, what does this say about the world we live in? When there are two dual theories, describing the world in very different concepts, is one of them true and the other one false? Are both false, indicating that there is a true theory hidden beyond? Or is a unique mathematical description of the physical world impossible and do dual theories constitute different (and possibly incomplete) but equally valid perspectives?

Obviously, it would go beyond the scope of the Element to discuss this in detail. What we do note, however, is that in the literature dualities have been systematically compared to gauge symmetries (Reference De Haro, Teh and ButterfieldDe Haro, Teh, & Butterfield, 2017; Reference RicklesRickles, 2017). This is because “dual theories can indeed ‘say the same thing in different words’—which is reminiscent of gauge symmetries” (Reference De Haro, Teh and ButterfieldDe Haro et al., 2017, 68). Accordingly, we conclude this section on the interpretation of gauge symmetries by emphasizing the significance of this undertaking. However, we note that the methods we discuss in Sections 5 and 6 can be regarded as examples of how to establish dualities between gauge theories and nongauge theories and their important role in gauge-invariant formulations of the BEH mechanism. For example, the dressing field method provides a reformulation of the Abelian Higgs model or the Glashow–Weinberg–Salam model of electroweak interactions in terms of gauge-invariant fields at the classical level; see Section 5. In addition, the concept of dualities, relating the state space of different theories, allows for a potential reinterpretation of the BEH mechanism, which avoids the terminology of spontaneous gauge symmetry breaking as conjectured in Reference SondenheimerSondenheimer (2020). The framework proposed by Fröhlich, Morchio, and Strocchi (FMS) to formulate observables of a BEH theory in a gauge-invariant manner provides a link between some gauge-invariant observables in one gauge theory and invariant observables with respect to another particular gauge group.Footnote ²⁷ Thus, the BEH mechanism, namely, the introduction of a scalar field with a Mexican hat–type interaction potential, does not involve a spontaneous breaking of the original gauge group but induces a duality between particular states of two gauge theories in a specific way that is described by the FMS formulation. The seemingly spontaneous breaking rather displays the fact that this duality becomes most apparent in a gauge-fixed language that explicitly breaks the original gauge group. We further elaborate on this relation in Section 6.

3 Symmetry Breaking and the Brout–Englert–Higgs Mechanism

The term “symmetry breaking” refers to a collection of phenomena and theoretical notions generally characterized by a situation in which the state of a system does not respect the symmetry of the laws governing it. The concept became dominant in theoretical physics around the early 1960s, in an interplay between condensed-matter physics and particle physics. The crowning glory of this framework is widely considered to be the Brout–Englert–Higgs (BEH) mechanism. A fundamental part of the electroweak model in particle physics, the mechanism is commonly presented as a case of dynamical spontaneous symmetry breaking of gauge symmetry. This standard account, however, has been challenged on various theoretical and philosophical grounds.

The issue is clearly interconnected with that of the interpretation of gauge symmetries. In this section we shall briefly present the standard account of the mechanism and recent developments in its understanding. We shall then focus on motivating gauge-invariant approaches (which will be the subject of the following two sections) in light of problems with the standard account of symmetry breaking on one hand and philosophical discourse on gauge presented in the previous section on the other hand.

3.1 Symmetry Breaking at a Glance

Historically, one of the first physicists to recognize the systematic significance of the notion of symmetry breaking was Pierre Curie, with his 1894 paper “Sur la symétrie dans les phénomènes physiques” being his principal work on this topic. Symmetry breaking, he noted, is not some obscure, rare phenomenon but is commonplace and may be the reason for the occurrence of phenomena in the first place. According to Curie,

A phenomenon can exist in a medium which possesses its characteristic symmetry or that of one of the subgroups of its characteristic symmetry. In other words, certain symmetry elements can coexist with certain phenomena, but they are not necessary. What is necessary is that some symmetry elements be missing. Asymmetry is what creates a phenomenon.

(Curie, 2003, 312)

This idea that “[a]symmetry is what creates a phenomenon” underscores the significance of symmetry breaking and is a theme not unfamiliar in modern (philosophy of) physics. Castellani, for instance, declares:

Any symmetry we can perceive (albeit in an approximate way) is indeed the result of a higher order symmetry being broken. This can actually be said of any symmetry which is not the “absolute” one (i.e. including all possible symmetry transformations). But we can say even more: in a situation characterized by an absolute symmetry, nothing definite could exist, since absolute symmetry means total lack of differentiation.

(Castellani, 2003, p. 322)

The claim that phenomena exist due to broken symmetries (or the lack of symmetry) is related to, but not equivalent to what has come to be known as Curie’s principle. The principle says: “When certain causes produce certain effects, the symmetry elements of the causes must be found in their effects” (Reference Curie, Brading and CastellaniCurie, 2003, 312). Conversely but equivalently, put in terms of asymmetry, the principle reads: "When certain effects show a certain asymmetry, this asymmetry must be found in the causes which gave rise to them” (Reference Curie, Brading and CastellaniCurie, 2003, 312). Partly due to its vagueness, Curie’s principle has met opposing reactions (see Reference EarmanEarman, 2004a). Attempts to make it more precise have resulted in a formulation that, according to prominent voices, “makes it virtually analytic” (Reference EarmanEarman, 2004a, 173). In slogan form, the principle says: “if no asymmetry goes in, then no asymmetry comes out” (Reference RobertsRoberts, 2013, 580).Footnote ²⁸ This leads us to the following question: How is it possible that our world manifests asymmetries although this world, on a fundamental level, seems to be governed by symmetric laws and theories? How exactly does asymmetry enter the picture and what does it mean and how can it be that a symmetry breaks down?

There are two forms of symmetry breaking relevant in modern physics: explicit and spontaneous.Footnote ²⁹ Explicit symmetry breaking is rather straightforward. A symmetry is broken explicitly when the equations of motion describing the system (as a consequence of the Lagrangian or Hamiltonian of the system not being invariant under the symmetry) are not covariant under the transformations. This is the case when a symmetry-breaking term is added in some situations to otherwise covariant dynamics (e.g. due to an external field).

In the case of spontaneous symmetry breaking (SSB), the equations of motion describing the system are covariant concerning the respective symmetry transformations, but the system is in a state that is not invariant under all symmetry transformations. This means that a system that is governed by a symmetric Hamiltonian can evolve into an asymmetric state. In particular, SSB appears when the ground state is degenerate, namely there are multiple states with the lowest energy. The transformation that connects these states instantiates a symmetry of the dynamics, yet each of the states does not respect the symmetry, as it transforms it to one of the other states.

The preceding description relates to classical theories. At the classical level, the choice of a particular ground state can be provided by part of the initial conditions. At the quantum level, this becomes more involved (Reference Birman, Nazmitdinov and YukalovBirman, Nazmitdinov, & Yukalov, 2013; Reference MaasMaas, 2019; Reference SartoriSartori, 1991). In quantum systems with finitely many degrees of freedom, the actual ground state always respects the symmetry. It is a superposition of nonsymmetric states that are connected via quantum tunneling. Thus, in a quantum theory SSB can only occur in systems with infinitely many degrees of freedom. This can be thought of in terms of the tunneling barriers becoming infinite.

In such quantum settings, phenomena associated with SSB can also be described in terms of explicit breaking. If a system shows SSB, it exhibits a well-defined behavior if the symmetry is broken by an external source as well. It will then show a broken (asymmetric) ground state even when taking the limit of a vanishing source.Footnote ³⁰

For gauge theories, any such source would itself be gauge-dependent and thus break the gauge symmetry explicitly. The question thus naturally arises whether external sources are necessary to define spontaneous symmetry breaking. That can already be addressed at the global level. In fact, because the path integral sums over all configurations, there is no possibility without external sources for singling out spontaneously a direction and thus break a symmetry. Thus global symmetries are always intact without external sources (Reference Fröhlich, Morchio and StrocchiFröhlich et al., 1981; Reference MaasMaas, 2019). Nonetheless, it is still possible to determine how it would react to an external infinitesimal source using suitable observables (Reference Caudy and GreensiteCaudy & Greensite, 2008; Reference MaasMaas, 2019). If it would react in a nonanalytic way, the system is metastable, and this metastability is equivalent to the usual picture of SSB.

Thus, external sources are not necessary to identify SSB. In fact, they can even be misleading for gauge theories, as cases are known in which metastability is signaled in absence of actual SSB (Reference Maas and ZwanzigerMaas & Zwanziger, 2014). Furthermore, Elitzur’s theorem even states that gauge symmetries can never experience SSB (Reference ElitzurElitzur, 1975).Footnote ³¹ Hence, the whole concept of SSB cannot be transferred directly from global symmetries to gauge symmetries.

This leads to the central issue: What is meant when, colloquially, physicists speak of SSB of gauge symmetries? The most prominent example of this issue is the Brout–Englert–Higgs effect (BEH), presented in the following subsection and discussed subsequently.

3.2 Gauge Symmetry Breaking and the BEH Mechanism

The standard textbook account of the BEH mechanism starts with spontaneous breaking of the global symmetry and then extends the analysis to the corresponding local gauge symmetry. In a spontaneous breaking of a global continuous symmetry, the different lowest-energy states are connected by a transformation of a degree of freedom that, in particle terms, can be interpreted as massless, spinless Goldstone bosons that interact with the matter fields. The introduction of the field associated with Goldstone bosons can be described using classical fields, by rewriting the Lagrangian, separating degrees of freedom that are transformed from one ground state to another and “radial” directions, associated with massive particles. In quantum field theories this reformulation gains further significance, as it corresponds to a perturbative analysis around a particular ground state.

The simplest example concerns two scalar fields $ϕ_{1}$ , $ϕ_{2}$ with a $U (1)$ symmetry corresponding to rotation in the internal field space (i.e., continuously transforming the one into the other). The potential is taken to be $V (ϕ) = - \frac{1}{2} μ^{2} ϕ^{2} + \frac{1}{4} λ ϕ^{4}$ , a function of the gauge-independent term $ϕ^{2} \equiv ϕ_{1}^{2} + ϕ_{2}^{2}$ . Thus, all states satisfying $ϕ = μ^{2} / λ \equiv v^{2}$ have minimal energy. In this model, $ϕ$ can be treated as a radial direction, corresponding to a massive boson. Transformations along the tangential direction generate the symmetry and are associated with a Goldstone boson. This description is said to break the symmetry since perturbative analysis around any particular vacuum state (e.g. $ϕ_{1} = v^{2}$ ; $ϕ_{2} = 0$ ) leads to the Lagrangian $L = (\frac{1}{2} \partial_{ν} η \partial^{ν} η - μ^{2} η^{2}) + (\frac{1}{2} \partial_{ν} ξ \partial^{ν} ξ) + [interaction terms]$ with $η$ radial perturbation and $ξ$ tangential ( $η \equiv ϕ_{1} - v^{2}$ and $ξ \equiv ϕ_{2}$ for the aforementioned state).

When applying the gauge principle, the global symmetry is localized, introducing a new massless bosonic field. In the preceding example the field $A^{ν}$ defines a connection over the $U (1)$ bundle. The central point in this standard account is that when the Lagrangian is expressed in terms of $ξ$ and $η$ , the gauge field $A^{ν}$ obtains a nonvanishing mass term (that equals to $\frac{1}{2} \frac{q μ}{ℏ c λ} A_{ν} A^{ν}$ in the preceding example). $A^{ν}$ is therefore interpreted as a massive bosonic field. The breaking of the local symmetry is associated with a choice of particular gauge. In order to allow for the massive-boson interpretation, the gauge is chosen such that the terms corresponding to the peculiar and unobservable Goldstone bosons disappear. Thus, the mechanism, according to this standard presentation, allows for gauge bosons to obtain mass at the price of sacrificing gauge invariance.

Moving from this toy model to theories with more empirical relevance, the prototypical theory is initially invariant under some local internal symmetry group and would accordingly have some gauge fields $W_{μ}^{a}$ , supplemented by one or more scalar fields $ϕ$ . In some representations of the gauge group the dynamics are defined by the Lagrangian

\begin{matrix} L & = - \frac{1}{4} W_{μ ν}^{a} W_{a}^{μ ν} + (D_{μ α β} ϕ_{β})^{†} D_{α γ}^{μ} ϕ_{γ} - V (ϕ), \end{matrix}

(3.1)

\begin{matrix} W_{μ ν}^{a} & = \partial_{μ} W_{ν}^{a} - \partial_{ν} W_{μ}^{a} + g f^{a b c} W_{μ}^{b} W_{ν}^{c}, \end{matrix}

(3.2)

\begin{matrix} D_{α β}^{μ} & = \partial^{μ} δ_{α β} + g T_{a α β}^{R} W_{a}^{μ} . \end{matrix}

(3.3)

The $T$ are suitably normalized generators for the representation $R$ of the scalar field, $f^{a b c}$ are the corresponding structure constants, and $g$ is the newly introduced gauge coupling. The most prominent example is the standard model Higgs sector. In this case the scalar is in the fundamental representation of an SU(2) gauge group. In this special case the theory furnishes also a global SU(2) symmetry acting only on the scalar field. The Abelian Higgs model is recovered for $f^{a b c} = 0$ and $T^{R} = 1$ .

Since the potential is, by construction, a function of gauge-invariant quantities only, so are the minima given by gauge-invariant conditions,

\begin{matrix} ϕ^{†} ϕ & = v^{2}, \end{matrix}

(3.4)

where $v \neq 0$ minimizes the potential $V (ϕ)$ and is thus a function of the parameters of the potential. These minima are, for a polynomial potential, necessarily translationally invariant, and therefore the fields satisfying (3.4) need to be constant. The minima need also to be invariant under any intact global symmetry. The condition $v \neq 0$ follows from the structure and parameters of the potential.Footnote ³² This condition is at the root of the BEH effect, and thus of the so-called gauge SSB, and everything else follows from it.

The existence of such a nontrivial minimum is then exploited by setting the length of the space-time averaged Higgs field as a gauge condition to the value $v$ . To complete the gauge, it is customary to use a local condition

\begin{matrix} \partial^{μ} W_{μ}^{a} + i g ζ ϕ_{α} T_{α β}^{a} v_{β} + Λ^{a} = 0, \end{matrix}

(3.5)

where $v_{β}$ is a vector of length $v$ and $ζ$ is an arbitrary gauge parameter. The direction of $v_{β}$ is arbitrary, but fixed by this gauge choice. It is really the second term in (3.5) that enforces that one direction is made special, and thereby breaks the gauge symmetry in such a way as to ensure a vacuum expectation value for the Higgs field. It thereby establishes the BEH effect in the usual picture of a condensation of the Higgs field. Such a gauge choice would not be possible, if the potential did not support a nontrivial minimum $v^{2} > 0$ .

Returning for a second to the classical level, initial conditions for the equations of motion of the gauge-dependent fields would need to be selected to comply with this gauge choice and therefore implement the BEH effect. At the quantum level, there are no initial conditions. Hence, the gauge choice alone can enforce SSB and, in that sense, the BEH effect. After the gauge is implemented, calculations can be done, especially in the form of perturbation theory.

3.3 Foundational and Interpretational Issues

Already in the early literature on the BEH mechanism it was noted that the gauge condition is not necessary (Reference Lee and Zinn-JustinB. Lee & Zinn-Justin, 1972). The vacuum expectation value (VEV) of the Higgs field, while sometimes identified with observable properties, is in fact gauge-dependent. Furthermore, gauges that do not fix it are possible, as well as gauges that set it to zero. This may seem surprising from the point of view of the standard perspective of the BEH as relying crucially on SSB, but it is actually not surprising in the light of Elitzur’s theorem (Reference ElitzurElitzur, 1975), which states that local symmetries cannot be spontaneously broken, concluding that “breaking of local symmetry such as the Higgs phenomenon, for example, is always explicit, not spontaneous. The local symmetry must be broken first explicitly by a gauge-fixing term leaving only global symmetry. This remaining global symmetry can be broken spontaneously” (p. 3981).Footnote ³³.

Gauges in which the VEV is set to zero disable perturbation theory, and for that reason they did not find widespread use; see Reference Lee and Zinn-JustinB. Lee and Zinn-Justin (1972) and Reference MaasMaas (2019). However, their existence makes the answer to the question whether the gauge symmetry has genuinely been broken seem by itself dependent on the choice of gauge! This hard-coding of SSB by gauge-fixing raises the question of whether it is physical. At least when using a lattice regulator in the standard-model case, Reference Osterwalder and SeilerOsterwalder and Seiler (1978) and Reference Fradkin and ShenkerFradkin and Shenker (1979) showed that there exists only a single phase, in the sense of an analytic free energy. Hence, the BEH effect cannot be a physical distinction. In fact, explicit calculations showed that the possibility of fixing the vacuum expectation value of the Higgs field itself is gauge-dependent and thus an unphysical distinction (Reference Caudy and GreensiteCaudy & Greensite, 2008).Footnote ³⁴ As Reference FriederichFriederich (2013) puts it, as it depends on the choice of gauge fixing, SSB of local symmetries does not qualify as a “natural phenomenon.” These issues, together with more basic considerations of gauge invariance raised by philosophers in this context (see Section 4), suggest that the standard account of the BEH mechanism as an instance of gauge symmetry breaking is misleading.

Nonetheless, ignoring these conceptual issues and identifying SSB after gauge fixing with the BEH mechanism may have played an important heuristic role in the acquisition of a wealth of experimentally confirmed results in particle physics (Reference Böhm, Denner and JoosBöhm et al., 2001; Reference ZylaZyla et al., 2020). This is a baffling state of affairs. It strongly suggests the necessity for a reconciliation of both the formal aspects and the phenomenological successes. One way to achieve such a reconciliation will be presented and discussed in Section 6.

4 Motivating Gauge-Invariant Approaches

As stressed in Section 2.3, there is a priori a tension between the view of gauge symmetries as an insight into the inner workings of Nature – as acknowledged by the wide recognition of the heuristic value of the Gauge Principle in discovering empirically successful mathematical descriptions of the fundamental interactions – and the view of gauge symmetries as essentially a descriptive redundancy of our theoretical apparatus – as manifested by the near universal acknowledgement that physical d.o.f. (observables) should be gauge-invariant. This is called the “profundity versus redundancy” conundrum (see Reference Martin, Brading and CastellaniMartin, 2003).

While both these views are expressed in the physics literature, their inconsistency doesn’t raise much interest or worry there, as it appears to be considered inconsequential to most practical or technical matters. Philosophers of physics on the other hand, whose job it is to worry about such things, began to seize on it when turning their attention to the foundations of gauge theories some twenty years ago. About this tension, Reference Redhead, Kuhlmann, Lyre and WayneMichael Redhead (2002) states “In my view its elucidation is the most pressing problem in current philosophy of physics.”

The one area where philosophers’ and physicists’ preoccupations (should) converge, and where the problem manifests itself with unique acuity, is when it comes to the notion of gauge SSB, especially given its implementation via the BEH mechanism in the Standard Model (SM), where it is widely seen as a concept pivotal to our understanding of the electroweak interaction. It is indeed a standard textbook presentation and the conventional wisdom that in the early history of the Universe, elementary particles interacting with the Higgs field (weak bosons and, most or all, fermions) gained their masses when the latter spontaneously broke the fundamental $S U (2)$ gauge symmetry of the electroweak theory. But then how is this account compatible with the “redundancy” stance on gauge symmetries? John Earman expresses the tension particularly vividly:

As the semi-popular presentations put it, ‘Particles get their masses by eating the Higgs field.’ Readers of Scientific American can be satisfied with these just-so stories. But philosophers of science should not be. For a genuine property like mass cannot be gained by eating descriptive fluff, which is just what gauge is. Philosophers of science should be asking the Nozick question: What is the objective (i.e., gauge invariant) structure of the world corresponding to the gauge theory presented in the Higgs mechanism?

(Earman, 2004b, 1239)

This is why we say that there is a worrisome tension at the very heart of modern particle physics. On one hand, gauge symmetries are considered unphysical mathematical redundancy, but on the other hand the BEH mechanism is explained by gauge SSB. But how could the breaking of unphysical mathematical redundancy have any physical impact on our world?Footnote ³⁵

The underlying idea of this joint Element is that, instead of employing the conceptually dubious notion of gauge SSB, one may look for manifestly gauge-invariant approaches to the BEH mechanism. For a while already, this has been the consensus view in the philosophy of physics community (see Reference FriederichFriederich, 2013, Reference Friederich2014; Reference LyreLyre, 2008; Reference SmeenkSmeenk, 2006; and Reference StruyveStruyve, 2011). In the physics community, while a narrative about the BEH in terms of SSB remains common, dissenting views – such as Reference Chernodub, Faddeev and NiemiChernodub, Faddeev, and Niemi (2008), Reference Ilderton, Lavelle and McMullanIlderton, Lavelle, and McMullan (2010), Reference MaasMaas (2019), and Reference SondenheimerSondenheimer (2020) – indicate an emerging trend in particle physics where invariant formulations of the BEH mechanism are seen as a promising research endeavor.Footnote ³⁶ Some physicists familiar with the philosophy literature share this vision, one of us raising the worry that

[ $\dots$ ] Not acknowledging the insights of philosophers of physics would certainly lead to a long-lived misconception at the heart of particle physics to remain uncorrected for still some times, and important ensuing questions regarding the context of justification of the electroweak model to remain unasked, let alone answered.

(François, 2019, 475)

It was indeed a motivation of this joint Element to argue for a better awareness and appreciation of the general conceptual and interpretive issues surrounding gauge symmetries, especially as clearheadedness on these could actually have a concrete impact on the (re)assessment of the foundation of electroweak physics and, perhaps, on future research avenues in particle physics.

In Section 2.1.2, we stressed that a satisfying interpretation of gauge theories may fulfill three desiderata:

(D1) To avoid ontological indeterminism.
(D2) To avoid ontological commitments to quantities that are not measurable even in principle.
(D3) To avoid surplus mathematical structure that has no direct ontological correspondence.

We see that restricting one’s realist commitments to gauge-invariant quantities has the crucial advantage of avoiding a problematic indeterminism that allows for quantities that have no observable effects. This speaks in favor of the redundancy interpretation: gauge symmetries are unphysical, full stop. While this is certainly a viable view some of us might endorse, earlier we introduced a more subtle understanding that combines elements of both sides of the “profundity versus redundancy” conundrum. We suggested that “surplus” should be understood as any formal structure of a theory whose excision wouldn’t be detrimental to its physical content and interpretation or to its theoretical/pragmatic virtues. We hinted at the fact that for gauge theories, the (field-theoretic) notion of locality provides a robust criterion to detect such structures.Footnote ³⁷ On this notion we elaborated in Section 2.4 the distinction between artificial and substantial gauge symmetries: only artificial gauge symmetries can be eliminated without sacrificing the locality of elementary field variables of a gauge theory, making them genuinely surplus. This distinction already goes some way toward alleviating the “profundity versus redundancy” tension: the gauge principle reveals its profundity when it points toward theories with substantial gauge symmetries. The general issue at hand is therefore not to assess whether gauge symmetries are surplus structure or not; the refined question is how to distinguish between genuinely surplus gauge structure and the nonsurplus gauge structure whose physical signature relates to the nonlocality or nonseparability of gauge d.o.f.

Gauge theories undergoing gauge SSB are no exception to this general discussion. Given a gauge theory with SSB, two possibilities arise. Either the gauge symmetry is substantial and its breaking correlates to (potential) physical observables, in which case gauge SSB is a genuine phenomenon. Or, the supposedly broken symmetry is artificial, in which case there is an invariant local reformulation of the theory in which SSB cannot occur and is revealed as a formal artifact of the original formulation stemming from an inadequate choice of local field variables.

To elucidate which of these obtains in any given theory, one needs to reformulate it in a maximally invariant way – in the sense of Section 2.4 – leaving only substantial gauge symmetries. In particular, such a gauge-invariant reformulation of the electroweak model is what is needed to answer Earman’s question, quoted earlier, and assess the real ontological status of electroweak SSB in the SM.

In the following two sections, we present the state of the art of gauge-invariant approaches to the BEH mechanism. In Section 5, we highlight a tool adapted to such goals, known as the dressing field method. We illustrate it via several instructive examples before the main application to the electroweak model in Section 5.4. As we will see then, the dressing field method gives arguments to the effect that the $S U (2)$ gauge symmetry of the model is artificial and that only the $U (1)$ symmetry is substantial, so that SSB is superfluous to the empirical success of the SM – this, we observe, is consistent with Elitzur’s theorem. Then, in Section 6, we describe a closely related gauge-invariant formulation of the eletroweak model known as the FMS approach – for Fröhlich, Morchio, and Strocchi – which has been much further developed, especially in lattice simulations, and is closer to being a viable alternative to the standard account regarding confrontation with high-precision collider experiments.

5 The Dressing Field Method of Gauge Symmetry Reduction

Given a physical theory with certain gauge symmetries, the discussion in the previous section motivates an attempt to identify its physical degrees of freedom by replacing gauge-dependent variables with gauge-invariant ones. This step, when successful, results in a reduction of gauge symmetries, namely the formulation of a theory with less symmetries. In fact, gauge fixing and SSB, discussed in previous sections, similarly apply reduction of gauge symmetries to achieve certain goals, such as to allow for massive gauge fields mediating weak interactions or to control the quantization of the classical theory. In the past few years, another item in the physicist’s symmetry reduction tool kit has been developed and is now known as the dressing field method (DFM) (Reference Attard, François, Lazzarini, Masson and KouneiherAttard et al., 2018; Reference Fournel, François, Lazzarini and MassonFournel et al., 2014; Reference FrançoisFrançois, 2014). In a nutshell, this approach allows one to systematically build gauge-invariant field variables out of the initial gauge-variant fields of a theory, if a so-called dressing field can be identified within it. In contrast to SSB and gauge fixing, the DFM does not reduce the symmetry by replacing the gauge-invariant Lagrangian with one that has to be expressed in a specific gauge, but rather by removing gauge dependence already at the level of the field variables.

In the simplest of terms, a dressing field $u (x)$ is a group-valued field – ideally, already present in the original field content of the theory (or built from it) – with which one will perform a change of field variables that formally mimicks a gauge transformation. To take the easiest example: If $ϕ (x)$ is a matter field, by definition its gauge transformation is $ϕ^{'} (x) = γ (x)^{- 1} ϕ (x)$ , where $γ (x)$ is a (group-valued) gauge transformation element. Given the dressing field $u$ , the dressing of $ϕ (x)$ is $ϕ^{u} (x) := u (x)^{- 1} ϕ (x)$ . The point of this change of variable is that given the defining gauge transformation of a dressing field, $u^{'} (x) = γ (x)^{- 1} u (x)$ , the dressed object $ϕ^{u} (x)$ is a gauge-invariant field. Similarly, one can produce gauge-invariant versions of any gauge variable, notably gauge potentials and field strength, by dressing them. The Lagrangian of an invariant gauge field theory can be rewritten in terms of these dressed variables.

As it turns out, the DFM provides a general framework of which one finds various retrospective applications in gauge theory. The tetrad field in gauge reformulations of General Relativity (GR) is probably the first example of a dressing field in the physics literature (in the writings of Einstein and Weyl). The “Stueckelberg trick” (Reference Ruegg and Ruiz-AltabaRuegg & Ruiz-Altaba, 2004; Reference StueckelbergStueckelberg, 1938a, Reference Stueckelberg1938b) is another noticeable early instance of the method, as is Dirac’s gauge-invariant formulation of QED (Reference DiracDirac, 1955; Reference DiracDirac, 1958, section 80).Footnote ³⁸ Let us also mention, for example, the study of anomalies in QFT (Reference Garajeu, Grimm and LazzariniGarajeu, Grimm, & Lazzarini, 1995; Reference Mañes, Stora and ZuminoMañes, Stora, & Zumino, 1985; Reference Stora and ’t HooftStora, 1984), or in QCD the so-called “proton spin decomposition controversy” (Reference François, Lazzarini and MassonFrançois, Lazzarini, & Masson, 2015; Reference Leader and LorcéLeader & Lorcé, 2014; Reference LorcéLorcé, 2013) and the issue of constructing physical quark states (Reference Heinzl, Ilderton, Langfeld, Lavelle and McMullanHeinzl et al., 2008; Reference Lavelle and McMullanLavelle & McMullan, 1997). In recent years, it also proved relevant to the study of the covariant symplectic structure of gauge field theory and to proposals to elucidate the question of local subsystems in gauge field theory over bounded regions of space-time (Reference Donnelly and FreidelDonnelly & Freidel, 2016; Reference Donnelly and GiddingsDonnelly & Giddings, 2016; Reference FrançoisFrançois, 2021b; Reference François, Parrini and BoulangerFrançois, Parrini, & Boulanger, 2021; Reference GeillerGeiller, 2017; Reference Giddings and WeinbergGiddings & Weinberg, 2019; Reference GomesGomes, 2019a; Reference Gomes, Hopfmüller and RielloGomes, Hopfmüller, & Riello, 2019; Reference Gomes and RielloGomes & Riello, 2018; Reference Mathieu, Murray, Schenkel and TehMathieu et al., 2020; Reference Murgueitio Ramírez and TehMurgueitio Ramírez & Teh, 2020; Reference SperanzaSperanza, 2018) – see also Reference WallaceWallace (2022a, Reference Wallace2022b) on the latter issue. In this literature, dressing fields are often called edge modes.

Regarding the interpretive issues about gauge symmetries that we concern ourselves with, the main takeaway from the DFM is the following: a dressing field, and therefore the gauge-invariant variables constructed from it, can either be local (like the tetrad field in pure GR) or nonlocal (like e.g. holonomies in YM theory). In keeping with the nomenclature recalled in Section 4, if a local dressing field can be extracted, and thus a theory rewritten in a gauge-invariant yet local way, one may or could infer that the original gauge symmetry of the theory was an artificial one: it was a genuine surplus structure, a mathematical artifact with no physical signature. When applied in particular to the electroweak model, the DFM suggests that the $S U (2)$ symmetry – allegedly broken according to the standard narrative – is actually artificial and can be removed in the new “dressed” formulation of the model by a local change of field variables.

Such a surprising conclusion is nonetheless consistent with a body of literature on invariant reformulations of theories undergoing SSB – see, for example, Reference MaasMaas (2019) for references – going all the way back to Reference HiggsHiggs (1966) and Reference KibbleKibble (1967). Philosophers of physics, such as Reference LyreLyre (2008), Reference SmeenkSmeenk (2006), Reference StruyveStruyve (2011), and Reference FriederichFriederich (2013, Reference Friederich2014), did not fail to appreciate that such reformulations, which they sometimes rediscovered for themselves, should shed new light on these theories and on the electroweak model in particular.

To flesh out this short preview, in this section we propose an introduction to the DFM. It is best formalized within the framework of differential geometry of fiber bundles, which is nowadays widely understood to be the mathematical foundation of classical gauge field theory. Yet, in the interest of facilitating the entry of a wider audience into this topic, we avoid the most geometric presentation,Footnote ³⁹ focusing instead on a field theoretic account.

We start by providing a nutshell presentation of the elementary notions of gauge field theory. We will use the language of differential forms throughout this section, first because it avoids index notations and allows one to focus on the conceptual points, and second because it is widespread in the theoretical physics literature on gauge field theory. We will here provide just enough material so that the reader can grasp our presentation of the dressing field method.Footnote ⁴⁰ We highlight how the latter can help in rewriting gauge theories in an invariant fashion, and why it can be a tool to detect genuine surplus structure – see also Reference FrançoisFrançois (2019) for further discussion of this point. The general discussion is illustrated via several examples before we turn to the main application, the electroweak model, which is followed by final short comments on the merits and limits of its reformulation via DFM.

5.1 Classical Gauge Field Theory in a Nutshell

The set of field variables on space-time $M$ of a gauge field theory based on a Lie group $G$ , with Lie algebra $g$ , comprises matter fields $ϕ^{a} (x)$ and a gauge potential $A_{b, μ}^{a} (x)$ mediating the interactions, where $a = {0, \dots, N}$ with $N =$ dim $g$ being the gauge (color) index. The gauge potential takes values in $g$ , while the matter field takes values in a representation space $V$ of $G$ , namely a space supporting the action of $G$ and $g$ .Footnote ⁴¹ Quite often, $V = C^{N}$ and $G$ is in its defining matrix representation, so this action is a simple matrix multiplication: for $v = v^{a} \in V$ and $g \in G$ , we have $g_{b}^{a} v^{b} \in V$ . More generally, we would write $ρ (g) v \in V$ , where $ρ : G \to G L (V)$ is the representation map. The action of $X = X_{b}^{a} \in g$ would be $ρ_{*} (X) v$ , with $ρ_{*} : g \to g l (V)$ , or simply $X_{b}^{a} v^{b}$ .

In the language of differential forms, all indices are removed: $A$ is called a gauge potential 1-form, and the matter fields $φ$ are a $0$ -form. The fields are the components of the forms. The minimal coupling between $A$ and $ϕ$ is represented by the covariant derivative: $D ϕ = d ϕ + ρ (A)_{*} ϕ$ , where $d$ is the de Rham – or exterior – derivative acting on forms and raising their degree by $1$ (so $d ϕ$ is a 1-form). In components: $D_{μ} ϕ^{a} = \partial_{μ} ϕ^{a} + A_{b, μ}^{a} ϕ^{b}$ . The field strength of $A$ is the 2-form $F = d A + A A$ , whose components $F_{b, μ ν}^{a} (x)$ are an antisymmetric tensor on $M$ , expressed in terms of the potential components as $F_{b, μ ν}^{a} = \partial_{μ}^{} A_{b, ν}^{a} - \partial_{ν}^{} A_{b, μ}^{a} + A_{c, μ}^{a} A_{b, ν}^{c} - A_{c, ν}^{a} A_{b, μ}^{c}$ .

The gauge group $G$ is defined as the set of $G$ -valued functions $γ = γ_{b}^{a} :$ $M \to G$ , $x \mapsto γ (x) = γ_{b}^{a} (x)$ , with pointwise group multiplication $(γ γ^{'}) (x) = γ (x) γ^{'} (x)$ – as such it is an infinite dimensional group – but acting on (transforming) one another: Given $η \in G$ , any other $γ \in G$ acts on $η$ by group conjugation: $η \mapsto γ^{- 1} η γ =: η^{γ}$ . The right-hand side of the equality is just a notation defined by the left-hand side and signifies the action of $γ \in G$ on $η$ seen as a “field” on $M$ . The gauge group is thus $G := (γ, η : M \to G | η^{γ} = γ^{- 1} η γ)$ .

By definition of the gauge potential and matter fields, $G$ acts on them as follows:

\begin{matrix} A \mapsto A^{γ} : & = γ^{- 1} A γ + γ^{- 1} d γ and ϕ \mapsto ϕ^{γ} := ρ (γ)^{- 1} ϕ . \end{matrix}

(5.1)

Or in components: $A_{b, μ}^{a} \mapsto (A^{γ})_{b, μ}^{a} := (γ^{- 1})_{c}^{a} A_{d, μ}^{c} γ_{b}^{d} + (γ^{- 1})_{c}^{a} \partial_{μ} γ_{b}^{c}$ , and $ϕ^{a} \mapsto (ϕ^{γ})^{a} := (γ^{- 1})_{b}^{a} ϕ^{b}$ . These are the gauge transformations of the elementary field variables. The field strength gauge transforms as $F^{γ} = γ^{- 1} F γ$ . The covariant derivative gauge transforms as $(D ϕ)^{γ} := d ϕ^{γ} + ρ_{*} (A^{γ}) ϕ^{γ} = ρ (γ)^{- 1} D ϕ$ , that is, it transforms in the same way as the field $ϕ$ itself. Hence the name “covariant derivative” for $D$ , since it preserves the covariance of $ϕ$ .

Now, a physical theory is specified by its Lagrangian form $L$ , which must be a $R$ -valued $n$ -form on $M$ ( $n =$ dim $M$ ), whose component is $L$ , the Lagrangian functional: $L = L {v o l}_{n}$ , with ${v o l}_{n} = \sqrt{det g} d x^{n}$ the volume $n$ -form on $M$ .Footnote ⁴² In the case of a gauge theory, the Lagrangian is required to be gauge-invariant, that is, $L (A^{γ}, ϕ^{γ}) = L (A, ϕ)$ for any $γ \in G$ . A prototypical Lagrangian for a gauge field $A$ coupled to both a scalar field $ϕ = φ$ and spinor (fermion) fields $ϕ = ψ$ is

\begin{matrix} L (A, φ, ψ) & = L_{YM} + L_{Dirac} + L_{Scalar}, \\ = \frac{1}{2} T r (F \land * F) + ⟨ ψ, D ψ ⟩ - m ⟨ ψ, * ψ ⟩ + ⟨ D φ, * D φ ⟩ - μ^{2} ⟨ φ, * φ ⟩, \\ = (- \frac{1}{4} F_{b, μ ν}^{a} F_{a}^{b, μ ν} + \overset{ˉ}{ψ} (D - m) ψ + (D_{μ} φ^{a})^{†} D^{μ} φ^{a} - μ^{2} (φ^{a})^{†} φ^{a}) {v o l}_{n}, \end{matrix}

(5.2)

where $φ$ has mass $μ$ , and $ψ$ has mass $m$ . In the first expression, $\land : Ω^{p} (M) \times Ω^{q} (M) \to Ω^{p + q} (M)$ is the wedge product of differential forms, while the Hodge dual $* : Ω^{p} (M) \to Ω^{n - p} (M)$ transforms $p$ -forms into $(n - p)$ -forms, so that, for example, for a $0$ -form $φ$ we have $* φ = φ {v o l}_{n}$ . The Dirac operator is $D ψ : = γ \land * D ψ = γ^{μ} D_{μ} ψ {vol}_{n}$ , where $γ = γ_{μ} d x^{μ}$ is a $1$ -form whose components are Dirac gamma matrices. With these, it can be checked that all the expressions proposed are indeed $n$ -forms. Also, $T r$ and $⟨, ⟩$ are bilinear forms on $g$ and $V = C^{N}$ respectively, so that the expressions are $R$ -valued.

This Lagrangian satisfies the Gauge Principle because it is indeed gauge-invariant, $L (A^{γ}, φ^{γ}, ψ^{γ}) = L (A, φ, ψ)$ . We remark that compliance with this principle forbids a mass term for the gauge potential $A$ , which would be $M^{2} T r (A \land * A) = M^{2} A_{b, μ}^{a} A_{a}^{b, μ}$ . Indeed, given Eq. (5.1) such a term is not invariant under $G$ . This means that, as it stands, the gauge potential is a massless field, and the interaction it mediates is a priori long-range.

As we’ve seen, the phenomenological successes of the gauge principle are impressive, as it predicts the number of mediating bosons, the form of their coupling to matter, and their masslessness (good for EM and GR). But on the other hand, it was not so clear at first that it could accommodate the phenomenology of the short-range nuclear interactions. Also, gauge symmetries make quantization a highly nontrivial matter. And, relatedly, they make it harder to isolate or distinguish the true physical d.o.f., as none of the basic gauge variables are true physical fields, since none are gauge-invariant objects!

The common theme in all these pragmatic challenges is the necessity to reduce gauge symmetries. A few options are available. Famously, as we have seen in Section 3, the idea of SSB mechanism was devised for accommodating massive mediators of interactions in the weak interactions. It is well known that gauge fixing – which amounts to selecting particular representatives in the gauge orbits of field variables, thus to a “breaking by hand” of the gauge symmetry – is a key technical step for many quantization schemes (e.g. BRST quantization); see Section 2.2.4. The “dressing field approach” is another tool to achieve gauge symmetry reduction in gauge theories. We describe it in the next section and discuss some of its potential impact regarding philosophical issues previously raised.

5.2 Reduction of Gauge Symmetries via Dressing

We begin by defining the central object of the approach: the dressing field. Consider a gauge theory based on a gauge group $H$ with rigid subgroup $H$ . Suppose there is a subgroup $K \subseteq H$ to which corresponds a subgroup $K \subset H$ of the gauge group. Let us furthermore suppose there is a group $G$ such that either $G \subseteq H$ (it could be that $G = K$ ) or $G \supseteq H$ .

Definition 1.

A $K$ -dressing field is a map $u : M \to G$ , $x \mapsto u (x) = u (x)_{b}^{a}$ (i.e. $G$ -valued field) defined by its $K$ -gauge transformation:

\begin{matrix} u^{κ} := κ^{- 1} u, for κ \in K . \end{matrix}

(5.3)

In index notation this is $(u^{κ})_{b}^{a} := (κ^{- 1})_{c}^{a} u_{b}^{c}$ . We denote the space of such $G$ -valued $K$ -dressing fields by $D r [G, K]$ .

Given the existence of a $K$ -dressing field, we have the following,

Proposition 2.

Given the gauge potential $A$ and gauge-tensorial fields $a = {F, φ, D φ}$ transforming via representations $ρ$ of $H$ , one may define the following dressed fields:

\begin{matrix} A^{u} := u^{- 1} A u + u^{- 1} d u and a^{u} : = ρ {(u)}^{- 1} a . \end{matrix}

(5.4)

These are $K$ -invariant, as is easily checked given (5.1) – specialized for $κ \in K$ – and (5.3).

In components, the dressed gauge potential is: $(A^{u})_{b, μ}^{a} := (u^{- 1})_{c}^{a} A_{d, μ}^{c} u_{b}^{d} + (u^{- 1})_{c}^{a} \partial_{μ} u_{b}^{c}$ . The dressed curvature is $F^{u} = u^{- 1} F u = d A^{u} + 1 / 2 [A^{u}, A^{u}]$ , i.e. in components $(F^{u})_{b, μ ν}^{a} := (u^{- 1})_{c}^{a} F_{d, μ ν}^{c} u_{b}^{d}$ . The dressed covariant derivative is $D^{A^{u}} = d + ρ_{*} (A^{u})$ . A dressed matter field is $ϕ^{u} := ρ (u)^{- 1} ϕ$ , or $(ϕ^{u})^{a} := (u^{- 1})_{b}^{a} ϕ^{b}$ , its coupling to the dressed potential is given by $D^{A^{u}} ϕ^{u} = ρ (u)^{- 1} D ϕ = d ϕ^{u} + ρ_{*} (A^{u}) ϕ^{u}$ .

A corollary of the above, is that in case $u$ is a $H$ -dressing field, the dressed fields (5.4) are strictly $H$ -invariant. Remark also that for the dressings $a^{u}$ to make sense for $G \supset H$ , we must assume that representations $ρ$ of $H$ extend to representations of $G$ .

Let us emphasize an important fact: Comparing the definitions of the gauge group and that (5.3) of a dressing field shows clearly that $u \notin K$ . Therefore, (5.4) are not gauge transformations, despite their formal resemblance to (5.1). This means, for example, that the dressed field $A^{u}$ is no more a gauge potential, and a fortiori is not a point in the gauge $K$ -orbit $O_{K} [A]$ of $A$ , so that $A^{u}$ must not be confused with a gauge fixing of $A$ .

5.2.1 Residual Gauge Transformations

Let us indulge in a brief digression that is also a transition. In the BRST framework, infinitesimal gauge transformations are encoded in the BRST bigraded (in form and ghost degrees) differential algebra:

\begin{matrix} s A = - D^{A} v, s a = - ρ_{*} (v) a, and s v + 1 / 2 [v, v] = 0, \end{matrix}

(5.5)

where $v$ the Lie $H$ -valued ghost field – which plays the role of the infinitesimal gauge parameter $χ$ – and $s$ is the BRST differential increasing the ghost degree by 1. It satisfies $d s + s d = 0$ and $s^{2} \equiv 0$ – due to the third relation defining its action on the ghost field. This relation is why $s$ is best interpreted geometrically as the de Rham derivative on the gauge group $H$ and $v$ as its Maurer–Cartan form (Reference Bonora and Cotta-RamusinoBonora & Cotta-Ramusino, 1983).

One shows that, at a purely formal level, the dressed variables satisfy a modified BRST $^{u}$ algebra:

\begin{matrix} s A^{u} = - D^{A^{u}} v^{u} s a^{u} = - ρ_{*} (v^{u}) a^{u}, and s v^{u} + 1 / 2 [v^{u}, v^{u}] = 0, \end{matrix}

(5.6)

where one defines the dressed ghost,

\begin{matrix} v^{u} : = u^{- 1} v u + u^{- 1} s u . \end{matrix}

(5.7)

In the special case where $u$ is a $H$ -dressing, its defining gauge transformation translates as $s u = - v u$ . Then the dressed ghost is $v^{u} = 0$ and BRST $^{u}$ is trivial, $s A^{u} = 0$ , and $s a^{u} = 0$ . In the more general case of a $K$ -dressing, $u$ achieving only partial gauge reduction, BRST $^{u}$ only makes sense if it encodes residual gauge transformations of the dressed fields (5.4).

To speak meaningfully about these, we must assume that $K$ is a normal subgroup, $K ◃ H$ , so that the $J : = H / K$ is indeed a group to which corresponds the (residual) gauge subgroup $J \subset K$ . Now, the action of $J$ on the initial field variables $A$ and $a$ is known. Therefore, what will determine the $J$ -residual gauge transformations of the dressed fields is the action of $J$ on the dressing field. In that regard, consider the following propositions:

Proposition 3.

If the $K$ -dressing field $u$ has $J$ -transformation given by

\begin{matrix} u^{η} = η^{- 1} u η for η \in J, \end{matrix}

(5.8)

then the residual $J$ -gauge transformations of the dressed fields are

\begin{matrix} (A^{u})^{η} = η^{- 1} A^{u} η + η^{- 1} d η and (a^{u})^{η} = ρ (η)^{- 1} a^{u} . \end{matrix}

(5.9)

In particular, $(F^{u})^{η} = η^{- 1} F^{u} η$ and $(ϕ^{u})^{η} = ρ (η)^{- 1} ϕ^{u}$ . That is, in this case, the dressed fields are genuine $J$ -gauge fields.

In the BRST language, the normality of $K$ in $H$ implies $v = v_{K} + v_{J}$ , where $v_{K}$ and $v_{J}$ are respectively Lie $K$ - and Lie $J$ -valued, and in accordance $s = s_{K} + s_{J}$ . The defining $K$ -transformation of the dressing field translates as $s_{K} u = - v_{K} u$ , while its $J$ -transformation assumed in Proposition 3 is encoded as $s_{J} u = [u, v_{J}]$ . The dressed ghost field is thus

\begin{matrix} v^{u} & = u^{- 1} (v_{K} + v_{J}) u + u^{- 1} (s_{K} + s_{J}) u \\ = u^{- 1} (v_{K} + v_{J}) u + u^{- 1} (- v_{K} u + [u, v_{J}]) = v_{J} . \end{matrix}

(5.10)

Therefore, the modified (actually reduced) BRST $^{u}$ algebra is now

\begin{matrix} s_{J} A^{u} = - D^{A^{u}} v_{J}, s_{J} a^{u} = - ρ_{*} (v_{J}) a^{u}, and s v_{J} + 1 / 2 [v_{J}, v_{j}] = 0. \end{matrix}

(5.11)

As expected, it encodes the infinitesimal residual $J$ -gauge transformations of the dressed fields.

Consider now the Lagrangian $L (A, ϕ)$ of the initial $H$ -gauge theory, and suppose a $K$ -dressing field satisfying the preceding propositions is available. Then:

Proposition 4.

Due to the $H$ -invariance of the Lagrangian, which holds as a formal property of $L$ as a functional, and due to the formal similarity between a gauge transformation (5.1) and a dressing operation (5.4), it is a fact that

\begin{matrix} L (A, ϕ) = L (A^{u}, ϕ^{u}) . \end{matrix}

(5.12)

That is, the $H$ -gauge theory can be rewritten in terms of $K$ -invariant variables, which means that it becomes a $J$ -gauge theory: the $K$ -gauge symmetry sector has been neutralized.

As a corollary, in case a $H$ -dressing field is available, the gauge symmetry of the theory $L$ is fully reduced. Note again that as $u \notin K \subset H$ , the dressed fields $χ^{u} = (A^{u}, ϕ^{u})$ are not points in the gauge orbits $O [χ]$ of the gauge variables $χ = (A, ϕ)$ . So, the dressed Lagrangian $L (A^{u}, ϕ^{u})$ is not a gauge-fixed version of $L (A, ϕ)$ .

5.2.2 Ambiguity in Choosing a Dressing Field

The dressed fields may exhibit residual transformations of another kind resulting from a potential ambiguity in choosing the dressing field. A priori, two dressings $u, u^{'} \in D r [G, K]$ may be related by $u^{'} = u ξ$ , where $ξ : M \to G$ . And since by definition $u^{κ} = κ^{- 1} u$ and $(u^{'})^{κ} = κ^{- 1} u^{'}$ , it must be that $ξ^{κ} = ξ$ . Let us denote the group of such $K$ -invariant functions by $G : = (ξ : P \to G | ξ^{κ} = ξ)$ and denote its action on a dressing field as $u^{ξ} := u ξ$ .

By definition, $G$ has no action on the initial field space $Φ$ : note that $A^{ξ} = A$ and $ϕ^{ξ} = ϕ$ , so on all gauge-tensorial objects $a^{ξ} = a$ . On the other hand, it is clear how $G$ acts on dressed fields:

\begin{matrix} (A^{u})^{ξ} & : = (A^{ξ})^{u^{ξ}} = A^{u ξ} = ξ^{- 1} A^{u} ξ + ξ^{- 1} d ξ, and \\ (a^{u})^{ξ} & : = (a^{ξ})^{u^{ξ}} = a^{u ξ} = ρ (ξ^{- 1}) a^{u} . \end{matrix}

(5.13)

In particular, $(F^{u})^{ξ} = ξ^{- 1} F^{u} ξ$ and $(ϕ^{u})^{ξ} = ρ (ξ)^{- 1} ϕ^{u}$ . The new dressed field $(A^{u})^{ξ}$ and $(a^{u})^{ξ}$ are also $K$ -invariant. It means that the bijective correspondance between the $K$ -dressed fields $(χ^{u})^{ξ}$ , for $χ = (A, a)$ , and their gauge $K$ -orbits $O_{K} [χ]$ holds $\forall ξ \in G$ . So, there is a $1 : 1$ correspondence $O_{K} [χ] \sim O_{G} [χ^{u}]$ . What this tells us is that owing to the ambiguity in the choice of dressing, the reduced gauge symmetry is replaced with a local symmetry that is (at least) as big.

The only way in which a meaningful constraint on this arbitrariness could arise is if the dressing field is built from the initial gauge variables. We thus write $u$ as a field-dependent functional, $u : Φ \to D r [G, K]$ , $(A, ϕ) \mapsto u [A, ϕ]$ . Then, it may be that this constructive procedure is such that $G$ is reduced to a “small,” rigid/global, or perhaps even discrete subgroup. Even if it is not so, this $G$ -symmetry may be an interesting new gauge symmetry.

These situations are represented in most fruitful applications (Reference Attard and FrançoisAttard and François, 2017, Reference Attard and François2018; Reference Attard, François, Lazzarini, Masson and KouneiherAttard et al., 2018; Reference FrançoisFrançois, 2019). Notably, in the context of the tetrad formulation of general relativity (without spinors), the cotetrad field is a full dressing for the Lorentz gauge group $H = S O (1, 3)$ , and $G = G L (4, R)$ is the group of local coordinate changes.

5.2.3 A Connection Form on $A$

Here we comment on another very useful parallel between dressings and a standard geometric construction on $A$ that may shed some light on the arbitrariness of the dressing. In the language of Appendix B, the main idea of the functional connection form is to infinitesimally (i.e. perturbatively) define a right-equivariant horizontal distribution. Thus a functional connection form is a 1-form on the field-space $A$ that is valued in Lie $H$ , obeying (infinite-dimensional versions of the) equivariance and projection equations, as in (B-1)–(B-2). In practice, connection forms provide particular examples of dressings that can be heuristically associated to (perturbative) gauge fixings.

One of the main advantages of the connection form is that in theories that possess a kinematic term in a Lagrangian, one can whittle down the enormous space of possible connection forms by using this term to define a supermetric on field space and thereby define a connection form by orthogonality with respect to the gauge orbits. This choice has several pragmatic advantages (Reference Gomes and RielloGomes & Riello, 2021, section 2).

In more generality, in the fully Lorentz covariant framework, such a functional connection form can be seen as acting on linearized fields, separating that field into a “pure-gauge” (or vertical) and a “physical” (or horizontal) component. This decomposition is gauge-covariant: it meshes nicely with the action of the gauge group on the linearized fields.

Moreover, when the connection form is integrable, namely when it possesses no associated curvature, it also provides a complete gauge fixing, and so a dressing associated to that gauge fixing. For example, in the Abelian theory, if $A (s)$ is a path in $A$ , with $A (0) = 0$ and $A (1) = A$ , then we can integrate the connection form $ϖ$ along this path to obtain the Dirac dressing as

u (A) := \int ϖ (\frac{d}{d s} A (s)) d s .

The integrability condition means that the resulting dressing is independent of the path chosen. Refer to (Reference Gomes and RielloGomes and Riello, 2021, section 5) and Reference Gomes and RielloGomes and Riello (2018) for more detail.

As to the physical interpretation, to the extent that (linearized) gauge fixings can be seen as an appeal to relational properties/observables, then so too can the resulting horizontally projected fields.

Lastly, one can also reproduce the BRST transformations (cf. Section 5.2.1) geometrically in this formalism; see Reference Gomes and RielloGomes and Riello (2017).

5.2.4 On Substantial versus Artificial Gauge Symmetries

As previous sections have explained, Section 2.4 in particular, there is a (usually) recognized trade-off between locality, as understood in a field-theoretic sense, and gauge invariance: a gauge theory is written either in terms of local gauge-variant variables or in terms of nonlocal gauge-invariant variables. A theory displaying such a trade-off would be said to have a substantial gauge symmetry, whereas a theory that does not and can be (re)written in terms of local gauge-invariant variables would be said to have an artificial gauge symmetry. From that viewpoint, a true/substantial gauge symmetry signals that physical degrees of freedom (d.o.f.) have a form of nonlocality to them. “Fake”/artificial gauge symmetries signal nothing of the sort and can be dispensed with at no physical cost: they are genuine surplus as defined in Section 2.1.2. The interpretive issues surrounding gauge symmetries and their physical relevance then applies to the substantial kind only.

Connecting to this discussion, the DFM may suggest a way to assess the nature of the gauge symmetry in a theory $L (A, ϕ)$ . If one is able to find, or build, a (field theoretically) local dressing field $u (A, ϕ)$ , then the theory can be rewritten as $L (A^{u}, ϕ^{u})$ in terms of the variables $A^{u}$ and $ϕ^{u}$ that are gauge-invariant and remain local, showing decisively that the gauge symmetry of the theory is artificial. In a theory with substantial gauge symmetry, any $u (A, ϕ)$ would be nonlocal and so would be the gauge-invariant variables $A^{u}$ and $ϕ^{u}$ : the rewriting $L (A, ϕ) = L (A^{u}, ϕ^{u})$ would then be the formal expression of the trade-off alluded to previously. Of course, the failure to find a local dressing field may be attributed to on a failure of imagination on the part of the theorist, or on a less-than-thorough search. So the strategy is asymmetric: finding a local dressing field is conclusive, but not finding one is not. Yet, for all practical purposes it is rather effective.

As a matter of interpretation, if $u = u (A)$ one could say that $ϕ^{u}$ represents the bare charged matter field shrouded in the gauge field it sources. Similarly, $A^{u}$ would be a self-enveloping charged gauge field acting as a source for itself (e.g. gluons in QCD). For abelian gauge fields, the latter interpretation is not available. If, on the other hand, $u = u (ϕ)$ , one could see $A^{u}$ as the gauge field embedded in a pervasive “sea” generated by $ϕ$ , as well as for $ϕ^{u}$ itself (which is reminiscent of a Higgs-like interpretation, since, for this interpretation to hold, it must be nowhere vanishing).

Let us consider some examples before coming to the main case of interest, the electroweak model.

5.3 Examples

An early example of (abelian) dressing field is the so-called Stueckelberg field, introduced in Reference StueckelbergStueckelberg (1938a, Reference Stueckelberg1938b); see Reference Ruegg and Ruiz-AltabaRuegg and Ruiz-Altaba (2004) for a review. An abelian Stueckelberg-type model involves a potential $A \in Ω^{1} (U, Lie U (1))$ and a Stueckelberg field $B \in Ω^{0} (U, R)$ , respectively transforming as $A^{γ} = A - d θ$ and $B^{γ} = B - μ θ$ , with $γ = e^{i θ} \in U (1)$ and $μ$ some constant. A prototypical (minimal) Stueckelberg $U (1)$ model would be

\begin{matrix} L (A, B) = \frac{1}{2} F * F + μ^{2} (A - \frac{1}{μ} d B) * (A - \frac{1}{μ} d B) . \end{matrix}

(5.14)

As just said, $B$ is actually a local abelian dressing field: defining $u (B) : = e^{\frac{i}{μ} B}$ , it is clear that $u {(B)}^{γ} : = u (B^{γ}) = e^{\frac{i}{μ} (B - μ θ)} = γ^{- 1} u (B)$ . The associated $U (1)$ -invariant local dressed field is then $A^{u} := A + i u^{- 1} d u = A - \frac{1}{μ} d B$ , with field strength $F^{u} = F$ . So (5.14) is manifestly rewritten as

\begin{matrix} L (A, B) = L (A^{u}) = \frac{1}{2} F^{u} * F^{u} + μ^{2} A^{u} * A^{u}, \end{matrix}

(5.15)

which is a Proca Lagrangian for $A^{u}$ with no gauge symmetry. According to the DFM, as $u$ and $A^{u}$ are local, the original $U (1)$ symmetry of the model (5.14) is artificial.

Theories $L (A, ϕ)$ with an abelian gauge potential $A \in Ω^{1} (U, Lie U (1))$ coupled to a charged scalar field $ϕ \in Ω^{0} (U, C)$ provide another illustration. The gauge transformations of the potential $A$ and the $C$ -scalar field $ϕ$ are $A^{γ} = A + γ^{- 1} d γ$ and $ϕ^{γ} = γ^{- 1} ϕ$ , for $γ \in U (1)$ . Now, one can extract a $U (1)$ -dressing field from the scalar field by the polar decomposition $ϕ = u ρ$ with $ρ = \sqrt{ϕ^{*} ϕ}$ its modulus and $u$ its phase. Obviously $ρ$ is invariant while the phase carries the transformation $u^{γ} = γ^{- 1} u$ . The latter is indeed a local dressing field whose associated gauge-invariant local fields are $A^{u} = A + u^{- 1} d u$ and $ϕ^{u} = u^{- 1} ϕ = ρ$ . Any such theory can be rewritten in terms of these variables, $L (A, ϕ) = L (A^{u}, ϕ^{u})$ , which shows that the $U (1)$ -gauge symmetry is artificial.

In particular, the Aharonov–Bohm (AB) effect – see Section 2.2.2 – formulated in the framework of $C$ -scalar EM loses its puzzling edge, as identified by Reference WallaceWallace (2014), since it can be interpreted as resulting from the local interaction of the gauge-invariant local fields $A^{u}$ and $ϕ^{u}$ outside the cylinder.

The abelian Higgs model also belongs to this framework. The Lagrangian of the theory would be

\begin{matrix} L (A, ϕ) & = \frac{1}{2} F * F + (D^{A} ϕ)^{*} * D^{A} ϕ - V (ϕ) {v o l}_{n} \\ with V (ϕ) = μ^{2} ϕ^{*} ϕ + λ (ϕ^{*} ϕ)^{2}, \end{matrix}

(5.16)

where ${v o l}_{n}$ is a volume $n$ -form on $U$ , $ϕ^{*}$ is the conjugate of $ϕ$ , and in the potential $V : C^{2} \to R$ we must have $λ > 0$ . Taking this formulation of the model at face value, one notices that for $μ^{2} > 0$ there is only one invariant vacuum solution $ϕ_{0} = 0$ minimizing $V$ , but for $μ^{2} < 0$ there is a whole $U (1)$ -orbit of vacua with modulus $| ϕ_{0} | = \sqrt{\frac{- μ^{2}}{2 λ}}$ . If and when $ϕ$ settles for one of these vacua – spontaneously breaking $U (1)$ – and fluctuates around it, $ϕ = ϕ_{0} + H$ , a mass term $m_{A} = g ϕ_{0}$ for $A$ appears in the Lagrangian via the minimal coupling term $D^{A} ϕ = d ϕ + g A ϕ$ , with $g$ a coupling constant. A mass for the gauge potential $A$ is thus generated, it seems, via spontaneous gauge symmetry breaking (SSB).

Yet as just seen, the model can be rewritten in a $U (1)$ -invariant way via dressing as

\begin{matrix} L (A^{u}, ρ) & = \frac{1}{2} F^{u} * F^{u} + (D^{A^{u}} ρ)^{*} * D^{A^{u}} ρ - V (ρ) {v o l}_{n} \\ with V (ρ) = μ^{2} ρ^{2} + λ ρ^{4} . \end{matrix}

(5.17)

Thus rewritten, there is no gauge symmetry to break. The potential is now $V : R^{+} \to R$ and has a unique vacuum configuration for either sign of $μ^{2}$ , $ρ_{0} = 0$ and $ρ_{0} = \sqrt{- μ^{2} 2 λ}$ . Writing $ρ = ρ_{0} + H$ , we see that the theory still has a massless ( $μ^{2} > 0$ , $m_{A} = 0$ ) and a massive ( $μ^{2} < 0$ , $m_{A} = g ρ_{0}$ ) phase. The mass is generated via a vacuum phase transition, but it is not tied to a SSB. And indeed, as said earlier, as $A^{u}$ and $ρ$ are local gauge-invariant fields, the $U (1)$ -symmetry of the initial model is artificial and plays no physical role.

In line with the remarks made earlier in the general setting, despite a formal resemblance the preceding dressing should not be confused with the unitary gauge fixing of the model: the dressed model (5.17) is not a gauge fixing of (5.16).Footnote ⁴³

For pure $H$ -gauge theories $L (A)$ , there are not many options to work with to build a dressing field $u (A)$ . One attempt that has been explored in relation to the “proton spin decomposition controversy” (Reference Leader and LorcéLeader & Lorcé, 2014; Reference LorcéLorcé, 2013) is to split the potential as $A = A_{phys} + A_{pure}$ . By assumption, only $A_{phys}$ contributes to the field strength $F = F_{phy}$ and it transforms as $A_{phys}^{γ} = γ^{- 1} A_{phys} γ$ for $γ \in H$ . So, $A_{pure}$ is pure gauge $F_{pure} = 0$ , which means that it can be written as $A_{pure} = u d u^{- 1}$ for some $H$ -valued function $u : U \to H$ . Since it must also transform as a connection, $A_{pure}^{γ} = γ^{- 1} A_{pure} γ + γ^{- 1} d γ$ , it means that $u^{γ} = γ^{- 1} u$ . In other words, $u = u (A)$ is a local $H$ -dressing field. The dressed fields are then $F^{u} = F_{phys}^{u} = d A_{phy}^{u} + 1 / 2 [A_{phy}^{u}, A_{phy}^{u}]$ and $F^{u} = F_{phys}^{u} = d A_{phy}^{u} + 1 / 2 [A_{phy}^{u}, A_{phy}^{u}]$ , so that the theory is rewritten $L (A) = L (A_{phy}^{u})$ . The same can be done for a theory including spinors, namely fermions, $L (A, ψ) = L (A_{phy}^{u}, ψ^{u})$ .

This, however, is unsatisfactory. Indeed the ansatz decomposition $A = A_{phys} + A_{pure}$ reflects the affine nature of the connection space $C$ , so that $A_{phys}$ is the local representative of a tensorial form, and $A_{pure}$ that of a flat connection. But then it can be shown that the existence of a global flat connection means the underlying bundle $P$ is trivial, which further implies that the ambiguity group in choosing $u$ is isomorphic to the initial gauge group, $G ≃ H$ . In view of (5.13), $G$ is a (gauge) symmetry of $L (A_{phy}^{u}, ψ^{u})$ . So, nothing has been really achieved by the ansatz, as the bare and dressed theories are entirely isomorphic. See Reference François, Lazzarini and MassonFrançois et al. (2015) for details.

As far as is known, in pure $H$ -gauge theories $L (A)$ any dressing field $u (A)$ is nonlocal, so that according to the DFM the initial $H_{loc}$ symmetry is substantial. The same seems likely for non-abelian gauge theories including spinor fields, $L (A, ψ)$ , as there is no polar decomposition of a spinor $ψ$ from which one could extract a local dressing field $u (ψ)$ . Applications of the DFM in the context of such theories, building nonlocal $u (A)$ ’s, provide in particular a geometric basis for Dirac’s gauge-invariant formulation of QED (Reference DiracDirac, 1955; Reference DiracDirac, 1958, section 80) – see Reference FrançoisFrançois (2019) for a discussion – as well as for the construction of quark ( $ψ^{u}$ ) and gluon ( $A^{u}$ ) states in QCD such as Reference Bagan, Lavelle and McMullanBagan, Lavelle, and McMullan (2000) and Reference Lavelle and McMullanLavelle and McMullan (1997).

We notice that it is when formulated in the context of spinorial EM that the AB effect retains its physical significance by displaying how EM properties are encoded nonlocally. Indeed, the effect cannot be explained by the local interaction of gauge-invariant fields outside the cylinder (or so it seems), as the gauge-invariant fields $A^{u}$ and $ψ^{u}$ are nonlocal.

5.4 Invariant Formulation of the Electroweak Model

The basic idea behind the DFM featured repeatedly in reformulations of theories undergoing SSB. It is seen in the pioneering work of Higgs on abelian models (Reference HiggsHiggs, 1966) and of Kibble on non-abelian models (Reference KibbleKibble, 1967). It resurfaced in the work of Banks and Rabinovici on the abelian Higgs model (Reference Banks and RabinoviciBanks & Rabinovici, 1979), and shortly after in the work of Reference Fröhlich, Morchio and StrocchiFröhlich, Morchio, and Strocchi (1980, Reference Fröhlich, Morchio and Strocchi1981) on the invariant formulation of the electroweak model, which is still today a point of reference (known as the FMS approach; see Section 6). Since then, the idea is found again in several works also concerned with invariant formulations of (aspects of) the electroweak model (Reference Buchmüller, Fodor and HebeckerBuchmüller, Fodor, & Hebecker, 1994; Reference Chernodub, Faddeev and NiemiChernodub et al., 2008; Reference Faddeev, Begun, Jenkovszky and PolańskiFaddeev, 2009; Reference Grosse-Knetter and KögerlerGrosse-Knetter & Kögerler, 1993; Reference Ilderton, Lavelle and McMullanIlderton et al., 2010; Reference KondoKondo, 2018; Reference Lavelle and McMullanLavelle & McMullan, 1995; Reference Masson and WalletMasson & Wallet, 2011; Reference MorrisMorris, 2000a, Reference Morris2000b; Reference RostenRosten, 2012). The recent review on Higgs physics (Reference MaasMaas, 2019) emphasizes the importance of gauge-invariant formulations, and flavors of the DFM can be recognized there.

In the past fifteen years, the fact that such reformulations may cast a new light on the electroweak physics, and gauge physics more generally, has been appreciated by philosophers of physics such as Reference SmeenkSmeenk (2006), Reference LyreLyre (2008), Reference StruyveStruyve (2011), and Reference FriederichFriederich (2013, Reference Friederich2014).

In the following we propose the most natural reformulation via DFM of a simplified electroweak model (considering only leptons and massless neutrinos). It may be compared to the FMS approach for this case.

The space of field of the model is $χ = {a, b, φ, ψ_{L}, ψ_{R}}$ . The gauge potentials are $a \in Ω^{1} (U, Lie (U (1)))$ and $b \in Ω^{1} (U, Lie (S U (2)))$ , with field strength $F$ and $G$ . We have a scalar field in the fundamental representation of $S U (2)$ , $φ = (φ_{1}, φ_{2})^{T} \in Ω^{0} (U, C^{2})$ , as well as a left-handed (Weyl) fermion doublet (leptons, say) $ψ_{L} = (ν_{L}, ℓ_{L})^{T}$ , and a right-handed fermion singlet $ψ_{R} = ℓ_{R}$ . The scalar and fermions couple minimally with the gauge potentials via the covariant derivatives

\begin{matrix} D φ & = d φ + (g b + g^{'} a) φ, \\ D ψ_{L} & = d ψ_{L} + (g b - g^{'} a) ψ_{L}, \\ D ψ_{R} & = d ψ_{R} - g^{'} 2 a ψ_{R}, \end{matrix}

with $g$ and $g^{'}$ coupling constants. The gauge group $H = U (1) \times S U (2)$ acts, for $α \in U (1)$ and $β \in S U (2)$ , as follows:

\begin{matrix} a^{α} & = a + \frac{1}{g^{'}} α^{- 1} d α, b^{α} = b, φ^{α} = α^{- 1} φ, and ψ_{L / R}^{α} = (\begin{matrix} α ψ_{L} \\ α^{2} ψ_{R} \end{matrix}), \\ a^{β} & = a, b^{β} = β^{- 1} b β + \frac{1}{g} β^{- 1} d β, φ^{β} = β^{- 1} φ, and ψ_{L / R}^{β} = (\begin{matrix} β^{- 1} ψ_{L} \\ ψ_{R} \end{matrix}) . \end{matrix}

(5.18)

The $H$ -invariant Lagrangian form of the theory is

\begin{matrix} L (a, b, φ, ψ_{L / R}) & = \frac{1}{2} Tr (F \land * F) + \frac{1}{2} Tr (G \land * G) + 〈 D φ, * D φ 〉 - V (φ) \\ + 〈 ψ_{L / R}, D ψ_{L / R} 〉 + f_{ℓ} 〈 ψ_{L}, * φ 〉 ψ_{R} + f_{ℓ} {\overset{ˉ}{ψ}}_{R} 〈 φ, * ψ_{L} 〉 . \end{matrix}

(5.19)

The potential term is $V (φ) = μ^{2} ⟨ φ, * φ ⟩ + λ ⟨ φ, * φ ⟩^{2}$ with $μ^{2} \in R$ , $λ > 0$ , and $⟨, ⟩$ is a Hermitian form on $C^{2}$ . The Dirac operator is $D = γ \land * D$ , with $γ = γ_{μ} d x^{μ}$ the Dirac matrices-valued 1-form. The constants $f_{ℓ} \in R$ are Yukawa couplings specific of each type of lepton ( $ℓ = e, μ, τ$ ).

As the usual narrative goes (see e.g. Reference Becchi and RidolfiBecchi and Ridolfi, 2006), if $μ^{2} < 0$ , the electroweak vacuum given by $V (φ) = 0$ seems degenerate, as it appears to be an $S U (2)$ -orbit of nonvanishing vacuum expectation values for $φ$ . When the latter settles randomly, spontaneously, on one of them, this breaks $S U (2)$ and generates mass terms for the fields with which it couples (minimally, or via Yukawa terms).

The dressing field method suggests an alternative proposition. Indeed it is not hard to find a dressing field in the electroweak model. Considering the polar decomposition in $C^{2}$ of the scalar field $φ = ρ u$ with

\begin{matrix} u (φ) & = \frac{1}{ρ} (\begin{matrix} φ_{2}^{*} & φ_{1} \\ - φ_{1}^{*} & φ_{2} \end{matrix}) \in S U (2) and ρ := (\begin{matrix} 0 \\ | | φ | | \end{matrix}) \in R^{+} \subset C^{2}, \\ one has φ^{β} \Rightarrow u^{β} = β^{- 1} u . \end{matrix}

(5.20)

Thus, $u$ is a $S U (2)$ -dressing field that can be used to construct the $S U (2)$ -invariant composite fields:

\begin{matrix} b^{u} & = u^{- 1} b u + \frac{1}{g} u^{- 1} d u = : B, and G^{u} = u^{- 1} G u = d B + \frac{g}{2} [B, B], \end{matrix}

(5.21)

\begin{matrix} φ^{u} & = u^{- 1} φ = ρ, and (D φ)^{u} = D^{u} ρ = d ρ + (g B + g^{'} a) ρ, \end{matrix}

(5.22)

\begin{matrix} ψ_{L}^{u} & = u^{- 1} ψ_{L} = : (\begin{matrix} ν_{L}^{u} \\ ℓ_{L}^{u} \end{matrix}) and (D ψ_{L})^{u} = D^{u} ψ_{L}^{u} = d ψ_{L}^{u} + (g B + g^{'} a) ψ_{L}^{u} . \end{matrix}

(5.23)

Since $u$ is local, so are the preceding composite fields. Therefore, we might suggest that the $S U (2)$ -gauge symmetry of the model is artificial, so that the theory defined by the electroweak Lagrangian (5.19) is rewritten as the $U (1)$ -gauge theory,Footnote ⁴⁴

\begin{matrix} L (a, B, ρ, ψ_{L}^{u}, ψ_{R}) = & \frac{1}{2} T r (F \land * F) + \frac{1}{2} T r (G^{u} \land * G^{u}) \\ + ⟨ D^{u} ρ, * D^{u} ρ ⟩ - V (ρ) \\ + ⟨ ψ_{L}^{u}, D^{u} ψ_{L}^{u} ⟩ & + ⟨ ψ_{R}, D ψ_{R} ⟩ + f_{ℓ} ⟨ ψ_{L}^{u}, * ρ ⟩ ψ_{R} + f_{ℓ} {\overset{ˉ}{ψ}}_{R} ⟨ ρ, * ψ_{L}^{u} ⟩ . \end{matrix}

(5.24)

The interpretation of the model in terms of SSB is here superfluous, and indeed impossible when expressed in the form (5.24). Analyzing the residual substantial $U (1)$ -gauge symmetry of the model allows us to go a step further in exhibiting the physical d.o.f.

Let us remark that the preceding dressed fields essentially reproduce the invariant variables used in Reference Banks and RabinoviciBanks and Rabinovici (1979) and the seminal FMS approach (Reference Fröhlich, Morchio and StrocchiFröhlich et al., 1980, Reference Fröhlich, Morchio and Strocchi1981). In particular, it is easy to compare $ρ \sim {(φ_{1}^{*} φ_{1} + φ_{2}^{*} φ_{2})}^{\frac{1}{2}}$ and $(\begin{matrix} ν_{L}^{u} \\ ℓ_{L}^{u} \end{matrix}) = \frac{1}{ρ} (\begin{matrix} φ_{2} ν_{L} - φ_{1} ℓ_{L} \\ φ_{1}^{*} ν_{L} + φ_{2}^{*} ℓ_{L} \end{matrix})$ to, for example, eq. (6.1) of Reference Fröhlich, Morchio and StrocchiFröhlich et al. (1981).

5.4.1 Residual $U (1)$ Symmetry

By its very definition $ρ^{β} = ρ^{α} = ρ$ , so it is already a fully $H$ -gauge invariant scalar field that then qualifies as a potential observable. As explained in Section 5.2, the $U (1)$ -residual gauge transformations of the $S U (2)$ -invariant composite fields depend on the $U (1)$ -gauge transformation of the dressing field $u$ : $(χ^{u})^{α} = (χ^{α})^{u^{α}}$ . One finds that

\begin{matrix} u (φ)^{α} : = u (φ^{α}) = u (φ) \tilde{α}, where \tilde{α} = (\begin{matrix} α & 0 \\ 0 & α^{- 1} \end{matrix}) . \end{matrix}

This is not the kind of residual transformation shown in Proposition 3, yet the general logic applies and we get $(χ^{u})^{α} = (χ^{α})^{u \tilde{α}}$ . So, using (5.18), we easily find

\begin{matrix} \begin{matrix} B^{α} & = {\tilde{α}}^{- 1} B \tilde{α} + \frac{1}{g} {\tilde{α}}^{- 1} d \tilde{α}, & (G^{u})^{α} & = {\tilde{α}}^{- 1} G^{u} \tilde{α}, \\ (ψ_{L}^{u})^{α} & = {\tilde{α}}^{- 1} α ψ_{L}^{u}, & (D^{u} ψ_{L}^{u})^{α} & = {\tilde{α}}^{- 1} α D^{u} ψ_{L}^{u}, \\ ρ^{α} & = (α \tilde{α})^{- 1} ρ = ρ, & (D^{u} ρ)^{α} & = (α \tilde{α})^{- 1} D^{u} ρ . \end{matrix} \end{matrix}

(5.25)

By a simple inspection of the matrices $α \tilde{α} = (\begin{matrix} α^{2} & 0 \\ 0 & 1 \end{matrix})$ and ${\tilde{α}}^{- 1} α = (\begin{matrix} 1 & 0 \\ 0 & α^{2} \end{matrix})$ , it is clear on the one hand that the top component $ν_{L}^{u}$ of $ψ_{L}^{u}$ is $U (1)$ -invariant,Footnote ⁴⁵ and on the other hand that $U (1)$ -invariant combinations of $a$ and (components of) $B$ are to be found in the covariant derivatives. And indeed, given the decomposition $B = B_{a} σ^{a}$ , where $σ^{a}$ are the hermitian Pauli matrices and $B_{a} \in i R$ , we have explicitly

\begin{matrix} B = (\begin{matrix} B_{3} & B_{1} - i B_{2} \\ B_{1} + i B_{2} & - B_{3} \end{matrix}) =: (\begin{matrix} B_{3} & W^{-} \\ W^{+} & - B_{3} \end{matrix}), \\ so that & B^{α} = (\begin{matrix} B_{3} + \frac{1}{g} α^{- 1} d α & α^{- 2} W^{-} \\ α^{2} W^{+} & - B_{3} - \frac{1}{g} α^{- 1} d α \end{matrix}) . \end{matrix}

(5.26)

The linear combination $g B_{3} - g^{'} a = : {(g^{2} + {g^{'}}^{2})}^{\frac{1}{2}} Z^{0}$ , obviously $U (1)$ -invariant, appears in both $D^{u} ρ$ and $D^{u} ψ_{L}^{u}$ . One may observe that the combination $A : = {(g^{2} + {g^{'}}^{2})}^{- \frac{1}{2}} (g^{'} B_{3} + g a)$ , $U (1)$ -transforms as $A^{α} = A + \frac{1}{e} α^{- 1} d α$ with $e = g g^{'} / \sqrt{g^{2} + {g^{'}}^{2}}$ . It would be natural to expect it to appear, together with $Z^{0}$ , in the bottom component of $(D^{u} ψ_{L}^{u})$ . Explicitly

\begin{matrix} D^{u} ρ & = & d ρ + (g^{'} a + g B) ρ = (\begin{matrix} g W^{-} ρ \\ d ρ + (g^{'} a - g B_{3}) ρ \end{matrix}) \\ = & (\begin{matrix} g W^{-} ρ \\ d ρ - {(g^{2} + {g^{'}}^{2})}^{\frac{1}{2}} Z^{0} ρ \end{matrix}) = (\begin{matrix} g W^{-} ρ \\ d ρ - \frac{e}{\cos θ_{W} \sin θ_{W}} Z^{0} ρ \end{matrix}), \end{matrix}

(5.27)

so,

\begin{matrix} (D^{u} ρ)^{α} = (\begin{matrix} α^{- 2} & 0 \\ 0 & 1 \end{matrix}) (\begin{matrix} g W^{-} ρ \\ d ρ - \frac{e}{cos θ_{W} sin θ_{W}} Z^{0} ρ \end{matrix}) \end{matrix}

by (5.25) or (5.26). And

\begin{matrix} D^{u} ψ_{L}^{u} & = d ψ_{L}^{u} + (g B - g^{'} a) ψ_{L}^{u} = (\begin{matrix} d ν_{L}^{u} + (g B_{3} - g^{'} a) ν_{L}^{u} + g W^{-} ℓ_{L}^{u} \\ d ℓ_{L}^{u} - (g B_{3} + g^{'} a) ℓ_{L}^{u} + g W^{+} ν_{L}^{u} \end{matrix}) \\ = (\begin{matrix} d ν_{L}^{u} + (g^{2} + {g^{'}}^{2})^{1 / 2} Z^{0} ν_{L}^{u} + g W^{-} ℓ_{L}^{u} \\ d ℓ_{L}^{u} - 2 e A ℓ_{L}^{u} - \frac{g^{2} - {g^{'}}^{2}}{\sqrt{g^{2} + {g^{'}}^{2}}} Z^{0} ℓ_{L}^{u} + g W^{+} ν_{L}^{u} \end{matrix}) \\ = (\begin{matrix} d ν_{L}^{u} + \frac{e}{cos θ_{W} sin θ_{W}} Z^{0} ν_{L}^{u} + g W^{-} ℓ_{L}^{u} \\ d ℓ_{L}^{u} - 2 e A ℓ_{L}^{u} - e (\frac{1}{cos θ_{W} sin θ_{W}} - 2 \frac{sin θ_{W}}{cos θ_{W}}) Z^{0} ℓ_{L}^{u} + g W^{+} ν_{L}^{u} \end{matrix}), \end{matrix}

(5.28)

so

\begin{matrix} (D^{u} & ψ_{L}^{u})^{α} = (\begin{matrix} 1 & 0 \\ 0 & α^{- 2} \end{matrix}) (\begin{matrix} d ν_{L}^{u} + \frac{e}{cos θ_{W} sin θ_{W}} Z^{0} ν_{L}^{u} + g W^{-} ℓ_{L}^{u} \\ d ℓ_{L}^{u} - 2 e A ℓ_{L}^{u} - e (\frac{1}{cos θ_{W} sin θ_{W}} - 2 \frac{sin θ_{W}}{cos θ_{W}}) Z^{0} ℓ_{L}^{u} + g W^{+} ν_{L}^{u} \end{matrix}) \end{matrix}

by (5.25) or (5.26). In the preceding calculations is introduced the weak mixing (or Weinberg) angle variable $θ_{W}$ via $\cos θ_{W} : = g / \sqrt{g^{2} + {g^{'}}^{2}}$ and $\sin θ_{W} = g^{'} / \sqrt{g^{2} + {g^{'}}^{2}}$ , so that the change of field variable $(a, B_{3}) \to (Z^{0}, A)$ can be written as a rotation in field space,

\begin{matrix} (\begin{matrix} A \\ Z^{0} \end{matrix}) = (\begin{matrix} cos θ_{W} & sin θ_{W} \\ - sin θ_{W} & cos θ_{W} \end{matrix}) (\begin{matrix} a \\ B_{3} \end{matrix}) = (\begin{matrix} cos θ_{W} a + sin θ_{W} B_{3} \\ cos θ_{W} B_{3} - sin θ_{W} a \end{matrix}) . \end{matrix}

The electroweak theory (5.24) is then expressed in terms of the $H$ -invariant fields $ρ, Z^{0}, ν_{L}^{u}$ and the $U (1)$ -gauge fields $W^{\pm}, A, ℓ_{L}^{u}, ℓ_{R}$ . Writing explicitly the parts of the Lagrangian relevant to the next point to be discussed, we have

\begin{matrix} L (A, W^{\pm}, Z^{0}, ρ, e_{L}^{u}, e_{R}, ν_{L}^{u}) = \frac{1}{2} T r (F \land * F) + \frac{1}{2} T r (G^{u} \land * G^{u}) \\ + d ρ \land * d ρ - g^{2} ρ^{2} W^{+} \land * W^{-} - (g^{2} + {g^{'}}^{2}) ρ^{2} Z^{0} \land * Z^{0} - (μ^{2} ρ^{2} + λ ρ^{4}) {v o l}_{n} \\ + ⟨ ψ_{L}^{u}, D^{u} ψ_{L}^{u} ⟩ + ⟨ ψ_{R}, D ψ_{R} ⟩ + f_{ℓ} ({\overset{ˉ}{ℓ}}_{L}^{u} ρ ℓ_{R} + {\overset{ˉ}{ℓ}}_{R} ρ ℓ_{L}^{u}) {v o l}_{n} . \end{matrix}

(5.29)

One can expand the $R^{+}$ -valued scalar field $ρ$ around its unique ground state $ρ_{0}$ , given by $V (ρ) = 0$ , as $ρ = ρ_{0} + H$ , where $H$ is the gauge-invariant Higgs field. Then, in the phase $μ^{2} < 0$ of the theory, where $ρ_{0} = \sqrt{- μ^{2} 2 λ}$ , mass terms $m_{Z^{0}} = ρ_{0} \sqrt{(g^{2} + {g^{'}}^{2})}$ and $m_{W^{\pm}} = ρ_{0} g$ for $Z^{0}, W^{\pm}$ appear from the couplings of the electroweak fields with $ρ$ ,Footnote ⁴⁶ and the latter’s self-interaction produces a mass $m_{H} = ρ_{0} \sqrt{2 λ}$ for $H$ , while mass terms $m_{ℓ} = ρ_{0} f_{ℓ}$ for the Dirac spinor leptons $ℓ = (ℓ_{L}^{u}, ℓ_{R})^{T}$ are produced by Yukawa couplings.

Masses for gauge fields and leptons are obtained through a phase transition of the unique electroweak vacuum, but it is not congruent with a spontaneous gauge symmetry breaking, as the model is $S U (2)$ -invariant – and the physical d.o.f. are manifest – in both phases.Footnote ⁴⁷ The DFM approach to the electroweak model is consistent with Elitzur’s theorem (Reference ElitzurElitzur, 1975) stating that in lattice gauge theory a gauge symmetry cannot be spontaneously broken.

5.4.2 Discussion

To reiterate again a general remark in this context, (5.29) formally looks like the electroweak Lagrangian in the unitary gauge, yet it is conceptually different, as a dressing is not a gauge fixing.

Another noteworthy difference is that while the model (5.19) is defined for $φ \in C^{2}$ , the dressed version (5.24)/(5.29) is only for $φ \in C^{2} / {0}$ because the polar decomposition (5.20) is not well defined at $φ = 0$ . Thus, the standard and dressed versions have different scalar field configuration topologies. In the massive phase ( $μ^{2} < 0$ ), this should be of little concern regarding the perturbative regime and appears also to be irrelevant nonperturbatively (Reference Fernandez, Fröhlich and SokalFernandez et al., 1992).

This is more troubling, however, in the phase $μ^{2} > 0$ , as this means that the absolute minimum $ρ_{0} = 0$ is not an available configuration, so that the mass terms are not vanishing, but vanishingly small. One could be tempted to retreat behind the fact that this phase of the theory is not realized in nature at present and is beyond experimental reach, yet electroweak phase transition is believed to have occurred in the early universe and contributed to baryogenesis. So one cannot evade the necessity to assess the consequences (cosmological and otherwise) of not having strictly zero masses in the $μ^{2} > 0$ phase. It turns out there are arguments as to why this may finally be irrelevant, or at worst lead to relic monopoles (Reference Fernandez, Fröhlich and SokalFernandez et al., 1992).

Another question worth pursuing is the quantization of the dressed model. As it is formally similar to the unitary-gauge version of the theory, indications of in-principle possibility of quantizing the model in the unitary gauge (instead of the usual $R_{ξ}$ -gauge) (Reference Irges and KoutroulisIrges and Koutroulis, 2017; Reference Mainland and O’RaifeartaighMainland & O’Raifeartaigh, 1975; Reference RossRoss, 1973; Reference WoodhouseWoodhouse, 1974) may speak in favor of the view that (5.29) lends itself well to perturbation theory. It is not obvious that the quantized version of (5.29) is exactly equivalent to the standard one, so it may be interesting to compare them to see if one has some theoretical edge over the other. This has been done in lattice simulations, and within systematic errors no deviations have been observed (Reference Evertz, Jersak, Lang and NeuhausEvertz et al., 1986; Reference Philipsen, Teper and WittigPhilipsen, Teper, & Wittig, 1996).

The previously mentioned problem in the $μ^{2} > 0$ phase is avoided in the alternative invariant FMS approach to the model, which is also the one for which serious perturbative and lattice calculations have been done (e.g. Reference Afferrante, Maas, Sondenheimer and TörekAfferrante et al., 2021; Reference Dudal, van Egmond and GuimarãesDudal et al., 2020; Reference Dudal, van Egmond and GuimarãesDudal, van Egmond, et al., 2021; Reference MaasMaas, 2019; Reference Maas and SondenheimerMaas & Sondenheimer, 2020; Reference Maas and TörekMaas & Törek, 2018; Reference SondenheimerSondenheimer, 2020), so it is most easily weighted against the standard literature. It has also the advantage of being easily generalizable to $S U (n)$ gauge theories.

6 The Fröhlich–Morchio–Strocchi Approach

So far, we have viewed and used the DFM as a direct reformulation of the degrees of freedom of the electroweak sector of the standard model to demonstrate that the local gauge structure is artificial and a change to a manifestly gauge-invariant formulation is possible. On practical grounds, in particular regarding quantization, an alternative viewpoint on the DFM is useful. Therefore, we keep the basic philosophy of the previous sections but slightly change the perspective.

In the following, we quantize the actual gauge symmetry based on the gauge-dependent elementary degrees of freedom but consider only $n$ -point functions of strictly gauge-invariant objects. From that perspective, we perform the analysis of physical observables in a gauge theory with BEH mechanism in a QCD-like fashion. In QCD, quarks and gluons are used to describe the microscopic degrees of freedom, but observable quantities are only gauge-invariant bound states, for example, the hadrons. For a general BEH theory, we can do the same. At first sight this seems to be at odds with the tremendous success of the common perturbative treatment to describe electroweak processes at current and past collider facilities. However, certain properties of these gauge-invariant objects can be mapped on properties of gauge-dependent objects within particular classes of gauges, which was first observed by Fröhlich, Morchio, and Strocchi (FMS) (Reference Fröhlich, Morchio and StrocchiFröhlich et al., 1980, Reference Fröhlich, Morchio and Strocchi1981).

6.1 The Fröhlich–Morchio–Strocchi Approach for the Electroweak Model

First of all, note that we can already reinterpret the dressed fields defined in Eq. (5.21) and Eq. (5.23) as gauge-invariant composite bound state operators. Ignoring the common treatment of the BEH mechanism, these are precisely some of the simplest possible gauge-invariant operators one would construct as observables of an $S U (2)$ gauge theory with fundamental scalar and fermion fields. For instance, one would construct an $S U (2)$ gauge-invariant combination of a left-handed fermion field with a scalar $φ^{†} ψ_{L}$ (cf. $ℓ_{L}^{u}$ in Eq. (5.23)) or the charge conjugate scalar ${\tilde{φ}}^{†} ψ_{L} = (ε φ^{*})^{†} ψ_{L}$ (cf. $ν_{L}^{u}$ in Eq. (5.23)) with $ε$ being the two-dimensional Levi–Civita tensor of $S U (2)$ . Similarly, we can construct gauge-invariant vector operators (with respect to the Lorentz group), for example, $φ^{†} D φ - {\tilde{φ}}^{†} D \tilde{φ}$ , ${\tilde{φ}}^{†} D φ$ , and $φ^{†} D \tilde{φ}$ (cf. $b^{u}$ in Eq. (5.21)).

Keeping this strategy, we can also define a strictly gauge-invariant scalar operator, $φ^{†} φ$ . Here, we do not rely on the polar decomposition of the complex scalar doublet $φ$ , which factors out the $S U (2)$ gauge-dependent contribution as in Eq. (5.22), but construct a gauge-invariant scalar object by dressing the elementary scalar field with its hermitian conjugate. An additional advantage of this viewpoint is the fact that these types of bound state operators can be investigated for all potential forms of the scalar potential $V (φ)$ independently as to whether it obeys only one minimum at vanishing field configuration or a multitude of different (possibly even gauge-inequivalent) minima. Thus, we now have a conceptually clean setup that can be used in all parameter regions of the model. However, the apparent disadvantage of these gauge-invariant formulations is given by the circumstance that we have to compute properties of composite objects instead of using perturbative techniques for the elementary degrees of freedom.

In general, a bound state is a nontrivial object, and the computation of its properties from first principles requires nonperturbative techniques. Nevertheless, the $n$ -point functions of some potential bound state operators can be computed in a fairly simple way in a BEH model. In order to examine this, FMS proposed to gauge fix the field configurations, for example, via ’t Hooft gauge, such that the scalar field acquires a nontrivial VEV. In this case, we are able to perform the conventional split

\begin{matrix} φ (x) = \frac{v}{\sqrt{2}} φ_{0} + Δ φ (x), \end{matrix}

(6.1)

where $φ_{0}$ is a unit vector in gauge space denoting the direction of the VEV (e.g., $φ_{0} = (0, 1)^{T}$ is a common choice), $v$ is the modulus of the VEV, and $Δ φ$ denotes fluctuations around it. The latter contains the field that is usually identified with the Higgs boson as well as the three would-be Goldstone modes that mix with those gauge bosons that acquire a nonvanishing mass term due to the BEH mechanism. With the aid of $φ_{0}$ , we can extract these fields in a covariant but obviously not a gauge-invariant way. Therefore, they cannot belong to the physical spectrum of the model if the gauge structure is merely a redundancy in our description. The Higgs field $h = \sqrt{2} Re (φ_{0}^{†} Δ φ)$ is the radial component of the fluctuation field in the direction of the VEV, while the Goldstone modes are excitations in the remaining orthogonal directions, $Δ \overset{˘}{φ} = Δ φ - Re (φ_{0}^{†} Δ φ) φ_{0}$ .

By using such a gauge with nonvanishing VEV, we are able to rewrite the $n$ -point functions of the gauge-invariant bound state operator in terms of $n$ -point functions of gauge-variant objects. For instance, we obtain, for the connected part of the propagator,

\begin{matrix} ⟨ (φ^{†} φ) (x) (φ^{†} φ) (y) ⟩ & = v^{2} ⟨ h (x) h (y) ⟩ + 2 v ⟨ h (x) (Δ φ^{†} Δ φ) (y) ⟩ \\ + ⟨ (Δ φ^{†} Δ φ) (x) (Δ φ^{†} Δ φ) (y) ⟩ . \end{matrix}

(6.2)

We ordered the terms on the right-hand side according to the number of fluctuation fields $Δ φ$ appearing in the $n$ -point functions (note that $h$ is also a component of $Δ φ$ ). However, this FMS expansion of the bound state $φ^{†} φ$ should not merely be viewed as an expansion in small fluctuations around the VEV. The FMS expansion is finite by construction and rather an exact rewriting of the original gauge-invariant operator. Thus, Eq. (6.2) holds for any field amplitude $Δ φ$ even in the nonperturbative regime. Nevertheless, using the number of fluctuation fields as an ordering scheme is an efficient method to extract the main information of the FMS expansion, in particular in the weak coupling regime. The first term on the right-hand side, namely the leading order term with respect to the ordering parameter $Δ φ / v$ , is the propagator of the (gauge-variant) elementary Higgs field $h$ . Therefore, certain properties of the gauge-invariant bound state propagator can already be extracted from $⟨ h (x) h (y) ⟩$ .

For instance, let us consider the mass and decay width of the state generated by $φ^{†} φ$ . These properties are encoded in the pole structure of its propagator. Ignoring for a moment the higher-order terms of the FMS expansion, we obtain that the pole of the gauge-invariant bound state propagator coincides with the pole structure of the elementary Higgs propagator. In addition, it can be shown to all orders in a perturbative expansion of the $n$ -point functions that the higher-order terms of the FMS expansion do not alter the pole structure on the right-hand side (Reference Maas and SondenheimerMaas & Sondenheimer, 2020). Therefore, the on-shell properties of $φ^{†} φ$ are well described by the propagator $⟨ h (x) h (y) ⟩$ . Of course, the pole of the bound state operator has to be gauge-invariant by construction. This translates at the level of the elementary $h$ field to the well-known Nielsen identities, which show that the pole of $⟨ h (x) h (y) ⟩$ is independent of the gauge-fixing parameter within ’t Hooft gauges (Reference Grassi, Kniehl and SirlinGrassi, Kniehl, & Sirlin, 2002; Reference NielsenNielsen, 1975).

However, the latter fact does not mean that the elementary Higgs field can be associated with the experimental observed Higgs boson. The Nielsen identities merely show that certain gauge-invariant information of the model can be extracted from the field $h$ , but $h$ itself is still gauge-dependent. In particular, every single term on the right-hand side of the FMS expansion is gauge-dependent and can only be computed within the specifically chosen gauge. Without gauge fixing, any of these Green’s functions will vanish since the action as well as the path integral measure are gauge-invariant. The fact that they are nontrivial within the common treatment is merely due to the conventional gauge-fixing procedure. Choosing a gauge implies automatically an explicit breaking of the gauge symmetry. However, this is done by hand and should not be confused with spontaneous symmetry breaking. For instance, gauges can be constructed that induce a vanishing VEV of the scalar field even if the potential has a nontrivial global minimum. For these types of gauges, the mass parameters of the various elementary fields would be zero to any order in a perturbative expansion. Nonetheless, the properties of a gauge-invariant object as the scalar bound state operator $φ^{†} φ$ are independent of the gauge. The FMS formulation basically reveals that in some gauges, namely those that are conventionally used in the particle physics community, some gauge-invariant information of the system can be computed in a convenient way as it is stored in the $n$ -point functions of elementary fields. Further, we have perturbative access to it in the weak coupling regime, as all terms on the right-hand side of Eq. (6.2) can be computed via perturbative techniques. Therefore, we have reduced the problem of calculating the properties of a complicated but strict gauge-invariant bound state operator on computing $n$ -point functions of elementary fields and composites of elementary fields in a gauge-fixed setup.

That this nontrivial relation is indeed realized has been validated by nonperturbative lattice simulations for an $S U (2)$ Yang–Mills–Higgs theory (Reference Maas and MuftiMaas & Mufti, 2014, Reference Maas and Mufti2015). The lattice formulation provides a clean setup for this check, as no gauge fixing is required to compute the properties of a gauge-invariant bound state. Furthermore, gauge-fixed configurations can be generated that allow for a nonperturbative investigation of the elementary $n$ -point functions. Thus, both sides of the relation can be investigated independently. By contrast, a perturbative analysis can only investigate the terms on the right-hand side due to the necessity to gauge fix. Investigating the spectrum in the scalar channel of the model, lattice simulations confirm that the mass of the gauge-invariant bound state operator coincides with the mass of the elementary Higgs field as dictated by the FMS relation. Considering the vector channel, one would expect three degenerate massive vector bosons due to the BEH mechanism from the conventional analysis. Constructing bound state operators, we are able to write down a gauge-invariant triplet of states that precisely map on the elementary triplet of vector bosons via the FMS formulation in this model. This relation also has been confirmed by lattice investigations.

6.1.1 Gauge-Invariant Description of the Electroweak Particles

The remaining question is now if such a type of mapping between a gauge-invariant bound state operator and the elementary fields of the Lagrangian can be implemented for all fields of the electroweak sector. Before we discuss the FMS formulation of the full electroweak model, for the sake of simplicity we neglect the $U (1)$ hypercharge gauge group and the Yukawa couplings for a moment and focus on the non-Abelian $S U (2)$ part. To be specific, we consider the Lagrangian (5.19) in the limit $f_{ℓ} \to 0$ and put the Abelian gauge field $a$ to zero. Besides the local $S U (2)$ gauge structure given in the second line of Eq. (5.18), the model also obeys a less obvious, additional global $S U (2)_{R}$ symmetry. It solely acts on the scalar field but in a nonlinear way as it relates $φ$ with $\tilde{φ}$ ,

\begin{matrix} φ^{κ} = κ_{1} φ + κ_{2} \tilde{φ}, {\tilde{φ}}^{κ} = - κ_{2}^{*} φ + κ_{1}^{*} \tilde{φ} \\ where κ_{1 / 2} \in C and | κ_{1} |^{2} + | κ_{2} |^{2} = 1 \end{matrix}

(6.3)

( $b^{κ} = b$ , $ψ_{L / R}^{κ} = ψ_{L / R}$ ). Note that this is a particularity of $S U (2)$ , as only for this group does the dual field of the fundamental scalar $φ$ transform under the fundamental representation as well. From the FMS perspective, we can now classify gauge-invariant bound state operators according to their transformation properties with respect to this global $S U (2)_{R}$ symmetry, namely as $S U (2)_{R}$ multiplets. Again, this is similar to pure QCD with $N_{f}$ fermion flavors where the physical spectrum is described in terms of $S U (3)$ -invariant meson, hadron, and more exotic bound states that form certain multiplets of the global flavor symmetry group. The additional global symmetry can be made more transparent by introducing a bidoublet, $Φ = (\begin{matrix} \tilde{φ} & φ \end{matrix})$ . The usual gauge transformations act on $Φ$ by multiplication from the left, $Φ^{β} = β Φ$ . The additional global (flavor-like) symmetry acts as multiplication from the right $Φ^{κ} = Φ κ$ , $κ \in S U (2)_{R}$ . Note, that this bidoublet is precisely used to construct the local dressing field $u = | | φ | |^{- 1} Φ$ . Further, we would like to emphasize that this global symmetry is broken via the BEH mechanism as well. Nonetheless, in case a gauge with nonvanishing VEV is chosen, a global diagonal subgroup of $S U (2) \times S U (2)_{R}$ remains such that the precise breaking pattern of the model reads $S U (2) \times S U (2)_{R} \to S U (2)_{diag}$ . This remaining symmetry also manifests in the elementary spectrum. For instance, the weak vector bosons receive the same mass term due to the BEH mechanism and transform as an $S U (2)_{diag}$ triplet after gauge fixing.Footnote ⁴⁸

In order to analyze the spectrum of the model, we characterize states due to their global quantum numbers. Therefore, we have an additional quantum number due to the global $S U (2)_{R}$ symmetry group. First, we consider the scalar channel, namely, operators that generate states that can be associated with scalar particles. The simplest gauge-invariant operators in this channel contain two elementary scalar fields. These can be combined such that we obtain two different irreducible $S U (2)_{R}$ multiplets, a singlet or a triplet. We have already discussed the scalar $S U (2)_{R}$ singlet $T r (Φ^{†} Φ) = φ^{†} φ + {\tilde{φ}}^{†} \tilde{φ} = 2 φ^{†} φ$ . In case a gauge is chosen such that $φ$ acquires a nonvanishing VEV, this operator can be mapped on the elementary Higgs field as discussed previously. Technically, we can also construct a triplet state $T r (τ_{i_{R}} Φ^{†} Φ)$ where $τ_{i_{R}}$ denotes the generators of $S U (2)_{R}$ .Footnote ⁴⁹ However, this multiplet vanishes identically, $T r (τ_{i_{R}} Φ^{†} Φ) = 0$ , which becomes directly apparent in the $φ$ - $\tilde{φ}$ notation where the triplet is given by $(R e ({\tilde{φ}}^{†} φ), I m (φ^{†} \tilde{φ}), φ^{†} φ - {\tilde{φ}}^{†} \tilde{φ}) = (0, 0, 0)$ .

In the vector channel, we perform the same analysis. The global $S U (2)_{R}$ triplet, $T r (τ_{i_{R}} Φ^{†} D^{μ} Φ)$ , expands in leading order of the FMS mechanism to the triplet of massive $S U (2)_{diag}$ vector fields. Choosing $φ_{0} = (0, 1)^{T}$ , that is, $Φ = \frac{v}{\sqrt{2}} 1 + O (Δ φ)$ , we have

\begin{matrix} Tr (τ_{i_{R}} Φ^{†} D^{μ} Φ) = \frac{g v^{2}}{2} Tr (τ_{i_{R}} b^{μ}) + O (Δ φ) = \frac{g v^{2}}{4} δ_{i_{R} i_{D}} b_{i_{D}}^{μ} + O (Δ φ) . \end{matrix}

(6.4)

Therefore, we obtain a gauge-invariant vector operator transforming as an $S U (2)_{R}$ triplet that can be mapped on the triplet of massive elementary vector fields in a way similar to the $φ^{†} φ$ - $h$ mapping in the scalar sector. Of course, we are also able to construct a vector operator that transforms as an $S U (2)_{R}$ singlet, $T r (Φ^{†} D_{μ} Φ)$ . However, this operator does not provide a mapping on an elementary vector field, as the $O (Δ φ^{0})$ term vanishes due to the properties of the $S U (2)$ group. At $O (Δ φ^{1})$ we obtain a nontrivial term given by $\sqrt{2} v d_{μ} h$ . Thus, investigating the propagator of the vector singlet, we expect a pole at the mass of the elementary Higgs. Nevertheless, this pole structure does not give rise to a new vector particle with mass $m_{h}$ as it appears only in the longitudinal part of the correlator such that it does not exhibit the correct Lorentz structure of a vector particle. Note that not only the pole structure but also the correct Lorentz structure is necessary for a proper particle interpretation. In the case of the vector channel, one would expect a structure $(g^{μ ν} - p^{μ} p^{ν} / m_{V}^{2}) / (p^{2} - m_{V}^{2})$ for a proper massive vector particle having Spin = 1. However, we only obtain $p^{μ} p^{ν} / (p^{2} - m_{h}^{2})$ , which has no meaningful particle interpretation because not only the analytic structure of the propagator is important but also its Lorentz tensor structure. It rather reflects the fact that a derivative acting on a scalar operator transforms as a vector and therefore mixes with operators in the vector channel. Therefore, we do not obtain a new vector particle from the gauge-invariant description, which thus remains consistent with the common perturbative treatment of the model.

Furthermore, we have the fermionic sector of the model. Neglecting the hypercharge sector, the right-handed fermion fields are part of the physical spectrum, as they are already gauge-invariant; see Eq. (5.18).Footnote ⁵⁰ However, the left-handed flavors of quarks and leptons within one generation are actually weak gauge charges and thus unobservable due to their non-Abelian nature. Due to the global $S U (2)_{R}$ symmetry, we are able to construct $S U (2)$ gauge-invariant fermionic operators that are $S U (2)_{R}$ doublets. In leading order of the FMS expansion, these expand to the elementary left-handed fermionic fields,

\begin{matrix} Φ^{†} ψ_{L} = (\begin{matrix} {\tilde{φ}}^{†} ψ_{L} \\ φ^{†} ψ_{L} \end{matrix}) = \frac{v}{\sqrt{2}} (\begin{matrix} {\tilde{φ}}_{0}^{†} ψ_{L} \\ φ_{0}^{†} ψ_{L} \end{matrix}) + O (Δ φ) = \frac{v}{\sqrt{2}} (\begin{matrix} ν_{L} \\ ℓ_{L} \end{matrix}) + O (Δ φ) . \end{matrix}

(6.5)

Therefore, the different flavors of the left-handed components observed within one generation are actually not the weak gauge charges but rather the physically well-defined $S U (2)_{R}$ quantum numbers.

So far, we have only discussed the spectrum within our reduced electroweak model, that is, neglecting Yukawa coupling and hypercharge contributions. Allowing for nonvanishing Yukawa and hypercharge couplings, we explicitly break the global $S U (2)_{R}$ symmetry. Therefore, the $S U (2)_{diag}$ symmetry of the gauge-fixed formulation is broken as well. Nonetheless, we are still able to investigate gauge-invariant operators that generate states of the aforementioned $S U (2)_{R}$ multiplets that map on the corresponding $S U (2)_{diag}$ multiplets. The only difference from the previous discussion is a splitting of the multiplet levels in the various quantum number channels due to the explicit breaking terms in the Lagrangian, which then results in the different observed mass terms for charged leptons and neutrinos or the mass splitting of the $W$ and $Z$ bosons. Furthermore, we have to incorporate a gauge-invariant treatment of the additional $U (1)$ gauge structure. As the hypercharge sector is an Abelian gauge theory, we might use common dressings via a Dirac phase factor as in QED. For more details see the end of Section 5.3 or Reference MaasMaas (2019).

6.1.2 Phenomenological Implications of the Fröhlich–Morchio–Strocchi Formulation

Besides nontrivial lattice checks of the FMS relation, perturbative investigations that include higher-order FMS terms also shed a new light on the gauge-invariant definition of observables in a gauge theory with BEH mechanism. Note that conventional investigations of, for example, the properties of the Higgs merely focus on the first term of the FMS expansion in Eq. (6.2). Although the on-shell properties of $h$ do not depend on the gauge-fixing parameter, the off-shell properties do. In order to examine this further, let us extract the Källén–Lehmann spectral representation $ρ_{h} (λ)$ from the propagator $⟨ h (x) h (y) ⟩$ . In momentum space we have

\begin{matrix} ⟨ h (p) h (- p) ⟩ = \int_{0}^{\infty} d λ \frac{ρ_{h} (λ)}{p^{2} - λ} . \end{matrix}

(6.6)

We depict the spectral function for the elementary Higgs field computed via a one-loop approximation of the propagator in Fig. 6.1, see Reference Maas and SondenheimerMaas and Sondenheimer (2020) for further details. The analysis was performed for different gauge-fixing parameters $ξ$ . The red dotted curve denotes the result for $ξ = 1$ (Feynman–’t Hooft gauge), the green dash-dotted curve represents the result for $ξ = 2$ , and the blue dashed line is $ξ = 10$ . First of all, we obtain a clear peak at $m_{h} = 125$ GeV independently of the chosen gauge. Also, the width of this peak is the same for all gauges. This is expected due to the Nielsen identities as the peak position (mass) and width (decay width) are determined by the pole of the propagator.

Figure 6.1 Spectral density of the elementary Higgs field for different values of the gauge-fixing parameter $ξ$ . We depict the spectral function for $ξ = 1$ (red dotted line), $ξ = 2$ (green dash-dotted line), and $ξ = 10$ (blue dashed). Further, we depict the Higgs spectral function extracted from the pinch technique (purple line). The black solid line shows the spectral function of the gauge-invariant bound state $φ^{†} φ$ . The vertical gray dashed lines indicate the mass thresholds at $2 m_{W}$ , $2 m_{Z}$ , $2 m_{h}$ , and $2 m_{top}$ from left to right. For further details see Reference Maas and SondenheimerMaas and Sondenheimer (2020).

Furthermore, we find several continuum thresholds that are associated with observed particle masses. These are indicated as vertical, thin, gray dashed lines. The first such line is the $2 m_{W}$ threshold starting at twice the mass of the $W$ boson. Going to higher energies, we also find the thresholds at $2 m_{Z}$ , where $m_{Z}$ is the mass of the Z boson, $2 m_{h}$ , as well as $2 m_{top}$ . However, we also find unphysical thresholds that are not related to physical particles. These thresholds depend on the gauge-fixing parameter $ξ$ , which can easily be figured out by varying $ξ$ . More precisely, these unphysical thresholds start at $2 \sqrt{ξ} m_{W}$ and $2 \sqrt{ξ} m_{Z}$ . Indeed, for $ξ = 10$ (blue curve), we have two additional spikes at $\approx 505$ GeV and $\approx 575$ GeV, while the additional thresholds appear at $\approx 226$ GeV and $\approx 257$ GeV for $ξ = 2$ (green line). For $ξ = 1$ (red line), we don’t find additional structures in the spectral density as the unphysical thresholds start at the masses of the physical $W$ and $Z$ mass scale, which leads to a nontrivial modification of the latter physical thresholds. At this point we would also like to emphasize that the spectral function becomes negative for some gauge conditions. This is also a clear hint that the elementary Higgs field $h$ cannot be identified with a physical observable, as the spectral function of such a quantity has to be nonnegative for a physical interpretation. Similar results can also be obtained for the Abelian Higgs model (Reference Dudal, Peruzzo and SorellaDudal, Peruzzo, & Sorella, 2021; Reference Dudal, van Egmond and GuimarãesDudal et al., 2019, Reference Dudal, van Egmond and Guimarães2020; Reference Dudal, van Egmond and GuimarãesDudal, van Egmond, et al., 2021).

Of course, the fact that the elementary Higgs propagator depends on $ξ$ has been known for a long time. However, the Higgs particle is unstable within the standard model and occurs only as an intermediate resonance in a physical process. When calculating physical $S$ -matrix elements, for example, scattering processes of stable particles, the gauge parameter dependence of the internal Higgs propagators will get cancelled by propagator-like pieces from triangle and box diagrams (Reference Papavassiliou and PilaftsisPapavassiliou and Pilaftsis, 1995, Reference Papavassiliou and Pilaftsis1996a, Reference Papavassiliou and Pilaftsis1996b). Taking these processes into account, a $ξ$ -independent definition of the propagator and thus of the spectral function can be introduced via the so-called pinch technique (Reference Binosi and PapavassiliouBinosi & Papavassiliou, 2009; Reference Papavassiliou and PilaftsisPapavassiliou & Pilaftsis, 1998). This pinch technique propagator is cured from unphysical thresholds by definition. However, its spectral function still violates positivity, as can be seen in Fig. 6.1 (purple solid line).

By contrast, let us investigate the spectral function of the bound state operator $φ^{†} φ$ . For that, we include the other two terms of the FMS expansion in Eq. (6.2) on the same footing as the elementary Higgs propagator, that is, we perform a one-loop approximation as the simplest possible nontrivial approximation for these terms as well. The first important result of this calculation is given by the fact that all $ξ$ -dependent contributions to the leading-order term $⟨ h (x) h (y) ⟩$ get canceled by gauge-dependent contributions to the other two Green’s functions $⟨ h (x) (Δ φ^{†} Δ φ) (y) ⟩$ and $⟨ (Δ φ^{†} Δ φ) (x) (Δ φ^{†} Δ φ) (y) ⟩$ . Of course, this is not a surprise, as the sum of all terms on the right-hand side of the FMS expansion is gauge-invariant by construction. Thus, the unphysical thresholds are absent. The second important result is the positivity of the spectral function such that a physical particle interpretation is possible for the bound state (Reference Maas and SondenheimerMaas & Sondenheimer, 2020).

Apart from these more advanced analyses for the Higgs boson, other interesting phenomenological implications also have been investigated in first exploratory studies. For instance, the potential influence on anomalous couplings and the size of the gauge-invariant $W$ -Higgs bound state has been studied in (Reference Maas, Raubitzek and TörekMaas, Raubitzek, & Törek, 2019). Furthermore, the bound state formulation of observables in the electroweak sector also influences high-precision measurements of other sectors as QCD. The necessity to describe hadrons as gauge-invariant objects not only with respect to the strong interaction but also with respect to the weak interaction once they are embedded in the larger standard model context implies that some of them contain additional scalar fields as constituents (Reference Egger, Maas and SondenheimerEgger, Maas, & Sondenheimer, 2017). These effects can be addressed in a parton distribution function type of language (Reference Fernbach, Lechner, Maas, Plätzer and SchöfbeckFernbach et al., 2020). Also, predictions for potential future lepton colliders should be investigated in light of the FMS formulation, as off-shell properties of leptons will get altered, similar to the case of the Higgs boson. If some of these effects are not properly accounted for, they could easily be misinterpreted as signals for new physics while being only nontrivial effects of standard model physics.

6.2 The Fröhlich–Morchio–Strocchi Formulation for General Gauge Theories with a Brout–Englert–Higgs Mechanism

In previous sections, we discussed different strategies to construct a gauge-invariant formulation of the electroweak sector of the standard model. One of the central advantages of the FMS formulation is its direct generalization to arbitrary gauge groups with scalar fields in arbitrary representations. This is of particular importance, as recent lattice investigations also challenge the conventional interpretation of the spectrum of gauge theories with a BEH mechanism (Reference Afferrante, Maas and TörekAfferrante, Maas, & Törek, 2020a; Reference Maas and TörekMaas & Törek, 2017, Reference Maas and Törek2018; Reference Törek, Maas and SondenheimerTörek, Maas, & Sondenheimer, 2018). States that one would naively expect by the conventional analysis were not found by the nonperturbative lattice simulations of the models. This failure has far-reaching consequences for potential model building (Reference Maas, Sondenheimer and TörekMaas, Sondenheimer, & Törek, 2019; Reference SondenheimerSondenheimer, 2020). Using the FMS approach provides a coherent picture of currently observed phenomena of the lattice spectra.

The basic FMS ingredients, namely

(a) construct strict gauge-invariant operators with respect to the original gauge group and classify them according to the global symmetries of the model, and
(b) choose a gauge with nonvanishing VEV of the scalar field and investigate the FMS expansion,

can be used for any BEH model. In the following, let us consider a gauge theory with gauge group $H$ that breaks in the conventional treatment of the BEH mechanism to a subgroup $K \subset H$ . As discussed in detail in previous sections, this viewpoint has various philosophical and field theoretical inconsistencies. From a field theoretical perspective, the BEH mechanism should rather be considered as a duality relation between the spectra of an $H$ gauge theory and a $K$ gauge theory with specific field content (Reference SondenheimerSondenheimer, 2020). The FMS formalism reveals which of the potential states in both theories are related. This duality relation can be read in two ways. From a top-down perspective, the FMS mechanism shows which $H$ -invariant operators can be computed by potential simpler objects in a $K$ gauge theory. From a bottom-up perspective, the FMS formalism explains which states of a $K$ gauge theory can be embedded into the spectrum of an $H$ gauge theory.

At first sight, one may be tempted to conclude that the FMS strategy provides a gauge-invariant description of all quantities that are usually considered from the perspective of gauge symmetry breaking, similar to the standard model case. As a simple example, consider the elementary scalar field that is proportional to the direction of the VEV and thus always transforms as a singlet with respect to the unbroken remaining gauge group $K$ of the BEH mechanism. We find always a strict $H$ -invariant operator that has precisely this particular gauge-dependent field as the nontrivial leading-order term of the FMS expansion. We always have $(ϕ^{a})^{*} ϕ^{a} = v h + \dots$ where $h$ is the elementary $K$ singlet, $ϕ$ a scalar field in an arbitrary representation of the gauge group $H$ whose potential has nontrivial minima, and $a$ is a multi-index characterizing the representation. Similar constructions of $H$ -invariant operators can also be done for all those elementary fields that transform as $K$ singlets. This has been confirmed in all models that have been investigated in lattice calculations so far (Reference Afferrante, Maas and TörekAfferrante, Maas, & Törek, 2020a; Reference Afferrante, Maas and TörekAfferrante, Maas, & Törek, 2020b; Reference Maas and TörekMaas & Törek, 2017, Reference Maas and Törek2018; Reference Törek, Maas and SondenheimerTörek et al., 2018).

For instance, consider an $H = S U (3)$ gauge theory with a fundamental scalar field. Any nontrivial minimum of the potential has a $K = S U (2)$ subgroup as stabilizer. Therefore, we would expect a breaking $S U (3) \to S U (2)$ due to the BEH mechanism from the conventional perspective. On the level of the particle spectrum this translates in a formulation of $S U (2)$ -invariant objects instead of strict $S U (3)$ -invariant composite bound states. The constituents of the former can be extracted from the $S U (3)$ gauge and scalar field and arranged in multiplets of the remaining $S U (2)$ group. For the considered example, we can decompose the $S U (3)$ gauge field into three different $S U (2)$ multiplets. Five gauge bosons acquire a nonvanishing mass term. These can be subdivided into a field $A_{s}$ that transforms as a singlet with respect to the remaining $S U (2)$ gauge transformations while the other four components form a fundamental multiplet $A_{f}$ . The remaining three (massless) gauge bosons, which we denote by $A_{a}$ , form the pure Yang–Mills sector of the $S U (2)$ gauge theory. As the elementary $A_{s}$ is already invariant with respect to the non-Abelian $S U (2)$ gauge group, it belongs to the gauge-invariant spectrum of the $S U (2)$ gauge theory. As with the elementary Higgs field $h$ , we can also construct an $S U (3)$ -invariant vector operator that precisely maps on this particular vector field, $ϕ^{†} D^{μ} ϕ \sim A_{s}^{μ} + O (φ / v)$ , a fact that has been confirmed via lattice investigations (Reference Maas and TörekMaas & Törek, 2017, Reference Maas and Törek2018).

By contrast, there is no $S U (3)$ -invariant operator that maps on any other elementary vector field. This is not a surprise. In general, we start our investigation with a strict $H$ -invariant operator. All terms that can be extracted from such an object have to be invariant with respect to the remaining $K$ gauge transformations by construction such that we can only obtain $K$ singlets but we do not obtain a component of a $K$ charged multiplet ( $A_{f}$ or $A_{a}$ for our $H = S U (3)$ example). Of course, this does not imply that an object on the right-hand side cannot contain nontrivial $K$ multiplets. $K$ -invariant combinations of $K$ multiplets can be extracted from $H$ -invariant composite operators. For our current $S U (3)$ example, the $S U (3)$ -invariant glueball operator $T r (F^{2})$ can be decomposed into an $S U (2)$ glueball $T r (F_{a}^{2})$ with $F_{a} = d A_{a} + 1 / 2 [A_{a}, A_{a}]$ , an $S U (2)$ bound state formed by the fundamental vector fields $A_{f}^{†} A_{f}^{}$ , as well as several other $S U (2)$ -invariant combinations of the elementary $S U (2)$ multiplets. Although we can extract these $S U (2)$ -invariant states from a strict $S U (3)$ -invariant operator in a gauge-fixed setup by decomposing the multiplets, the associated states have not been found on the lattice yet. Why this is the case is currently under investigation and an open problem. So far, we can identify two differences that distinguish, for instance, a $K$ glueball operator from the elementary Higgs field or the elementary singlet vector field for the $S U (2)$ case.

First, the latter operators not only can be obtained from $S U (3)$ -invariant operators via the standard multiplet decomposition but also appear in a unique way at nontrivial leading order of the FMS expansion of some $S U (3)$ -invariant operators. By contrast, no such mapping of an $H$ -invariant operator on a $K$ -invariant glueball operator exists via the split $ϕ = \frac{v}{\sqrt{2}} ϕ_{0} + Δ ϕ$ , as the constituents of the $K$ glueball operator are in a subspace orthogonal to the direction of the VEV $ϕ_{0}$ . Second, the $K$ glueball as well as various other operators are composites of elementary $K$ multiplets and form nontrivial bound states already from the perspective of the $K$ gauge theory. The question of whether the BEH duality extends to these objects and the FMS mappings are able to explain the spectra on a pure group theoretical basis or dynamical effects of bound state formations from either the $H$ or the $K$ perspective play an important role needs further detailed investigation.

The fact that only $K$ -invariant operators can be extracted from $H$ -invariant ones has far-reaching implications for model building beyond the standard model. As a simple toy model, let us consider an $S U (2)$ gauge theory with a scalar field in the adjoint representation. Performing the conventional analysis, the breaking pattern reads $S U (2) \to U (1)$ and the particle spectrum consists of a scalar particle described by an elementary scalar field that is a $U (1)$ singlet, a massive vector boson that is charged with respect to the remaining $U (1)$ symmetry and its corresponding antiparticle with opposite $U (1)$ charge, as well as a massless gauge boson being the force carrier of the $U (1)$ gauge group. Thus, a variety of potential states can be described by elementary fields from the conventional perspective of gauge symmetry breaking. From the FMS perspective, we have to construct $S U (2)$ -invariant states and investigate their FMS expansions. Indeed, it is straightforward to find strict $S U (2)$ -invariant operators that map on the elementary scalar boson as well as on the massless vector particle (Reference Maas, Sondenheimer and TörekMaas, Sondenheimer, and Törek, 2019). However, no $S U (2)$ -invariant operator exists that maps on an operator generating a $U (1)$ charged state (Reference SondenheimerSondenheimer, 2020), which is in accordance with lattice investigations (Reference Afferrante, Maas and TörekAfferrante et al., 2020a, Reference Afferrante, Maas and Törek2020b; Reference Lee and ShigemitsuLee & Shigemitsu, 1986). Again, we can only extract a $K$ $(= U (1))$ -invariant operator from any $H$ -invariant operator. Thus, it is not possible to embed the $U (1)$ charged states from the perspective of the $U (1)$ gauge theory into the spectrum of the $S U (2)$ gauge theory. This is a generic problem for any physical theory beyond the standard model (BSM) that tries to embed the $U (1)$ gauge group of the standard model into a larger gauge symmetry.

From that perspective, the standard model electroweak gauge group is special. First, it explicitly contains a $U (1)$ (hypercharge) group whose properties translate into the properties of the remaining $U (1)$ (electromagnetism) gauge group via the BEH mechanism. Second, the non-Abelian $S U (2)$ weak gauge sector has a global counterpart, which is also described by an $S U (2)$ structure that purely acts on the scalar fields. Therefore, a sufficiently large number of $S U (2)$ gauge-invariant operators can be constructed and classified according to global $S U (2)_{R}$ multiplets. Due to these particular reasons, we are able to construct the spectrum in a strict gauge-invariant way, and the FMS mapping provides a convenient description of it in terms of the conventional analysis via the elementary fields of the gauge-fixed Lagrangian. Similar constructions can be done for BSM models that fulfill the same requirements as the standard model, for example two-Higgs-doublet models (Reference Maas and PedroMaas and Pedro, 2016) as well as general $N$ -Higgs-doublet models, such that these models provide reliable BSM models that pass all FMS constraints.

7 Critical Assessment, Reflections, and Challenges

The success of gauge theories in particle physics opened the door to optimism concerning unification in physics based on the concept of symmetry (Reference YangYang, 1980). Understanding gauge symmetries as descriptive redundancies seems adequate in the light of reasonable conceptual desiderata such as determinism, parsimony of the posited unobservable ontology, and elimination of superfluous structure (Section 2). Yet, the fundamental significance of this success seems to be challenged by the apparent indispensability of gauge fixing and spontaneous symmetry breaking. In the context of the BEH mechanism, the very possibility of providing a gauge-invariant account (presented already in the unitary-gauge-fixed formulations by Reference HiggsHiggs, 1966 and Reference KibbleKibble, 1967) can appear as providing a viewpoint benefiting from the best of both worlds (Reference StruyveStruyve, 2011), reconciling gauge invariance and accounting for massive vector bosons at the same time. However, the foundational importance of such attempts remains questionable as long as they stand as mere reformulations of existing theories, achieving certain theoretical virtues at the price of sacrificing others. While gauge invariance may further resolve some technical issues that arise in the context of lattice theories (Section 3), it may seem far from being clear whether these advantages can compete with those of the established framework of spontaneous symmetry breaking.

The DFM (Section 5) and the FMS approach (Section 6) may each suggest that the gauge-invariant approaches can, in fact, open the door to a wider heuristic and conceptual framework. Both of these methods identify gauge-invariant field variables, thus achieving reduction of gauge symmetries without compromising on the theoretical virtues that motivate gauge invariance (see Section 4). Applied to the electroweak model, they converge on the conclusion that the spontaneous breaking of gauge symmetry is not a physical phenomenon in this case, and furthermore, at the classical level the results they provide coincide (Reference MaasMaas, 2019), giving rise to a local gauge-invariant description of the massive gauge bosons that renders the $S U (2)$ symmetry an artificial one. However, while neither the DFM nor the FMS are yet in a state of full maturity, we can already point out that despite the aforementioned similarities the two approaches seem to entail different research programs that face different challenges.

For example, the question of quantizing a dressed theory, with its invariant field variables, is still a programmatic endeavor. The FMS approach is more developed in that regard, as it treats invariant fields as composites of gauge-variant fields and borrows techniques from QCD, and it already shows great promises as it appears to mitigate problems appearing in the standard formulation based on SSB (as detailed by the end of Section 6).

If future research confirms that such local invariant formulations of the electroweak model have a theoretical edge over the usual approach (gauge fixing and SSB), then a web of interconnected questions arises: If $S U (2)$ is indeed artificial, and given that the gauge principle applied to the sole $U (1)$ substantial gauge symmetry is not enough to explain the structure of the model, presumably this reopens the question of its conceptual and theoretical foundations – or at least provide a new angle to reassess those foundations. This question itself is then nested within that of the underlying principle(s) explaining the structure of the full Standard Model, whose substantial gauge symmetry group would then be $U (1) \times S U (3)$ . A more fundamental theory giving the SM in the effective low-energy regime should then explain why it presents this mix of substantial and artificial symmetries. Following that thread could be another avenue toward the area beyond the SM physics.

Constraining the formalism to gauge-invariant field variables might come at the price of increased complexity and/or loss of manifest locality. This shows that while gauge symmetries are convenient, they are not always necessary in order to formulate the relevant physical theories. We conclude with some reflections on implications of these findings, and a (possible) future role of gauge-invariant approaches in physical practice.

Concerning physical practice, there is an overwhelming agreement among physicists that gauge-dependent quantities are not empirical. On the other hand, as far as day-to-day physical practice is concerned, this statement is often applied only within the narrow window of perturbative or effective treatments. Only a few, mainly mathematical, physicists have consistently pointed out that the perturbative approach is viable only for very specific theories. In matters of practice, only within the community of lattice theoreticians has nonperturbative gauge invariance become, by necessity, mandatory. Nonetheless, the subtleties in relation to the perturbative treatment, which were emphasized especially by Fröhlich, Morchio, and Strocchi, have not been widely appreciated (see Reference MaasMaas, 2019 for an overview of the developments). Assuming that, in accordance with the recommendations presented here, formulating theories in gauge-invariant ways from the start becomes accepted as a methodological guideline in the future, this leaves us with a number of puzzling insights and challenges.

On the one hand, it appears that many, or perhaps even all gauge theories can be reformulated as theories without gauge symmetries. However, these reformulations come with nontrivial features, like nonlocal contributions, non–power-countable Lagrangians, involved target spaces, or an infinite number of fields. Moreover, provided that dualities between different gauge theories hold, there could be multiple different gauge theories associated with the same set of gauge-invariant quantities.

On the other hand, peaceful coexistence between theoretical practice and gauge symmetries is definitely possible, as long as we maintain a commitment to express observable quantities in terms of gauge-invariant observables. That is, even if gauge symmetries are still a part of the theoretical framework, we assign physical relevance only to quantities that are gauge-independent; this should be contrasted with eliminative approaches that are strictly formulated using gauge-invariant variables. Though, as the example of QCD shows, if this commitment to gauge-invariant observable quantities is manifested by, for example, lattice QCD, this resolution may require only marginally less effort than eliminating the gauge symmetry altogether, as manifested by, for example, a reformulation of QCD in terms of Wilson lines. The enormous amount of computing time and person-years in development of algorithms for lattice QCD needs to be compared to the conceptual and technical complications of a reformulation of QCD in terms of Wilson lines.

But a number of conceptual challenges emerge in the eliminative approach, from the physics point of view: Can indeed every gauge theory be written in terms of a formalism without gauge-dependent quantities and hence without gauge symmetries? Do there exist gauge theories whose nongauge version is genuinely local, without the gauge symmetry being trivial? Is any such theory relevant to experiment? Do theories that are dual to experimentally relevant ones exist, which have different gauge symmetries? Answering these questions would tell us a lot about to what extent gauge symmetries are uniquely tied to the observables.

Even if the ease of use implies that gauge symmetry will find continued employment in actual calculations, consistently adopting the stance that gauge symmetries are conceptually redundant would have far-reaching implications. For this would imply that, as they are represented in the Lagrangian, each and every elementary particle in the current standard model of particle physicsFootnote ⁵¹ is not physical, as the fields corresponding to these particles are all gauge-dependent! The only physical degrees of freedom would be those that correspond to hadrons, the electroweak objects of the FMS approach, and photon-cloud dressed QED states. The conventional notions of quarks, electrons, and other “elementary" particles would need to be regarded as mere auxiliaries that are technically useful but do not have any physical reality. Given the role these objects play even at the level of school textbooks, this would be a fundamental shift of what are widely and popularly regarded as the furniture of reality and the basic building blocks of nature.Footnote ⁵² Thus, eliminating gauge-dependent objects as physical objects might well be the most consequential change in the way in which we portray nature since the advent of quantum field theory.

However, this leaves one stark observation: Every experimentally relevant theory can be written either in a local form using gauge symmetries or in a nonlocal form without gauge symmetries. This raises several questions: If we wanted to preserve locality, is the preservation of gauge symmetries our only option? Is a description of experiments without gauge symmetry only possible nonlocally? As such, are we guided correctly in assuming that the gauge principle is essential in more fundamental theories? Or does this unnecessarily narrow our perspective? Even when thinking about approaches like loop quantum gravity, gauge-invariant variables originally derive from a local formulation. Could and should a general nonlocal (or nongaugeable) approach be searched for? Without experimental guidance, this appears challenging at least. So, as a more pragmatic benchmark, we can ask: does eliminating gauge-dependent objects as an element of reality create progress?

These questions need to be answered. And it needs to be understood whether abandoning our current view in terms of quarks, electrons, and point-like particles in general is necessary, or at the very least advantageous.

As a concluding comment, we discussed here primarily the situation of ordinary relativistic quantum field theories in flat space-time. But the problem extends beyond those. Most notably, similar problems arise in (quantum) gravity theories, which can be considered to be gauge theories of translations and, in presence of torsion, Lorentz symmetry (Reference Hehl, McCrea, Mielke and Ne’emanHehl et al., 1995, Reference Hehl, Von Der Heyde, Kerlick and NesterHehl et al., 1976). This does not even touch upon the possibilities in more extensive settings, for example string theory.

The problems encountered with gauge symmetries are then amplified in gauge theories of gravity, as the space-time structure itself, including the notion of time, becomes gauge-dependent. Likewise, similar approaches have been advocated to eliminate the problem of gauge dependence. Most notably, loop-quantum gravity (Reference Ashtekar and SinghAshtekar & Singh, 2011) seeks an alternate quantization procedure by quantizing manifestly gauge-invariant quantities. More particle-physics-like approaches are also discussed, where the quantum theory remains a gauge theory. This leads to ideas similar to the dressing-field method (see e.g. Reference Donnelly and GiddingsDonnelly & Giddings, 2016; Reference Giddings and WeinbergGiddings & Weinberg, 2019) or the FMS approach (Reference MaasMaas, 2020). However, the concept of locality in particular becomes far more involved. In this context, the concept of local observables is far less developed, and important questions, for example the role of affine parameters instead of space-time coordinates, are far from understood. This issue has also been observed in the philosophy of physics (Reference HealeyHealey, 2007; Reference LyreLyre, 2004). However, without a clear understanding of the role of gauge symmetries in particle physics a full clarification in the quantum gravity setting appears unlikely.

Appendix A The Basics of Gauge Field Theory

We present here the formalization of gauge theories in the language of differential forms on space-time, which is very common in the physics literature. This material is fairly elementary, yet we can only present it rather than fully explain it. But we have made an effort to ensure that it is, if not self-contained, at least as logically developed as possible. The reader who feels the need to fill the gaps or achieve a deeper understanding may consult the pedagogical review (Reference FrançoisFrançois, 2021a) or more complete treatments, for example the first chapter of Reference BertlmannBertlmann (1996) or the book by Reference HamiltonHamilton (2017). Our aim is for a motivated nonexpert to grasp the key technical and conceptual notions.

A1 The Field Space

A gauge field theory describes the dynamics and interactions of a set of fields $Φ = {A, ϕ}$ on an $n$ -dimensional space-time manifold $M$ based on a Lie symmetry group $G$ . Here $A$ is a 1-form on $M$ with values in the Lie algebra $g$ of $G$ – we denote $A \in Ω^{1} (M, g)$ – which represents the gauge potential, and whose field strength is the 2-form $F : = d A + \frac{1}{2} [A, A] \in Ω^{2} (M, g)$ .Footnote ¹ Given a basis $e_{a}$ of $g$ , $a \in {1, \dots, dim g}$ , we have $A = A^{a} e_{a}$ and $F = F^{a} e_{a}$ , with $A^{a}$ and $F^{a}$ scalar-valued 1 and 2 forms on $M$ . On any open $U \subset M$ with chosen coordinate ${x^{μ}}$ , $μ \in {1, \dots, n}$ , we have the coordinate representation $A = A_{μ} d x^{μ} = {A^{a}}_{μ} e_{a} \otimes d x^{μ}$ and $F = \frac{1}{2} F_{μ ν} d x^{μ} \land d x^{ν} = \frac{1}{2} F_{μ ν}^{a} e_{a} \otimes d x^{μ} \land d x^{ν}$ . The components ${A^{a}}_{μ} = {A^{a}}_{μ} (x)$ and $F_{μ ν}^{a} = F_{μ ν}^{a} (x)$ are the “physical” fields. The components $F_{μ ν}^{(a)}$ of the field-strength 2-form are an antisymmetric covariant tensor on $M$ , expressed in terms of the potential components as

\begin{matrix} F_{μ ν}^{a} = \partial_{[μ}^{} A_{ν]}^{a} + [A_{μ}, A_{ν}]^{a} = \partial_{μ}^{} A_{ν}^{a} - \partial_{ν}^{} A_{μ}^{a} + {f^{a}}_{b c} A_{μ}^{b} A_{ν}^{c}, \end{matrix}

(A-1)

where $[e_{b}, e_{c}] = {f^{a}}_{b c} e_{a}$ defines the structure constants of $g$ . This is the well-known expression of the Yang–Mills field strength (and of the Riemann tensor if $A$ is the spin connection of GR). Actually, in this expression we set an important physical parameter to unity: the coupling constant $g$ that should appear in the quadratic piece of the field strength, $g {f^{a}}_{b c} A_{μ}^{b} A_{ν}^{c}$ , indicating the strength of the self-coupling of the gauge potential. We will keep it to unity (except once) to maintain focus on the mathematical and geometrical nature of the fields involved.

The symbol $ϕ$ denotes a field (or a collection thereof) valued in some representation space(s) $V$ for $G$ : these represent various matter fields – and the Higgs field. This we denote $ϕ : M \to V$ , $x \mapsto ϕ (x)$ . There is a group morphism $ρ : G \to G L (V)$ , with corresponding Lie algebra morphism $ρ_{*} : g \to g l (V)$ , giving the action of $G$ and $g$ on $V$ , and therefore $ϕ$ . For all practical purposes, $V$ will often be a linear (vector) space, either real or complex, supporting an action of the defining matrix representation of $H$ . So we may omit to write explicitly the representation in concrete, component, notation. Given a basis ${b_{i}}$ for $V$ , we have $ϕ = ϕ^{i} b_{i}$ , with ${ϕ_{i}}$ a collection of $K$ -valued scalar fields ( $K = R$ or $C$ ), which are just the component representation of $ϕ$ .

To be more precise, matter fields are fermions (with half-integer spin) represented by (Dirac) spinor fields valued in a $C$ -representation space $S$ for the Spin(1, n-1) group of $M$ , which is the double cover of its (local) Lorentz group $S O (1, n - 1)$ . We denote $ψ : M \to S$ , $x \mapsto ψ (x)$ . On a basis ${ε_{α}}$ of $S$ , a spinor field decomposes as $ψ = ψ^{α} ε_{α}$ – with $α$ the spinor index “supporting” the action of the Lorentz group and Lie algebra. A gauge matter field is thus $ψ : M \to S \otimes V$ with components $ψ^{α, i}$ . This means that each $ψ^{i}$ is a Dirac spinor (rather than a $K$ -scalar field), or that each component $ψ^{α}$ supports the action of $H$ . This slight complication due to the spinorial nature of matter fields will not be essential to the remainder of our presentation. So we will continue to commit the slight abuse of calling $ϕ$ a matter field, with the understanding that, unless stated otherwise, the spinor structure would not interfere with what is under discussion.

The interaction between the gauge potential and the matter fields is formalized by the (gauge) covariant derivative, $D ϕ := d ϕ + ρ_{*} (A) ϕ$ , which implements their minimal coupling via the term $ρ_{*} (A) ϕ$ .Footnote ² Here again we have set the coupling constant to unity: reestablished in the covariant derivative, $D ϕ := d ϕ + g ρ_{*} (A) ϕ$ , it indicates the strength of the minimal coupling between $A$ and $ϕ$ . In components, given the mapping of Lie algebra generators $ρ_{*} (e_{a}) = {e^{i}}_{j}$ – where the latter is the matrix representation of the generators of $g l (V)$ – the interaction terms are $g {A^{i}}_{j, μ} ϕ^{j}$ . We may notice that, contrary to the exterior derivative, which is s.t. $d^{2} = 0$ , the covariant derivative satisfies $D^{2} = ρ_{*} (F)$ .

Remark:

The field strength is a $g$ -valued field whose target space supports the action of $g \in G$ by the adjoint representation, $ρ (g) X = Ad (g) X := g X g^{- 1}$ , and thus an action of $ℓ \in g$ via $ρ_{*} (ℓ) X = ad (ℓ) X := [ℓ, X]$ . So, the covariant derivative applies on it as $D F = d F + [A, F]$ . Furthermore, given its definition in terms of $A$ , it identically satisfies the Bianchi identity, $D F \equiv 0$ , which is thus a kinematical field equation – as opposed to dynamical field equations stemming from a choice of Lagrangian (see the following discussion). In the simple case of an Abelian gauge theory the bracket is trivial, so this is $d F \equiv 0$ : in electromagnetism (EM) this encodes the sourceless Maxwell equations, which are then kinematical/nondynamical.

A2 The Gauge Group

The preceding description is not enough to characterize the field space: its mathematical description is complete only once we have specified how the gauge group of the theory acts upon each field variable.

The gauge group $G$ is defined as the set of $G$ -valued functions $γ : M \to G$ with pointwise group multiplication $(γ γ^{'}) (x) = γ (x) γ^{'} (x)$ – as such it is an infinite dimensional group – but defined to act on (transform) one another: given $η \in H$ , any other $γ \in H$ acts on $η$ by group conjugation, $η \mapsto γ^{- 1} η γ =: η^{γ}$ . The right-hand side of the equality is just a notation defined by the left-hand side and signifies the action of $γ \in G$ on $η$ seen as a “field" on $M$ . The gauge group is thus

\begin{matrix} G := (γ, η : M \to G | η^{γ} = γ^{- 1} η γ) . \end{matrix}

(A-2)

By definition of the gauge potential and matter fields, $G$ acts on them as

\begin{matrix} A \mapsto A^{γ} := γ^{- 1} A γ + γ^{- 1} d γ and ϕ \mapsto ϕ^{γ} := ρ (γ)^{- 1} ϕ . \end{matrix}

(A-3)

These are the gauge transformations of the gauge potential and matter fields. Given the definition of the field strength in terms of $A$ , it gauge transforms as $F \mapsto F^{γ} = γ^{- 1} F γ$ . Furthermore, given the definition of the covariant derivative of $ϕ$ , it gauge transforms as $D ϕ \mapsto (D ϕ)^{γ} := d ϕ^{γ} + ρ_{*} (A^{γ}) ϕ^{γ} = ρ (γ)^{- 1} D ϕ$ , namely it transforms in the same way as the field $ϕ$ itself. Hence the name “covariant derivative” for $D$ : it preserves the covariance of $ϕ$ . Thus, $ϕ$ , $D ϕ$ , and $F$ are gauge-covariant fields – gauge tensors, so to speak – while $A$ is not, given its inhomogeneous transformation law.Footnote ³ We remark, however, that the difference of two gauge potentials $B := A - A^{'}$ , given (A-3), transforms as $B \mapsto B^{γ} = γ^{- 1} B γ$ and is thus a gauge covariant 1-form. This shows that the space of gauge potentials is an affine space modeled on the vector space of $g$ -valued ( $Ad$ -) covariant 1-forms.Footnote ⁴

The action of the Lie algebra of the gauge group, Lie $G$ , provides the infinitesimal gauge transformations. For $χ \in$ Lie $G$ we have

\begin{matrix} δ_{χ} A = d χ + [A, χ], δ_{χ} ϕ = - ρ_{*} (χ) ϕ, and δ_{χ} F = [F, χ], \end{matrix}

(A-4)

which are just the linearization of (A-3) given $γ = e^{χ}$ . Notice that one may write $δ_{χ} A = D χ$ , as the infinitesimal gauge parameter is seen as a field $χ : M \to g$ supporting the adjoint action $ρ_{*} = ad$ . The infinitesimal gauge variation of the potential is thus a gauge covariant tensor, in keeping with the affine structure of the space of gauge potential just noted.

As defined by (A-3), the action of $G$ on $Φ$ is a right action. To illustrate that it is well defined, let us look at the iterated action of $H$ on the matter field,

\begin{matrix} ϕ \mapsto ϕ^{η} \mapsto (ϕ^{η})^{γ} := & (ρ (η)^{- 1} ϕ)^{γ}, \\ := & ρ (η^{γ})^{- 1} ϕ^{γ} = ρ (γ^{- 1} η γ)^{- 1} ρ (γ)^{- 1} ϕ = ρ (η γ)^{- 1} ϕ, \\ = & : ϕ^{η γ} . \end{matrix}

(A-5)

At the beginning of the second line we use the fact that $G$ acts on all field objects, by (A-2)–(A-3), and the definition (A-3) at the end to conclude that we have indeed a well-defined (right) action on the space of matter fields, $(ϕ^{η})^{γ} = ϕ^{η γ}$ . Showing the same on the space of gauge potentials is left as an exercise to the reader.

The main takeaway we argue for here is that, mathematically, the action of $G$ on $Φ$ (and on itself) is part of the definition of the various fields considered. This is relevant to our discussion of the dressing field approach, and it is something to be mindful of in general.

A3 The Lagrangian of a Gauge Field Theory

Now that the kinematics are properly characterized, we may turn our attention to the dynamics. They are specified by a choosing a Lagrangian, a ( $R$ -valued) $n$ -form $L$ on $M$ , with action functional $S = \int_{M} L$ from which field equations are derived via the Variational Principle (classically), or from which there is built a Lagrangian QFT (e.g. via a path integral formulation). In any case, the choice should, of course, be dictated by the empirical adequacy. Yet, given the obvious impracticality of going back from empirical data (especially those yet to be found) to the Lagrangian, physicists must rely on well-motivated Symmetry Principles (epistemic and/or heuristic in nature) to constrain as much as possible the a priori choice of the Lagrangian, namely the space of admissible theories.Footnote ⁵

One such guiding principle is the idea that the expression of physical laws (Lagrangians and field equations) should be indifferent to choices of coordinates: this is (a version of) the generalized Relativity Principle, the Principle of General Covariance (PGC), which reflects a requirement of democratic epistemic access of all observers to the objective physical reality. In that respect, working with tensors and differential forms of $M$ to represent fields makes compliance to the PGC automatic and goes a long way toward implementing the core idea underlying the generalized Relativity Principle.Footnote ⁶

Another such guiding principle is of course the Gauge Principle (GP), a central theme of this book, discussed in section 2 – in particular sections 2.1.1 and 2.2.3. It requires that the admissible space of theories are described by Lagrangians for which the gauge group $H$ is a (variational) symmetry, meaning those $L$ s.t.

\begin{matrix} L (A^{γ}, ϕ^{γ}) = L (A, ϕ), or δ_{χ} L (A, ϕ) = d b (χ, A, ϕ) \end{matrix}

(A-6)

\begin{matrix} \Rightarrow S (A^{γ}, ϕ^{γ}) = S (A, ϕ) and/or δ χ S = 0. \end{matrix}

(A-7)

The second condition is often referred to as quasi-invariance of the Lagrangian; infinitesimal invariance up to $d$ -exact (or boundary) terms. Using Stokes theorem $\int_{M} d α = \int_{\partial M} α$ , it is enough to guarantee the invariance of the action on boundaryless manifolds, $\partial M = \emptyset$ , or when boundary conditions are imposed on the fields to fix their values, so $χ_{| \partial M} = 0$ . The Chern-Simons Lagrangian in 3D for a gauge field is of this type: $L_{CS} (A) = Tr (A d A + \frac{2}{3} A^{3})$ .

The first condition is exact gauge invariance. The easiest way to achieve it is to construct $L$ from gauge covariant forms, such as $ϕ$ , $D ϕ$ and $F$ . The prototypical Lagrangian for a gauge field coupled to scalar and spinor fields is

\begin{matrix} L (A, ϕ, ψ) & = \frac{1}{2} Tr (F \land * F) + 〈 ψ, D ψ 〉 - m 〈 ψ, * ψ 〉, \end{matrix}

(A-8)

\begin{matrix} + ⟨ D ϕ, * D ϕ ⟩ - μ^{2} ⟨ ϕ, * ϕ ⟩ . \end{matrix}

(A-9)

As $L$ should be a $R$ -valued $n$ -form, we used several non-trivial (yet very natural) ingredients: First, the Hodge dual operator $* : Ω^{p} (M) \to Ω^{n - p} (M)$ , which transforms $p$ -forms into $(n - p)$ -forms, and in particular for any $0$ -form $φ$ we have $* φ = φ {v o l}_{n}$ with ${v o l}_{n}$ the volume $n$ -form on $M$ . Then, $T r$ and $⟨, ⟩$ , which are $G$ -invariant non-degenerate bilinear forms on the respective target spaces $g$ , $V$ and $S$ (or $V \otimes S$ )Footnote ⁷ of the elementary covariant fields/forms $F$ , $ϕ$ and $ψ$ respectively. Finally, the Dirac operator $D : = γ \land * D$ , where $γ := γ_{μ} d x^{μ} = γ_{a} {e^{a}}_{μ} d x^{μ}$ is a Clifford algebra-valued 1-form, with $γ_{a}$ the Dirac gamma matrices – basis of the Clifford algebra on Minkowski space $(M, η)$ – satisfying $γ_{a} γ_{b} + γ_{b} γ_{a} = η_{a b}$ , and ${e^{a}}_{μ}$ is the cotetrad field.Footnote ⁸ In components, $D ψ = γ^{μ} D_{μ} ψ {v o l}_{n}$ : it is a $n$ -form.

The terms involving $ϕ$ , and $A$ via $D ϕ$ , are a coupled Klein–Gordon-type Lagrangian $L_{KG} (ϕ, A)$ , which describes the dynamics of the massive bosonic field $ϕ$ and its interaction with $A$ . From it, one derives the second-order massive Klein–Gordon field equation. In particular, the term quadratic in $ϕ$ is its mass term. Clearly, it is possible to add gauge-invariant terms of the form $λ ⟨ ϕ, * ϕ ⟩^{2 p} := λ ⟨ ϕ, ϕ ⟩^{2 p} {v o l}_{n}$ (power taken on the $R$ -value of the $n$ -form on the left-hand side): a potential term for $ϕ$ typically contains quadratic and quartic terms such as these.

The terms involving $ψ$ , and $A$ via the Dirac operator $D ψ$ , are a coupled Dirac Lagrangian $L_{Dirac} (ψ, A)$ , and describe the dynamics of the massive fermionic field $ψ$ and its interaction with the gauge field $A$ . Thanks to the Dirac operator $D ψ$ , which is a $n$ -form by itself, $L_{Dirac} (ψ, A)$ contains a single derivative, so it gives rise to the first-order Dirac equation (the historical raison d’être of the Dirac operator). Here again, the term quadratic in $ψ$ is its mass term.

The term quadratic in $F$ is the Yang–Mills Lagrangian, $L_{YM} (A)$ , which provides the dynamics and self-coupling of the gauge potential. From it, one derives the Yang–Mills equation $D * F = J (ϕ, ψ)$ , where $J$ is the current $(n - 1)$ -form built from the matter fields that acts as a source for the gauge field. In the abelian case this is $d * F = J$ , the dynamical Maxwell equations. Notice that a mass term for the bosonic gauge potential, $m^{2} Tr (A \land * A)$ – or in components, $m^{2} {A^{a}}_{b, ν} {A^{b ν}}_{a,}$ – is forbidden by the GP: indeed, by (A-3) such a term is not gauge-invariant. The gauge symmetry $G$ implies that gauge interactions are mediated by massless fields, giving massless gauge bosons upon quantization, whose physical influence thus must propagate at the speed of light $c$ (which is an insightful retrodiction/explanation and prediction concerning QED and GR).

Appendix B Fiber Bundle Geometry

In this section we go a layer deeper in the mathematical foundation of gauge field theory. We take the view, fairly consensual, that the differential geometry of fiber bundles is the geometrical underpinning of classical gauge field theories. The motivated yet unacquainted reader may consult the short pedagogical review (Reference FrançoisFrançois, 2021a) before delving into more complete introductions, for example the nice book by Reference HamiltonHamilton (2017). In the following, we simply provide a logical but dense articulation of the elementary notions of bundle geometry, showing how it underlies gauge field concepts seen in Appendix A. This is then put to work when we express the DFM more geometrically than our field-theoretic-focused account of Section 5 allowed for.

B1 Differential Geometry of Gauge Theories

The recipe for a gauge field theory consists in a series of geometric ingredients providing the “kinematics,” so to speak, and a physical one providing the dynamics: the Lagrangian.

Arguably the central ingredient is a principal bundle $P$ over space-time $M$ with structure group $H$ (the global/rigid symmetry group) and projection $π : P \to M$ , $p \mapsto π (p) = x$ . A fiber over $x \in M$ is the submanifold $π^{- 1} (x) = P_{| x} \subset P$ . Each fiber is an orbit of the right action of the structure group, $P \times H \to P$ , $(p, h) \mapsto p h =: R_{h} p$ , which is free and transitive. The linearization of this action induces vectors tangent to the fibers: $\forall X \in$ Lie $H$ corresponds to a vertical vector $X_{| p}^{v}$ at $p$ . At $p \in P$ , the span of these vectors is a subvector space $V_{p} P$ of the tangent space $T_{p} P$ . The collection of all such subspaces $\forall p \in P$ is the canonical vertical subbundle $V P$ of the tangent bundle $T P$ . We note $Γ (V P)$ the space of vertical vector fields $X^{v} : P \to V P$ (i.e. sections of $V P$ ).

Given representations $(ρ_{i}, V_{i})$ of $H$ , one naturally builds associated bundles to $P$ , $E_{i} : = P \times_{ρ_{i}} V_{i}$ with typical fiber $V_{i}$ , whose sections $s_{i} : M \to E_{i}$ , $s_{i} \in Γ (E_{i})$ , represent various kinds of matter fields. It is a standard result of bundle theory that sections are in 1:1 correspondence with representation-valued equivariant functions on $P$ , namely $Γ (E_{i}) ≃ Ω_{eq}^{0} (P, V_{i}) := {φ : P \to V_{i} | R_{h}^{*} φ = ρ_{i} (h^{- 1}) φ$ , that is, $φ (p h) = ρ_{i} (h^{- 1}) φ (p)}$ . More generally, one may define the space of representation-valued equivariant forms $Ω_{eq}^{∙} (P, V_{i}) := {α \in Ω^{∙} (P, V_{i}) | R_{h}^{*} α = ρ_{i} (h^{- 1}) α)}$ , and the important subspace of tensorial forms $Ω_{tens}^{∙} (P, V_{i}) := {α \in Ω_{eq}^{∙} (P, V_{i}) | α (X^{v}, \dots) = 0$ , for $X^{v} \in Γ (V P))}$ . A form that satisfies this latter condition, without necessarily being equivariant, is called horizontal. Notice that $Ω_{eq}^{0} (P, V_{i}) = Ω_{tens}^{0} (P, V_{i})$ . Tensorial forms with trivial equivariance are called basic. The name is justified by the fact that basic forms induce, or come from, forms on the base $M$ . In fact, an alternative definition is $Ω_{basic}^{∙} (P, V_{i}) := {α \in Ω^{∙} (P, V_{i}) | \exists β \in Ω^{∙} (M, V_{i}) s.t. α = π^{*} β}$ .

The exterior derivative $d$ on $P$ does not preserve the space of tensorial forms, in particular $d φ \notin Ω_{eq}^{1} (P, V_{i})$ . Hence the introduction on $P$ of a connection 1-form $ω \in Ω^{1} (P, Lie H)$ defined by

\begin{matrix} R_{h}^{*} ω & = {Ad}_{h^{- 1}} ω, i.e. ω \in Ω_{eq}^{1} (P, Lie H), \end{matrix}

(B-1)

\begin{matrix} ω_{p} (X_{| p}^{v}) & = X \in Lie H, \forall X_{| p}^{v} \in V_{p} P . \end{matrix}

(B-2)

From these properties follows that one can define a covariant derivative, $D^{ω} := d + ρ_{i *} (ω) : Ω_{tens}^{∙} (P, V_{i}) \to Ω_{tens}^{∙ + 1} (P, V_{i})$ , where $ρ_{i *}$ are representation maps for Lie $H$ . So that in particular $D^{ω} φ = d φ + ρ_{i *} (ω) φ \in Ω_{tens}^{1} (P, V_{i})$ , that is, a connection allows for a good notion of derivation on $Γ (E_{i})$ . The choice of a connection 1-form on $P$ is noncanonical. The space of all connections $C$ is an affine space modeled on the vector space $Ω_{tens}^{1} (P, Lie H)$ , meaning that for $ω \in C$ and $α \in Ω_{tens}^{1} (P, Lie H)$ , $ω^{'} := ω + α \in C$ . Or, as is clear from the preceding defining properties, for $ω, ω^{'} \in C$ , $ω + ω^{'} \notin C$ , and $ω^{'} - ω \in Ω_{tens}^{1} (P, Lie H)$ .

The curvature of a connection is given by $Ω = d ω + \frac{1}{2} [ω, ω]$ . One shows that it is a tensorial form, $Ω \in Ω_{tens}^{2} (P, Lie H)$ . The covariant derivative thus acts on it trivially, which gives the Bianchi identity: $D^{ω} Ω = 0$ . It is also easily proved that $D^{ω} \circ D^{ω} = ρ_{*} (Ω)$ .

The natural maximal group of transformation of $P$ is its group of automorphisms $A u t (P) := {Ψ \in Diff (P) | Ψ (p h) = Ψ (p) h}$ , namely the subgroup of $D i f f (P)$ that respects the fibration structure by sending fibers to fibers, and thus projects on $D i f f (M)$ . The subgroup of vertical automorphisms is ${A u t}_{v} (P) := {Ψ \in Aut (P) | π \circ Ψ = π}$ , that is, it is those automorphisms that induce the identity transformation on $M$ . The latter group is isomorphic to the gauge group of $P$ , $H := {γ : P \to H | R_{h}^{*} γ = h^{- 1} γ h}$ (itself isomorphic to the space of sections of the bundle $P \times_{Conj} H$ ), the isomorphism being $Ψ (p) = p γ (p)$ .

The gauge transformation of a differential form $α$ on $P$ is defined by $α^{γ} := Ψ^{*} α$ , for $γ \in H$ corresponding to $Ψ \in {A u t}_{v} (P)$ . Notably, the gauge transformations of tensorial forms are entirely controlled by their equivariance: for $α \in Ω_{tens}^{∙} (P, V_{i})$ , $α^{γ} = ρ_{i} (γ^{- 1}) α$ (hence the name for such forms). The $H$ -transformation of a connection also assumes a simple form: for $ω \in C$ , $ω^{γ} = γ^{- 1} ω γ + γ^{- 1} d γ$ . Transformations induced by the action of $H ≃ {A u t}_{v} (P)$ are called active gauge transformations, as they are analogous to the action of $D i f f (M)$ in General Relativity (GR).

Notice that as a special case of its action on tensorial forms, the gauge group acts on itself as $η^{γ} = γ^{- 1} η γ$ for $η, γ \in H$ . This ensures that the action of $H$ on $Ω_{tens}^{∙} (P, V_{i})$ and $C$ is a right action.

A bundle $P$ is always locally trivial, meaning that for $U \subset M$ we have $P_{| U} ≃ U \times H$ . A trivializing section is a map $s i g m a : U \to P_{| U}$ , $x \mapsto s i g m a (x)$ . By its means, one can pull back objects of $P$ down to $M$ . In particular, the local representative of $ω \in C$ is $A := {s i g m a}^{*} ω \in Ω^{1} (U, Lie H)$ , which is a gauge (Yang–Mills) potential. The field strength of $A$ is the local representative of the curvature $F := {s i g m a}^{*} Ω \in Ω^{2} (U, Lie H)$ . Then the local representatives $ϕ := {s i g m a}^{*} φ \in Ω^{0} (U, V_{i})$ are various kinds of matter fields, and $D^{A} ϕ := {s i g m a}^{*} (D^{ω} φ) \in Ω^{1} (U, V_{i})$ their minimal coupling to the gauge field. The last three are special cases of local representatives of tensorial forms: $a := {s i g m a}^{*} α \in Ω^{∙} (U, V_{i})$ .

Considering $U$ and $U^{'}$ s.t. $U \cap U^{'} \neq \emptyset$ and local sections $s i g m a : U \to P_{| U}$ and ${s i g m a}^{'} : U^{'} \to P_{| U^{'}}$ related on the overlap via ${s i g m a}^{'} = s i g m a g$ where $g : U \cap U^{'} \to H$ , $x \mapsto g (x)$ , is a (well-named) transition function of $P$ . The local representatives on $U$ and $U^{'}$ obtained via $s i g m a$ and ${s i g m a}^{'}$ satisfy gluing properties on $U \cap U^{'}$ involving $g$ ’s. For local representatives of a connection and of tensorial forms we have

\begin{matrix} A^{'} = g^{- 1} A g + g^{- 1} d g, a^{'} = ρ_{i} (g^{- 1}) a . \end{matrix}

(B-3)

As special case of the second equation, we have the gluings of the field strength and matter fields: $F^{'} = g^{- 1} F g$ and $ϕ^{'} = ρ_{i} (g^{- 1}) ϕ$ . Equations (B-3) are called passive gauge transformations, as they are entirely analogous to coordinate changes, or passive diffeomorphisms, in GR.

The latter are formally indistinguishable, yet conceptually different, from local active gauge transformations, namely the local representatives of the global $H$ -transformations seen previously, which on $U$ would read

\begin{matrix} A^{γ} = γ^{- 1} A γ + γ^{- 1} d γ, a^{γ} = ρ_{i} (γ^{- 1}) a, for γ \in H_{loc} \end{matrix}

(B-4)

and with the local gauge group over $U$ defined as $H_{loc} := {γ = {s i g m a}^{*} γ, γ \in H | η^{γ} = γ^{- 1} η γ}$ .Footnote ¹ In the next section, we will focus our discussion on local active gauge transformations.

With this ends our summary of the geometry underlying the kinematics of a gauge theory. Let us note $A$ the space of gauge potentials (local connections) and, with slight abuse, $Γ (E_{i})$ the spaces of matter fields. A gauge theory is specified by a Lagrangian functional $L : A \times Γ (E_{i}) \to Ω^{n} (U, R)$ , $(A, ϕ) \mapsto L (A, ϕ)$ , with $n =$ dim $M$ . Requiring the passive gauge invariance of $L$ , $L (A^{'}, ϕ^{'}) = L (A, ϕ)$ , amounts to requiring that it has trivial gluings and is well defined across $M$ : $L (A, ϕ) \in Ω^{n} (M, R)$ . But it formally is indistinguishable from requiring its local active gauge transformation, $L (A^{γ}, ϕ^{γ}) = L (A, ϕ)$ , which implies that it comes from a $H$ -invariant, or basic, form on $P$ : $\overset{ˉ}{L} (ω, φ) = π^{*} L (A, ϕ) \in Ω_{basic}^{n} (P, R)$ .

B2 Geometry of the Dressing Field Method

Here, we somewhat mimick the structure of the main text presentation (Section 5) so as to ease the comparison with the field-theoretic presentation. More complete presentations from the perspective of the geometry of field space $Φ$ can be found in Reference FrançoisFrançois (2021b) and Reference François, Parrini and BoulangerFrançois et al. (2021). Let us begin by defining the central object of the DFM: consider a $H$ -gauge theory based on a bundle $P (M, H)$ .

Definition 5.

Suppose $\exists$ subgroups $K \subseteq H$ of the structure group, to which corresponds a subgroup $K \subset H$ of the gauge group, and $G$ s.t. $K \subseteq G \subseteq H$ . A $K$ -dressing field is a map $u : P \to G$ defined by its $K$ -equivariance $R_{k}^{*} u = k^{- 1} u$ . Denote the space of $G$ -valued $K$ -dressing fields on $P$ by $D r [G, K]$ . It follows immediately that the $K$ -gauge transformation of a dressing field is $u^{γ} = γ^{- 1} u$ , for $γ \in K$ .

Given the existence of a $K$ -dressing field, we have the following:

Proposition 6.

From $ω \in C$ and $α \in Ω_{tens}^{∙} (P, V_{i})$ , one defines the dressed fields

\begin{matrix} ω^{u} : = u^{- 1} ω u + u^{- 1} d u and α^{u} : = ρ_{i} (u)^{- 1} α, \end{matrix}

(B-5)

which have trivial $K$ -equivariance and are $K$ -horizontal, thus are $K$ -basic on $P$ . It follows that they are $K$ -invariant: $(ω^{u})^{γ} = ω^{u}$ and $(α^{u})^{γ} = α^{u}$ , for $γ \in K$ , as is easily checked. The dressed curvature is $Ω^{u} = d ω^{u} + \frac{1}{2} [ω^{u}, ω^{u}] = u^{- 1} Ω u$ and appears when squaring the dressed covariant derivative defined as $D^{ω^{u}} : = d + ρ_{*} (ω^{u})$ . It satisfies the Bianchi identity $D^{ω^{u}} Ω^{u} = 0$ .

In case the equivariance group of $u$ is $K = H$ , $α^{u} \in Ω_{basic}^{∙} (P, V_{i})$ and $ω^{u} \in Ω_{basic}^{1} (P, Lie H)$ are $H$ -invariant, thus project as forms on $M$ . The preceding results for $α^{u}$ make sense for $G \supset H$ if we assume that representations $(V_{i}, ρ_{i})$ of $H$ extend to representations of $G$ .

Let us emphasize an important fact: it should be clear from its definition that $u \notin K$ , so that (B-5) are not gauge transformations, despite the formal resemblance. This means, in particular, that the dressed connection is no more a $H$ -connection, $ω^{u} \notin C$ , and a fortiori is not a point in the gauge $K$ -orbit $O_{K} [ω]$ of $ω$ , so that $ω^{u}$ must not be confused with a gauge fixing of $ω$ .

On $U \subset M$ , a local $K_{loc}$ -dressing field $u = {s i g m a}^{*} u : U \to G \in D r [G, K]_{loc}$ will be defined (or recognized) by its defining gauge transformation property $u^{γ} = γ^{- 1} u$ for $γ \in K_{loc} \subseteq H_{loc}$ . The local version of Proposition 6 is none other than Proposition 2 of Section 5.2.

Residual Gauge Transformations

The only way to speak meaningfully about well-behaved residual gauge transformations is if $K$ is a normal subgroup, $K ◃ H$ , so that the $J : = H / K$ is indeed a group, to which corresponds the residual gauge subgroup $J \subset K$ . Now, the action of $J$ on the initial variables $A$ and $α$ is known. Therefore what will determine the $J$ -residual gauge transformations of the dressed fields is the action of $J$ on the dressing field. And this in turn is determined by its $J$ -equivariance. In that regard, consider the following proposition,

Proposition 7.

Suppose the dressing field $u$ has $J$ -equivariance given by $R_{j}^{*} u = j^{- 1} u j$ . Then the dressing field has $J$ -gauge transformation $u^{η} = η^{- 1} u η$ for $η \in J$ , and the residual gauge transformations of the dressed fields are $(ω^{u})^{η} = η^{- 1} ω^{u} η + η^{- 1} d η$ and $(α^{u})^{η} = ρ (η)^{- 1} α^{u}$ . So in particular $(Ω^{u})^{η} = η^{- 1} Ω^{u} η$ .

this means that the dressed connection $ω^{u}$ remains a good connection on the $J$ -subbundle $P^{'} \subset P$ , with curvature $Ω^{u}$ . The local version is none other than Proposition 3 of Section 5.2.1.

Ambiguity in the Choice of Dressing Field

The dressed fields may exhibit residual transformations of another kind resulting from a potential ambiguity in choosing the dressing field. A priori two dressings $u, u^{'} \in D r [G, K]$ may be related by $u^{'} = u ξ$ , where $ξ : P \to G$ . Since by definition $R_{k}^{*} u = k^{- 1} u$ and $R_{k}^{*} u^{'} = k^{- 1} u^{'}$ , one has $R_{k}^{*} ξ = ξ$ . Let us denote the group of such basic maps $G : = (ξ : P \to G | R_{k}^{*} ξ = ξ)$ , and denote its action on a dressing field as $u^{ξ} = u ξ$ . By definition, $G$ has no action on the space of connections $C$ or on $Ω_{tens}^{∙} (P, V_{i})$ : note that $ω^{ξ} = ω$ and $α^{ξ} = α$ . On the other hand, it is clear how $G$ acts on dressed fields:

\begin{matrix} (ω^{u})^{ξ} & : = (ω^{ξ})^{u^{ξ}} = ω^{u ξ} = ξ^{- 1} ω^{u} ξ + ξ^{- 1} d ξ, and \\ (α^{u})^{ξ} & : = (α^{ξ})^{u^{ξ}} = α^{u ξ} = ρ (ξ^{- 1}) α^{u} . \end{matrix}

(B-6)

In particular, $(Ω^{u})^{ξ} = ξ^{- 1} Ω^{u} ξ$ . The new dressed field $(ω^{u})^{ξ}$ and $(α^{u})^{ξ}$ are also $K$ -basic, and therefore $K$ -invariant. This means that the bijective correspondence between the $K$ -dressings $(χ^{u})^{ξ}$ , for $χ = (ω, α)$ , and their gauge $K$ -orbits $O_{K} [χ]$ holds $\forall ξ \in G$ . So, there is a $1 : 1$ correspondence $O_{K} [χ] \sim O_{G} [χ^{u}]$ .

The local counterpart of the preceding clearly reproduces our field-theoretic treatment of the main text, (5.13) in section 5.2.2, where the physical implications were discussed.

Acknowledgments

P. B. acknowledges support by the Austrian Science Fund (FWF) [P 31758]. During the completion of this work, J. F. has been supported by the Fonds de la Recherche Scientifique – FNRS under grant PDR No. T.0022.19 (“Fundamental issues in extended gravitational theories”) and by the FNRS grant MIS No. F.4503.20 (“HighSpinSymm”). R. S. acknowledges support by the DFG under Grant No. SO1777/1-1.

Richard Dawid
Stockholm University
Richard Dawid is Professor in the Philosophy of Science at Stockholm University and specialises in the philosophy of contemporary physics, particularly that of non-empirical theory assessment. In 2013 he published String Theory and the Scientific Method and in 2019 he co-edited a second book titled Why Trust a Theory (both published by Cambridge University Press).

James Wells
University of Michigan, Ann Arbor
James Wells is Professor in Physics at the University of Michigan, Ann Arbor, and his research specialises in high-energy theoretical physics, with a particular focus on foundational questions in fundamental physics such as gauge symmetries, CP violation, naturalness and cosmological history. He is a Fellow of the American Physical Society.

About the Series

Foundations in Contemporary Physics explores some of the most significant questions and discussions currently taking place in modern physics. The series is accessible to physicists and philosophers and historians of science, and has a strong focus on cutting-edge topics of research such as quantum information, cosmology, and big data.

Element contents