You are here
Homeconnection
Primary tabs
connection
Note that this definition has come under new management and is still in the process of being edited and rewritten.
Intuitive geometric definition
The notions of connection, parallel transport, and covariant derivative are closely related so, to prevent confusion, we will begin by explaining these notions intuitively before presenting formal definitions. Moreover, it helps to have a good grasp of the geometric notions involved before studying the more formal definitions.
In elementary vector analysis, one takes it for granted that vectors can be moved about freely. As long as one takes care not to change the magnitude or the direction of a vector, one can move the basepoint of the vector to any arbitrary location.
When one graduates to the study of vectors on curved spaces, however, it becomes apparent that one can no longer take this freedom of moving vectors about for granted. As defined, vectors are confined to their basepoint and the basic operations with vectors are only defined for vectors based at the same point.
To move a vector from one point to another, one needs to specify how this is to be done. A connection is a prescription for moving vectors based at one point of a space to another point. Intuitively speaking, a connection consists of a set of linear transformations which transform vectors based at a particular point into vectors based at infinitesimally nearby points. Unlike in elementary vector analysis where there is only one right way of moving a vector from one point to another, in differential geometry there are many ways of moving vectors around, so one needs to specify which connection one is using before one can move vectors from point to point.
This act of moving a vector from point to point is called parallel transport in analogy with the operation of elementary vector analysis which it generalizes. Not only can one speak of transporting a vector to a nearby point using a connection, but one can parallel transport a vector along a curve. To see how that works, imagine a curve as a sequence of points. Using the connection, we can transport a vector based at one point of a curve to the next point on the curve. Then we can use the connection to transport it to the point after that, and so on until we have transported it from one end of the curve to the other.
At this point, a striking difference between differential geometry and elementary vector analysis shows up. Typically, if we connect two points $P$ and $Q$ by two or more curves and parallel transport a vector based at $P$ to $Q$, we find that the result depends upon which curve we transported the vector along. In fact, in differential geometry, the definition of a curved space is a space in which there exist two distinct curves with the same endpoints such that parallel transport along one curve is not the same as parallel transport along the other curve.
Finally, there is the notion of covariant derivative. Suppose that one is given not just a single vector based at a certain point, but a whole vector field, i.e. a vector for each point of the manifold. Then one can try to compute the derivative of this vector field. To compute a derivative of a function, one subtracts the value of the function at a point from the value at a nearby point. But this is not possible for the vector field because we are only allowed to subtract vectors stationed at the same base point. However, we can use our connection to parallel transport the vector at a point to the nearby point, then subtract. This generalization of differentiation involving parallel transport is known as covariant differentiation.
Obviously, the above definitions leave much to be desired in the way of precision. They are not specific what a space is, how vectors are to be associated to the points of this space, and are based on vague notions of infinitesimally nearby points.
For the purpose of this article, we shall take our space to be a finitedimensional manifold. To be sure, some of the definitions to be given apply to more general contexts, such as infinitedimensional manifolds so one may speak of connections on these spaces as well. However, we shall not pursue this topic here since this exposition is intended to be accessible to newcomers to differential geometry who may not have the necessary background in Hilbert space theory, point set topology, and other subjects.
The definition of vector bundle makes precise the idea of ”attaching vectors to points”. The reader who is not comfortable with general vector bundles may take this bundle to be the tangent bundle.
There are about as many ways of framing a rigorous definition of connection as there are ways of formalizing differential geometry. Hence, under the headings below, we shall list various equivalent definitions.
Before proceeding to these definitions, a few words of warning may be in order. Since the notions of connection, parallel transport, and covariant derivative are so closely related, it is easy to translate propositions involving one of these terms into propositions involving a different one of three terms. In particular, propositions about connections are easily rewritten as propositions about covariant derivatives. In some formalisms, it is easier to define covariant derivative than to define connection. This leads to an abuse of terminology — some authors say things like “the connection $\nabla$” instead of the more precise statement “the covariant differentiation operator $\nabla$”. This can be disconcerting to the uninitiated, but once the principle involved has been grasped, this practise is harmless.
Preliminaries.
Let $M$ be a smooth, $d$dimensional differential manifold. Let $\mathcal{F}(M)$ denote the ring of smooth, realvalued functions on $M$, and let $\mathcal{X}(M)$ denote the real vector space of smooth vector fields. Let $B$ be a vector bundle over $M$ whose structure group is the finitedimensional Lie group $G$ and whose fibers are isomorphic to the $n$dimensional vector space $V$.Let $\mathcal{X}(B)$ denote the set of sections of $B$. Let $G^{M}$ denote the set of smooth maps from $M$ to $G$; it forms a group under pointwise multiplication. (If $f$ and $g$ are two functions from $M$ to $G$, then their product is $h$ which is defined as $h(x)=f(x)g(x)$.) Likewise, let $A^{M}$ denote the set of smooth maps from $M$ to $A$.
For simplicity, we shall assume that $G=GL(V)$ for the time being. After stating the definitions of connection in this case, we shall describe how they can be modified to cover the case where $G\subset GL(V)$.
Recall that $\mathcal{F}(M)$ both acts and is acted upon by $\mathcal{X}(M)$. Given a function $f\in\mathcal{F}(M)$ and a vector field $X\in\mathcal{X}(M)$ we write $fX\in\mathcal{X}(M)$ for the vector field obtained by pointwise multiplying values of $X$ by values of $f$, and write $X(f)\in\mathcal{F}(M)$ for the function obtained by taking the directional derivative of $f$ with respect to $X$.
Coordinate definition.
Let $x^{1},\ldots,x^{d}$ be a set of coordinates on some neighborhood of $M$. These may be extended to coordinates on a subset of the bundle by augmenting them with coordinates $x^{{d+1}},\ldots,x^{{d+n}}$ on the fiber. Since the fiber is a vector space, we will demand that the fiber coordinates be linear coordiantes. (This means that the coordinates of the sum of two vectors are the sum of the coordinates of the two vectors and the coordiantes of the scalar multiple of a vector are gotten by multiplying the coordinates of the original vecor by the scalar.) Adopt the convention that Latin indices run from $1$ to $d$ and that Greek indices run from $d+1$ to $n$.
In these coordinates, a connection will be represented by a threeindex field
$C^{\mu}_{{\nu i}}(x^{1},\ldots,x^{d})$ 
on the manifold and the covariant derivative $\nabla_{i}S^{\mu}$ of a section $S^{\mu}\in\mathcal{X}(B)$ will be an element of $T(M)\otimes B$ given by the formula
$\nabla_{i}S^{\mu}=\partial_{i}S+C^{\mu}_{{\nu i}}S^{\nu}$ 
(Here $\partial_{i}$ is short for $\partial/\partial x^{i}$ and the summation convention is in force. It might also be worth mentioning that sometimes the covariant derivative is defined with a minus sign instead of a plus sign (on rare occasions mostly occurring in highenergy physics theory, one even sees it defined with an imaginary unit $i$) so one needs to check which sign convention is in use.)
Before proceeding further, it might be helpful to present a warning. The notation $\nabla_{i}T^{\mu}$ can lead to some confusion, and this danger warrants an extra comment. The symbol $\nabla_{i}$ acting on a function, is customarily taken to mean the same thing as the corresponding partial derivative:
$\nabla_{i}f=\partial_{{x_{{i}}}}(f)=\frac{\partial f}{\partial x^{i}}.$ 
Thus, it easy to make the mistake that $\nabla_{i}T^{\mu}$ is the result of applying an operator $\nabla_{i}$ to each component of $T^{\mu}$. As can be seen from the definition, this is not the case. Rather, one should think of $\nabla_{i}T^{\mu}$ as if is were $(\nabla T)_{i}^{\mu}$, which is to say it denotes the components of a new tensor which was derived from $T^{\mu}$ by the operation of covariant differentiation.
The relation of these formulas to the naive picture is as follows: A connection is supposed to be a collection of linear maps from one tangent space to neighboring tangent space. Given a point $p\in M$, any vector $w^{\mu}$ in the fiber of $B$ above $p^{i}$ is transformed into the vector $w^{\mu}+C^{\mu}_{{\nu i}}(p)w^{\nu}dx^{i}$ in the fiber above the nearby point $p^{i}+dx^{i}$. (In this paragraph, I am using ”$dx^{\mu}$” in its naive sense of ”infinitesimal displacement” rather than as a differential form.) Likewise, subtracting the value of $S^{\mu}(p^{i}+dx^{i})$ from the paralleltransported value of $S^{\mu}(p)$ and dividing by $dx$, one obtains the formula for covariant derivative.
In order for a geometrical quantity to be defined properly by a coordinate expression, one must specify how the quantity transforms under change of coordinates. Under a change of coordinates
$y^{i}=f^{i}(x^{1},\ldots,x^{d})$ 
$y^{\mu}=\Lambda^{\mu}_{\nu}(x^{1},\ldots,x^{d})y^{\nu}$ 
the components of the connection transform as follows:
$C^{\mu}_{{\nu i}}(y)={\partial f^{j}\over\partial f_{i}}\Lambda^{\mu}_{\sigma}% (\Lambda^{{1}})_{\nu}^{\tau}C^{\sigma}_{{\tau j}}+(\Lambda^{{1}})^{\mu}_{% \kappa}{\partial\Lambda^{\kappa}_{\nu}\over\partial x^{j}}$ 
Note that these rules imply that the components of a connection do not transform like the components of a tensor — the term involving the derivatives of $\Lambda$ is not present in the transformation law of a tensor. However, if we have two connections on the same bundle, the difference of these connections will be a tensor because the extra terms cancel.
The reason for defining the transformation law in this way is so that the covariant derivative $\nabla_{i}S^{\mu}$ of a section of $S^{\mu}\in\mathcal{X}(B)$ will transform as an element of $T(m)\otimes B$ should. Furthermore, as one may check by transforming the various quantites that appear in the equation defining the covariant derivative, this is the only possible transformation law which will make $\nabla_{i}S^{\mu}$ transform prperly. This property is the origin of the term “covariant derivative” — the covariant derivative maps tensor fields into quantities which transform in the same manner.
Alternative Notations
There are many different systems of notations in differential geometry. (Indeed one humorous definition of differential geometry is “The study of invariants under change of notation”!) This section will discuss several notations for connections and covariant derivatives.
It is traditional to represent the components of the covariant derivative like this
$Y^{\mu}_{{\;;j}}=\nabla_{j}Y^{\mu}$ 
using the semicolon to indicate that the extra index comes from covariant differentiation. Sometimes, as in the theory of embedded surfaces, there are two connections present so a semicolon is used to indicate covariant derivatives with repsect to one connection and a vertical bar or a colon is used to indicate covariant derivatives with respect to the other connection. It might also be worth noting that commas are likewise used to indicate partial derivatives with respect to a given coordinate system. Using this notation, one might write the formula for covariant derivative as
$T^{\mu}_{{;i}}=T^{\mu}_{{,i}}+C^{\mu}_{{\nu i}}T^{\nu}$ 
Also, there are different ways of packaging the information contained in the connection components. One may collect the connection components into $d$ matrices $A_{i}$:
$A_{i}=\begin{pmatrix}C^{{d+1}}_{{d+1\,i}}&\cdots&C^{{d+1}}_{{d+n\,i}}\cr\vdots% &\ddots&\vdots\cr C^{{d+n}}_{{d+1\,i}}&\cdots&C^{{d+n}}_{{d+n\,i}}\end{pmatrix}$ 
Another common notational device is to collect the connection coefficients into the socalled “connection oneforms”
$A^{\mu}_{\nu}=C^{\mu}_{{\nu i}}dx^{i}$ 
When using this notation, the covariant derivative is written as a generalization $D$ of the exterior derivative $d$:
$DT^{\mu}=dT^{\mu}+A^{\mu}_{\nu}T^{\nu}$ 
By combining the two devices and collecting the connection oneforms into a matrix $A$, one may do away with indices altogether. If one also collects the components $T^{\mu}$ into a column vector $T$, one may write
$DT=dT+AT$ 
A quantity like $A$ is often referred to as a matrixvalued oneform.
Occasionally, one finds connection coefficients with only two indices instead of three. The reason is that the two indices referring to the bundle have been replaced by a single index referring to the Lie algebra. To relate this notation to the one discussed so far, we need to remember that the action of the structure group $G$ on $V$ defines a representation of the Lie algebra $A$ on $V$, i.e. a map
$\rho\colon A\to Hom(V,V)$ 
If we choose linear coordinates $y_{1},\ldots,y^{m}$ on the vector space $A$, this map may be expressed in components as
$(y^{1},\ldots,y^{m})\mapsto t^{\mu}_{{\nu I}}y^{I}$ 
(Extend our conventions by agreeing that capital Latin indices run from $1$ to $m$, where $m$ is the dimension of the Lie algebra. In the case we are considering, where $G=GL(V)$, we will have $m=n^{2}$.) To the twoindex object $A^{I}_{i}$, we will associate the threeindex object
$C^{\mu}_{{\nu i}}=A^{I}_{i}t^{\mu}_{{\nu I}}$ 
Therefore, one may also specify a connection in a coordinate system by giving an array indexed by an index referring to the Lie algebra and an index referring to the cotangent space of the manifold. This notation is useful in situations when one wants to emphasize the structure group rather than the manifold or when one is dealing with more than one bundle whose fibers are different representations of the same group.
Definition in terms of oneforms
It is worth noting that one can define the connection directly in terms of the curvature oneforms. A noteworthy feature of such definition is that it does not make explicit reference to coordinate systems on the manifold, although it does make use of local neighborhoods. After the discussion of the last section, the relation of this definition to the preceding definition should be clear.
As in the last section, let $\rho\colon G\to Hom(V,V)$ denote the action of $G$ on $V$.
Let $(U,\phi)$ be a local trivialization of the bundle $B$. Recall that $U$ is an open set of $M$ and that $\phi$ is a diffeomorphism between $\pi^{{1}}(U)\in B$ and $U\times V$. To every local trivialization, associate an element $A$ of $v$. In order for these elements to define a connection, they must transform properly under changes of local trivialization. Two local trivializations over the same set $U$ are related by a transition function $g\colon U\to G$. The transformation law of an element $A\in T^{*}(U)\times Hom(V,V)$ is given by
$A^{{\prime}}=\rho(g)^{{1}}A\rho(g)+\rho(g)^{{1}}d\rho(g)$ 
For this definition to be consistent, it must agree with the cocycle condition. The reason for this is that, if it didn’t, one obtain different answers by transforming from one local trivialization to another in two different ways. That it is consistent is easily verified. Using the notation of the entry on fibre bundles,
$\rho(g_{{ij}}g_{{jk}})^{{1}}A\rho(g_{{ij}}g_{{jk}})+\rho(g_{{ij}}g_{{jk}})^{{% 1}}d\rho(g_{{ij}}g_{{jk}})=$ 
$\rho(g_{{jk}})^{{1}}\rho(g_{{ij}})^{{1}}A\rho(g_{{ij}})\rho(g_{{jk}})+\rho(g% _{{jk}})^{{1}}\rho(g_{{ij}})^{{1}}d\big(\rho(g_{{ij}})\rho(g_{{jk}})\big)=$ 
$\rho(g_{{jk}})^{{1}}\big(\rho(g_{{ij}})^{{1}}A\rho(g_{{ij}})+\rho(g_{{ij}})^% {{1}}d(\rho(g_{{ij}})\big)\rho(g_{{jk}})+(g_{{jk}})^{{1}}d\rho(g_{{jk}})$ 
Axiomatic definition of covariant differentiation
In this definition, covariant differentiation is characterized axiomatically. As explained in the first section, it is not necessary to augment this with a separate definition of connection, since any statement about connections can be rephrased as a statement about covariant derivatives. An important feature of this definition which sets it apart from the previous two definitions is that it is global — there is no need to chop up the manifold or the bundle into patches, define the connection on each patch, then sew the patches back together again to make a complete manifold.
A covariant derivative $\nabla$ is a mapping
$\displaystyle\nabla:\mathcal{X}(M)\times\mathcal{X}(B)$  $\displaystyle\rightarrow\mathcal{X}(B)$  
$\displaystyle(X,Y)$  $\displaystyle\mapsto\nabla_{X}Y,\qquad X\in\mathcal{X}(M),Y\in\mathcal{X}(B)$ 
that for all $X,Y\in\mathcal{X}(M)$, all $Z,W\in\mathcal{X}(B)$, all $f\in\mathcal{F}(M)$, and all $\lambda\in A^{H}$ satisfies
1. $\nabla_{{X+Y}}Z=\nabla_{X}Z+\nabla_{Y}Z$
2. $\nabla_{X}(Z+W)=\nabla_{X}Z+\nabla_{X}W$
3. $\nabla_{{fX}}Z=f\,\nabla_{X}Z$
4. $\nabla_{X}(fZ)=X(f)Z+f\,\nabla_{X}Z$
Note that the lack of tensoriality in the second argument means that a connection is not a tensor field.
Also not that we can regard the connection as a mapping from $\mathcal{X}(M)$ to the space of type (1,1) tensor fields, i.e. for $Y\in\mathcal{X}(M)$ the object
$\displaystyle\nabla Y:\mathcal{X}(M)$  $\displaystyle\rightarrow\mathcal{X}(M)$  
$\displaystyle X$  $\displaystyle\mapsto\nabla_{{\!X}}Y,\quad X\in\mathcal{X}(M)$ 
is a type (1,1) tensor field called the covariant derivative of $Y$. In this capacity $\nabla$ is often called the covariant derivative operator.
Recall that once a system of coordinates is chosen, a given vector field $Y\in\mathcal{X}(M)$ is represented by means of its components $Y^{i}\in\mathcal{F}(U)$ according to
$Y=Y^{i}\partial_{{x_{{i}}}}.$ 
The formula for the components follows directly from the defining properties of a connection and the definition of the Christoffel symbols. To wit:
$Y^{i}_{{\;;j}}=Y^{i}_{{\;,j}}+\Gamma_{{jk}}{}^{i}\,Y^{k}$ 
where the symbol with the comma
$Y^{i}_{{\;,j}}=\partial_{{x_{{j}}}}(Y^{i})=\frac{\partial Y^{i}}{\partial x^{j}}$ 
denotes a derivate relative to the coordinate frame.
A related and frequently encountered notation is $\nabla_{i}$, which indicates a covariant derivatives in direction $\partial_{{x_{{i}}}}$, i.e.
$\nabla_{i}Y=\nabla_{{\!\partial_{{x_{{i}}}}}}Y,\quad Y\in\mathcal{X}(M).$ 
This notation jibes with the point of view that the covariant derivative is a certain generalization of the ordinary directional derivative. The partials $\partial_{{x_{{i}}}}$ are replaced by the covariant $\nabla_{i}$, and the general directional derivative $V^{i}\partial_{{x_{{i}}}}$ relative to a vectorfield $V$, is replaced by the covariant derivative operator $V^{i}\nabla_{i}.$
Group compatibility
So far, we have been labouring under the assumption that $G=GL(V)$. The time has now come to remove this restriction. To do so, we need to come to grips with the issue of group compatibility. As usual, we shall begin by discussing the problem in intuiutive terms, then formalize our intuition in various formalisms.
The structure group transforms vectors located at a point into each other whilst the connection transforms transforms vectors based at one point into vectors based at another point. To understand the problem of compatibility, let us focus attention on two nearby points $P$ and $Q$ of the manifold and the fibres above these points.
There are two ways to transform a vector $v\in V_{P}$. (Since it is crucial to remember that the fibers over different points are distinct vector spaces if one is to order to understand this discussion, we have indexed the copies of $V$ which serve as fibers of the bundle over various points of the manifold with their basepoints. Likewise, we shall index the symbol $\rho$ with a point of the manifold to indicate the action of the group on vectors based at that point.) The simplest way is to pick an element $g\in G$ and apply the transformation $\rho_{P}(g)$ to $v$. Alternatively, one could first parallel transport $v$ to $Tv\in V_{Q}$, apply the transform $\rho_{Q}(g)$ to the transported vector, then parallel transport the result back to $P$ to obtain $T^{{1}}\rho_{Q}(g)Tv$.
If the transform $T^{{1}}\rho_{Q}(g)T$ does not equal $\rho_{P}(g^{{\prime}})$ for any $g^{{\prime}}\in G$, we are in trouble. By using the connection, we could generate a transformation of the fiber which is not described by the structure group of the bundle. To avoid this difficulty, we need to demand that the connection is compatible with the group. Group compatibility is the condition that for every map $T:V_{P}\to V_{Q}$ which parallel transports a vector from a point $P$ to another point $Q$ and for every $g\in G$, there exists a $g^{{\prime}}\in G$ such that $\rho_{P}(g)T=T\rho_{Q}(g^{{\prime}})$. In the language of representation theory, we would say that $T$ intertwines the representations $\rho_{P}$ and $\rho_{Q}$ of $G$.
It is worth noting that, if we transport the vector $v$ from $P$ to $Q$ by first transporting it to an intermediate point $R$, it is enough to check that the transport from $P$ to $R$ and the transport from $R$ to $Q$ are group compatible since, if they are, it will automatically follow that the transport from $P$ to $Q$ is group compatible. To verify this assertion, let $T_{1}$ be the matrix which transports vectors from $V_{P}$ to $V_{R}$ and let $T_{2}$ be the matrix which transports vectors from $V_{R}$ to $V_{Q}$.
$\nabla_{X}(\lambda Z)\lambda\nabla_{X}Z=\mu Z$ for some $\mu\in A^{M}$
Related Definitions.
The torsion of a connection $\nabla$ is a bilinear mapping
$T:\mathcal{X}(M)\times\mathcal{X}(M)\rightarrow\mathcal{X}(M)$ 
defined by
$T(X,Y)=\nabla_{X}(Y)\nabla_{Y}(X)[X,Y],$ 
where the last term denotes the Lie bracket of $X$ and $Y$.
The curvature of a connection is a trilinear mapping
$R:\mathcal{X}(M)\times\mathcal{X}(M)\times\mathcal{X}(M)\rightarrow\mathcal{X}% (M)$ 
defined by
$R(X,Y,Z)=\nabla_{X}\nabla_{Y}Z\nabla_{Y}\nabla_{X}Z\nabla_{{[X,Y]}}Z,\quad X% ,Y,Z\in\mathcal{X}(M).$ 
We note the following facts:

The torsion and curvature are tensorial (i.e. $\mathcal{F}(M)$linear) with respect to their arguments, and therefore define, respectively, a type (1,2) and a type (1,3) tensor field on $M$. This follows from the defining properties of a connection and the derivation property of the Lie bracket.

Both the torsion and the curvature are, quite evidently, antisymmetric in their first two arguments.
A connection is called torsionless if the corresponding torsion tensor vanishes. If the corresponding curvature tensor vanishes, then the connection is called flat. A connection that is both torsionless and flat is locally Euclidean, meaning that there exist local coordinates for which all of the Christoffel symbols vanish.
Notes.
The notion of connection is intimately related to the notion of parallel transport, and indeed one can regard the former as the infinitesimal version of the latter. To put it another way, when we integrate a connection we get parallel transport, and when we take the derivative of parallel transport we get a connection. Much more on this in the parallel transport entry.
As far as I know, we have Elie Cartan to thank for the word connection. With some trepidation at putting words into the master’s mouth, my guess is that Cartan would lodge a protest against the definition of connection given above. To Cartan, a connection was first and foremost a geometric notion that has to do with various ways of connecting nearby tangent spaces of a manifold. Cartan might have preferred to refer to $\nabla$ as the covariant derivative operator, or at the very least to call $\nabla$ an affine connection, in deference to the fact that there exist other types of connections (e.g. projective ones). This is no longer the mainstream view, and these days, when one wants to speak of such matters, one is obliged to use the term Cartan connection.
Indeed, many authors call $\nabla$ an affine connection although they never explain the affine part. ^{1}^{1}The silence is puzzling, and I must confess to wondering about the percentage of modernday geometers who know exactly what is so affine about an affine connection. Has blind tradition taken over? Do we say “affine connection” because the previous person said “affine connection”? The meaning of “affine” is quite clearly explained by Cartan in his writings. There you go esteemed “everybody”: one more reason to go and read Cartan. One can also define connections and parallel transport in terms of principal fiber bundles. This approach is due to Ehresmann. In this generalized setting an affine connection is just the type of connection that arises when working with a manifold’s frame bundle.
Bibliography.
[Exact references coming.]
Bishop ang Goldberg (1968)
 Cartan’s book on projective connection.
 Ehresmann’s seminal midcentury papers.
 Kobayashi and Nomizu’s books
Spivak (1965)
See also the bibliography for differential geometry.
Mathematics Subject Classification
53B05 no label found Forums
 Planetary Bugs
 HS/Secondary
 University/Tertiary
 Graduate/Advanced
 Industry/Practice
 Research Topics
 LaTeX help
 Math Comptetitions
 Math History
 Math Humor
 PlanetMath Comments
 PlanetMath System Updates and News
 PlanetMath help
 PlanetMath.ORG
 Strategic Communications Development
 The Math Pub
 Testing messages (ignore)
 Other useful stuff
 Corrections
Comments
coonections.
Great text on connections : short, concise and extremely readable.
I think these references (as you've noted) by Elie Cartan could be mentionned :
1 "LeÃ§on sur la gÃ©omÃ©trie des espaces de Rieman", E.Cartan. GauthierVillars, Paris, 1963.
This is lecture given by Cartan in 1925.
2 "Sur les variÃ©tÃ©s Ã connexion affine et la thÃ©orie de la relativitÃ© gÃ©nÃ©ralisÃ©e", E. Cartan, Annales scientifiques de l'ENS, 3Ã¨me sÃ©rie, tome 40 (1923), 325412.
This article (and the following two) are available on line on "Numdam": www.numdam.org
or directly at
http://www.numdam.org/item?id=ASENS_1923_3_40__325_0
This is an extremely beautifull article.
Regards,
Re: coonections.
Just wanted to say thanks! Numdam looks to be a fantastic resource. I will certainly take up your advice. Best wishes, RM
Re: connections.
I think you are definitely on the right track with your comments! Cartan's idea about what a connection is is nowadays formalized as a "Cartan connection". Let me tack a stab at an informal explanation. But first a disclaimer: I can't pretend to be an expert on this. I've noticed the discrepancy, pondered this question, tried to talk to people about it, but the following interpretation is strictly my own synthesis based on reading Cartan. I have some confidence in it though, because Sharpe talks about similar ideas, albeit in terms of a different metaphor: "a meteor streaking through the principal bundle". It appears to be the same idea though.
The modern idea of connection is that it is the derivative of the parallel transport operation. So what is parallel transport, appropriately generalized and abstracted?
We start with a fibre bundle E>B and say that "parallel transport" is a structure that gives us a unique way of lifting paths on B to paths on E. The motivation for this comes about when E is the the bundle of linear frames. We have a path on the base manifold B, and when we lift it to E, we get an isomorphism between the space of frames at the start of the path and the space of frames at the end of the path. To get this kind of lifting we split the tangent bundle of E. We already have the naturally Vertical bundle VE, but now to add a connection we add a horizontal complement HE. The choice of HE is the connection.
The connection (which can also be described as an injection of TB into TE that splits the exact sequence VE>TE>TB) maps every tangent vector in TB uniquely to a tangent vector in TE. Now when we slide along a path on B, the connection maps the velocity in TB to a velocity in TE, and voila, we get a path in TE. Usually, we have some kind of Lie group acting on the fibres so we also ask that the splitting TE=VE+HE be equivariant with this action. That correspond to the equivariance of the parallel transport with respect to the group action. Makes perfect sense. If we parallel transport a certain frame Fx, from point x\in B to a frame Fy at point y\in B along a fixed path, we should have enough information to parallel transport every other frame as well. Thus if g is any linear transformation, g*Fx should be parallel transported to g*Fy along the fixed path in question. To make this work infinitesimally, we need to make sure that the pushforward g* sends HE to HE.
OK, that's a sketch of the modern idea. Now Cartan thought differently. Let's start with Riemannian structure. Let's say you are encased in a ridid body (your vehicle) at some point of a Riemannian manifold. To be concrete let's say we are on a surface embedded in 3d Euclidean space. To move around on the surfaces one needs to exert 2 different kinds of forces: an external force directed along the normal to the surface which keeps you trapped in the surface, and an internal force that changes your speed and heading (motion with purely external accelration = geodesic motion). However as one moves around, the orientation of one's vehicle is also undergoing changes. However, this change is only perceptible in the ambient 3D space. If one is a 2d creature trapped in Flatland, one cannot perceive the ambient 3D, and there is no way to compare the tangent spaces at different points. Thus, a 2D flatland being cannot internally perceive the rigid body rotation that one's vehicle undergoes as it moves around on the surface.
However, there is something that even a 2d flatlander can do and measure. Starting at a fixed point, one can take a sheet of paper (Euclidean plane) mark off an origin to represent ones current location, draw a compass rosette, and associate the directions on paper to actual physical directions in the surface. Then one can start driving/flying by choosing an acceleration relative to ones frame of reference. In the next instant one is somewhere else, but still the orientation of one's rigid body vehicle gives you a point of reference and you can keep on steering. Basically, you can draw a smooth path in your navigational chart (the Euclidean plane) and use that path to drive your vehicle around on the surface.
Now to do meaningful measurement in this setting one has to drive back back to one's starting point, and this is where the first miracle happens. If one draws a closed loop in ones navigational chart and uses this loop to steer by around the surface, one will find that one returns back to ones starting point. This is called absence of torsion! However, when one returns back to one's point of origin, one will find that the orientation of the vehicle upon arrival is different from the orientation at the time of departure. THis is due to the presence of curvature. The change in orientation is called holonomy.
So what Cartan did was to abstract this. To him a connection was a mathematical structure that allowed one to translate paths from a navigational chart, (Euclidean space in case of Riemannian structure, affine space in the case of affine structre, the sphere in the case of conformal structure, Projective space in the case of projective structure, etc) to the underlying manifold. Now for Euclidean and affine structure, this ends up being the same story as the standard parallel transport version of connections. However, when one's navigational chart is something like Projective space, then parallel transport won't cut it and you need some other kind of structure. I won't go into the technicalities: these can be had from Sharpe's book and also from Kobayashi's "Transformation groups".
A good question along these lines is "what is the difference between an affine connection and a linear connection"? The answer, as far as I can tell, is that these are two different ways to encode the same information. With an affine connection we bundle torsion and connection into one big (n+1)x(n+1) matrix of 1forms. For linear connections, we have just the nxn connection matrix of 1forms. However, this connection matrix lives on the bundle of linear frames, where we have the R^n valued canonical 1form (the \theta^i as they are usually named) and these end up being exactly the extra bits from the Cartan connection. So the answer seems to be that "affine connection" = "linear connection", but we formalize/encode the same information differently.
Feedback would be appreciated.
Re: connections.
Sorry, it's taken me so long to reply. Your latest comment is steering the discussion in a direction with which I am not very familiar: constrained Lagrangian systems and nonholonomic connections.
It seems to me that one could define a "connection" without any kinds of equivariance built in. One could start with a fibre bundle and define the horizontal subspace by splitting the exact sequence VE>TE>TB.
So, for example, we could get a connection on the tangent space TM of a manifold. I have not really encountered a use for a nonequivariant connection, but maybe in mechanics this is exactly what is needed?
You give the example of B=J1(R,R) (first jets of maps from R to R)
This is indeed, 3dimensional, with a nonholonomic distribution, which we may consider to be the horizontal complement. So we can raise paths from R (the base) to B (this is usually called prolongation). How can we think of this as an affine connection though? The affine group (2 dimensional in this case, dilations+translations) acts on the 2d fibres of B>R. The fibres contain the zeroth and the first jet info: translations change the zeroth order part and the dilations the first order part.
To get a connection we need an aff(2)valued 1form on B=J1. Now I am stuck, because I don't see how to do this. To get a Cartan connection, aka an absolute parallelism we need a 3d group that contains aff(2), because dim B has to equal dim G. So I didn't quite follow your example (sorry for the rambling remarks, I was trying to figure things out).
In Kobayashi's "Transformations groups" he talks about a natural construction that lets one "thicken" the fibre bundle and turn a Cartan connection into an "ordinary" connection. Would this construction be helpful in your exposition? I thought that curvature=infinitesimal holonomy. You seem to be suggesting that torsion can also manifest as holonomy. Perhaps when one does Kobayashi's "thickening construction" that is exactly how torsion will manifest?