3 Answers
3

This is part of a notational convention that evolved with J.J. Sylvester (1853) in his study of algebraic invariant theory.

The distinctions represented by the use of superscripted or subscripted indices only become important when there is more than one coordinate system in play.

In fact: "when there is a fixed coordinate basis (or when not considering coordinate vectors), one may choose to use only subscripts"

In cases where there are multiple coordinate systems under simultaneous consideration, there will typically be a reference axis (often this has the standard basis), and a coordinate axis (often this has a non-orthogonal basis, e.g. an oblique coordinate system).

The conventions for a vector (array of elements) and their indexing depend on how their values are obtained, as the following image (from Mathview's animation) illustrates:

CONTRAvariant components (superscripts) of a vector are obtained for example by parallel projection of the vector onto the coordinate axes. So in this example, contravariant components would be the familiar coordinate values. CONTRAavariant components, by definition, change as the coordinate axes change. They can be remembered as superscripted by considering the third letter 'n' whose rounded edge can be taken as pointing up.

COVariant components (subscripts) are by obtained for example by orthogonal projection of the vector onto the coordinate axes when these are themselves taken as unit vectors given in the reference frame. This is the usual dot-product of vectors against unit vectors in the coordinate directions. The components can be remembered as being subscripted by considering the third letter 'v' to point to its use. COVariant components, by their definition, change as the reference axes change, since that changes the values of the unit direction vectors of the coordinate axes which are being dot producted against.

In abstract mathematics terminology, the covariant vector is a linear functional that lives in the space dual to the contravariant vector, and, when used in an inner product with a vector (primal space) generates a scalar. In physics, this is expressed in the famous Einstein notation convention for representing length invariantly regardless of coordinate systems by taking the inner product of a covariant and contravariant vector (called contraction).

$\begingroup$Thanks for your great answer! That mathview video lecture you link to: it says one can find the components in an oblique basis by orthogonal projection onto the basis vectors, which I think is wrong. One would have to project onto $b_1$ along $b_2$, which is not the orthogonal projection. Do you agree?$\endgroup$
– Eric AuldOct 14 '14 at 15:01

$\begingroup$Hi, IMO the video is saying there are two ways to extract meaningful coordinates in an oblique coordinate system: taking orthogonal projection to oblique axes gives a covariant vector; parallel projections gives a contravariant vector. Neither is wrong or right by itself. The key is that inner product of covariant and contravariant gives an invariant independent of obliqueness of coordinate system. This invariant is exactly norm-squared when in orthogonal coordinates where covariant = contravariant. So both projections are part of understanding length. Does that help? @EricAuld$\endgroup$
– Assad EbrahimOct 14 '14 at 20:09

Usually one wants lower subscripts to denote vectors, probably coming from the common practice of denoting the standard basis of $\mathbb R^n$ as $e_1,\ldots,e_n$. Then to use Einstein summation convention you would want the coordinate functions $x^i$ on $\mathbb R^n$ to have upper indices since they represent covectors: any vector on $\mathbb R^n$ is written as
$$
v = x^i(v)e_i.
$$
In $\frac{\partial}{\partial x^i}$, the $i$ is a lower index (consistent with vectors having lower indices) and arguably this is the better notation. The main advantage, in my opinion, of using $x_i$ instead of $x^i$ is that you don't confuse the index with an exponent and writing a power of a coordinate function is cleaner.

$\begingroup$So a lower index in the denominator is actually an upper index? If so, why is this the convention?$\endgroup$
– Eric AuldNov 5 '13 at 2:00

1

$\begingroup$It's a good convention. Einstein summation is consistent with the fact that the basis of vectors $\left\{ \frac{\partial}{\partial x^1}, \ldots, \frac{\partial}{\partial x^n} \right\}$ is dual to the basis of covectors $\left\{ dx^1, \ldots, dx^n \right\}$.$\endgroup$
– Sammy BlackNov 5 '13 at 3:03

2

$\begingroup$The "lower index in the denominator is an upper index" (or the reverse) can be seen as logical by calculating the derivative of a function with respect to a coordinate $x^i$, which we expect to be a gradient: $\frac{\partial}{\partial x^i}f(x)=\frac{\partial f}{\partial x^j}\delta^j_i$. Sum over the $j$s (since they are upper and lower now), so the derivative is a 1-form (has lower $i$).$\endgroup$
– levitopherNov 5 '13 at 3:56

I never really liked this convention myself, but if you ever have to do any heavy duty differential-geometric calculations (such as in a course on general relativity) you have to keep track of a myriad of indicies, some of which are representing (tensor products of) vectors and others which represent (tensor products of) covectors, and distinguishing between the two would be very difficult if the upper indexes were not used. However, in a pure mathematics course on a more abstract coordinate-free approach to differential geometry I don't think the notation is that helpful, and I prefer to just stick to lower indicies.