Site Statistics

Is it foolish to distinguish between covariant and contravariant vectors?

+ 2 like- 0 dislike

625 views

A vector space is a set whose elements satisfy certain axioms. Now there are physical entities that satisfy these properties, which may not be arrows. A co-ordinate transformation is linear map from a vector to itself with a change of basis. Now the transformation is an abstract concept, it is just a mapping. To calculate it we need basis and matrices and how a transformation ends up looking depends only on the basis we choose, a transformation can look like a diagonal matrix if an eigenbasis is used and so on. It has nothing to do with the vectors it is mapping, only the dimension of the vector spaces is important.

So it is foolish to distinguish vectors on the way how their components change under a co-ordinate transformation, since it depends on the basis you used. So there is actually no difference between a contravariant and covariant vector, there is a difference between a contravariant and covariant basis as is shown here http://arxiv.org/abs/1002.3217. An inner product is between elements of the same vector space and not between two vector spaces, it is not how it is defined.

Is this approach correct?

Along with this approach mentioned, we can view covectors as members of the dual space of the contra-vector space. What advantage does this approach over the former mentioned in my post?

Addendum: So now there are contra variant vectors and their duals called covariant vectors. But the duals are defined only once the contravectors are set up because they are the maps from the space of contra vectors to $R$ and thus, it won't make sense of to talk of covectors alone. Then what does it mean that the gradient is a covector ? Now saying because it transforms in a certain way makes no sense.

In the context of manifolds, contravariant and covariant vectors "live" in different vector spaces--at a point, the tangent space and the cotangent space, respectively, so the corresponding "fields" are in different bundles. If the manifold is equipped with a well-behaved metric, then there is a natural isomorphism between them, so to some extent one can pretend they're the same objects. But you'll be missing out on a lot of geometrical meaning and integration theory if you do, because differential structure does not need a metric.

Your comment on this question:

To answer, leave an answer instead. Comments are usually for non-answers.
To mask links under text, please type your text, highlight it, and click the "link" button. You can then enter your link URL.
To alert a user, please use the "@" command and remove spaces from the username, example, the user "John Doe" should be pinged as "@JohnDoe", while the user "Johndoe" should be pinged as "@Johndoe". The post author is always automatically pinged (unless you are the post author).
Please consult the FAQ for as to how to format your post.

Live preview (may slow down editor)Preview

Live Preview

Preview

Your name to display (optional):

Email me at this address if a comment is added after mine:Email me if a comment is added after mine

Privacy: Your email address will only be used for sending these notifications.

Anti-spam verification:

If you are a human please identify the position of the character covered by the symbol $\varnothing$ in the following word:p$\varnothing$ysicsOverflowThen drag the red bullet below over the corresponding character of our banner. When you drop it there, the bullet changes to green (on slow internet connections after a few seconds).

3 Answers

This is not really an answer to your question, essentially because there isn't (currently) a question in your post, but it is too long for a comment.

Your statement that

A co-ordinate transformation is linear map from a vector to itself with a change of basis.

is muddled and ultimately incorrect. Take some vector space $V$ and two bases $\beta$ and $\gamma$ for $V$. Each of these bases can be used to establish a representation map $r_\beta:\mathbb R^n\to V$, given by
$$r_\beta(v)=\sum_{j=1}^nv_j e_j$$
if $v=(v_1,\ldots,v_n)$ and $\beta=\{e_1,\ldots,e_n\}$. The coordinate transformation is not a linear map from $V$ to itself. Instead, it is the map
$$r_\gamma^{-1}\circ r_\beta:\mathbb R^n\to\mathbb R^n,\tag 1$$
and takes coordinates to coordinates.

Now, to go to the heart of your confusion, it should be stressed that covectors are not members of $V$; as such, the representation maps do not apply to them directly in any way. Instead, they belong to the dual space $V^\ast$, which I'm hoping you're familiar with. (In general, I would strongly discourage you from reading texts that pretend to lay down the law on the distinction between vectors and covectors without talking at length about the dual space.)

The dual space is the vector space of all linear functionals from $V$ into its scalar field:
$$V=\{\varphi:V\to\mathbb R:\varphi\text{ is linear}\}.$$
This has the same dimension as $V$, and any basis $\beta$ has a unique dual basis $\beta^*=\{\varphi_1,\ldots,\varphi_n\}$ characterized by $\varphi_i(e_j)=\delta_{ij}$. Since it is a different basis to $\beta$, it is not surprising that the corresponding representation map is different.

To lift the representation map to the dual vector space, one needs the notion of the adjoint of a linear map. As it happens, there is in general no way to lift a linear map $L:V\to W$ to a map from $V^*$ to $W^*$; instead, one needs to reverse the arrow. Given such a map, a functional $f\in W^*$ and a vector $v\in V$, there is only one combination which makes sense, which is $f(L(v))$. The mapping $$v\mapsto f(L(v))$$ is a linear mapping from $V$ into $\mathbb R$, and it's therefore in $V^*$. It is denoted by $L^*(f)$, and defines the action of the adjoint $$L^*:W^*\to V^*.$$

If you apply this to the representation maps on $V$, you get the adjoints $r_\beta^*:V^*\to\mathbb R^{n,*}$, where the latter is canonically equivalent to $\mathbb R^n$ because it has a canonical basis. The inverse of this map, $(r_\beta^*)^{-1}$, is the representation map $r_{\beta^*}:\mathbb R^n\cong\mathbb R^{n,*}\to V^*$. This is the origin of the 'inverse transpose' rule for transforming covectors.

To get the transformation rule for covectors between two bases, you need to string two of these together:
$$
\left((r_\gamma^*)^{-1}\right)^{-1}\circ(r_\beta^*)^{-1}=r_\gamma^*\circ (r_\beta^*)^{-1}:\mathbb R^n\to \mathbb R^n,
$$
which is very different to the one for vectors, (1).

Still think that vectors and covectors are the same thing?

Addendum

Let me, finally, address another misconception in your question:

An inner product is between elements of the same vector space and not between two vector spaces, it is not how it is defined.

Inner products are indeed defined by taking both inputs from the same vector space. Nevertheless, it is still perfectly possible to define a bilinear form $\langle \cdot,\cdot\rangle:V^*\times V\to\mathbb R$ which takes one covector and one vector to give a scalar; it is simple the action of the former on the latter:
$$\langle\varphi,v\rangle=\varphi(v).$$
This bilinear form is always guaranteed and presupposes strictly less structure than an inner product. This is the 'inner product' which reads $\varphi_j v^j$ in Einstein notation.

Of course, this does relate to the inner product structure $ \langle \cdot,\cdot\rangle_\text{I.P.}$ on $V$ when there is one. Having such a structure enables one to identify vectors and covectors in a canonical way: given a vector $v$ in $V$, its corresponding covector is the linear functional
$$
\begin{align}
i(v)=\langle v,\cdot\rangle_\text{I.P.} : V&\longrightarrow\mathbb R \\
w&\mapsto \langle v,w\rangle_\text{I.P.}.
\end{align}
$$
By construction, both bilinear forms are canonically related, so that the 'inner product' $\langle\cdot,\cdot\rangle$ between $v\in V^*$ and $w\in V$ is exactly the same as the inner product $\langle\cdot,\cdot\rangle_\text{I.P.}$ between $i(v)\in V$ and $w\in V$. That use of language is perfectly justified.

Addendum 2, on your question about the gradient.

I should really try and convince you at this point that the transformation laws are in fact enough to show something is a covector. (The way the argument goes is that one can define a linear functional on $V$ via the form in $\mathbb R^{n*}$ given by the components, and the transformation laws ensure that this form in $V^*$ is independent of the basis; alternatively, given the components $f_\beta,f_\gamma\in\mathbb R^n$ with respect to two basis, the representation maps give the forms $r_{\beta^*}(f_\beta)=r_{\gamma^*}(f_\gamma)\in V^*$, and the two are equal because of the transformation laws.)

However, there is indeed a deeper reason for the fact that the gradient is a covector. Essentially, it is to do with the fact that the equation
$$df=\nabla f\cdot dx$$
does not actually need a dot product; instead, it relies on the simpler structure of the dual-primal bilinear form $\langle \cdot,\cdot\rangle$.

To make this precise, consider an arbitrary function $T:\mathbb R^n\to\mathbb R^m$. The derivative of $T$ at $x_0$ is defined to be the (unique) linear map $dT_{x_0}:\mathbb R^n\to\mathbb R^m$ such that
$$
T(x)=T(x_0)+dT_{x_0}(x-x_0)+O(|x-x_0|^2),
$$
if it exists. The gradient is exactly this map; it was born as a linear functional, whose coordinates over any basis are $\frac{\partial f}{\partial x_j}$ to ensure that the multi-dimensional chain rule,
$$
df=\sum_j \frac{\partial f}{\partial x_j}d x_j,
$$
is satisfied. To make things easier to understand to undergraduates who are fresh out of 1D calculus, this linear map is most often 'dressed up' as the corresponding vector, which is uniquely obtainable through the Euclidean structure, and whose action must therefore go back through that Euclidean structure to get to the original $df$.

But then we say, the covectors transform in a certain way. Well, they can transform normally like any other vector too. The fact that they transform in the peculiar way they do is associated with the fact that the vectors of the normal space are transformed first and there is a corresponding transformation of the dual space, whereas it could easily happen the other way around, i.e. a covector is transformed first and then its corresponding contra vector will transform in the same peculiar way as the convector first did.

I can tell you, though, that there is indeed complete symmetry between vectors and covectors (which is why they're called duals), and that this extends to the transformation laws. The (co)vectors themselves don't transform. Their components do transform, and they do so together and in exactly the correct way so that the inner product $\langle\cdot,\cdot\rangle$ will have the invariant form $\varphi_j v^j$ regardless of the basis.

Just to make this clear, I will no longer respond to comments on this thread; I have dedicated enough time and you are too convinced that you are right to seek explanations for your questions or indeed listen to anyone else at all. You're welcome to leave more comments in case anyone else wants to explain; I wish you the best of luck on that.

I can't really think of a good resource that has all of this material in this perspective. I learned my basic linear algebra from Friedberg's textbook, which is along these lines, but does not really go the whole way. If you want more of this then it's a good place to start; it will let you put on mathematician's goggles which you can then use to look at other material.

See, cotangent space is also a vector space and it can be thus represented in terms of components as a column vector, just like any abstract vector space and thus it will transform like any other vector does. ALL VECTORS ARE THE SAME IN TERMS OF HOW THEY TRANSFORM.

Your comment on this answer:

To answer, leave an answer instead. Comments are usually for non-answers.
To mask links under text, please type your text, highlight it, and click the "link" button. You can then enter your link URL.
To alert a user, please use the "@" command and remove spaces from the username, example, the user "John Doe" should be pinged as "@JohnDoe", while the user "Johndoe" should be pinged as "@Johndoe". The post author is always automatically pinged (unless you are the post author).
Please consult the FAQ for as to how to format your post.

Live preview (may slow down editor)Preview

Live Preview

Preview

Your name to display (optional):

Email me at this address if a comment is added after mine:Email me if a comment is added after mine

Privacy: Your email address will only be used for sending these notifications.

For any manifold, once you have a scalar field $f(x)$ you have the tangent space at $x$, composed of contravariant vectors, and its dual, the cotangent space, composed of covariant vectors. The contravariant vectors $h$ are essentially infinitesimal changes in $x$ divided by an infinitesimal, so if $x$ changes to $x+\epsilon h$ then $f(x)$ changes to $f(x+\epsilon h)=f(x)+ \epsilon \langle df(x) |h\rangle+o(\epsilon)$, which shows that the gradient $df(x)$ is a covector.

If the manifold is $R^n$ then contravariant vectors are column vectors and covariant vectors are row vectors. They are clearly distinct objects. (But in multivariate calculus, one generally prefers to work with column vectors only, and hence defines the gradient to be a column vector, too, $\nabla f(x)=df(x)^T$.)

Your comment on this answer:

To answer, leave an answer instead. Comments are usually for non-answers.
To mask links under text, please type your text, highlight it, and click the "link" button. You can then enter your link URL.
To alert a user, please use the "@" command and remove spaces from the username, example, the user "John Doe" should be pinged as "@JohnDoe", while the user "Johndoe" should be pinged as "@Johndoe". The post author is always automatically pinged (unless you are the post author).
Please consult the FAQ for as to how to format your post.

Live preview (may slow down editor)Preview

Live Preview

Preview

Your name to display (optional):

Email me at this address if a comment is added after mine:Email me if a comment is added after mine

Privacy: Your email address will only be used for sending these notifications.

I will say that the standard definition of vectors and one-forms is not the world's cleanest. A modern definition of vectors would say that a vector space is a mapping from the functions on the space to itself that satisfies the Leibniz rule and is linear (alternately, the vector space is the local linear approximation of the space). Then, the set of one-forms is a linear mapping from the vector space to the space of functions on the tangent space.

This is a precise definition but it is wholly unintuitive and formal as stated. Maybe a few words could illuminate what this definition really means otherwise it is just "Bourbakise"... It reminds me a bit of Arnold's lecture pauli.uni-muenster.de/~munsteg/arnold.html

Your comment on this answer:

To answer, leave an answer instead. Comments are usually for non-answers.
To mask links under text, please type your text, highlight it, and click the "link" button. You can then enter your link URL.
To alert a user, please use the "@" command and remove spaces from the username, example, the user "John Doe" should be pinged as "@JohnDoe", while the user "Johndoe" should be pinged as "@Johndoe". The post author is always automatically pinged (unless you are the post author).
Please consult the FAQ for as to how to format your post.

Live preview (may slow down editor)Preview

Live Preview

Preview

Your name to display (optional):

Email me at this address if a comment is added after mine:Email me if a comment is added after mine

Privacy: Your email address will only be used for sending these notifications.