Wednesday, October 1, 2014

Living in a vector

Vectors are present in all domains of fundamental physics, so if you want to understand physics, you will need them. You may think you know them, but the truth is that they appear in so many guises, that nobody really knows everything about them. But vectors are a gate that allows you to enter the Cathedral of physics, and once you are inside, they can guide you in all places. That is, special and general relativity, quantum mechanics, particle physics, gauge theory... all these places need vectors, and once you master the vectors, they become much simpler (if you don't know them and are interested, read this post).

The Cathedral has many gates, and vectors are just one of them. You can enter through groups, sets and relations, functions, categories, through all sorts of objects or structures from algebra, geometry, even logic. I decided to show you now the way of vectors, because I think is fast and deep in the same time, but remember, this is a matter of choice. And vectors will lead us, inevitably, to the other gates too.

I will explain some elementary and not so elementary things about vectors, but you have to read and practice, because here I just give some guidelines, a big picture. The reason I am doing this is that when you study, you may get lost in details and miss the essential.

Very basic things

A vector can be understood in many ways. One way is to see it as a
way to specify how to move from one point to another. A vector is like an
arrow, and if you place the arrow in that point, you
find the destination point. To find the new position for any point, just
place the vector in that point, and the tip of the vector will show you
the new position. You can compose more such arrows, and what you'll get
is another vector, their sum. You can also subtract them, just place
their origins in the same point, and the difference is the vector
obtained by joining their tips with another arrow.

Once you fix a reference position, an origin, you can specify any position, by the vector that tells you how to move from origin to that position. You can see that vector as being the difference between the destination, and the starting position.

You can add and subtract vectors. You can multiply them with numbers. Those numbers are from a field $\mathbb{K}$, and we can take for example $\mathbb{K}=\mathbb{R}$, or $\mathbb{K}=\mathbb{C}$, and are called scalars. A vector space is a set of vectors, so that no matter how you add them and scale them, the result is from the same set. The vector space is real (complex), if the scalars are real (complex) numbers. A sum of rescaled vectors is named linear combination. You can always pick a basis, or a frame, a set of vectors so that any vector can be written as a linear combination of the basis vectors, in a unique way.

Vectors and functions

Consider a vector $v$ in an $n$-dimensional space $V$, and suppose its components in a given basis are $(v^1,\ldots,v^n)$. You can represent any vector $v$ as a function $f:\{1,\ldots,n\}\to\mathbb{K}$ given by $f(i)=v^i$. Conversely, any such function defines a unique vector. In general, if $S$ is a set, then the set of the functions $f:S\to\mathbb{K}$ form a vector space, which we will denote by $\mathbb{K}^S$. The cardinal of $S$ gives the dimension of the vector space, so $\mathbb{K}^{\{1,\ldots,n\}}\cong\mathbb{K}^n$. So, if $S$ is an infinite set, we will have an infinite dimensional vector space. For example, the scalar fields on a three
dimensional space, that is, the functions $f:\mathbb{R}^3\to
\mathbb{R}$, form an infinite dimensional vector space. Not only the vector spaces are not limited to $2$ or $3$ dimensions, but infinite dimensional spaces are very natural too.

Dual vectors

If $V$ is a $\mathbb{K}$-vector space, a linear functions $f:V\to\mathbb{K}$ is a function satisfying $f(u+v)=f(u)+f(v)$, and $f(\alpha u)=\alpha f(u)$, for any $u,v\in V,\alpha\in\mathbb{K}$. The linear functions $f:V\to\mathbb{K}$ form a vector space $V^*$ named the dual space of $V$.

Tensors

Consider now two sets, $S$ and $S'$, and a field $\mathbb{K}$. The Cartesian product $S\times S'$ is defined as the set of pairs $(s,s')$, where $s\in S$ and $s'\in S'$. The functions defined on the Cartesian product, $f:S\times S'\to\mathbb{K}$, form a vector space $\mathbb{K}^{S\times S'}$, named the tensor product of $\mathbb{K}^{S}$ and $\mathbb{K}^{S'}$, $\mathbb{K}^{S\times S'}=\mathbb{K}^{S}\otimes\mathbb{K}^{S'}$. If $(e_i)$ and $(e'_j)$ are bases of $\mathbb{K}^{S}$ and $\mathbb{K}^{S'}$, then $(e_ie'j)$, where $e_ie'_j(s,s')=e_i(s)e'_j(s')$, is a basis of $\mathbb{K}^{S\times S'}$. Any vector $v\in\mathbb{K}^{S_1\times S_2}$ can be uniquely written as $v=\sum_i\sum_j \alpha_{ij} e_ie'j$.

Also, the set of functions $f:S\to\mathbb{K}^{S'}$ is a vector space, which can be identified with the tensor product $\mathbb{K}^{S}\otimes(\mathbb{K}^{S'})^*$.

The vectors that belong to tensor products of vector spaces are named tensors. So, tensors are vectors with some extra structure.

The tensor product can be defined easily for any kind of vector spaces, because any vector space can be thought of as a space of functions. The tensor product is associative, so we can define it between multiple vector spaces. We denote the tensor product of $n>1$ copies of $V$ by $V^{\otimes n}$. We can check that for $m,n>1$, $V^{\otimes (m+n)}=V^{\otimes {m}}\otimes V^{\otimes {n}}$. This can work also for $m,n\geq 0$, if we define $V^1=V$, $V^0=\mathbb{K}$. So, vectors and scalars are just tensors.

Let $U$, $V$ be $\mathbb{K}$-vector spaces. A linear operator is a function $f:U\to V$ which satisfies $f(u+v)=f(u)+f(v)$, and $f(\alpha u)=\alpha f(u)$, for any $u\in U,v\in V,\alpha\in\mathbb{K}$. The operator $f:U\to V$ is in fact a tensor from $U^*\otimes V$.

Inner products

Given a basis, any vector can be expressed as a set of numbers, the components of the vector. But the vector is independent of this numerical representation. The basis can be chosen in many ways, and in fact, any non-zero vector can have any components (provided not all are zero) in a well chosen basis. This shows that any two non-zero vectors play identical roles, which may be a surprise. This is a key point, since a common misconception when talking about vectors is that they have definite intrinsic sizes and orientations, or that they can make an angle. But in fact the sizes and orientations are relative to the frame, or to the other vectors. Moreover, you can say that from two vectors, one is larger than the other, only if they are collinear. Otherwise, no matter how small is one of them, we can easily find a basis in which it becomes larger than the other. It makes no sense to speak about the size, or magnitude, or length of a vector, as an intrinsic property.

But wait, one may say, there is a way to define the size of a vector! Consider a basis in a two-dimensional vector space, and a vector $v=(v^1,v^2)$. Then, the size of the vector is given by Pythagoras's theorem, by $\sqrt{(v^1)^2+(v^2)^2}$. The problem with this definition is that, if you change the basis, you will obtain different components, and different size of the vector. To make sure that you obtain the same size, you should allow only certain bases. To speak about the size of a vector, and about the angle between two vectors, you need an additional object, which is called inner product, or scalar product. Sometimes, for example in geometry and in relativity, it is called metric.

Choosing a basis gives a default inner product. But the best way is to define the inner product, and not to pick a special basis. Once you have the inner product, you can define angles between vectors too. But size and angles are not intrinsic properties of vectors, they depend on the scalar product too.

The inner product between two vectors $u$ and $v$, defined by a basis, is $u\cdot v = u^1 v^1 + u^2 v^2 + \ldots + u^n v^n$. But in a different basis, it will have a general form $u\cdot v=\sum_i\sum_j g_{ij} u^i v^j$, where $g_{ij}=g_{ji}$ can be seen as the components of a symmetric matrix. These components change when we change the basis, they form the components of a tensor from $V^*\otimes V^*$. Einstein had the brilliant idea to omit the sum signs, so the inner product looks like $u\cdot v=g_{ij} u^i v^j$, where you know that since $i$ and $j$ appear both in upper and in lower positions, we make them run from $1$ to $n$ and sum. This is a thing that many geometers hate, but physicists find it very useful and compact in calculations, because the same summation convention appears in many different situations, which to geometers appear to be different, but in fact are very similar.

Given a basis, we can define the inner product by choosing the coefficients $g_{ij}$. And we can always find another basis, in which $g_{ij}$ is diagonal, that is, it vanishes unless $i=j$. And we can rescale the basis so that $g_{ii}$ are equal to $-1$, $1$, or $0$. Only if $g_{ii}$ are all $1$ in some basis, the size of the vector is given by the usual Pythagoras's theorem, otherwise, there will be some minus signs there, and even some terms will be omitted (corresponding to $g_{ii}=0$).

Quantum mechanics

Quantum particles are described by Schrödinger's equation. Its solutions are, for a single elementary particle, complex functions $|\psi\rangle:\mathbb{R}^3\to\mathbb{C}$, or more general, $|\psi\rangle:\mathbb{R}^3\to\mathbb{C}^k$, named wavefunctions. They describe completely the states of the quantum particle. They form a vector space $H$ which also has a hermitian product (a complex scalar product so that $h_{ij}=\overline{h_{ji}}$), and is named the Hilbert space (because in the infinite dimensional case also satisfies an additional property which we don't need here), or the state space. Linear transformations of $H$ which preserve the complex scalar product are named unitary transformations, and they are the complex analogous of rotations.

The wavefunctions are represented in a basis as functions of positions, $|\psi\rangle:\mathbb{R}^3\to\mathbb{C}^k$. The element of the position basis represent point particles. But we can make a unitary transformation and obtain another basis, made of functions of the form $e^{i (k_x x + k_y y + k_z z)}$, which represent pure waves. Some observations use one of the bases, some the other, and here is why there is a duality between waves and point particles.

For more elementary particles, the state space is the tensor product of the state spaces of the individual particles. A tensor product of the form $|\psi\rangle\otimes|\psi'\rangle$ represents separable states, which can be observed independently. If the system can't be written like this, but only as a sum, the particles are entangled. When we measure them, the outcomes are correlated.

The evolution of a quantum system is described by Schrödinger's equation. Basically, the state rotates, by a unitary transformation. Only such transformations conserve the probabilities associated to the wavefunction.

When you measure the quantum systems, you need an observable. One can see an observable as defining a decomposition of the state space, in perpendicular subspaces. After the observation, the state is found to be in one of the subspaces. We can only know the subspace, but not the actual state vector. This is strange, because the system can, in principle, be in any possible state, but the measurement finds it to be only in one of these subspaces (we say it collapsed). This is the measurement problem. The things become even stranger, if we realize that if we measure another property, the corresponding decomposition of the state space is different. In other words, if you look for a point particle, you find a point particle, and if you look for a wave, you find a wave. This seems as if the unitary evolution given by the Schrödinger's equation is broken during observations. Perhaps the wavefunction remains intact, but to us, only one of the components continues to exist, corresponding to the subspace we obtained after the measurement. In the many worlds interpretation the universes splits, and all outcomes continue to exist, in new created universes. So, not only the state vector contains the universe, but it actually contains many universes.

I have a proposed explanation for some strange quantum features, in [1, 2, 3], and in these videos:

Special relativity

An example when there is a minus signs in the Pythagoras's theorem is given by the theory of relativity, where the squared size of a vector is $v\cdot v=-(v^t)^2+(v^x)^2+(v^y)^2+(v^z)^2$.

This inner product is named the Lorentz metric. Special relativity takes place in the Minkowski spacetime, which has four dimensions. A vector $v$ is named timelike if $v\cdot v < 0$, spacelike if $v\cdot v > 0$, and null or lightlike if $v\cdot v = 0$. A particle moving with the speed of light is described by a lightlike vector, and one moving with an inferior speed, by a timelike vector. Spacelike vectors would describe faster than light particles, if they exist. Points in spacetime are named events. Events can be simultaneous, but this depends on the frame. Anyway, to be simultaneous in a frame, two events have to be separated by a spacelike interval. If they are separated by a lightlike or timelike interval, they can be connected causally, or joined by a particle with a speed equal to, respectively smaller than the speed of light.

In Newtonian mechanics, the laws remain unchanged to translations and rotations in space, translations in time, and inertial movements of the frame - together they form the Galilei transformations. However, electromagnetism disobeyed. In fact, this was the motivation of the research of Einstein, Poincaré, Lorentz, and FitzGerald. Their work led to the discovery of special relativity, according to which the correct transformations are not those of Galilei, but those of Poincaré, which preserve the distances given by the Lorentz metric.

Curvilinear coordinates

A basis or a frame of vectors in the Minkowski spacetime allows us to construct Cartesian coordinates. However, if the observer's motion is accelerated (hence the observer is non-inertial), her frame will rotate in time, so Cartesian coordinates will have to be replaced with curved coordinates. In curved coordinates, the coefficients $g_{ij}$ depend on the position. But in special relativity they have to satisfy a flatness condition, otherwise spacetime will be curved, and this didn't make much sense back in 1905, when special relativity was discovered.

General relativity

Einstein remarked that to a non-inertial observer, inertia looks similar to gravity. So he imagined that a proper choice of the metric $g_{ij}$ may generate gravity. This turned out indeed to be true, but the choice of $g_{ij}$ corresponds to a curved spacetime, and not a flat one.

One of the problems of general relativity is that it has singularities. Singularities are places where some of the components of $g_{ij}$ become infinite, or where $g_{ij}$ has, when diagonalized, some zero entries on the diagonal. For this reason, many physicist believe that this problem indicates that general relativity should be replaced with some other theory, to be discovered. Maybe it will be solved when we will replace it with a theory of quantum gravity, like string theory or loop quantum gravity. But until we will know what is the right theory of quantum gravity, general relativity can actually deal with its own singularities (while the ones mentioned above did not solve this problem). I will not describe this here, but you can read my articles about this, and also this essay, and these posts about the black hole information paradox [1, 2, 3]. And watch this video

Vector bundles and forces

We call fields the functions defined on the space or the spacetime. We have seen that fields valued in vector spaces are actually vector spaces. On a flat space $M$ which looks like a vector space, the fields valued in vector spaces can be thought of as being valued in the same vector space, for example $f:M\to V$. But if the space is curved, or if it has nontrivial topology, we are forced to consider that at each point there is another copy of $V$. So, such a field will be more like $f(x)\in V_x$, where $V_x$ is the copy of the vector space $V$ at the point $x$. Such fields still form a vector space. The union of all $V_x$ is called a vector bundle. The fields are also called sections, and $V_x$ is called the fiber at $x$.

Now, since $V_x$ are copies of $V$ at each point, there is no invariant way to identify each $V_x$ with $V$. In other words, $V_x$ and $V$ can be identified, for each $x$, up to a linear transformation of $V$. We need a way to move from $V_x$ to a neighboring $V_{x+d x}$. This can be done with a connection. Also, moving a vector from $V_x$ along a closed curve reveals that, when returning to $V_x$, the vector is rotated. This is explained by the presence of a curvature, which can be obtained easily from the connection.

Connections behave like potentials of force fields. And a force field corresponds to the curvature of the connection. This makes very natural to use vector bundles to describe forces, and this is what gauge theory does.

Forces in the standard model of particles are described as follows. We assume that there is a typical complex vector space $V$ of dimension $n$, endowed with a hermitian scalar product. The connection is required to preserve this hermitian product when moving among the copies $V_x$. The set of linear transformations that preserve the scalar product is named unitary group, and is denoted by $U(n)$. The subset of transformations having the determinant equal to $1$ is named the special unitary group, $SU(n)$. The electromagnetic force corresponds to $U(1)$, the weak force to $SU(2)$, and the strong force to $SU(3)$. Moreover, all particles turn out to correspond to vectors that appear in the representations of the gauge groups on vector spaces.

What's next?

Vectors are present everywhere in physics. We see that they help us understand quantum mechanics, special and general relativity, and the particles and forces. They seem to offer a unitary view of fundamental physics.

However, up to this point, we don't know how to unify

unitary evolution and the collapse of the wavefunction

the quantum level with the mundane classical level

quantum mechanics and general relativity

the electroweak and strong forces (we know though how to combine the electromagnetic and weak forces, in the unitary group $U(2)$)