Quantum math: states as vectors, and apparatuses as operators

I actually wanted to write about the Hamiltonian matrix. However, I realize that, before I can serve the plat de résistance, we need to review or introduce some more concepts and ideas. It all revolves around the same theme: working with states is like working with vectors, but so you need to know how exactly. Let’s go for it. 🙂

In my previous posts, I repeatedly said that a set of base states is like a coordinate system. A coordinate system allows us to describe (i.e. uniquely identify) vectors in an n-dimensional space: we associate a vector with a set of real numbers, like x, y and z, for example. Likewise, we can describe any state in terms of a set of complex numbers – amplitudes, really – once we’ve chosen a set of base states. We referred to this set of base states as a ‘representation’. For example, if our set of base states is +S, 0S and −S, then any state φ can be defined by the amplitudes C+ = 〈 +S | φ 〉, C0 = 〈 0S | φ 〉, and C− = 〈 −S | φ 〉.

We have to choose somerepresentation (but we are free to choose which one) because, as I demonstrated when doing a practical example (see my description of muon decay in my post on how to work with amplitudes), we’ll usually want to calculate something like the amplitude to go from one state to another – which we denoted as 〈 χ | φ 〉 – and we’ll do that by breaking it up. To be precise, we’ll write that amplitude 〈 χ | φ 〉 – i.e. the amplitude to go from state φ to state χ (you have to read this thing from right to left, like Hebrew or Arab) – as the following sum:

So that’s a sum over a complete set of base states (that’s why I write all i under the summation symbol ∑). We discussed this rule in our presentation of the ‘Laws’ of quantum math.

Now we can play with this. As χ can be defined in terms of the chosen set of base states too, it’s handy to know that 〈 χ | i 〉 and 〈 i | χ 〉 are each other’s complex conjugates – we write this as: 〈 χ | i 〉 = 〈 i | χ 〉* – so if we have one, we have the other (we can also write: 〈 i | χ 〉* = 〈 χ | i 〉). In other words, if we have all Ci = 〈 i | φ 〉 and all Di = 〈 i | χ 〉, i.e. the ‘components’ of both states in terms of our base states, then we can calculate 〈 χ | φ 〉 as:

〈 χ | φ 〉 = ∑ Di*Ci = ∑〈 χ | i 〉〈 i | φ 〉,

provided we make sure we do the summation over a complete set of base states. For example, if we’re looking at the angular momentum of a spin-1/2 particle, like an electron or a proton, then we’ll have two base states, +ħ/2 and +ħ/2, so then we’ll have only two terms in our sum, but the spin number (j) of a cobalt nucleus is 7/2, so if we’d be looking at the angular momentum of a cobalt nucleus, we’ll have eight (2·j + 1)base states and, hence, eight terms when doing the sum. So it’s very much like working with vectors, indeed, and that’s why states are often referred to as state vectors. So now you know that term too. 🙂

However, the similarities run even deeper, and we’ll explore all of them in this post. You may or may not remember that your math teacher actually also defined ordinary vectors in three-dimensional space in terms of base vectors ei, defined as: e1 = [1, 0, 0], e2 = [0, 1, 0] and e2 = [0, 0, 1]. You may also remember that the units along the x, y and z-axis didn’t have to be the same – we could, for example, measure in cm along the x-axis, but in inches along the z-axis, even if that’s not very convenient to calculate stuff – but that it was very important to ensure that the base vectors were a set of orthogonal vectors. In any case, we’d chose our set of orthogonal base vectors and write all of our vectors as:

This actually allows us to re-write the vector dot product A·B in a way you’ve probably haven’t seen before. Indeed, you’d usually calculate A·B as |A|∙|B|·cosθ = A∙B·cosθ (A and B is the magnitude of the vectors A and B respectively) or, quite simply, as AxBx + AyBy + AzBz. However, using the dot products above, we can now also write it as:

We deliberately wrote B·Ainstead of A∙B because, while the mathematical similarity with the

〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | φ 〉

equation is obvious, B·A = A·B but 〈 χ | φ 〉 ≠ 〈 φ | χ 〉. Indeed, 〈 χ | φ 〉 and 〈 φ | χ 〉 are complex conjugates – so 〈 χ | φ 〉 = 〈 φ | χ 〉* – but they’re not equal. So we’ll have to watch the order when working with those amplitudes. That’s because we’re working with complex numbers instead of real numbers. Indeed, it’s only because the A·B dot product involves real numbers, whose complex conjugate is the same, that we have that commutativity in the real vector space. Apart from that – so apart from having to carefully check the order of our products – the correspondence is complete.

Let me mention another similarity here. As mentioned above, our base vectors ei had to be orthogonal. We can write this condition as:

ei·ej = δij, with δij = 0 if i ≠ j, and 1 if i = j.

Now, our first quantum-mechanical rule says the same:

〈 i | j 〉 = δij, with δij = 0 if i ≠ j, and 1 if i = j.

So our set of base states also has to be ‘orthogonal’, which is the term you’ll find in physics textbooks, although – as evidenced from our discussion on the base states for measuring angular momentum – one should not try to give any geometrical interpretation here: +ħ/2 and +ħ/2 (so that’s spin ‘up’ and ‘down’ respectively) are not ‘orthogonal’ in any geometric sense, indeed. It’s just that pure states, i.e. base states, are separate, which we write as: 〈 ‘up’ | ‘down’ 〉 = 〈 ‘down’ | ‘up’ 〉 = 0 and 〈 ‘up’ | ‘up’ 〉 = 〈 ‘down’ | ‘down’ 〉 = 1. It just means they are just different base states, and so it’s one or the other. For our +S, 0S and −S example, we’d have nine such amplitudes, and we can organize them in a little matrix:

In fact, just like we defined the base vectors ei as e1 = [1, 0, 0], e2 = [0, 1, 0] and e2 = [0, 0, 1] respectively, we may say that the matrix above, which states exactly the same as the 〈 i | j 〉 = δij rule, can serve as a definition of what base states actually are. [Having said that, it’s obvious we like to believe that base states are more than just mathematical constructs: we’re talking reality here. The angular momentum as measured in the x-, y- or z-direction, or in whatever direction, is more than just a number.]

OK. You get this. In fact, you’re probably getting impatient because this is too simple for you. So let’s take another step. We showed that the 〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | χ 〉 and B·A = ∑(B·ei)(ei·A) are structurally equivalent – from a mathematical point of view, that is – but B and A are separate vectors, while 〈 χ | φ 〉 is just a complex number. Right?

Well… No. We can actually analyze the bra and the ket in the 〈 χ | φ 〉 bra-ket as separate pieces too. Moreover, we’ll show they are actually state vectors too, even if the bra, i.e. 〈 χ |, and the ket, i.e. | φ 〉, are ‘unfinished pieces’, so to speak. Let’s be bold. Let’s just cut the 〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | χ 〉 by writing:

Huh?

Yes. That’s the power of Dirac’s bra-ket notation: we can just drop symbols left or right. It’s quite incredible. But, of course, the question is: so what does this actually mean? Well… Don’t rack your brain. I’ll tell you. We define | φ 〉 as a state vector because we define | i 〉 as a (base) state vector. Look at it this way: we wrote the 〈 +S | φ 〉, 〈 0S | φ 〉 and 〈 −S | φ 〉 amplitudes as C+, C0, C−, respectively, so we can write the equation above as:

So we’ve got a sum of products here, and it’s just like A = Ax·e1 + Ay·e2 + Az·e3. Just substitute the Ai coefficients for Ci and the ei base vectors for the | i 〉 base states. We get:

| φ 〉 = |+S〉 C+ + |0S〉 C0 + |+S〉 C−

Of course, you’ll wonder what those terms mean: what does it mean to ‘multiply’ C+ (remember: C+ is some complex number) by |+S〉? Be patient. Just wait. You’ll understand when we do some examples, so when you start working with this stuff. You’ll see it all makes sense—later. 🙂

Of course, we’ll have a similar equation for | χ 〉, and so if we write 〈 χ | i 〉 as Di, then we can write | χ 〉 = ∑ | i 〉〈 χ | i 〉 as | χ 〉 = ∑ | i 〉 Di.

So what? Again: be patient. We know that 〈 χ | i 〉 = 〈 i | χ 〉*, so our second equation above becomes:

You’ll have two questions now. The first is the same as the one above: what does it mean to ‘multiply’, let’s say, D0* (i.e. the complex conjugate of D0, so if D0 = a + ib, then D0* = a − ib) with 〈0S|? The answer is the same: be patient. 🙂 Your second question is: why do I use another symbol for the index here? Why j instead of i? Well… We’ll have to re-combine stuff, so it’s better to keep things separate by using another symbol for the same index. 🙂

In fact, let’s re-combine stuff right now, in exactly the same way as we took it apart: we just write the two things right next to each other. We get the following:

What? Is that it? So we went through all of this hocus-pocus just to find the same equation as we started out with?

Yes. I had to take you through this so you get used to juggling all those symbols, because that’s what we’ll do in the next post. Just think about it and give yourself some time. I know you’ve probably never ever handled such exercise in symbols before – I haven’t, for sure! – but it all makes sense: we cut and paste. It’s all great! 🙂 [Oh… In case you wonder about the transition from the sum involving i and j to the sum involving i only, think about the Kronecker expression: 〈 j | i 〉 = δij, with δij = 0 if i ≠ j, and 1 if i = j, so most of the terms are zero.]

To summarize the whole discussion, note that the expression above is completely analogous with the B·A = BxAx + ByAy + BzAz formula. The only difference is that we’re talking complex numbers here, so we need to watch out. We have to watch the order of stuff, and we can’t use the Di numbers themselves: we have to use their complex conjugates Di*. But, for the rest, we’re all set! 🙂 If we’ve got a set of base states, then we can define any state in terms of a set of ‘coordinates’ or ‘coefficients’ – i.e. the Ci or Di numbers for the φ or χ example above – and we can then calculate the amplitude to go from one state to another as:

In case you’d get confused, just take the original equation:

The two equations are fully equivalent.

[…]

So we just went through all of the shit above so as to show that structural similarity with vector spaces?

Yes. It’s important. You just need to remember that we may have two, three, four, five,… or even an infinite number of base states depending on the situation we’re looking at, and what we’re trying to measure. I am sorry I had to take you through all of this. However, there’s more to come, and so you need this baggage. We’ll take the next step now, and that is to introduce the concept of an operator.

Look at the middle term in that expression above—let me copy it:

We’ve got three terms in that double sum (a double sum is a sum involving two indices, which is what we have here: i and j). When we have two indices like that, one thinks of matrices. That’s easy to do here, because we represented that 〈 i | j 〉 = δij equation as a matrix too! To be precise, we presented it as the identity matrix, and a simple substitution allows us to re-write our equation above as:

I must assume you’re shaking your head in disbelief now: we’ve expanded a simple amplitude into a product of three matrices now. Couldn’t we just stick to that sum, i.e that vector dot product ∑ Di*Ci? What’s next? Well… I am afraid there’s a lot more to come. For starters, we’ll take that idea of ‘putting something in the middle’ to the next level by going back to our Stern-Gerlach filters and whatever other apparatus we can think of. Let’s assume that, instead of some filter S or T, we’ve got something more complex now, which we’ll denote by A. [Don’t confuse it with our vectors: we’re talking an apparatus now, so you should imagine some beam of particles, polarized or not, entering it, going through, and coming out.]

We’ll stick to the symbols we used already, and so we’ll just assume a particle enters into the apparatus in some state φ, and that it comes out in some state χ. Continuing the example of spin-one particles, and assuming our beam has not been filtered – so, using lingo, we’d say it’s unpolarized – we’d say there’s a probability of 1/3 for being either in the ‘plus’, ‘zero’, or ‘minus’ state with respect to whatever representation we’d happen to be working with, and the related amplitudes would be 1/√3. In other words, we’d say that φ is defined by C+ = 〈 +S | φ 〉, C0 = 〈 0S | φ 〉, and C− = 〈 −S | φ 〉, with C+ = C0 = C− = 1/√3. In fact, using that | φ 〉 = |+S〉 C+ + |0S〉 C0 + |+S〉 C− expression we invented above, we’d write: | φ 〉 = (1/√3)|+S〉 + (1/√3)|0S〉 C0 + (1/√3)|+S〉 C− or, using ‘matrices’—just a row and a column, really:

However, you don’t need to worry about that now. The new big thing is the following expression:

〈 χ | A | φ〉

It looks simple enough: φ to A to χ. Right? Well… Yes and no. The question is: what do you do with this? How would we take its complex conjugate, for example? And if we know how to do that, would it be equal to 〈 φ | A | χ〉?

You guessed it: we’ll have to take it apart, but how? We’ll do this using another fantastic abstraction. Remember how we took Dirac’s 〈 χ | φ 〉 bra-ket apart by writing | φ 〉 = ∑ | i 〉〈 i | φ 〉? We just dropped the 〈 χ left and right in our 〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | φ 〉 expression. We can go one step further now, and drop the φ 〉 left and right in our | φ 〉 = ∑ | i 〉〈 i | φ 〉 expression. We get the following wonderful thing:

| = ∑ | i 〉〈 i | over all base states i

With characteristic humor, Feynman calls this ‘The Great Law of Quantum Mechanics’ and, frankly, there’s actually more than one grain of truth in this. 🙂

Now, if we apply this ‘Great Law’ to our 〈 χ | A | φ〉 expression – we should apply it twice, actually – we get:

As Feynman points out, it’s easy to add another apparatus in series. We just write:

Just put a | bar between B and A and apply the same trick. The | bar is really like a factor 1 in multiplication. However, that’s all great fun but it doesn’t solve our problem. Our ‘Great Law’ allows us to sort of ‘resolve’ our apparatus A in terms of base states, as we now have 〈 i | A | j 〉 in the middle, rather than 〈 χ | A | φ〉 but, again, how do we work with that?

Well… The answer will surprise you. Rather than trying to break this thing up, we’ll say that the apparatus A is actually being described, or defined, by the nine 〈 i | A | j 〉 amplitudes. [There are nine for this example, but four only for the example involving spin-1/2 particles, of course.] We’ll call those amplitudes, quite simply, the matrix of amplitudes, and we’ll often denote it by Aij.

Now, I wanted to talk about operators here. The idea of an operator comes up when we’re creative again, and when we drop the 〈 χ | state from the 〈 χ | A | φ〉 expression. We write:

So now we think of the particle entering the ‘apparatus’ A in the state ϕ and coming out of A in some state ψ (‘psi’). We can generalize this and think of it as an ‘operator’, which Feynman intuitively defines as follows:

The symbol A is neither an amplitude, nor a vector; it is a new kind of thing called an operator. It is something which “operates on” a state to produce a new state.”

But… Wait a minute! | ψ 〉 is not the same as 〈 χ |.Why can we do that substitution? We can only do it because any state ψ and χ are related through that other ‘Law’ of quantum math:

Combining the two shows our ‘definition’ of an operator is OK. We should just note that it’s an ‘open’ equation until it is completed with a ‘bra’, i.e. a state like 〈 χ |, so as to give the 〈 χ | ψ〉 = 〈 χ | A | φ〉 type of amplitude that actually means something. In practical terms, that means our operator or our apparatus doesn’t mean much as long as we don’t measure what comes out, so then we choose some set of base states, i.e. a representation, which allows us to describe the final state, i.e. 〈 χ |.

[…]

Well… Folks, that’s it. I know this was mighty abstract, but the next posts should bring things back to earth again. I realize it’s only by working examples and doing exercises that one can get some kind of ‘feel’ for this kind of stuff, so that’s what we’ll have to go through now. 🙂