Fundamental Quantum Probability

Let [itex]|\psi_n\rangle\in\mathcal{H}[/itex], where [itex]\mathcal{H}[/itex] is Hilbert space, be orthonormal states forming a complete set, and [itex]n\in\mathbb{N}[/itex]. Let

[tex]|\Psi\rangle=\sum_{n=1}^N c^{(1)}_n|\psi_n\rangle,[/tex]

where [itex]c_n[/itex]s are normalized coefficients and [itex]N[/itex] is either finite or infinite. Let [itex]m[/itex] be an eigenvalue of observable [tex]\hat O[/tex] corresponding to the eigenket [itex]|\phi\rangle\in\mathcal{H}[/itex], where

According to Copenhagen interpretation, there is no explanation about this.
Accoording to hidden variable theories its due to our lack of knowledge about the initial conditions. (Like the probabilities in classical statistical physics)

What specifically is "occurrence(s) of measuring particular state(s)" for given [tex]c_n[/tex]s and [tex]N[/tex]?

It's true that it's an interpretational thing. I personally see it in the a kind of subjective conditional probability view.

If you can imagine "a measure of" or equivalently "a way to count" the set of physically distinguishable possibilities, given a premise (seen as information); ie. each observer encodes this measure.

Then the quantum probability is the counting needed to define "probability the way you imagine" is simply counting the number of physically distinguishable possibilities that are consistent with the constraint implied by your prior premise: [tex]\{c_n\}_{i=1}^N; \mathcal{H}[/tex] which supposedly represents the observers "information", on which the "probability" is conditional.

The mystery is that given such a view, why does the counting compute/evaluate exactly like the structure of QM for a given operator? I personally see this as a so far open problem. But the general idea is that a bounded observer, can only encode/hold finite information, and therefore his "measure" is incomplete, and therefore it is possible for the emergence of non-commutative observables. Given that the next question could be the CHOICE of observables/information, that defines the observer. Given, that the observer has limited capacity, WHAT information should be retained/encoded and what should be discarded?

I think all these questions need answered in order to give a somewhat satisfactory answer to your original question.

observer, can only encode/hold finite information, and therefore his "measure" is incomplete, and therefore it is possible for the emergence of non-commutative observables.

This incompleteness should thouhg not be confused with the realist type of incompletness of QM that Einsteins were talking about. This incompletness is not an incompletness of our theory, it's an incompletness intrinstic to the makeup of nature and the interactions in nature.

Let [tex]|\psi_n\rangle\in\mathcal{H}[/tex], where [tex]\mathcal{H}[/tex] is Hilbert space, be orthonormal states forming a complete set, and [tex]n\in\mathbb{N}[/tex]. Let

[tex]|\Psi\rangle=\sum_{n=1}^N c^{(1)}_n|\psi_n\rangle,[/tex]

where [tex]c_n[/tex]s are normalized coefficients and [tex]N[/tex] is either finite or infinite. Let [tex]m[/tex] be an eigenvalue of observable [tex]\widehat{O}[/tex] corresponding to the eigenket [tex]|\phi\rangle\in\mathcal{H}[/tex], where

which follows from the definition of probability. What specifically is "occurrence(s) of measuring particular state(s)" for given [tex]c_n[/tex]s and [tex]N[/tex]?

Most of this looks very strange to me. There certainly shouldn't be a * on that phi near the end. It's not clear what the [itex]\psi_n[/itex] are. Are they the members of an arbitrary orthonormal basis? In that case, they shouldn't appear at all in the final expression. Are they the eigenstates of a complete set of commuting observables? In that case, is [tex]\hat O[/tex] one of them? And why are the states labled by n instead of the eigenvalues? Also, you didn't actually say that [tex]\hat O[/tex] is the observable being measured.

I should have been more clear about that, too. I am leaving them unspecified. What if I were to say that the [itex]|\psi_n\rangle[/itex]s are eigenstates of [tex]\hat O[/tex]; how would that change things? I meant that they are the complete set of orthonormal states in all of Hilbert space which can represent any arbitrary state [itex]|\phi\rangle[/itex]. Is this not possible?

What if I were to say that the [itex]|\psi_n\rangle[/itex]s are eigenstates of [tex]\hat O[/tex]; how would that change things?

I thought you were trying to say something similar to this: If the system is in state [itex]|\psi\rangle[/itex] just before we measure an observable [tex]\hat O[/tex] with eigenvectors [itex]|\phi_n\rangle[/itex], then [itex]|\langle\phi_m|\psi\rangle|^2[/itex] is the probability that the result will be the eigenvalue corresponding to [itex]|\phi_m\rangle[/itex]. (Note that there's no need to mention a basis).

This is the simplest version of the probability rule. It only works when the space of eigenvectors corresponding to the eigenvalue we're interested in is 1-dimensional.

The [itex]|\phi_n\rangle[/itex]s only appear implicitly in the last expression.

You haven't defined any [itex]|\phi_n\rangle[/itex]. I asked about [itex]|\psi_n\rangle[/itex], and they appear on the right. But I don't know what you meant by that right-hand side. Maybe something like this:

"Number of measurements with result Om" / "Total number of measurements"

...where Om is the eigenvalue corresponding to [itex]|\phi_m\rangle[/itex].

You seemed to be going for something else, something like

"number of [itex]|\psi_n\rangle[/itex] with a certain property" / "total number of [itex]|\psi_n\rangle[/itex]",

but this certainly doesn't make sense when the Hilbert space is infinite dimensional, and I don't think there's a way to make sense of it in the finite-dimensional case either.

Maybe it was I that misunderstood his question, but I didn't interpret his question to have much to do with these issues or typos??

I thought his question was, and it was what I responded to, simply the quest for how to understand how to interpret something that evaluates as a superposition [tex]P(\phi|\psi_1 \vee \psi_2)[/tex], with a frequentist probability. ie. if you look at the expression and compare it with how classical logic would handle the same, and try to interpret this in terms of "frequency counts" of classical (non-quantum) logic.

If you see this as a manipulation of conditional probability, I thought OP's question was simply a confusion over how to understnand the superposition principle? but it's possible that I totally missed the point.

I thought his question was, and it was what I responded to, simply the quest for how to understand how to interpret something that evaluates as a superposition [tex]P(\phi|\psi_1 \vee \psi_2)[/tex], with a frequentist probability. ie. if you look at the expression and compare it with how classical logic would handle the same, and try to interpret this in terms of "frequency counts" of classical (non-quantum) logic.

This is basically what I was trying to ask: How does [itex]||\phi\rangle|^2[/itex] relate to frequentist probability, as you call it.

This is basically what I was trying to ask: How does [itex]||\phi\rangle|^2[/itex] relate to frequentist probability, as you call it.

Yes that's exactly what I thought you asked: you are trying to "understand" the superposition construction (quantum statistics), term by term, in terms of a "probability count" (or classical statistics)?

It's what I tried to say the first two posts. In the regular introduction QM, there is no physical understanding of this. It's basically part of the postulated structure of QM, in one form or the other.

This connection is I still an open issue. I have my own private understanding of this but there are various attempts at this also published.

I think depending on where you start, this is also related to understanding the born rule.

My personally preferred view is that the "counting" that can restore a "count picture", counts not primary events, but rather distinguishable "information quanta" in a retained and generally lossily compressed time history of events. Superposition and non-commutativity then is a result of counting the information states of a memory record, rather than the actual past time history.

Then depending on the compression and structure of this memory record, indeterminism and non-commutative measurements result. This is how I picture x and p, they represent different counting domains, and when one tries something like [tex]P(x \wedge p)[/tex] we have two information streams that competes for a fixed "information capacity", this is why confidence in one stream, reduces the confidence in the other one.

I find this is remotely related to Wetterich idea.

I personally am not aware of a paper that describes this in a good way (in a way I think is good). Eventually I hope to get around to writing a paper on this, but I don't want to publish anything that's isolated from the big picture since I am quite convinced that it will be misunderstood.

This is basically what I was trying to ask: How does [itex]||\phi\rangle|^2[/itex] relate to frequentist probability, as you call it.

The way I see it, the definition of science requires that we test each prediction (assignment of a probability to each possible result of an experiment) by performing a large number of equivalent experiments and comparing the predictions to the relative frequencies of the different possible results. If anyone claims otherwise, I'd like to know what definition of science they're using.

Note that the probabilities that appear in the theory are just numbers between 0 and 1, which the theory associates with results of experiments, and that science requires us to compare those numbers to relative frequencies in large but finite ensembles. There's no need, and it's actually quite irrational, to assume the existence of a limit that the relative frequencies tend to when the number of experiments go to infinity.

A longer version of this appears in this thread, starting at post #20. (But I think I expressed my ideas more clearly in some of the posts later in the thread).

I just read that paper again, and I have to say, I still don't get it (or maybe I do and that's the problem). Hartle defines an operator [itex]f_\infty{}^k[/itex] and proves that an infinite tensor product [itex]|s\rangle\otimes|s\rangle\otimes\cdots[/itex] is an eigenvector of that operator with eigenvalue [itex]|\langle k|s\rangle|^2[/itex]. What I don't get is why that should be interpreted as a derivation of the probability rule.

Let's look at the frequency operator for finite ensembles, just to avoid the technical difficulties. It can be defined by specifying its action on a tensor product of eigenstates of the operator we're interested in:

The eigenvalue is the fraction of the systems that are in state |k>. So [itex]f_N{}^k[/itex] has an obvious interpretation as a frequency operator when it acts on eigenstates of the relevant observable. But the interpretation is not so obvious when it acts on an arbitrary tensor product of states of identically prepared systems.

These are the probabilities that N measurements will yield a specific sequence of results, multiplied by the frequency of k in each sequence, and added up. So the expectation value is the expected frequency of k in a sequence of N measurements, i.e. it's close to the what we would get if we performed the entire sequence of measurements M times, recorded the frequency of k in each sequence, and computed the average of those frequencies, assuming that M is a large enough number.

Does this justify the interpretation of [itex]f_N{}^k[/itex] as a frequency operator when it's acting on a tensor product of identical but arbitrary states? I would say that it does, but note that we used the probability rule to interpret [itex]|\langle I|s\rangle|^2[/itex] as a probability. We seem need the result we're trying to prove to justify the interpretation of the result we get as the result we're trying to prove!

I just read that paper again, and I have to say, I still don't get it (or maybe I do and that's the problem).
...
Does this justify the interpretation of [itex]f_N{}^k[/itex] as a frequency operator when it's acting on a tensor product of identical but arbitrary states? I would say that it does, but note that we used the probability rule to interpret [itex]|\langle I|s\rangle|^2[/itex] as a probability. We seem need the result we're trying to prove to justify the interpretation of the result we get as the result we're trying to prove!

I think you are right. Indeed, it is often claimed in the literature that all attempts to explain the Born rule in MWI are circular - they "explain" it by tacitly assuming it in some place. A detailed discussion of this is given in the last chapter (written by N. Graham) of the book
The Many-Worlds Interpretation of Quantum Mechanics, edited by B. S. DeWitt and N. Graham (Princeton University Press, 1973). In this chapter Graham also attempts to avoid this problem. His attempt does not look convincing to me, but I cannot say that I really understood all this, so I would like to see your opinion on the Graham arguments. If you don't have this book, see also Private Messages.

I think you are right. Indeed, it is often claimed in the literature that all attempts to explain the Born rule in MWI are circular - they "explain" it by tacitly assuming it in some place.

Yes, but I think Hartle had already done that by assuming that the Hilbert space of a system that consists of many subsystems is the tensor product of the Hilbert spaces of the subsystems. (That can also be justified by appealing to the Born rule). So it seems that in this case, the implicit use of the Born rule isn't enough. We also need to use it explicitly.

... by assuming that the Hilbert space of a system that consists of many subsystems is the tensor product of the Hilbert spaces of the subsystems. (That can also be justified by appealing to the Born rule).

The way I see it, the definition of science requires that we test each prediction (assignment of a probability to each possible result of an experiment) by performing a large number of equivalent experiments and comparing the predictions to the relative frequencies of the different possible results. If anyone claims otherwise, I'd like to know what definition of science they're using.

Note that the probabilities that appear in the theory are just numbers between 0 and 1, which the theory associates with results of experiments, and that science requires us to compare those numbers to relative frequencies in large but finite ensembles. There's no need, and it's actually quite irrational, to assume the existence of a limit that the relative frequencies tend to when the number of experiments go to infinity.

How can you from a conceptal consistency point of view, one hand, make use of the existence of limits in a mathematical sense, and at the same time say that wether these limits exists is irreleveant?

I can buy this if we just see QM as a "theory" that we are testing, without caring about how it's constructed, and mainly focuses of the falsification or corroboration. But then we ignore the most important step which I think is the revision of a falsified hypothesis to a new one. Popper made the same mistake in his analysis.

In my view, the subjective conditional probability is defined as the inverse of the number of complexions in the observers memory of microstructure, consistent with the condition. If the information capacity is bounded, I prefer to think of this measure as not exhausting the entire real segment [0,1], it would rather be a subset of it.

The question is then, if the mictrostructure can be naturally given a hilbert like structure? It is then clear that since in principle this microstructure, can informtionally, be decomposed in many different ways into substructures, each substructure representing an observable, it is not possible due to the information capacity constraint to encode all different substructures at one. Non-commutativite substructures result. And the choice of structures are a result of the interaction history, as optimim data compression representations.

But in my view at least, the key to this conslusion is the limited information capacity. And without a careful assessment of the information theoretic basis, such as the probability and limits or non-limits this key is lost.

One can afterwards without problems scale up the complexity to teh effective continuum limit, but then we have defined the physical limiting procedure. If we start in the wrong end, the limiting procedure needed to make sense out of computations becomes as usual: ambigous, because we have not clue what the physical counting measure is.

I hava feeling - to jump a little - that I think what you call "irrationality" is the quest for improving the hypothesis generation. This is exactly what Karl Popper also called irrational and dismissed to pshychology of the human brain. I think we can perform better than that.

How can you from a conceptal consistency point of view, one hand, make use of the existence of limits in a mathematical sense, and at the same time say that wether these limits exists is irreleveant?

I have no idea why you think I've done that.

It doesn't make sense to assume that the average value after N measurements goes to some value as N→∞, for many different reasons, including that the universe will not support intelligent life (or machines) long enough for infinitely many experiments to be performed.

People who feel that we need that limit to exist must have failed to understand that only mathematical concepts have exact definitions, and that theories of science are already (and will always be) associating mathematical concepts that have exact definitions with real-world concepts that don't have an exact definition.

I can buy this if we just see QM as a "theory" that we are testing, without caring about how it's constructed, and mainly focuses of the falsification or corroboration. But then we ignore the most important step which I think is the revision of a falsified hypothesis to a new one. Popper made the same mistake in his analysis.

QM is a theory that we're testing, and what does "revision of a falsified hypothesis to a new one" have to do with anything I've been saying? It's a completely different subject, so how could it be a mistake not to mention it?

In my view, the subjective conditional probability is defined as the inverse of the number of complexions in the observers memory of microstructure, consistent with the condition.

Mathematical concepts should be defined mathematically, not in terms of real world concepts that don't have an exact definition.

QM seems to work fine even without observers. For example, the nuclear reactions in a star in a distant galaxy seem to work just fine without anyone thinking about the probabilities of those particular interactions.

The question is then, if the mictrostructure can be naturally given a hilbert like structure?

There are already at least three fully developed ways to get to quantum theory by associating a mathematical structure with something in the real world. (See e.g. the first paragraph in this post). I don't see a need for another one, especially not one that starts with an attempt to define probability in terms of psychology.

I hava feeling - to jump a little - that I think what you call "irrationality" is the quest for improving the hypothesis generation. This is exactly what Karl Popper also called irrational and dismissed to pshychology of the human brain. I think we can perform better than that.

That's not a little jump. This stuff has nothing to do with anything I've said.