The (Epistemological) Foundations of Physics

Sunday, March 1, 2015

From page 82 onward (click the book on the right) the analysis focuses its attention to General Relativistic description of gravity. The very first paragraphs allude to the fact that in conventional physics, Relativity Theory is not compatible with Quantum Mechanics. The subtle problems between Special Relativity and Quantum Mechanics are sometimes dismissed as negligible semantic issues, but the deeper you get towards relativistic gravity, more concrete those problems become.

In the attempts to quantify General Relativity, there are significant conceptual problems in the way time is conceived as a dynamic metric of space-time. The notion of time is embedded to the dynamic space-time geometry, and in quantifying that geometry, notion of time becomes self-referential to the dynamics of the geometry that describes time in the first place.

Click here for some commentary on the issue. Notice how typical attempts to quantize gravity involve quantizing general relativistic space-time itself. Notice how that leads into ill-defined background for quantum fluctuations. If it is space-time that fluctuates, what is it fluctuating in relation to?

Notice also the comments about problematic conceptualization of time;“Since time is essentially a geometrical concept [in General Relativity], its definition must be in terms of the metric. But the metric is also the dynamical variable, so the flow of time becomes intertwined with the flow of the dynamics of the system”

If we still step back to Special Relativity for a bit, it's worth noting that usually the space-time is described with the metric;

$$ds^2= dx^2 + dy^2 + dz^2 - (cdt)^2$$

The time component $t$ is taken as a coordinate axis, and its signature means the interval $s$ is exactly 0 in the case that $dx^2 + dy^2 + dz^2 = (cdt)^2$.

Meaning, anything moving with the speed of light is described as having 0 length interval. It takes the same coordinates when it is at earth, as when it is in Alpha Centauri, in terms of its own coordinate system. Or to be more accurate, such coordinate system is ill-defined; "light" is conceptually time-dilated to 0 "it doesn't see time", and thus it doesn't see "t"-coordinate, in so far that one wishes to describe time as geometry. On the flip-side of the same coin, the spatial distance between events is also exactly 0, or to say it another way, there's no way to describe time evolution from the perspective of light.

However, typical attempts to quantify gravity still tend to conceptualize time as geometrical axis of the coordinate system where the data is displayed.

Despite these kinds of conventional difficulties, Richard's analysis at hand reproduces both quantum mechanical and relativistic relationships from the same exact underlying framework, and in doing so it does in fact give a quantified representation of relativistic gravity, without employing the concept of space-time at all. In other words, while the presented framework is not really a theory about reality per se, it does effectively give a fully general representation of any and each possible unifying theory, or so-called "Theory of everything".

Naturally, unifying both relationships under the same logical framework begs the question, what exactly is the critical difference between Richard's framework, and the conventional form of these theories?

Many definitions picked up from the analysis correspond almost identically to concepts also defined in conventional physics - e.g., objects, "energy", "momentum", "rest mass" (p. 56-58) - there is one very crucial difference with the way time is conceptualized. Think carefully of the following;

The parameter $t$, or "time", is not defined as a coordinate axis. It was defined explicitly as an ordering parameter; elements associated with the same $t$ are taken to belong to the same "circumstance" (it is meant as an ordering parameter). And if you follow the development of general rules (p. 14-40), you can see that under this notation, elements must be seen as obeying contact interactions. Meaning, only the elements that come to occupy the same position at the same "time" can interact. Notice how that statement about "time" is not consistent with the idea that clocks measure "time"; what clocks measure is a different concept and great care must be taken to not tacitly confuse the two at any point of an analysis.

However, $\tau$, or tau, was defined as a coordinate axis, originally having nothing to do with time. It was named as tau at the get-go because in the derivation of Special Relativity, $\tau$ displacement turns out to correspond to exactly what any natural clock must measure; closely related to the relativistic concept of proper time which is typically symbolized with $\tau$. It is important to understand the difference though; in Special Relativity, $\tau$ is NOT a coordinate axis! Under the specific paradigm that we call Special Relativity, it cannot be generalized as a coordinate axis. The associated definitions of the entire framework need to be built accordingly.

Remember, in the analysis, $\tau$ was an "imaginary" coordinate value at the get-go; object's position in $\tau$ yields no impact in the evaluation of the final probabilities; its value is to be seen as completely unknown. On the other hand, object's momentum along $\tau$ corresponds to its mass (Eq. 3.22), which is essentially a fixed quantity, and thus treated as completely known.

Which simply means that $\tau$ position cannot play a role in the contact interactions of the elements. You may conceptualize it as if every object is infinitely long along $\tau$, if you wish. Either way, projecting $\tau$ position out from object interactions is the notion which is typically amiss in most attempts to describe relativity with euclidean geometries, and without it you tend to become unable to define interactions meaningfully.

Note how this corresponds exactly to the fact that the fundamental equation we are working with, is a wave equation. Since it is a wave equation, it is also governed by uncertainty principle from bottom-up.

Meaning mass $m$ and $\tau$ must be related with $\sigma_m \sigma_{\tau} \geq \frac{\hbar}{2}$. Since the momentum along $\tau$ (defined as mass) is completely known, the $\tau$ position must be completely unknown.

Note that under conventional description of Relativity, while $t$ is a coordinate axis against which all data is plotted, there is no such thing as clock that measures $t$; each clock measures its own proper time, $\tau$. We just cast $t$ from $\tau$ measurement. In so far that we are willing to approximate our clocks as stationary (which they never are), and approximate the effects of gravity and its unpredictable fluctuations, that casting is rather trivial. But when you involve gravity, you involve curving and wiggling world lines to all the elements, including the time reference devices, and the relationship between $\tau$ and the coordinate axis you wish to plot your data with (t) becomes much harder to handle. It seems quite rational to investigate a paradigm where $\tau$ is in fact by its very definition a coordinate axis.

I won't repeat the derivation in the book, but please post a comment if anything in the derivation seems unclear, so it can be clarified by the author.

Few comments are in order though. Note that the final result of the deduction (Eq. 4.23) is not completely identical to Schwarzschild solution; it contains an additional term, whose impact is extremely small. This is rather interesting, because it implies a possibility for experimental verification. On the other hand, it also may be simply an error by the author. I think this part would require that enough experienced people would walk through the derivation and see if they can find errors. With so many people looking for a way to describe quantum gravity, I would think there are interested parties out there.

Wednesday, January 14, 2015

It has become quite obvious to me that practically no one seems to comprehend what I have put forth in my book. I recently attended a Sigma Xi conference in Glendale Arizona where I spoke to several people about my logical presentation. From their reactions, I think I have a somewhat better comprehension of the difficulty they perceive. The opening chapters of the book seems to emphasize the wrong issues.

The two first chapters seem to overcomplicate a rather simple issue. I suggest one might consider the following post to be a simpler replacement of those opening issues.

The underlying issue I was presenting was the fact that our knowledge, from which we deduce our beliefs, constitutes a finite body of facts. This is an issue modern scientists usually have little interest in thinking about. See Zeno's paradox. My analysis can be seen as effectively uncovering some deep implications of the fact that our knowledge is built on a finite basis.

The same issue can be applied to human communication. Note that all languages spoken by mankind to express their beliefs is also a construct based on a finite number of concepts. What is important here is that the number of languages spoken by mankind is greater than one. This fact also has implications far beyond the common perception.

Normal scientific analysis of any problem invariably ignores some issues of learning (and understanding) the language in which the problem is expressed. Large collections of concepts are presumed to be understood by intuition or implicit meanings. One should comprehend that they can not even begin to discuss the logic of such an analysis with a new born baby. In fact I suspect a newborn can not even have any meaningful thoughts before some concepts have been created to identify their experiences. Any concepts we use to understand any problem had to be mentally constructed. The fact that multiple languages exist implies that the creation of those concepts arise from early experiences and that the representation itself is, to some degree, an arbitrary construct.

The central issue of my deduction is the fact that once one has come up with a theoretical explanation of some phenomena (that is, created a mental model of their experiences) the number of concepts they use to think is finite (maybe quite large but must nonetheless be finite). It follows that, being a finite collection, a list of the relevant concepts can be created. (Think about the creation of libraries, museums and other intellectual properties together with an inventory log.)

Once one has that inventory log, numerical labels may be given each and every log entry. Using those numerical labels, absolutely every conceivable circumstance which can be discussed may be represented by the notation $(x_1,x_2,\cdots,x_n)$. Note that learning a language is exactly the process of establishing the meaning of such a collection from your experiences expressed with specific collections of such circumstances: i.e., if you have at your disposal all of the circumstances you have experienced expressed in the form $(x_1,x_2,\cdots,x_n)$ you can use that data to reconstruct the meaning of each $x_i$ as that is actually the central issue of learning itself.

I would like to point out that, just because people think they are speaking the same language does not mean their concepts are semantically identical. Each of them possess what they think is the meaning of each specified concept. What is important here is that "what they think those meanings are" was deduced from their experiences with communications; i.e., what they know is the sum total of their experiences (that finite body of facts referred to above).

But back to my book. The above circumstance leads to one very basic an undeniable fact. If one has solved the problem (created a mental model of their beliefs) then they can express those beliefs in a very simple form: $P(x_1,x_2,\cdots,x_n)$ which can be defined to be the probability that they believe the specific circumstance represented by $(x_1,x_2,\cdots,x_n)$ is true. In essence, if they had an opinion as to the truth of the represented circumstance, $P(x_1,x_2,\cdots,x_n)$ could be thought of as representing their explanation of the relevant circumstance $(x_1,x_2,\cdots,x_n)$.

It is at this point that a single, most significant, observation can be made. Those labels, $x_i$, are absolutely arbitrary. If any specific number is added to each and every numerical label $x_i$ in the entire defined log, nothing changes in the patterns of experiences from which the solution was deduced. In other words the following expression is absolutely valid for any possible solution representing any possible explanation (what is ordinarily referred to as one's belief in the nature of reality itself); i.e.,

What is important here is that, if this were a mathematical expression, it would be exactly the definition of the derivative of $P(x_1+a,x_2+a,\cdots,x_n+a)$ with respect to a.

If $P(x_1,x_2,\cdots,x_n)$ were a mathematical expression the above derivative would lead directly to the constraint that $$\sum_{i=1}^n\frac{\partial\;}{\partial x_i}P(x_1,x_2,\cdots,x_n)\equiv 0.$$ However, it should be evident to anyone trained in mathematics that the expression defined above above does not satisfy the definition of a mathematical expression for a number of reasons.

The reader should comprehend that there are two very significant issues before even continuing the deduction. First, the numerical labels $x_i$ are not variables (they are fixed numerical labels) and second, the actual number of concepts labeled by those $x_i$ required to represent a specific circumstance of interest is not fixed in any way. (Consider representing a description of some circumstance in some language; the number of words required to express that circumstance can not be a fixed number for all possible circumstances.)

The remainder of chapter 2 is devoted to handling all the issues related to transforming the above derivative representation into a valid mathematical function. Any attempt to handle the two issues above will bring up additional issues which must be handled very carefully. The single most important point in that extension of the analysis is making sure that no possible explanation is omitted in the final representation: i.e., if there exist explanations which can not be represented by the transformed representation the representation is erroneous.

There is another important aspect of such a representation. Though the number of experiences standing behind the proposed expression $P(x_1,x_2,\cdots,x_n)$ is finite, the number of possibilities to be represented by the explanation must be infinite (the probability of truth must be representable for all conceivable circumstances $(x_1,x_2,\cdots,x_n)$.

I take care of the first issue by changing the representation from a list of numerical labels to a representation by patterns of points in a geometry. This would be something quite analogous to representation via printed words or geometric objects representing the relevant represented concepts. I handle the second issue is by introducing the concept of hypothetical objects, a very common idea in any scientific explanation of most anything.

At this point another very serious issue arises. If the geometric representation is to represent all possible collections of concepts, that geometry must be Euclidean. This is required by the fact that all "non Euclidean" geometries introduce constraints defining relationships between the represented variables. Only Euclidean geometry makes absolutely no constraints on the relationships between the represented variables. This is an issue many theorists omit from their consideration.

I look forward to any issues which the reader considers to be errors in this presentation.

Sunday, September 7, 2014

So let's take a look at how Special Relativity falls out from the epistemological arguments. Do not confuse this as the full analytical defense of the argument; for that, it is best to follow the details represented in the book.

To understand what is the significance of this type of derivation of Special Relativity, just keep a close eye on the fact that none of the arguments presented is dependent on any type of speculative nature of the universe, or require any kind of experimental result of any type. Time to do some thinking!

First significant fact is commented at page 69; The equation (2.23), which represents the four universal epistemological constraints of an explanation as a single equation (See this post for some more comments about those constraints), is a linear wave equation of fixed velocities (besides the interaction term; the one with the Dirac delta function).(Note; Page 69 appears to contain a typo; it refers to equation (3.29), when it should be (2.23))
Equation (2.23):
$$\left \{ \sum_i \vec{\alpha}_i \cdot \vec{\triangledown}_i + \sum_{i \neq j} \beta_{ij} \delta(\vec{x}_i - \vec{x}_j) \right \}\Psi=\frac{\partial}{\partial t}\Psi = im\Psi$$

This equation represents fundamental constraints that any self-consistent explanation / world-view / theory must satisfy, but it does not in any way define what kinds of elements the explanation consists of. I.e. it doesn't imply any assumptions as to what the world is; any kinds of defined objects can be represented with the associated notation (defined in the opening chapters).

In practice this means any self-consistent explanation can be fully represented in such a way that it directly satisfies the above equation.

The fact that the equation happens to be a wave equation of fixed velocity simply means any self-consistent explanation can be fully represented in a form where all of its defined elements move at constant fixed velocities.

The second significant fact is that the equation (2.23) can be taken as valid only in one particular coordinate system; the "rest frame" of whatever is being represented. That is to say, you cannot represent your solution in terms of moving coordinate systems without employing some kind of transformation.

Third fact; if an explanation has generated object definitions in such a manner that the "rest of the universe" can be ignored when representing those objects, it implies the same equation (2.23) must also represent a valid constraint for representing a single object. To quote myself from the Schrödinger post;

Note further that if it was not possible - via reasonable approximations or otherwise - to define microscopic and macroscopic "objects" independently from the rest of the universe, so that those objects can be seen as universes unto themselves, the alternative would be that any proposed theory would have to constantly represent state of the entire universe. I.e. the variables of the representation would have to include all represented coordinates of everything in the universe simultaneously.

In other words, if an explanation contains object definitions where objects can be represented independently from the rest of the universe, then there must also exist a valid transformation between the rest frames of those individual objects, in such a manner that the equation (2.23) preserves its form as it is.

Philosophically, if each object are truly independently represented, it is the same thing as saying that there is no meaningful way to define a universal rest frame; at least not in terms of the dynamic rules defined by the explanation.

And the fact that the equation (2.23) preserves its form means it can be seen as a wave equation of fixed velocity inside the rest frame of any individual object. This should start to sound familiar to those who understand Relativity; we are fast approaching the fact that the speed of information can be defined as isotropic across reference frames, because it is already guaranteed that a valid transformation mechanism exists, that gives you that option. Lorentz' transformation is exactly such a valid mechanism, and can be employed here directly.

But what about this business of every element constantly moving at fixed speed? (Let's call it C)

Remember, the notation defined in first chapters contained the concept of imaginary $\tau$ (tau) axis, defined to ensure no information is lost in the presentation of an explanation. It is a feature of the notation and has got only epistemological meaning. By its definition of being an imaginary axis created for notational purposes, it is meaningless what position the objects get along $\tau$. Or to be more accurate, the probability of finding an object at specific location is not a meaningful concept. But the velocity along $\tau$ is meaningful, and plays a role in representing the dynamics of the explanation. It is certainly possible to represent explanations without this concept, but the equation (2.23) was defined under a terminology that requires it.

And since we are using it, it just means objects that are at rest in $(x, y, z)$ space (if we wish to use 3 dimensional representation of the world), will be moving at velocity C in $\tau$.

On the other hand, anything moving at velocity C in $(x, y, z)$ space, implies 0 velocity along $\tau$, which is rather interesting in the light of the definition of "rest mass" defined in page 57. Directly related to the velocity of the object along $\tau$. So during Schrödinger deduction, we already reached a point where any defined object can be identified as having energy, momentum and mass exactly as they manifest themselves in terms of modern physics (including all of their relationships), via simply appropriately defining what we mean by those terms. And now we have reached a point where any object moving at velocity C in $(x, y, z)$ space cannot have any mass. Not because world happens to be built that way, but because a meaningful definition of mass yields that result.

Note that this is in sharp contrast to common perspective, where mass is seen as a fundamental ontological thing that objects just have. Here, it is intimately tied to how all the associated concepts are defined. It simply means that anything moving at velocity C must have no mass, by the definition of mass, energy, momentum, C, and a host of other definitions that these definitions require.

From page 71 onward the Special Relativistic time measurement relationships are demonstrated simply via defining how a photon oscillator operates (representing a clock, or internal dynamics of any macroscopic object), under the terminology established thus far. It should not be very difficult to follow those argument to their logical conclusion.

Just in case the drawn diagrams create some confusion, here are simple animated versions;

A stationary photon oscillator (a photon and two mirrors) is defined as:

All the elements are represented with "infinite length" along $\tau$ because the position in $\tau$ is not meaningful concept. The velocity of all the elements is fixed to C, but orthogonal to each others.

When the same construction is represented from a moving coordinate system, it looks like this;

The "self-coherence reasons" alluded to there are exactly the reasons why Lorentz transformation must be employed (see the beginning of this post).

So this is effectively a simple geometrical proof of time relationships being expected to have exactly the same form as they have in Special Relativity, but having absolutely nothing at all to do with any speculative ontology of the world (such as space-time or any ontological nature of simultaneity), or even with Maxwell's Equations per se.

None of the arguments so far have made any comments about how the world is; everything is revolving around the ideas of what kind of representation can always be seen as valid, under appropriate definitions that are either forced upon us due to self-consistency requirements, or always available for us as arbitrary terminology choices.

So none of this could possibly tell you how the world really is. There is merely scientific neutrality in recognizing in what sense we really don't know how things are; we just know how things can be represented. It can also help us in recognizing that there really are all kinds of ways to generate the same predictions about the universe, which is to say there are different valid ways to understand the universe.

Next destination will be slightly more challenging to analyze; General Relativity...

Saturday, September 6, 2014

Alright, time to talk little bit about how Special Relativity falls out from the analysis (see the book on the right). This one is in my opinion quite a bit easier to understand than the deduction of Schrödinger, at least if the reader is familiar enough with the logical mechanisms of Special Relativity.

The first signs of relativistic relationships arose in Chapter 3, during the deduction of Schrödinger Equation (see the discussion relating to the $mc^2$ term on page 55). Chapter 4 contains a rather detailed presentation of the explicit connection between Relativity, and general epistemological requirements. If you have followed the arguments through Schrödinger, Relativity falls out almost trivially. In fact, someone who is familiar enough with the logical mechanisms behind Special Relativity, might already guess how things play out after reading just the first couple of pages of the chapter.

Unfortunately most people have a rather naive perspective towards Relativity, more revolving around ontological beliefs of space-time constructions, rather than the actual logical relationships that the theory is composed of. The space-time construction is just a specific interpretation of Special Relativity, and if the reader is under the impression that this interpretation is necessary, then the logical connection between fixed velocity wave functions to relativistic relationships might not be too obvious.

Because the space-time interpretation is so common, I think it's best to first point out few facts that everyone having any interest in understanding Relativity should be aware of.

In my opinion, one of the best sources of information on Special Relativity is still Einstein's original paper from 1905. Because unlike most modern representation, that paper was still reasonably neutral in terms of any speculative ontologies. In the paper, the relativistic relationships are viewed more as requirements of Maxwell's equations of electromagnetism (For instance, the moving magnet and conductor paradox is mentioned in the very opening) under the fact that one-way speed of light cannot be measured (something that was well understood at the time). Relativistic space-time is not mentioned anywhere, because that idea arose later, as Herman Minkowski's interpretation of the paper. Later, space-time became effectively synonymous to Relativity because the later developments, especially General Relativity, happened to be developed under that interpretation.

The history of Relativity is usually explained so inaccurately, filled with ridiculous myths and misconceptions, that it seems almost magical how Einstein came up with such a radical idea. But once you understand the context in which Einstein wrote the paper, the steps that led to Special Relativity start to look quite a bit more obvious. I would go so far as to say pretty much inevitable.

When Einstein was writing the paper, it was well understood within the physics community that the finity of the propagation speed of information (speed of light) led to the impossibility of measuring the simultaneity of any two spatially separated events. If the events look simultaneous, it is not possible to know whether they actually were simultaneous without knowing the correct time delays from the actual events. To know these delays, you would have to first know the speed of light.

But that leads directly to another fact which was also well understood back then. That one-way speed of light cannot possibly be measured. In order to measure the speed, you need two clocks that are synchronized. Which is the same thing as making sure the clocks started timing simultaneously.

That is to say, you cannot synchronize the clocks without knowing the one-way speed of light, and you can't know the one-way speed of light, without synchronizing the clocks. Note that clocks are going to be by definition electromagnetic devices, thus there must be an expectation that moving them can affect their running rate. The strength of the effect is expected to depend on the one-way speed of light in their direction of motion. Which just means you can't even synchronize the clocks first and then move them.

So here we have ran into a very trivial circular problem arising directly from finity of information speeds, that cannot be overcome without assumptions. It wasn't possible then, it's not possible now, and it will not be possible ever, by definition. This problem is not that difficult to understand, yet we still have people performing experiments where they attempt to measure one-way speed of light, without realizing the trivial fact that in order to interpret the results, we must simply assume some one-way speed of light. That is exactly the same thing as measuring whether water runs downhill, after defining downhill to be the direction where water runs.

As simple as this problem is, it does not exist in the consciousness of most people today, because they keep hearing about all kinds of accurate values for speed of light. It is almost never mentioned that these measurements are actually referring either to average two-way measurements (use a single clock), or to Einstein convention of clock synchronization (the convention established in his paper; assume C to be isotropic in all inertial frames, and synchronize the clocks under that assumption).

Next important issue to understand is how exactly Einstein's paper is related to the aether theories of the time. The physics community was working with a concept of an aether, because it yielded the most obvious interpretation of C in Maxwell's Equations, and implied some practical experiments. Long story short, the failure to produce expected experimental results led Hendrik Lorentz (and George FitzGerald) to come up with an idea of length contraction affecting moving bodies (relating to variations to the propagation speeds of internal forces), designed to explain why natural observers could not measure aether speeds.

The significance of this development is that the transformation rules Lorentz came up with survive to this day as the central components of Einstein's theory of Special Relativity; that is why the relativistic transformation is called Lorentz transformation.

In terms of pure logic, Lorentz' theory and Special Relativity were effectively both valid. The difference was philosophical. For anyone who understands relativity, it is trivial to see that Lorentz' theory can be seen as completely equivalent to Einstein's in logical sense; just choose arbitrarily any inertial frame, and treat that as the frame of a hypothetical aether. Now any object moving in that frame will follow the rules of Lorentz transformation, just like they do in Special Relativity, and all the other consequences follow similarly. For natural observer, everything looks exactly the same.

When Einstein was thinking about the problem, he had Lorentz' theory in one hand, and the fact that one-way speed of light / simultaneity of events cannot be meaningfully measured on the other hand. It doesn't take an Einstein (although it did) to put those together into a valid theory. Since the universal reference frame could be, and would have to be, arbitrarily set - as far as any natural observer goes - it is almost trivial to set C to be isotropic across inertial frames, and let the rest of the definitions play out from there.

So getting to Special Relativity from that junction is literally just a matter of defining C to be isotropic, not because you must, but because you can. Setting C as isotropic across inertial frames is exactly what the entire paper about Special Relativity is about. Note that the language used in the paper is very much about how things are measured by observers, when their measurements are interpreted under the convention defined in the paper.

While Lorentz' theory was seen as a working theory, it was also seen as containing a redundant un-observable component; the aether. Thus Einstein's theory would be more scientifically neutral; producing the same observable results, but not employing a concept of something that cannot possibly be observed by its own definition.

And just judging from the original paper itself, this is certainly true. But there is great irony in that then Einstein's theory would get a redundant un-observable component tacked to it within few years from its introduction; the relativistic space-time.

A typical knee-jerk reaction at this point is to argue that relativistic space-time is a necessary component by the time we get to general relativity, but that is actually not true. General relativity can also be expressed via different ontological interpretations; some may be less convenient than others depending on purpose, but it is certainly possible.

Another reaction is to point out that different ontological interpretations of Relativity do not fall under the category of physics at all. This is true; but it is also true when it comes to relativistic space-time.

There really are some very solid epistemological reasons for the validity of relativistic time relationships, that have nothing to do with neither aether concepts, nor relativistic space-time concepts, or any other hypothetical structure for the universe. "Epistemological reason" means purely logical reasons that have got nothing to do with the structure of the universe, but everything to do with how we understand things, and those reasons are what Chapter 4 is all about.

I will write a post more directly related to the arguments of Chapter 4 very soon.

Wednesday, July 16, 2014

It can be little bit confusing - especially through the first chapters - to grasp what exactly is being developed in the analysis (see the book on the right). It is easy to miss that what is being developed is not a physical theory, but about physical theories; every argument exists an abstraction level above physics. Especially the first chapters in themselves are highly abstract as they are meticulously detailing an epistemological framework for an analysis (a framework that is most likely completely unfamiliar to the reader).

I find that typically people start to get a better perspective towards the analysis when Schrödinger equation suddenly appears in plain sight in rather surprising manner.

So to provide a softer landing, I decided to write some commentary and thoughts on that issue. Don't mistake this for an analytical defense of the argument; for that it's better to follow the arguments in the book. I'm merely providing a quick overview, and pointing out some things worth thinking about.

In Richard's analysis this expression arises from radically different considerations than its historical twin, and the steps that were taken to get here offer some insight as to why Schrödinger's equation is a valid approximated general description of the entire universe.

To get some perspective on the issue, let me do a quick hand-wavy run through some of the history of Schrödinger equation.

Usually the $\Psi$ in Schrödinger equation is interpreted as representing a probability amplitude (whose absolute square is the probability of observation results). Erwin Schrödinger didn't arrive to this representation by trying to describe probability amplitude's per se. Rather it arose from a series of assumptions and definitions that were simply necessary to explain some observations, under certain pre-existing physics definitions;

Max Planck created a concept of "discrete energy bundles" as part of his explanation for "black body radiation". Motivated by this, Albert Einstein suggested that electromagnetic radiation exists in spatially localized bundles - call them "photons" - while they would still preserve their wave-like properties where suitable. Consequently, Louis de Broglie showed that wave behavior would also explain observations related to electrons, and further hypothesized that all massive particles could be associated with waves describing their behavior.

Schrödinger set out to find wave equation that would represent the behavior of an electron in 3 dimensions; in particular he set out to find a wave equation that would correctly reproduce the observed emission spectrum of a hydrogen atom.

Schrödinger arrived at a valid equation, which would "detail the behavior of wave function $\Psi$ but say nothing of its underlying nature". There is no underlying reasoning that would have directly led Schrödinger to this particular equation, there is just the simple fact that this is the form that yields valid probabilistic expectations for the observed measurement outcomes.

It was only after the fact that the desire to explain what in nature makes Schrödinger equation valid, would lead to a multitude of hypothetical answers (i.e. all the ontological QM interpretations).

Effectively what we know is that the equation does correctly represent the probabilities of measurement outcomes, but all the ideas as to why, are merely additional beliefs.

Just as a side-note, every step on the historical route towards Schrödinger equation represents a small change in the pre-existing physics framework; every step contains an assumption that the pre-existing framework represents reality mostly correctly. Physical theories don't just spring out form radically new perspectives, but rather they tend to be sociologically constructed as logical evolution from previous paradigms, generalized to explain a wider range of phenomena. These successful generalizations may cause revolutionary paradigm shifts, as was the case with Schrödinger equation.

Alright, back to Richard's analysis. I will step backwards from the equation (3.15) (Schrödinger Equation) to provide a quick but hand-wavy preview of what is it all about.

First note that $\Psi$ is indeed here also related to the probability of a measurement outcome via $P = \Psi^{\dagger} \cdot \Psi$. But it has not been interpreted as so after the fact; rather it has been explicitly defined at the get-go as any unknown function that yields observed probabilities in self-consistent fashion; any function that does so can be seen as a valid explanation to some data.

For more detail on this definition, see equation (2.13) on page 33 and the arguments leading up to it. Note especially that the definition (2.13) is specifically designed to not exclude any possible functions embedded inside $\Psi$. It is important that this move does not close out any possibilities pre-emptively; if it did, we would have just made an undefendable assumption about the universe. The definition (2.13) will have an impact on the mathematical appearance of any expressions, but this is most correctly seen as an abstract (and in many ways arbitrary) mathematical terminology choice. Its consequences should be viewed as purely epistemological (purely related to the mental terminology embedded to our explanation), not ontological (relating to nature in itself). E.g. the squaring comes from the definition of probability, and $\Psi$ being complex function simply impacts the apparent form of its constraints (which play important role in the form of Schrödinger equation, as becomes evident little later).

Let's jump to equation (3.14). which is logically equivalent to Schrödinger equation; only couple of simple algebraic steps stand between the two expressions. There is absolutely no reason to do these steps other than to point out that the equations indeed do represent the same constraint.

As is mentioned in the book, (3.14) was obtained as a starting point for a perturbation attack, to find the constraints on a $\Psi$ for a single element, in an unknown universe (under few of conditions, which I'll return to).

To get some idea of what that means, let me first point out that the underlying constraints embedded into the equation have a deceptively simple source. Equations tagged as (2.7) on page 26 are;

which simply means that, under any explanation of any kind of universe, the probability of an outcome of any measurement is always a function of some data points that provide a context (the meaning) for that measurement. But the probability is never a function of the assignment of labels (names) to those data points.

Alternatively you can interpret this statement in terms of an abstract coordinate system $(x, \tau)$ (a view also developed carefully in the book), in which case we could say, the probability of an outcome of a measurement is not a function of the location of the context inside the coordinate system. Effectively that is to say that the defined coordinate system does not carry any meaning with its locations. After all, it is explicitly established as a general notation capable of representing any kind of explanation.

Note that what the data points are, and what they mean, is a function of each and every possible explanation. Thus the only constraints that are meaningful to this analysis are those that would apply to any kind of assignment of labels.

Note that exactly similar symmetry constraint is defined for partial derivative of $t$, the index for time-wise evolution. See (2.19) where it is expressed against $\Psi$

The other type of underlying constraint is represented in equation (2.10) with a Dirac delta function, meaning at its core that different data points cannot be represented as the same data point by any self-consistent explanation; a rather simple epistemological fact.

The definition of $\Psi$ as $P = \Psi^{\dagger} \cdot \Psi$ and some algebra will lead to a succinct equation expressing these universal epistemological constraints as single equation (2.23)

which was somewhat useless for me to write down here as you need to view the associated definitions in the book anyway to understand what it means. You can see page 39 for definitions of the alpha and beta elements, and investigate the details of this expression better there. For those who want to just carry on for now, effectively this amounts to be a single expression representing exactly the above-mentioned constraints - without creating any additional constraints - on $\Psi$.

View this as a constraint that arises from the fact that any explanation of anything must establish a self-consistent terminology to refer to what-is-being-explained, and this is the constraint that any self-consistent terminology in itself will obey, regardless of what the underlying data is.

Chapter 3 describes the steps from this point onward, leading us straight into Schrödinger's expression. It is worth thinking about what those steps actually are.

First steps are concerned with algebraic manipulations to separate the collection of elements into multiple sets under common probability relationship P(set #1 and set #2) = P(set #1)P(set #2 given set #1) (page 45)

Leading us to equation (3.6), which is an exact constraint that a single element must obey in order to satisfy the underlying epistemological constraints. But this expression is still wonderfully useless since we don't know anything about the impact of the rest of the universe (the $\Psi_r$)

From this point on, the moves that are made are approximations that cannot be defended from an ontological point of view, but their epistemological impact is philosophically significant.

The first move (on page 48) is the assumption that there is only negligible feedback between the rest of the universe and the element of interest. Effectively the universe is taken as stationary in time, and the element of interest is assumed to not have an impact to the rest of the universe.

Philosophically this can be seen in multiple ways. Personally I find it interesting to think about the fact that, if there exists a logical mechanism to create object definitions in a way where those objects have negligible feedback to the rest of the universe, then there are rather obvious benefits in simplicity for adopting exactly such object definitions, whether or not those definitions are real or merely a mental abstraction.

Note further that if it was not possible - via reasonable approximations or otherwise - to define microscopic and macroscopic "objects" independently from the rest of the universe, so that those objects can be seen as universes unto themselves, the alternative would be that any proposed theory would have to constantly represent state of the entire universe. I.e. the variables of the representation would have to include all represented coordinates of everything in the universe simultaneously.

That is to say, whether or not reality was composed of complex feedback loops, any method of modeling probabilistic expectations with as little feedback mechanisms as possible would be desirable, and from our point of view such explanations would appear to be the simplest way to understand reality.

Next steps are just algebraic moves under the above assumption, leading to equation (3.12) on page 50. Following that equation, the third and final approximation is set as;

$$
\frac{\partial}{\partial t} \Psi \approx -iq\Psi
$$

which leads to an expression that is already effectively equivalent to Schrödinger's equation, simply implying that this approximation plays a role in the exact form of Schrödinger's Equation. See more commentary about this from page 53 onward.

And there it is, the equation that implies wave particle duality to the entire universe, and yielded a revolution in the concepts of modern physics, arises from entirely epistemological constraints, and few assumptions that are forced upon us to remove overwhelming complexity from a representation of a universe.

The steps that got us here tell us exactly what makes Schrödinger Equation generally valid. When we create our world view, we define the elements (the mental terminology) with exactly the same epistemological constraints that would also yield Schrödinger Equation in Richard's analysis. The only difference between different representations (everyday, classical, or quantum mechanical) is that different approximations are made for simplicity's sake.

The steps that the field of physics took towards Schrödinger equation were always dependent on the elements that had already been defined as useful representation of reality. They were merely concerned of creating generalized expressions that would represent the behavior of those elements.

The so-called quantum mystery arises from additional beliefs about the nature of reality - redundant beliefs that the elements we define as part of our understanding of reality, are also ontological elements in themselves. There exists many different beliefs (QM interpretations) that each yield a possible answer to the nature behind quantum mechanics, but scientifically speaking, there is no longer need to explain the validity of Schrödinger Equation from any hypothetical ontological perspective.

So it appears the critical mistake is to assume that the particles we have defined are also in themselves real objects, from which our understanding of reality arises. Rather the converse appears to be true; a useful method of representing the propagation of probabilistic expectations between observations is driving what we have define as objects, and consequently this epistemological issue critically affects how do we view the universe meaningfully in the first place. After all, understanding reality is meaningful only in so far that we can correctly predict the future, and the only meaningful object definitions have to be directly related to that fact.

Thus, to avoid redundant beliefs in the subject matter, Schrödinger equation can be seen simply as a valid general representation of the propagation of our expectations (of finding a defined particle) between observations, governed by the same constraints that govern what we define as "objects". The exact reason why the particles "out there" appear to be reacting to our mere observations is that the pure observation X implies the existence of a particle Y only in our own mental classification of reality. That is why the particles that we have defined do not behave as particles would in-between observations.

To assume the ontological existence of associated particles as we have defined them, is not only redundant, but also introduces a rather unsolvable quantum mystery.

Saturday, May 10, 2014

"Thus it is that I propose using the common mathematical notation $(x_1,x_2,...,x_n)$ to represent any specific "circumstance" of interest."

It is common that people raise questions and ask examples about the intended meaning of the $x_i$ elements, because there appears to be little bit too many ways to understand the intent of the author.

In this case the intent is specifically to not define any particular meaning for these elements; the possibilities must be kept completely open. What is being developed is anotation capable of representing any self-consistent explanation of reality in abstract manner. At this junction, it is not important whether or not we know how to actually perform a translation from some existing explanation to the proposed notation. The analysis up ahead does not rely on any specific meanings of the labels, it only relies on self-consistency issues. What is important is to understand in what way the notation itself does not impose constraints on what could be represented in principle.

Any explanation of reality operates under the idea that some collection of (well defined) indivisible elements exists, and the existence of a particular collection of those elements, or particular states of some elements, leads into some expectations about the future as per our beliefs regarding how those elements behave.

Thus, to have a generic notation that does not speculate about the existence of anything in particular at the outset, there needs to be a completely generic way to write down what elements exist (in some relevant situation or circumstance) according to some explanation. Whether or not the explanation conceives reality as collections of elements, or collections of states to some elements, the relevant information can be represented as a collection of numerical $x_i$ labels, where the meaning of each label can only be understood if that specific explanation is understood.

Note; it is natural to the notation defined here, that multiple different understandings of the same collections of elements can potentially exist. That is exactly the issue facing us as we are developing explanations about reality. If we have an explanation for reality that holds any predictive power, then we also have defined some elemental, indivisible entities with more or less persisting identity to themselves. By definition, we cannot write down "noumenal" reality in itself, we can only write down something we have already defined to constitute an element.

It is important the reader understands what is being developed is not an argument about reality, but an argument about our explanations of reality.