A little while ago (or at least it was recently when I started planning this post out), the commentator Matthew posted a number of interesting comments which were certainly worthy of a response with a blog post rather than an inline reply. I responded to the first two comments a couple of posts ago. That post, however, grew too long both in length and writing time, and I decided to defer my discussion of the third comment to a subsequent post. Which is this one. I note that Matthew has responded to my original post; I'll probably get back to that in a few posts time.
The comment in question is this one, and I recommend that readers look at it before coming back here.
Before I begin, I should also warn any readers that my views on the topic are still developing (partly because responses such as Matthew's force me to rethink things, and consider the topic in more depth), and what I write here might not be fully consistent with what I have written before or elsewhere. I also ought to begin by outlining the underlying science, and the non-controversial aspects of the problem.
The idea behind Bell's theorem is straight-forward. I simplify a bit here; there are two more precise explanations of Bell's theorem (from different points of view) described below. You have two particles with an entangled quantum state; that is to say the states of the two particles are in some way linked together. When one measures a particular observable for one particle, then a measurement for the other particle must give a complementary result. This is true no matter how far apart the particles are. If the result of the measurement is determined at the time of the measurement (as held by, for example, the Copenhagen interpretation), then there must be instantaneous communication between the two experiments, so the particle at the second detector knows what answer it has to give to be consistent with the result at the first. This instantaneous communication seems to violate the principles behind special relativity. The second requirement to derive Bell's inequality or something like it, is that there should be several observables which are correlated but the operators representing those observables don't commute with each other. That means that if you are in an exact state for one observable, you are in an indeterminate state for the others. This is a feature unique to quantum physics; it is not to be found in either classical (Newtonian) physics or in Greek physics. This second requirement seems to imply either that results of measurement are only determined at the point of measurement, or there are some underlying physical parameters, not directly observable, but which determine the possible outcomes of all future measurements and which are set at the moment when the two particles depart from each other. What Bell's theorem attempts to show is that these hidden variable theories necessarily imply results which contradict experiment. This leaves us with the first option: that the outcomes of at least some measurements is only determined at the point of measurement. This seems to imply that the axioms that either quantum physics or special relativity depend on are wrong (or at least incomplete), which is a problem because these are the two pillars of contemporary physics, and two ideas which we are more certain than anything else are right.
I usually use the example of a spin 0 Boson decaying into two spin 1/2 Fermions. The phenomenon is also present in polarized light (and many of the experimental tests of Bell's and related inequalities have used light), but the important aspects of the underlying mathematics are the same in each case, so from a theoretical standpoint it doesn't matter too much which physical system is modelled. The case of particle decay is perhaps slightly simpler to grasp.
A spin 1/2 particle's spin can be measured along any particular axis, and in each case it can take two possible values: +1/2 or -1/2. The spin 0 particle can only have its spin measured as 0. Because spin is conserved along each axis in every physical interaction, if one of the two Fermions is measured as spin 1/2 along any axis, then the other one must have spin -1/2. And this is true in each direction. And this holds in both classical and quantum physics; there is no quantum spookiness coming in yet. (OK, the concept of spin itself only arises naturally in relativistic quantum physics, but apart from that there is no quantum spookiness coming in as yet.)
But what if we measure the spins of the two Fermions along slightly different axes? Clearly, in this case, we need not get opposite results. We could measure the spin of both particles as being +1/2. But we might also expect that the spin states are in some way correlated for directions that are close to each other. I don't see that this necessarily has to be the case. One can think of a classical system where the spin in each direction is fully determined at the point of decay independently of what is happening in any other direction. But in practice, there is correlation between the observed value of the spins along neighbouring axis. If the spin of one of the Fermions is -1/2, the spin of the other one, if measured along an axis which is only slightly differently aligned, will be +1/2 almost all the time. And this is fortunate: it would be otherwise pretty much impossible to ensure that the two detectors were aligned in precisely the same way, so one would always get random results. (As noted, the concept of spin arises from relativistic quantum mechanics, and that is what we theoretically expect.) So if we say that the two axes of measurement differ by an angle θ, then the chance that they are misaligned will be some function of θ. And that works in either classical or quantum physics.
But the mathematical details of how the spins will differ depend on the model used to calculate them, and in particular the assumptions behind those models. Some of these agree with the quantum calculation, and others don't.
The Copenhagen interpretation of quantum physics states that the state of the particle is indeterminate until it is measured. Since the two particles are linked together, as soon as you perform a measurement on one of them, it collapses into a well defined state (that's standard). But in that case, so does the entangled partner. This dual collapse happens simultaneously. This is a problem in relativistic theories where there is no clear notion of simultaneity for two events at different locations.
The standard sorts of interactions in QFT are all local. They involve particle decay, emission and absorption, and each event occurs at a single point. This is built into the theory; it cannot be removed without violating Lorentz symmetry. Wavefunction collapse of entangled particles, on the other hand, is non-local. That means that either the information communicated in wavefunction collapse occurs via some non-physical interaction, or the model of wavefunction collapse is incorrect.
Bell's theorem in particular targets interpretations of Quantum Physics which rely on hidden variables. These basically say that all the different possible experimental outcomes are encoded in some way in the particles at the time of their decay. Behind the wavefunction are another set of variables which control the experimental outcomes. The theorem states that if there are hidden variables and locality, then one gets different results than expected from the quantum mechanical calculation.
The assumptions used to derive Bell's inequalities, as stated in this paper are:
- The outcome of a measurement is not determined at the point of measurement, but is uniquely determined by the properties of the particle before the measurement is performed. (This assumption is known as local realism or the hidden variables assumption.)
- Reality is single valued, i.e. each measurement has only one outcome. This is violated in Everett multiple world interpretations of quantum physics.
- The microstates (or hidden variable states) are determined by an underlying probability distribution, which is positive and normalizable to one. This could be violated if the detectors are inefficient (i.e. there is bias in which of the particles are detected), or if the detectors were in some way dependent on the hidden variables in the particles.
- There is no backwards causation; i.e. causal signals can't go backwards in time.
- Factorizability (or locality), i.e. what is happening at one detector is independent on what is happening at the other detector. The measurement results just depend on the hidden variables of the particles, and not on what happened at the other detector. This entails both that the settings on the two detectors don't interfere with each other, and that the outcomes of the two experiments don't depend on each other, but only on the hidden variables. This arises from the principle that only events in the past light cone of the measurement can influence the measurement. The probability of getting a particular value at one detector is conditional only on the events in the past light cone, and otherwise independent of what is happening at the other detector.
Some of these axioms are related to each other. For example, the article that Matthew referenced in his comment, suggests that there are only three assumptions: factorizability, no backwards causation, and hidden variables.
One other thing we should bear in mind is that the experimental tests of Bell's theorem are not performed using a single particle decay. They require numerous events to build up an ensemble. One calculates a probability, which one wishes to compare against a frequency distribution. But to reliably measure a frequency distribution, one needs a large enough sample of events. Each individual decay (unless the detectors are orientated along the same axis) might either have the spins in alignment or not. There is no way to predict the outcome, and the result by itself doesn't prove anything. It is only when we consider the ensemble after numerous decays that a frequency distribution emerges. The probability distributions we calculate are predictions of the frequency distributions for an ensemble of results. They do not describe what is happening with a single particle. Hidden variables, on the other hand, refer to the mechanics of each individual decay. If there are such hidden variables, then each individual event will depend on its own set of them. The ensemble distribution itself represents some sort of average over all the possible hidden variables.
Before progressing, I need to discuss the mathematics. Warning in advance for any non-mathematicians: this will get pretty intense. It should be understandable for anyone with A-level or IB mathematics or equivalent.
The mathematics -- derivation of Bell's inequality
First a bit of notation. We have a physical system which consists of a spin 0 particle which emits two spin-half Fermions in opposite directions (Fermion 1 and Fermion 2). We have detectors x, w, a and b which measure the spin of the two Fermions. w and a are positioned to observe Fermion 1, while x and b are set up to measure the spin of Fermion 2. We can choose to use either detector w or a, and similarly choose whether to use x and b. x and w are set up to measure the spin of the Fermions along the same axis. So w will always give the opposite result to x. a and b measure the spin along their own axis. All three of these axes are in the same plane. The angle of the spin axis measured by a and that measured by w is . The similar angle between x and b is .
- A represents detector a giving a spin up result.
- A' represents detector a giving a spin down result.
- B represents detector b giving a spin up result.
- B' represents detector b giving a spin down result.
- X represents detector x giving a spin up result (and consequently w reports that Fermion 1 is spin down).
- X' represents detector x giving a spin down result (and consequently w reports that Fermion 1 is spin up).
We are interested in calculating the probabilities. Every probability is conditional on its premises, and those we calculate here will be conditional on λ, which represent the assumed hidden variables, and M, which contains information on the initial setup, the assumptions of the theoretical model, and so on. I will denote "the probability of A conditional on M" as P(A|M). represents the logical and or set intersection operator, while represents the logical or or set union operator.
We will need a few standard results from probability theory. Firstly, the probability of an outcome intersecting with the universal set is the probability of the outcome by itself:
Secondly, the probability of an outcome intersecting with its compliment is 0 (the law of non-contradiction):
Thirdly, one of the axioms of probability, the rule for the probability of a union of two outcomes.
Finally, that the probability that outcome A or outcome B occurs is greater or equal to the probability that outcome B alone occurs
What we want to calculate is the probability that the detectors at a and b give the same result, conditional on the results we get when we perform the experiments at a and x, and when we compare b and w. Those initial results are found to be:
The factor of comes from the probability that we can either get the result X or X' given the initial experimental setup. For these equations to be useful in modelling reality, we need the assumption that the observational results depend solely on the hidden variables (plus experimental setup). Recall, we will eventually want to compare this probability with an experimentally observed frequency. That comparison only makes sense if the causality represented in the calculation of the probability exactly matches the real physical causes. We also use the assumption that the results of one measurement don't influence the other measurement throughout this calculation.
So the thing we want to calculate is
So let's start trundling through the algebra.
we can kill the last term in the expansion, and continue
We now make use of equations (1) and (2) to write
Repeating this for each of the terms in equation (8) gives
And substituting the numbers in from equations (3) and (4) gives
And this is the result we are after: Bell's inequality, as it applies to this particular experimental set up. If this inequality is shown to be violated in practice (which it is), then that shows that at least one of the assumptions behind the calculation is false.
The mathematics -- the quantum calculation
So now it is time to look at the quantum mechanical calculation (which, of course, agrees with experiment).
So once again, we start with a spin 0 particle decaying into two spin half Fermions. We want to calculate the same probability as above. I will only do the calculation for detectors a and b -- one can use the same methods to calculate the comparisons between a and x, for example, which will lead to the same results in equations (3) and (4).
I will denote the Fermions emerging from the decay as YA, which heads towards detector a, and YB, which heads towards detector b. We need to express the possible observable values in terms of quantum states, so will represent the state corresponding to detector a measuring spin up, and so on.
We don't know whether YA is spin up or spin down, so we need to account for both possibilities in our amplitude. The amplitude that we destroy the spin up particle at detector b will be given by the overlap between the states, represented by . So what we are doing in our calculation is creating the particles one spin down particle and one spin up particle, and then destroying them at detectors a and b. In terms of creation and annihilation operators, this is
To create the amplitude , we need to annihilate the creation operator for Y+B with the annihilation operator for B, which means first of all we need to place these operators next to each other. We can re-order the Fermion operators by swapping neighbouring operators, until they are in the order we want. However, Fermion operators anti-commute, so whenever we swap two neighbouring operators, we have to multiply the whole expression by minus one. That is the origin of the minus sign in the right hand side of equation (12).
I'll use Q this time to denote the initial conditions of the experiment and modelling assumptions (since it is a different model to the classical case, I'll use a different symbol). So what we want to calculate are amplitudes such as
However, in the calculation below, I will use the Y states as an intermediary. But we don't know the spin states of Y, so we have to integrate over all of the possibilities:
YA is equally likely to be spin up or spin down, so we have one of the two amplitudes needed to solve this integral immediately.
I will spend the rest of this section calculating , the amplitude that a and b are both spin up given that YB is spin up and YA is spin down, and the other similar quantities I need.
I need to start by defining the spin operator, and calculating its eigenstates. I will use polar coordinates to convert (x,y,z) Cartesian coordinates into a pair of angles.
Then the matrix representation of the spin operator is
The spin up and spin down states for arbitrary angles are defined by the eigenvalue equations
These turn out to be,
We have a degree of freedom in deciding what the angles our detectors are orientated at. The system has rotational symmetry, so we can pick the x-axis to be in whatever direction we choose. Only relative differences between angles will affect the final result. As such, I am free to pick detector A as being along the x-axis, and detector B as resting in the x-y plane. The Fermions themselves can emerge in any spin orientation. Thus we have,
This means that the states are,
We can now calculate the amplitudes. Firstly, the amplitude that both detectors measure the particles to be spin up. Don't forget that we need to complex conjugate the states for A and B.
Similarly, we can calculate the amplitude that detector a records spin down and detector b spin up
And the other possible measurement outcomes give
Note that these results don't depend on the orientation of the Fermions as they emerged from the decay. We have no way of knowing what that was. We have all the numbers we need, so we can now calculate the final amplitudes
Then we can modulus square the amplitudes (Born's rule) and combine them together to get the probabilities that the two detectors will record either the same spin state or opposite spin states,
Converting this back to the notation of the previous section gives
And this is the quantum mechanical prediction.It obviously differs from the classical result (equation (11)), and can violate the inequality.
Models and explanations
I designed the calculation in the previous section to reflect my own philosophy. There are other ways of getting the same result, which would reflect other philosophies of physics. But my approach is to take the calculation above literally.
In summary, my approach is:
- God exists and actively and continuously sustains the universe. Physics is the description of that action (in the absence of any special circumstances, where God is free to act differently, leading to miracles). Ultimately, the various symmetry laws that constrain the laws of physics are a reflection of God's attributes. The indeterminacy of physics is due to God's free will.
- Substances can exist in one of a number of states (or potentia). One of these is actual at any given time, and changes either represent movement from one state to another, or the creation or annihilation of different particles. The states are always expressed in terms of a basis. There might be (for example) two allowed states in a given basis. There is a different basis for every possible observable. However (and this is where it gets harder to visualise and explain without the mathematics), unlike Newtonian or Greek physics, these bases are not orthogonal: if one is in an exact state in one basis, then one is necessarily in a superposition of states in another basis.
- I make a distinction between event and substance causality. Substance causality refers to which substance (or particle) preceded another substance. Or, more generally, which actualised state preceded another actualised state. There are a limited number of possible interactions (in the absence of miracles), which are denoted by the final causes of a particle. The efficient cause traces back the past history of a particle, from one state to another. Final causes point to substances which could possibly arise from that particle. Substance causality is always respected, even in quantum physics. It is also local, in the sense that all interactions occur at a single point in space and moment in time, and only substances in the past light cone of a being can be part of the chain of efficient causes for that being. Event causality, on the other hand, refers to the causes of events, which represent changes in states. The physical particles alone are insufficient to describe the event cause. Even if we had a complete knowledge of the state of the universe and all physical parameters, events would, with a few exceptions, be impossible to predict precisely. Ultimately, which events occur, which potentia are actualised, comes down to God's free decisions. It is not something we can predict; the best we can do is treat it stochastically, using the methods of quantum physics. While God is able to do whatever He pleases, in practice God's rational nature and constancy means that there is a degree of regularity in the actions. (God only acts differently if there is a good reason to do so.) Thus we can usefully assign amplitudes to each possible event, and also expect different events to be correlated to each other as a reflection of God's consistency and constancy.
- Uncertainty for a event in quantum physics is best parameterized by an amplitude rather than a probability. We can only convert to a probability after we have a large number of such events and want to compare against an experimental measurement of a frequency distribution.
To apply these principles to this particular example, the spin zero particle decays into two Fermions (an act of efficient causality). The two Fermions will be in an actual state, in a definite spin state (Y) and a particular basis. We don't know what that spin state is (because of God's freedom when actualising the decay), and we have no way of finding out. When we take a measurement, we project the particle into the basis associated with the detector (A or B). The amplitude for us to measure it either as spin up or spin down is given by the overlap between the detector state (A) and the decayed particle state (Y). Usually we will then need to integrate over all unknown variables (such as the precise orientation of Y) to get a final result which can be compared with experiment.
The assumptions behind this calculation are:
- The particle is emitted in a particular state in a particular basis. Only those observables which commute with that basis are determined prior to measurement. The observables that require non-commuting bases are undetermined until measurement.
- Reality is single valued.
- Uncertainty for single particles is most fundamentally parametrized by amplitudes rather than probabilities. If we later wish to convert to a probability to compare predictions for an ensemble of particles with an observed frequency distribution, we can do so via Born's rule.
- The system is factorizable and local, as far as substance causality is concerned.
- The measurement process is indeterminate. It proceeds by projecting the particle into the basis defined by the detector. The amplitudes for the which of the two possible eigenstates it drops into are determined by the overlap between the particle state and the two eigenstates of the detector's basis. The measurement, forcing the particles into one of the two states, is an event, not determined by physical causes alone. The amplitudes correctly predict the likelihood of each possibility.
So I would disagree with the first and third of the assumptions behind the derivation of Bell's inequalities. There are some similarities with the Copenhagen interpretation, because the outcome of a measurement is (usually) not determined until the measurement itself. However, unlike the Copenhagen interpretation, where the particle remains as some ghostly non-physical wavefunction between measurements, I would maintain that it is in some definite state, we just don't know what that is.
In my interpretation, we don't know, and can't know even after measurement, what the state the particle is in between emission and observation, but we do know that it is in some state. The wavefunction (in the respects that are relevant to this discussion) represents a parametrization of our uncertainty. In particular, it is used to predict the results of experiments (after we have enough statistics to determine a frequency distribution). There is a hidden reality behind it. But this do not lead to Bell's inequality, but instead duplicates the standard quantum mechanical calculation.
Of course, this still leaves one question. How does the detector at a know how to get the opposite result from the detector at b when the two detectors are aligned? The mathematics says that it must, but what lies behind the mathematics? The mathematics is used to make statistical predictions of the results of the indeterminate system. It is ultimately based on symmetry requirements, which in turn are drawn from the premise that whatever lies beyond the laws of physics is not bound by time and space: timeless and omnipresent, and therefore does not treat one moment in time or one location in space as any more or less important than any other. The results of the experiment are indeterminate, but using these symmetries we can still make statements about the likelihood of these results. We know that there is no physical communication between the two detectors -- if there were, it would be present in the action of the most fundamental theory (which I assume is the standard model adjusted to take into account gravity). But whatever is behind the laws of physics -- whatever is the reason why the particles obey them -- is not a physical particle. Now I regard the laws of physics as a description of God's actions in sustaining the universe (via secondary causation), under the assumption that God has no special interest in the events concerned. If, for example, God wanted to miraculously heal a blind man, then He would have a special interest in a particular time and location; the symmetries constraining QFT would in that moment be invalid, and therefore the assumptions behind the prediction would no longer hold. We would get an unexpected result. (The mechanism by which God produces miracles is, in this model, exactly the same mechanism by which God creates and destroys particles in the ordinary course of sustaining the universe.) But, in the case of this particular experiment, there is no good reason why God should favour one result ahead of another. The symmetries hold; the mathematical calculation of the amplitude is a correct description of the likelihood of God's actions, and the amplitude of zero for the detectors recording the same spin means that the two detectors will display the opposite spin.
Does this imperil the freedom of God? No. Firstly, God can still either make the particle at detector a spin-up or spin-down, so the result is not determined, and we certainly aren't led to the situation where God is forced into one particular act. Secondly, God could record the same spin on both detectors ( a miraculous intervention), but would only do so if He had a reason to do so. And there is no special reason, so we are left with the default result.
People might, of course, dispute that the laws of physics are a description of God's sustaining of the universe. But, in this context, that affects little. What I needed are the attributes of timelessness, omnipresence, and omnipotence (that is universal applicability): and I think even most atheists would admit that the laws of physics either represent the acts of something or somethings that possesses these attributes, or, if they are in some way capable of action themselves rather than as a description or an intermediary, then the laws themselves possess those attributes. There will be those who disagree with this, of course, such as those who hold to a Humean view of causality, or otherwise object to the notice of an objective (i.e. operating beyond humanity) and knowable notion of laws of physics. But such people have far harder things to explain than mere quantum entanglement.
Why do I believe this to be the best interpretation?
Firstly, it takes the mathematics literally, and mirrors what the mathematics describes. In particular, it is consistent with QFT's description of matter and the creation and annihilation of particles. Every object that enters into the mathematical calculation has something corresponding to it in reality. Only those objects in the calculation exist in reality (at least, of those things relevant to the experiment). It preserves both reality (which is important from a philosophical perspective) and locality (which is important if physics is to be self consistent). To be more explicit: all physical interactions (or secondary causation) are local. However, God is not localised.
Granted there is more than one way to formulate the mathematics. For example, I imagine that this is where those who advocate de Broglie/Bohm's interpretation will part from me. In the de Broglie/Bohm, the Schroedinger equation is split into two parts. Likewise reality is split into two parts: there are the particles (which we observe) and an underlying pilot wave (which is not observed, but which controls the motions of the particles). One part of the division of the Schroedinger equation governs the pilot wave evolution. The other governs the motion of the particle. The proponent of this interpretation would argue that it too is the natural interpretation of the mathematics, after de Broglie's reformulation.
The problem is that the equation governing the motion of the particle depends not only on the the wavefunction at that point, but at every other point. This violates the locality of physical interactions. This non-locality is what allows the de Broglie/Bohm interpretation to avoid the worst implications of Bell's theorem (a denial of realism). However, it creates problems when reconciling the interpretation with relativity. There have, of course, been attempts to do this -- see for example arXiv:1307.1714 and arXiv:quant-ph/0303156.
That last paper, in particular, looks quite similar to my own interpretation: particles travel down their own world-lines, with occasional spontaneous jumps corresponding to a creation/annihilation event. Like me it prefers a particle rather than a field interpretation of QFT, and for much the same reasons. However, this interpretation also requires the physical but unobservable pilot wave to guide the particles. In my interpretation, this is redundant. Between the jumps in the De-Broglie/Bohm interpretation, everything is deterministic, while in my interpretation that is not so. In my interpretation, the Hamiltonian operator allows us to calculate, given an initial state, the amplitude for each possible final state a moment later (we can talk about "moments later" without specifying a preferred reference frame and thus time axis because the Hamiltonian operator is entirely local: it makes no difference how we label space time points elsewhere in the universe). This final state could have the particle move to a neighbouring point in space, or stay where it is, or interact via a creation/annihilation event with one or more other particles. There is an amplitude for each option, which means that every option is possible. There is no deterministic evolution between creation or annihilation events. So even if the De-Broglie/Bohm interpretation can reproduce the standard theoretical results (and I haven't seen an explicit calculation suggesting this), it still requires non-local physical interactions, redundant non-observable physical objects, and unexplained indeterminate jumps in an otherwise deterministic system. My interpretation avoids all of those.
What of the alternatives? The Everett interpretation requires a branching of reality at each moment of indeterminacy. There is no obvious reason why this should happen, or clue as to what mechanism causes it to happen. Nor can there be any experimental evidence for it. The Copenhagen interpretation denies reality between measurements, and also has that unpalatable mix between deterministic wavefunction evolution and indeterminate jumps (in this case on measurements). The ensemble interpretation is widely regarded -- including by me -- as being disproven by Bell's theorem (and related results). Other interpretations require information passing backwards in time, violating causality. All these other interpretations require adding something in addition to the bare mathematics, and often even after that don't really resolve the problem of mixing determinacy and indeterminacy. You might say I too am adding something beyond the mathematics, namely God. But the theist would respond that God is present behind the scenes in any interpretation of quantum physics, giving the abstract equations force in the real world.
Extensions to Bell's theorem
I have been discussing Bell's original theorem. There have been extensions and variations of this brought up since then. I will cite the Leggett-Garg inequality, the Kochen-Specker theorem, and the Pusey-Barrett-Rudolph theorem, which Matthew himself raised.
These all assume slightly different assumptions to those of Bell's theorem. However, as far as I can tell, they all rely on the first and third of Bell's assumptions. As such, they are not applicable to my interpretation.
Now onto the comment which sparked this post. And as with Matthew's other comments, it is well thought out, well informed, and challenging. I am very grateful for such comments, since they force me to think about things more deeply. And will correct me on a number of issues.
I hope it is not too presumptuous of me to assert that an expert on quantum physics is wrong about quantum physics... but, I believe you are wrong about this:
"In the derivation of his inequalities, Bell assumed that uncertainty concerning the hidden real substratum of matter should be parametrized using classical probability; assuming that all the predicates of the particle have actual values in the hidden substratum"
The probabilities that appear in the derivation of Bell's theorem are probabilities for measurement outcomes predicted by a candidate physical theory, and they are conditional on measurement settings and whatever else the theory says is required to predict those probabilities. Bell does not actually assume anything about what is required to specify the state of the system; in fact, it could just be the quantum wavefunction. As long as it is coherent to say things like "there is a certain probability for this system to have this certain wavefunction" and "the probability that we get this measurement outcome is so-and-so, given that the system has this wavefunction," and these probabilities obey classical probability theory, I can't see any objectionable hidden assumption in the derivation of Bell's theorem. (And that is even if I am wrong in my comment on your earlier post about whether classical probability theory can coherently be said to fail for quantum systems, or whether it is more accurate to understand things some other way.)
I would discuss quantum states rather than wavefunctions, but I think what Matthew means is the same thing as my state. A state is a possible configuration of the particle. A wavefunction is a superposition of states, which parametrizes (as an amplitude) our uncertainty about which state the particle is in. So states are physical; wavefunctions are (primarily) epistemic. They are also things we calculate, and as such conditional on whatever premises we put into the calculation. As such, we should not discuss probabilities for a system to have a certain wavefunction, since systems don't have wavefunctions. They exist in particular states.
So is it coherent to say "There is a certain probability for this system to be in this particular state?" as Matthew requires. I would disagree with this. It is not coherent to discuss probabilities that the system is in a given state, for the most part. In my interpretation, our uncertainty concerning quantum states is parametrized in terms of an amplitude rather than a probability. So one can say "There is a certain amplitude for this system to be in this particular state." But one cannot use probabilities here. The reason for this is that when we convert from amplitudes to probabilities, we lose, or rather combine, information that distinguishes between different states. For example, we picture possible states of a system as being the different points along the circumference of a unit circle. In practice, we would have several such circles. Although the system itself must be on one circle or another, we don't know which one, so our amplitude and probabilities distributions are split between them. The amplitude would be a point somewhere within that circle. It contains information on both the angle and the radius. The probability, on the other hand, is the square of the radius, and can only be represented as such because of the definition of probability as a single number lying between zero and one. Thus the probability can't distinguish between the different states on the same circle. The conversion to probability loses the angular information. The best we can talk about is the probability that the particle is in one of a large number of states (namely all those which contribute to the same circle); the amplitude can distinguish between the individual states on the circle.
What if one did try to use something other than Born's rule to convert from an amplitude to a probability. Then you could maybe assign a probability to each individual state. But that would lead to incorrect experimental results. Probability theory assumes that the states are orthogonal, and this affects how you try to combine probabilities for different states. But the states of quantum physics are not like this. The problem is that it is precisely the interference effects, or non-orthogonality, between these different possible states -- which can be picked up by the amplitude but not the probability, which lead to all the interesting quantum features. (This example is most closely related to interference; each circle represents a different location, and the angles represent a different complex phase; but similar considerations apply to particle spins or photon polarisation.)
There are, of course, caveats to make. Firstly, we could discuss probabilities if no information was lost when converting from the amplitude to the probability. For example, if each amplitude, instead of being represented by a circle, was represented by a line segment from 0 to 1. Here we would have no angular information; there would be a one to one relationship between the amplitude and probability. In this sort of system, it would make sense to associate probabilities with given states. However, I can't think of a quantum particle described by the standard model like this.
The second caveat is, of course, the measurement process itself. Here the system is forced by the detector into a given basis (or a given angle). There are now no interference effects between the amplitudes for different states, and again there is a one to one relationship between the amplitude and probability. But this only occurs at the point of measurement. Before it hits the detector, the particle could be in any basis, and therefore we have to parametrise our uncertainty using the amplitude. It is incoherent to say there is a probability that the particle is in a given state, because probability can't specify which basis the particles are in, and can't deal with non-orthogonal bases.
So as long as Bell assigns a probability to a quantum state (or hidden variable configuration) outside the measurement process, my criticism is valid. One doesn't have to read very far into his original paper to find where he does this:
Travis Norsen on Bell
Back to Matthew.
I take my understanding of the situation here from Travis Norsen's papers on Bell's theorem, such as https://arxiv.org/abs/0707.0401 and others. Norsen's work on Bell and Bell's theorem are really good, I highly recommend them to any physicist. I think he does a very good job of clearly presenting where Bell was coming from and what his theorem entails.
First of all, I would like to thank Matthew for pointing me to this paper. It is well worth a read. I'll firstly quickly summarise the contents, and then give my thoughts.
The main argument of the original EPR paper was to show that the theory behind QM, under the assumption that it satisfied local causality, is incomplete. Einstein and Bell each believed that a local hidden variable theory was the only hope for a locally causal quantum physics. Bell's contribution was to show that no locally causal theory could reproduce the correct empirical predictions.
Key to all this is Bell's notion of local causality, which is frequently misunderstood. In summary (I will be more precise later), Bell's definition means that an event can only be caused by events and substances in the past light cone. Many commentators state that Bell's theorem depends on local causality plus some other assumptions, but in practice everything that Bell needs is the assumption of local causality and those things which can be derived from it.
Central to Bell's formulation is the concept of the beables. These are those elements of a theory which are supposed to correspond to something physically real and independent of any observation. There are two separate questions: what are the beables in a particular interpretation of quantum physics? And what are the beables in reality? The first of these questions should be reasonably easy to answer; the second is harder. Bell is only concerned with the first question. He is interested in the issue of those interpretations which only contain beables which satisfy local causality. His goal is to show that such interpretations are inconsistent with quantum physics.
Bell's other concern is that terms should be defined in a rigorous way. For example, those interpretations which separate the microscopic (quantum) world from the macroscopic (classical) world are problematic because there is no clear dividing line between the microscopic and macroscopic. Beables should therefore be well defined in any interpretation.
Equally, beables should not be things which are only a matter of convention. For example, the mathematical form of the electromagnetic potential is gauge dependent. Thus this depends on one's choice of convention, and does not qualify as a beable. However, the existence of the photon itself is not gauge-dependent, and this could qualify (in some interpretations) as an objective physical fact, a beable.
Beables certainly exist. Our experimental equipment (and ourselves) have to be assumed to be physically real in any interpretation.
The next aspect of a locally causal theory is that it should be complete, i.e. every relevant factor is expressed in terms of beables in the past light cone of the event. For example, in the case of the Boson decaying into two Fermions, there is the assumption that the decay and its products are the only relevant factors influencing the later experimental outcomes. There is no spooky stuff which we don't detect and haven't included in the model happening alongside it. In particular, everything that is physically real needs to be accounted for in the theory. What is and isn't specified is again provided in a particular candidate interpretation.
The notion of cause and effect are difficult to define in a way that is sufficiently clean for mathematics. Bell's formulation of local causality, however, does not depend on any particular model of causality. It does not invoke any commitment to what physically exists and how it acts. The burden of explaining causality is shifted to the theoretical models and interpretations. Causality is more readily accounted for in a model than in reality. Thus we are interested in causality as it exists in a given model more than causality as it exists in reality. (The purpose of this whole study is to judge between different models.) The claim of Bell's theorem is that all candidate theories which respect local causality are inconsistent with experiment.
The notion of causality used by Bell is not intended to necessarily imply a determinative cause. It applies to both deterministic and stochastic theories.
So now we are ready for the formal definition of local causality:
A theory will be said to be locally causal if the probabilities attached to values of local beables in a space-time region 1 are unaltered by specification of values of local beables in a space-like separated region 2, when what happens in the backward light cone of 1 is already sufficiently specified, for example by a full specification of local beables in a space-time region 3. In a (local) stochastic theory, however, even a complete specification of relevant beables in the past (e.g., those in region 3 of Figure 2) may not determine the realized value of the beable in question (in region 1). Rather, the theory specifies only probabilities for the various possible values that might be realized for that beable.
This might be expressed more formally as a probability equation.
where b1 represents one of the two events of interest, b2 the other, and B3 all the beables in the past light cone of b1. Implications of this include factorizability; namely that the joint probability for A and B can be factorised into a probability for A and a probability for B.
The probabilities are not subjective representing someone's beliefs, but are seen as the fundamental output of some candidate (stochastic) physical theory. In other words, you take one particular model, put in the initial conditions, and the result of the calculation would be expressed as a probability. This is similar to how I would interpret probability.
Bell's formulation distinguishes between causation and correlation. His definition of local causality forbids faster than light causal influences, but may still entail correlations between space-like separated events. (For example, event 1, space like separated from event 2, might not cause event 2 or be an effect of event 2, but still be correlated because they are both caused by something in each of their past light cones.) Any physical signalling process must involve causation.
The original EPR paper argued for the incompleteness of QM, and that it needs to be buttressed by an underlying locally causal theory. It attempts to show that completeness implies non-locality, while locality implies incompleteness. For example, to quote directly from the paper.
Thus, noting that for orthodox QM λ in Figure 4 is simply the quantum mechanical wave function, we have for example that but also that in violation of the probability equation above. Orthodox QM is not a locally causal theory.
Thus a locally causal explanation for the correlation predicted by QM requires a theory with more beables than just the quantum wave-function.
Local causality entails the factorisation of the joint probability for the outcomes once λ is specified. Since there are only two possible outcomes, each of these possibilities entails that the opposite outcome is pre-determined. The possible values of λ must therefore fall into two mutually exclusive and exhaustive categories. Since the measurement axis is arbitrarily chosen, the same argument will establish that λ must encode pre-determined outcomes for all possible measurement directions. Thus local causality requires deterministic hidden variable theories.
The paper goes on to derive the CHSH inequality to show local causality's violation with QM. I need not discuss the details of that derivation here. This calculation assumes local causality as defined above, and that the setting of the detectors is independent of the hidden variables λ.
Are there any ways around this? The article closes by suggesting two. Firstly, non-Markovian causal influences(i.e. when an event at time t-2 influences events at time t directly, rather than indirectly via events at time t-1). This I take as just as unacceptable for the one who accepts locality as action at a distance. Ultimately, Lorentz invariance, as it manifests itself in the standard model Hamiltonian, requires locality in time just as much as it does locality in space. The second solution offered by the paper is one where the beables themselves are not localised but spread out over a region of space. An example of this is the pilot wave of the de Broglie/Bohm interpretation.
As mentioned this paper is a good read, and informative. There are two points I want to make in response.
The first point is one which I have made before. It assumes that the output of the model is expressed as a probability. A probability -- and this is the definition that both this paper and I use -- is a model dependent expression of uncertainty that satisfies Kolmogorov's axioms, and makes certain assumptions about the possible outcomes of system being studied. An amplitude is a model dependent expression of uncertainty that satisfies other axioms and makes different assumptions about the nature of the physical system. In particular, the various arguments that one uses to get from the premise of a model that satisfies local causality to various conclusions that contradict experiment makes use of the axioms of probability. Now, of course, in both causes we are finally interested in comparing against a frequency distribution: directly in the case of a probability, or indirectly via Born's rule in the case of amplitudes. But the various models inspired by the different interpretations discuss single quantum events. Hidden variables (or beables) apply to single quantum events. To get from the theory for a single event to a frequency distribution to compare against an experiment we need to combine those events into an ensemble. There are two different ways in which this combination happens. On the experimental side, we simply re-run the experiment numerous times. On the theoretical side (to deduce a result which can be compared against experiment), the process of going from a single system to a frequency distribution will involve some sort of averaging process (or integration over various states; those states being parametrized by the beables); you assume that every possible combination of beables is sampled at some point in the ensemble. The final result of that averaging process is thus independent of the precise values of any beables for individual events. It just depends on the initial conditions, the model derived from the interpretation, and the precise mechanics of how you do the averaging. The probability or amplitude will no longer depend on the beable data for individual events. And the way that process of combination works depends on whether your uncertainty is parametrized by a probability or an amplitude.
So we have two ways of thinking about this:
Model of single event →
Amplitude conditional on initial conditions and beables→
Born's rule to obtain event probability →
Combine to get result for ensemble (average over beables) →
Probability for ensemble →
Comparison against experimentally measured frequency.
Model of single event →
Amplitude conditional on initial conditions and beables →
Combine to get result for ensemble (average over beables) →
Born's rule to obtain ensemble probability →
Probability for ensemble →
Comparison against experimentally measured frequency.
These two approaches yield different results. The second one is used in quantum calculations. But the equation above defining local causality is both expressed as a probability and contains beable data for an individual event. The only one of these two paths where it makes sense to discuss probability and beables in the same equation is the first route, where we convert to a probability before averaging over the beable information to get the result for the ensemble. In the second route, probability only comes into it after we have integrated over all the beable data; so the probability cannot be conditional upon the actual values of the beables. So Bell's formulation does not match what is done in the actual quantum mechanical calculations. This is why I keep making the point that if the interpretation leads to a model where outcomes are expressed as amplitudes then the mathematical proof of Bell's theorem and the related theorems is no longer applicable.
Of course, that doesn't mean that Bell's conclusion is invalid for those models which parametrise individual events as amplitudes. But a different proof and mathematical definition of local causality is required. I haven't seen such a proof. That's not to say there isn't one (reality doesn't depend on my ignorance, and I would gladly accept correction if I am wrong), but all Bell-type arguments that I recall seeing make this same mistake. They don't apply to hidden variable theorems which exclusively use amplitudes to parametrise the uncertainty of single events.
So onto my second observation. This is not one I have made before, or seen anyone else make (but, again, my ignorance of the literature on this topic is vast, so I would gladly accept correction). I am certainly not alone in emphasising that there have been, over the millennia, different definitions and interpretations of causality. In particular, I like to distinguish between substance causality (what substance did this substance emerge from?) and event causality (what was the cause of this event?). I have suggested that modern philosophy almost exclusively focuses on event causality (out of the two of these options; obviously other interpretations such as Hume's are also available); while when classical philosophers discuss efficient or final causality they generally refer to substance causality. It is often argued that the indeterminism of quantum physics is in conflict with causality. Some events, such as radioactive decay, happen spontaneously and seemingly without cause. I have responded that this objection just refers to event causality. Substance causality is respected by quantum physics. This is guaranteed by the conservation of (the quantum mechanical definition of momentum), which arises from the locality of the QFT Hamiltonian, which in turn is mandated by Lorentz invariance and special relativity.
But it strikes me that Bell's definition of local causality, as expressed in Norsen's paper, makes use of event causality. He is interested in the cause of events. This is evident in the probability equation; he is discussing the probability of an event given certain factors; those factors parametrise the causes of the events.
So when I discuss causality, I almost exclusively focus on substance causality. This is what the standard model Hamiltonian, expressed in terms of its creation and annihilation operators, implies and focuses on. It describes possible decay channels: what substances a substance can decay into. When we can use perturbation theory (and with the usual caveats about renormalization etc.) each Feynman diagram represents one possible path to get from an initial to final state, and display the possible processes of annihilation and emission. We don't know which sequence of events happen, but we can (if we are told which path happened in practice) trace out a sequence of substances. Special relativity, via Lorentz invariance, enters contemporary physics by constraining the possible forms of the Hamiltonian. In other words, the rules of special relativity, including no contact outside the light cone, only need apply to substance causality.
If there is event causality, then it is in part non-material. That is to say that material substances are not in themselves sufficient to explain any events. At least, if we are to accept that the fundamental physical substances are either those described by the standard model, or others of a similar nature. There are three ways I can think of around this. The first is to say that events are just uncaused. The second is to say that there is something physical of a different type to those described by the Hamiltonian which drives the events (such as a pilot wave). The third is to the missing data needed to specify the event causes is non-physical. That is to say something immaterial, or which can't be be expressed by our usual methods of mathematical representation: something which exists outside time and space.
The first of these approaches is troubling, since it would imply that there is, at the heart of the way the universe works, an irrationality. A place where some conclusions don't have premises; where logic breaks down. But this isn't the troubling part. A wholly irrational universe I can imagine. But what we would have is a universe with a rational substance causality, and a partially rational event causality. Sometimes events are determined by their physical causes, such as in an absorption event. Always the possible events are constrained by what is allowed by the Hamiltonian. It is this mixture of rationality and irrationality which would be hard to grasp.
The third approach is to propose that in addition to the material substances described by physics, there is an immaterial substance, a timeless, spaceless, omnipotent and free agent which is directly involved in every process of determining which potential to actualise, while respecting the final causes of substances (unless He has a good reason not to). I call this God; you can call it the source of physical law if you prefer. This substance is in some sense simultaneously in contact with every point in the universe. We can then have a breakdown in local event causality while respecting local substance causality. This is the route that I take.
The second approach is in some respects similar. It proposes that the missing element in the event causes is something physical. For example, a pilot wave. But here we run into the problem that why this thing isn't represented in our most fundamental theory. If it is physical and interacts with particles (which it must do if it influences events), then we should be able to express it mathematically, and put it into the theory. This would require a re-writing of QFT, and in particular the full standard model, incorporating this new feature. The problem of combining Bohm's approach with QFT is well known. The proposed approaches I know of which might be viable (again, this might just be a reflection of my ignorance) still require stochastic jumps for each creation annihilation event. In other words, they don't answer the problem of event causality, and ultimately must collapse into either the first or third approaches.
Entanglement is one of the major issues in interpreting quantum physics. I make no claims to have the definitive answer myself, and I doubt that anyone else can do so either. Bell's work (and others similar to it) is an important contribution. But we need to be careful. I still maintain that Bell's argument makes a mistake in treating the hidden variables in terms of a probability rather than an amplitude. It also fails to make the (in my view important) distinction between event and substance causality.
The question is primarily over the completeness of quantum physics. When we make the distinction between event and substance causality, a picture emerges. If we only focus on substance causality, then quantum physics is complete. Lorentz invariance and special relativity hold. Since Bell's notion of local causality specifically refers to event causality, he does not refute this. With regards to event causality, quantum physics is incomplete. Bell's argument comes into play here, and is an important piece of evidence. But we don't need Bell's argument to make this conclusion. The very stochastic nature of quantum physics alerts us to it. Bell does demonstrate that no classical deterministic theory, mediated by particle or field interactions, could underlie quantum physics (without violating special relativity). This, in turn, leads us to look at meta-physical (we might say supernatural, in the sense of beyond nature) event causes.
But I have discussed that many times before, and you all should know where I think that thought leads.
All fields are optional
Comments are generally unmoderated, and only represent the views of the person who posted them.
I reserve the right to delete or edit spam messages, obsene language,or personal attacks.
However, that I do not delete such a message does not mean that I approve of the content.
It just means that I am a lazy little bugger who can't be bothered to police his own blog.
Weblinks are only published with moderator approval
Posts with links are only published with moderator approval (provide an email address to allow automatic approval)