Firstly, I must apologise that it is such a long time since my last post. Partly it is because this post took longer to prepare than I had planned, but mainly because I got sidetracked with other things which kept me busy for at least a month. Still, I am back now.
Recently (or at least it was recently when I started writing this), the commenter Matthew has posted a number of interesting comments which are certainly worthy of a response with a blog post rather than an inline reply. The comments are
- On probability and amplitude.
- On the difference between quantum mechanics and field theory.
- On Bell's theorem.
In this post, I intend to look at the first two of these. I'll address the third at a later time, because this post is long enough already and it is already far too long since I last posted here.
Probability and Amplitudes
The first comment was raised in response to an article I wrote comparing the use of probability and amplitude in quantum physics. The article which was intended to motivate a discussion on the foundations of quantum physics. My aim in this post was to distinguish the rules of classical probability (which we see with, for example, tennis balls) and quantum physics (which we use to calculate the behaviour of electrons). My post described the basic premises behind probability theory. In particular that probabilities are always additive. Probabilities are used to predict frequency distributions. Suppose that we had a path A which had a 20% chance of achieving outcome X. Then suppose that we have a path B which has a 30% chance of achieving outcome X. If we take path A 500 times, we expect on average 100 hits at outcome X. If we take path B 500 times, we expect on average 150 hits at outcome X. If we then fire 1000 tennis balls, half of which go through path A and half of which through path B, then we expect 250 hits on outcome X.
My statement in that post was to observe that this sort of reasoning doesn't work in quantum physics. My claim was that it leads to results inconsistent with observation. Instead, we have to parametrise uncertainty by some other means; and the method of choice is to use amplitudes.
Of course, classical physics is deterministic, so why use probability? Probability, in classical physics, is used when we don't know all the premises for the calculation with certainty. For example, if we know the starting location for a particle was at exactly x=10.3456252123 in whatever units we use, and we know equally exactly its mass, starting velocity, and those of all the other particles interacting with it, and the correct laws of motion, then we can predict its future with complete certainty. But we can never know things to that precision. All we know is that it starts at a place 10.34 > x > 10.35, and similarly for all the other parameters. If it passes through a fluid or gas, all we know are the average pressure and a distribution of the velocities of the particles interacting with it, but not the microscopic details. So there is uncertainty in the initial conditions, and often in the interactions. So we cannot make perfect predictions; the best we can do is express our results as a range of possibilities. Probability is used to treat this uncertainty with a degree of mathematical rigour. We can not tell which outcome will occur, but we can say how probable they all are. Of course, we can improve the resolution of our conclusions to an arbitrary precision by improving our knowledge of the initial conditions.
Here the use of probability is valid. Each possible starting position leads to just one final result. So all we have to do is add up how many different starting positions lead to a particular final result (if necessary adjusting for the count of how many times the system would start in that position), and that immediately gives us a probability.
That same sort of uncertainty is also present in quantum physics, but quantum physics adds some new features. The first is that it is indeterminate: there are many possible outcomes which can arise from a given initial state. But the main issue is the superposition or interference principle. If two different paths can lead to the same result, then the amplitudes can cancel each other out. The probability when we consider the effects of both paths is not just the sum of the probability for each path individually. This runs against the rules of probability, and is certainly counter-intuitive. My own solution to this is to say that the quantum amplitude and the probability are both measures of uncertainty, which obey different mathematical rules. And that last part is crucial. The mathematics behind probability is built from various axioms, which make various assumptions about the system being modelled. The mathematics behind amplitudes is built from another set of axioms, which make different assumptions about the system being modelled. Both work well as parametrisations of uncertainty. So which one we use depends on what system we want to model.
There are numerous different ways in which we can mathematically model uncertainty: it is not just restricted to probability and amplitudes. All of these models are mathematically and logically consistent. They just are built on different axioms. Which parametrisation we use depends solely on which one best fits the physical system we are trying to model. That should be the only criteria.
Next, I should say that it is relatively easy to convert from an amplitude to a probability. But the reverse is not true: we cannot take a probability, and convert it to an amplitude. That is because the conversion from amplitude to probability loses information. Once it is lost, we can never get it back. To convert from a probability to an amplitude would require adding new information -- data which we would have to pull out of thin air, because we certainly won't get it from experiment.
I have a couple more points to make as background. The first is the difference between frequentist and Bayesian understandings of probability.
Probability, as I said, makes certain assumptions about the system being modelled. These conditions, as it happens, are also satisfied by a frequency distribution. The frequentist interpretation of probability takes this link as fundamental: it sees probability as primarily a statement about frequency. The Bayesian, on the other hand, regards probability primarily as a measure of uncertainty, one that happens to be useful when making predictions about the frequencies you get when you have a large number of samples. From what I wrote above, you will find that I am very much (in this regard) on the side of the Bayesians.
Firstly, the Bayesian interpretation allows us to make a distinction between measurement (frequency) and theoretical calculation (probability): one which is lost if we adopt a literal interpretation of frequentism. Probability is something you calculate. Frequency is something you measure.
Secondly, a Bayesian probability is valid whether we have a large or small sample (or even a single sample) of data points, while measurements of frequency always require a large sample. The Bayesian interpretation is useful when discussing the probability of a single coin toss. The frequentist interpretation, not so much. But sometimes in physics we are interested in single events, or discussing probabilities of experiments which have not yet been performed.
Thirdly, treating probability (or amplitude; in this respect there is no difference between them) as uncertainty is consistent with what we do in physics. Every calculation starts from a premise, and leads to a conclusion. Every probability is conditional on those premises. And this is what allows probability to be objective. A probability is not based on what we (subjectively) know or don't know, but only on what we put into the calculation. One never ought to state "The probability of X is 0.7654" (a subjective statement, and a nonsensical one), but instead "The probability of X given these particular premises is 0.7654." In physics, we start from an initial state, and want to know something about the final state of the system, or some property of the system. The statement is never "The probability of a final state X is 0.7654" (meaningless and nonsensical), but "The probability of a final state X given these initial conditions is 0.7654." Never "There is observable X" (meaningless and nonsensical) but "This particular system has observable X viewed from this particular reference frame." Every physical statement is conditional. Thus inasmuch as we apply probability to this aspect of theoretical physics (estimating experimental error is a different question), it gels consistently with the logical Bayesian interpretation of probability, so that is what I will use.
Secondly, I want to mention the difference between empiricism and realism. Empiricists believe that all we can know about the world can only come from observation. Theoretical models (which go beyond observation) are of dubious validity. All we can observe are particulars, so all we can know about are particulars. All we have is the sensation of redness associated with a particular object. A theoretical model is a useful mental construct, but does not and cannot correspond to anything in reality.
A realist, on the other hand, says that theoretical models are both knowable and useful (I am using the word in a rather more general sense than its technical philosophical definition). There is an underlying objective reality which is both accessible to reason and which contains more than what is directly accessible to our senses; the sensual data can be deduced from this reality. For example, we can think of Plato's theory of forms and Aristotle's formal causality, two classic examples of realist philosophies, as being about an underlying structure to the universe, not directly accessible to our senses (only indirectly accessible through their effects), but accessible to reason. Today we would call this a theoretical understanding (as opposed to empirical data or a phenomenological understanding, extrapolated from the raw data). In particular, it is possible to understand universals, features which are shared between different objects. The universal of redness provides a theoretical understanding of why we see certain objects as red, and what they have in common. The realist accepts that there is something in reality which can be accurately represented by a theoretical model; and that model is both understandable and useful in explaining our observations. This model isn't merely something we invent but accurately represents a feature of the real world; something that is itself inherent within nature. Realists can be (and ought to be) thoroughly empirical, in that they use observation to help them understand. The difference with empiricists is they also consider bare logical analysis as a means towards genuine understanding. To understand the underlying universals (or form), we need both reason and observation (or experiment).
Once again, I most definitely take the side of the realists here. I think that the success of quantum physics, whose theoretical formulation is driven by mathematical concepts from geometry and group theory as much as it is from experiment, makes this clear. We have a theoretical understanding which does probe a deeper level than mere sensual impressions. But the theoretical is backed up by detailed and precise experiments.
Why discuss realism and empiricism? Firstly, because it relates to the frequentest-Bayesian debate. Frequencies are what we measure. So an empiricist will take frequencies as the fundamental data, and then link from those to probabilities. The amplitude then merely becomes a book-keeping tool, with no basis in reality. Empiricists thus most naturally tend towards frequentist thought. Realists, on the other hand, see the theoretical model as most fundamental (albeit that observation is required to determine which theoretical model is correct). Amplitude is linked to the theoretical model. From the amplitude we derive probability, which is used to predict frequency, which is then compared against measurement. Most realists would, then, I think be more sympathetic to the Bayesian interpretation of probability.
Secondly, it relates to how we interpret the internals of quantum physics. Quantum physics is interested in two different types of object: firstly, there are observables, which correspond to things that we measure. That is clear enough. Secondly, there are certain objects in the equations which we cannot directly observe, but which can be used to infer conclusions about the observables. For example, in quantum mechanics, we have the wavefunction. A realist might say that the wavefunction plays a key role in the equations because it corresponds to something in reality: perhaps a Platonic form or an Aristotelian potentia. The empiricist, on the other hand, need not concern himself with the microscopic details because to him only the observations have an understandable reality. To my mind, this begs the question of why use of the wave-function is so successful. If it does correspond to something in reality, then it is clear that it would lead to successful predictions. If not, then those can only be happy accidents. (My own view is half-way between these; I take the wavefunction to be partly a representation of the state of the physical object, and partly a parametrisation of our uncertainty. Both of these have the same mathematical representation; and the quantum amplitude is powerful enough to do both jobs simultaneously.)
So, with that introduction out of the way, let's go to the actual comment. There are two thoughts, so I will discuss each in turn.
It seems to me that the conclusion that quantum mechanics fails to obey classical probability theory is premature. My first thought is that the failure of classical probability is not just unintuitive but may be incoherent: similar to how the idea that classical propositional logic must be replaced by "quantum logic," where the usual rules of inference fail, seems to me to be incoherent. One can formulate alternative logical systems, but classical logic is always more fundamental - I don't see how one can get around the laws of non-contradiction and excluded middle as necessary truths. Analogously, classical probability theory seems to be more fundamental than "quantum probability theory;" when we want to compute outcomes, we always translate back into something that is in the language of classical probability theory by taking the squared absolute values of the quantum amplitudes. Moreover, we can reason about quantum mechanics with classical probability. (e.g. there can be situations where there is a certain probability that an electron has this wavefunction and a certain probability that is has a different wavefunction - these are what "mixed states" are, no?)
Firstly, I should note that the question is not about what is more fundamental from the perspective of which comes first in rational development. The question is which most closely mirrors the objects (or our knowledge of the objects) studied by quantum physics. Just because we are discussing fundamental physics doesn't mean that it is the most fundamental from the mathematical perspective.
Secondly, I should say that I don't quite claim that quantum mechanics fails to obey classical probability theory. Clearly, one can get probabilities from the amplitudes used in quantum physics; and when comparing predictions against measurement we have to convert from amplitude to a probability. These probabilities do behave as we expect probabilities to do. My claim is not that probability has no role to play in quantum physics, but that amplitudes more accurately describe both what is happening in the reality that underlies the measurements and are a better way to parametrise our knowledge (or lack of it) of that reality.
I will start by discussing the law of non-contradiction and the law of the excluded middle. The law of non-contradiction states that something cannot be both p and not-p at the same time. The law of the excluded middle states that something must be either p or not-p. The issue when applying these directly to quantum physics is down to the principle of superposition. For example, suppose that we make a measurement in a quantum system with two possible results, A and B. The law of non-contradiction seems to suggest that the system can't be in both state A and state B at the same time. The law of the excluded middle seems to suggest that the system must be in either state A or state B. The problem is that quantum physics tells us that the system could be represented by a superposition, (A + B)/√2. One naive way of interpreting this is to say that the system is in some way both in state A and state B, violating the law of non-contradiction. Another naive way of interpreting this is to say that it is in neither state A nor state B, violating the law of the excluded middle.
Of course, that last point only violates the law of excluded middle if we assume that the two possible measurement outcomes represent the only two possible internal states of the system. In other words, if the allowed states of the system are given by ( cos θ A + sin θ B) for all possible values of θ. In this case, the superposition falls as one of the infinite number of possible options. The assumption that the only possible internal states of the system are the measurable states is only required if we assume empiricism. Reject empiricism, and there is no problem. But we need to generalise the logical principles to make it clear that the measurable states are not the only possible internal states.
Alternatively, we can think of superpositions as a rotation of the basis of the system. These different bases are just different ways of representing the data; all mathematically equivalent to each other, but expressed differently. An analogy is the distinction between Cartesian coordinates and polar coordinates . Both can be used to convey the same information, but do so via entirely different numerical representations. We can rotate to a new basis, A' and B' such that the superposition state is an exact state in the new basis. (For example, we can write A' = ( cos θ A + sin θ B) and B' = (- sin θ A + cos θ B); the superposition state in the A/B basis is an exact state in the A'/B' basis). So everything in this interpretation would be in an exact state in some particular basis. If we adopt this approach, then we would need to generalise the laws of non-contradiction and excluded middle to take into account the possibility of expressing the system in different bases.
So the law of non-contradiction would be expanded to saying "A system cannot simultaneously be p and not-p for all not-p expressed in the same basis as p". The law of excluded middle is expanded to mean "A system must be either p or any not-p expressed in the same basis as p". Classically, people don't need those qualifications, because there is only one basis in which you express the outcomes. But when there are multiple bases, one needs to be explicit. This is not really a change to the laws of logic. The laws of logic are clear that they only apply to things in the same respect. For example it is no contradiction to say that a string is simultaneously hot and cold, if one end of the string is hot and the other cold. Nor is there contradiction in saying that something is observed as both red and blue; if the two observers are approaching it at different speeds. One adds qualifications to the law to counter cases such as these. Only when the statements are made in the same respect and context, or those contexts are known and can be mapped onto each other, can we think about whether they contradict or don't contradict each other. One needs to add similar qualifications to account for the possibility of superposition.
Or, alternatively, we can view the superposition as merely a statement about our knowledge of the system. In other words, that the system really is in either state A or B, but we don't know which, so need to parametrise our uncertainty by using the superposition. I don't think this will work by itself (there are conjugate observables, where an exact value for one observable means an indeterminate value for another). But it is, I believe, part of the story of what the amplitude represents. If we adopt this interpretation, we still need to combine it with at least one of the methods of the previous paragraphs (although saying that the amplitude is in part a statement of our knowledge of the system is, I think, necessary to explain entanglement).
So the quantum superposition doesn't violate the logical rules, but we have to make clear that to define a state p we need to both specify the state (A or B) and the basis used to parametrise the results, and that the logical rules only directly apply to states in the same basis. In classical physics, there is only one possible basis, so one only needs to specify the state. "Quantum probability" thus represents a generalisation of "classical probability". One can express every classical (single state) result in terms of quantum logic, but the opposite is not true: quantum systems can't be expressed in terms of a single pair of states in one basis.
The amplitude is not irrational. It obeys the basic laws of logic, such as the principle of non-contradiction, as does the probability. But both of them have additional axioms on top of those principles, which is where they differ. From the perspective of logical analysis alone (aside from any physical or empirical considerations), they are equally valid ways of doing a similar job. However, it turns out that the modulus square of the amplitude is consistent with the axioms of probability.
So what of the claim that classical probability is more fundamental than the quantum amplitude because we can always convert from the amplitude to the probability, but not vice versa. I personally see this as evidence for the opposite conclusion. The more fundamental system is the one which contains the most information. For example, we consider statistical mechanics as more fundamental than thermodynamics, since we can derive the results of thermodynamics (which deals with average quantities such as pressure only) from statistical mechanics (which is concerned with the motion of the individual particles). To get from statistical mechanics to thermodynamics involves a process of averaging, losing information. Equally we call quantum physics more fundamental than classical physics, since the principle of least action arises as a particular limit of quantum theory, while one cannot derive quantum theory from the principle of least action. The same should apply with the theoretical framework. The quantum amplitude contains more information than the classical probability. That extra information is not redundant (unless one is using the quantum formalism to describe a classical system), because it is necessary to understand the dynamics of the system. Equally, gauge symmetry, fundamental to modern quantum theory, only applies if the fundamental variables are complex amplitudes rather than real probabilities.
In other words, the amplitude is related to the underlying theoretical structure, while the probability is related to measurement outcomes. And this is where empiricists and realists might take different paths. Empiricists regard observation as primary, and would thus naturally consider the probability first. Realists regard the underlying theoretical structure as primary, and would therefore consider the amplitude (which is aligned to it) as more fundamental. Given the reasons outlined in the previous paragraph, I see this as further evidence for realism over empiricism.
Can we reason about quantum mechanics with classical probability? In some circumstances, yes. The example given is mixed states, which arise in entangled systems, and you have a composite system with two independent Hilbert spaces. Here we treat the system as a statistical ensemble of single particle wavefunctions. But this only works when there is no interference between the two systems. Probability cannot describe interference effects, which are at the heart of the differences between classical and quantum physics. For this, we need an amplitude description. Of course, we have interference in classical, mechanics, in wave systems; but there are differences. In a classical wave system, interference is between different objects of the same sort; you need to have a large number of particles, or a delocalised observable, to see the interference effect. In a quantum system, interference effects appear even when you have only one particle. The particle is only ever observed as being localised, and there is nothing for it to interfere with. In classical terms this doesn't make sense. My interpretation is that parametrising uncertainty by an amplitude is enough to explain this. Bohmian mechanics (where there is a delocalised wavefunction behind the scenes driving along a point like particle) would also work, but has considerable issues in relativistic quantum physics, particularly quantum field theories.
So while some quantum systems make use of probability, even these "mixed states" are an ensemble of pure states. But to understand pure states, we need to use amplitudes to express our uncertainty.
So let us move onto the second thought. This is in reference to an argument I made about why classical probability doesn't work in a quantum interference experiment. I considered the two slit experiment, and stated that the distribution of particles hitting a detector screen at a particular was not the sum of the distributions expected from the two slits. However, the amplitude at the final destination is the sum of the two amplitudes for the different paths. My argument was:
So let us consider the standard quantum experiment, the two slit experiment. We have a source S which emits a beam of particles, towards a barrier with two slits, A and B. The particles then hit a screen at position X. Our understanding of the initial conditions and the physics is denoted as p. It doesn’t matter for this example what that physics is. It could be Newton’s physics, Aristotle’s, Einstein’s, or something entirely different; the only stipulation is that it accurately describes the paths of the particles.
Now if the particle passes through slit A, there is a certain probability that it will hit the screen at position X, which we denote as P(X|Ap). This is expressed as a probability rather than a certainty; maybe we are uncertain about the initial conditions or the mass of the particle or the forces acting on it, so we are not certain about the final result X. Equally the probability that the particle will hit X after passing through slit B is P(X|Bp). The probability that it will pass through slit A is P(A|p) and the probability that it will pass through slit B is P(B|p).
And after a few calculations, I concluded that,
Which is the wrong result. I would subsequently argue that if you considered amplitudes rather than probabilities, then this argument would work. The amplitude for a particle hitting the detector at X given the initial conditions is the sum of the amplitudes for the particles travelling through both slits. So, onto the objection:
My second thought is that the argument you have made to infer that quantum mechanics does not obey classical probability theory ignores quantum contextuality. In your equation (1):
P(X|p) = P(X|Ap)P(A|p) + P(X|Bp)P(B|p)
The probabilities that you are adding up to get the theoretical result, (14), do not correspond to those written in (1). The probabilities that you are adding up are instead:
P(X|Ap')P(A|p') + P(X|Bp')P(B|p')
where p' is a different experimental setup than p, namely, one where we have a which-path detector set up to see where the electron goes, or where you have one of the slits blocked to ensure that it goes through the other one.
In the quantum case, the probabilities that 4 particles hit X when they go through slit A and 10 particles hit X when they go through slit B aren't applicable to the experiment where we don't know which path the particles take, and this isn't because classical probability doesn't apply here, but because quantum theory doesn't make a prediction about which path the particle takes when there is no which-path measurement being made. That sounds incredibly weird, but only because orthodox quantum theory is incredibly vague about what is going on at the microscopic level, speaking as if there is a particle travelling when really all that the mathematics talks about is a wavefunction.
This objection is certainly worth taking seriously. However, I think that there are a few reasons why it breaks down.
I will start by considering that final paragraph. Orthodox quantum theory is incredibly vague about what is going on at the microscopic level. This is, of course, where the different interpretations of quantum theory come in. Mathematically, what we have is a Fock state and a (time ordered) S-matrix describing how it evolves in time. But can we use that to say what is happening at a microscopic level? I tend to interpret the mathematics literally. To over-simplify (in practice there are complications, particularly around renormalisation, the convergence of the series and the vacuum state), I treat The Feynman diagram expansion fairly literally: each diagram represents one possible path from the initial to final state. In this interpretation, the theory isn't vague about what is happening at the microscopic level, except that we don't know which of the paths is taken in practice. Of course, in quantum mechanics, where Feynman diagrams play no role, we don't have this interpretation available. I would therefore say that quantum mechanics is vague about what happens at the microscopic level because it is only an approximation to the reality, where electromagnetic and other gauge fields are treated classically rather than as quantum objects, and particle creation and annihilation is ignored. Quantum mechanics thus does not correctly understand interactions, and cannot be expected to get the microscopic details correct. Field theory does, however, give us a natural picture of what is happening behind the scenes.
Quantum theory doesn't make a prediction about which path the particle takes when there is no which-path measurement being made. This depends on what is meant by prediction. If the word is meant in the sense "This particle will go down this path," then no. Quantum physics is indeterminate. But that is the same whether or not we are view probability or amplitude as more fundamental. If the prediction concerns the likelihood for a particle to go down a particular path, then we can make a prediction, as long as that prediction is expressed in terms of an amplitude rather than a probability. We can calculate the amplitude for the particle to go through each slit, and this amplitude remains the same whether or not we actually make the measurement. We can, of course, convert that amplitude into a probability to compare against a measurement; but when we do so we destroy the phase information of the amplitude, and lose the interference data.
So we can say that in reality the particle goes through one of the slits, but we can't know which one in the absence of a measurement.
So let us move onto the main focus of the objection. Usually, if one wants to demonstrate the inadequacy of a classical hidden variables theory, one uses Bell's theorem. But since I was using the two slit experiment in my previous post, I will work with that. The statement is that adding a measurement about which slit the particle goes through changes the probability distribution. This is true, to a certain extent. Indeed, it affects the standard quantum physics calculation; only that calculation is based on amplitudes rather than probabilities. The reason it affects the measurement is that the measurement destroys the phase data of the amplitude. Since the probability contains no phase data, if uncertainty were fundamentally represented by a probability, then one could expect that there would be a way of making a measurement without affecting the dynamics of the system. Certainly this is true in classical physics.
Let us write the amplitude as Ψ. In this case, we are interested in the amplitude that the particle ends up at X given the initial conditions p and that it must go through either slit A or B (which for simplicity, I am assuming are point like).
Note that the two slits have a certain non-zero width. In practice, this means that we have two sources for the interference. Firstly, amplitudes for particle beams from one slit will interfere with amplitudes for particle beams from the other slit. This leads to an oscillation between points of maximal and minimal intensity. The spacing between these points depends on the spacing between the slits. Secondly, even within a single slit, we have interference between light coming from the top of the slit and the bottom. This leads to another sequence of maximal and minimal intensity, with the spacing between the minima now depending on the width of the slit. The intensity pattern for the two slit experiment without a measurement of which slit the particle went through will be a convolution of these two effects. The intensity pattern when we do measure which slit the particle went through will be the sum of the two single slit diffraction patterns. This will also oscillate in intensity, and we can arrange the experiment so that there are points of zero intensity when we perform the experiment with a measurement of which slit the particle went through.
The amplitude for a particle to hit a location X given the initial conditions and that it went through slit A is
The corresponding amplitude for slit B is
We add them together to get the total amplitude
Ψ(X|A,p) represents the amplitude that we measure result X contingent on A and p. [I am simplifying here to keep the equations more readable: in practice, one should sum over the amplitude calculated from every point in the slit A or B. If you prefer, you can mentally add a sum over A and B whenever I have repeated indices.] But unlike probabilities, these two contingent facts have a different status, which is why I have separated them with a comma in the expression. There is a difference between hypotheticals (which we don't know) and things which are the result of measurement. The first type of condition are represented by amplitudes, which contain phase data. But we measure frequencies, which don't contain phase data. The rules for amplitudes is that we combine those priors which are represented by amplitudes, but not those which represent facts. (This is my own notation; this distinction is related to the distinction between pure and mixed states.) Thus,
Performing the calculation to get the prediction for the final amplitude and thus observed frequency distribution is a simple matter of geometry and trigonometry, but I don't need the exact result here. All I will say is that it gives an interference pattern, with alternating fringes of high and low amplitude. When we take the modulus square of this expression to get a probability which we can compare against the experimentally measured frequency, we get
where the dagger indicates complex conjugation. The first of these terms represents the frequency you would get from passing through slit A alone, the second represents the frequency you would get from passing through slit B alone, and the third and fourth provide the interference effect.
That's the case when we don't have measurements to determine which way we pass through the slit. But what if we do have such a measurement? I will assume that we have a device sitting on slit A which reads out whether or not the particle passes through that slit. This will be represented by the operator
Of course, we simultaneously gain information about whether or not the particle passes through slit B, so I will put in another operator to represent that. We are no longer calculating the amplitude for recording a particle at X given the initial conditions, but the amplitude for recording a particle at X given the initial conditions and whether or not our measurement records a particle passing through slit A. There are thus two amplitudes, depending on the result of the measurement at slit A:
mA and mB are results of measurements rather than a hypothetical slit. They don't represent something we are uncertain about (with the uncertainty parametrised by an amplitude), but something we know for certain. This means that we can't easily combine these two conditional amplitudes in the way I did above for Ψ(X|A,p) and Ψ(X|B,p). It is more convenient to express this in a vector notation, which means that we can keep everything on one line:
In this notation, we the amplitude Ψ(X|pm) and the measurement operators are vector quantities, each component representing a possible result of the experiment. The measurement will also have the effect of randomising the phase of the particle going through slit A, which I have denoted by adding in the angle θ.
When we take the modulus square of this amplitude to compare against the experiment, we get no interference terms (because the two paths are on different components of the vector)
I ought to make a comment here about locality, since that is going to be important below. Locality is the principle that there is no action at a distance: all interactions take place at the same point. This is of significant importance in contemporary physics (both Newton's theory of gravity, and Gauss' law of electric forces, which in classical physics are non-local, have been replaced by field theories which are explicitly local). Does my operator violate locality? It affects the amplitude for the particle passing through slit B based on a measurement performed on slit A. No, because the amplitude represents (in part) our knowledge of what's going on at the microscopic level. Nothing is physically interacting with that particle. It is just that our knowledge of the situation has changed because we know that if the particle didn't pass through slit A, it must have passed through slit B. The principle of locality only applies to particle interactions, not our knowledge of the system.
So the question is right that measuring which slit which the particle passes through changes the conditions of the amplitudes.
That's the orthodox quantum mechanical calculation, which (once one puts all the numbers in) agrees well with experiment. One would have thought that because it is fundamentally based on amplitudes rather than probabilities, that is evidence supporting my thesis that uncertainty in quantum physics is best represented by an amplitude rather than a probability. True, one can represent mixed states through a combination of probabilities and amplitudes. But as my notation here shows, one doesn't have to: an amplitude representation alone can do the same work: a mixed state is merely represented by a vector of amplitudes. However, one does not represent the pure state (used in the case where there is no which-way measurement) system by probabilities (until the very end of the calculation, when one wants to compare against experiment): it is all amplitude.
Matthew was thus correct that the presence of the measurement changes the priors of the amplitude, even in the quantum mechanical calculation, and it is important. I was mistaken in my original post to exclude this. However, it doesn't make any difference to the conclusion, as I shall now show by considering the corresponding calculation when we use probability.
Can we apply the same reasoning if we say that our uncertainty is fundamentally parametrised by a probability? Can the difference in priors lead to the presence or lack of an interference pattern? We can easily enough add these additional priors to the expression for the probability. The set-up is the same: we measure whether or not the particle goes through slit A. Since this is a classical system, I will denote the measurement by an upper case M rather than a lower case m. In the case where there is no measurement, we have,
While when there is a measurement, we have
Probability is always additive. We are measuring whether or not the particle goes through slit A. There is no experimental equipment at slit B. If we assume locality, then that means that P(X|BMp) will be unaffected by our measurement, and will just be the same as P(X|Bp). Thus we have
Similarly, we know what P(X|Ap) is, since the case without the measurement will have a symmetry between the two slits and thus (to the level of approximation that I'm working at), each slit will contribute roughly half of the overall result of P(X|p).
So let us put in the answer. I will write d as the separation between the centre points of the slits, and b as the width of the slits. x indicates the position on the detector screen, and a is the distance between the slits and the screen. λ is the wavelength of the light, and I assume that there is equal probability of the light passing through either slit. If we neglect terms quadratic in d/a, b/a and x/a, then we find,
N0 and N1 are normalisation factors. In practice N0 is slightly less that twice N1, so this is a reasonable expression of a probability.
So in this sense, Matthew is right. From the sense of logic alone, one can solve this problem by introducing a conditional probability associated with the measurement that alters the distribution.
The problem is that this equation doesn't make any physical sense. This probability describes the effect of the measurement process on the distribution. It tells us how the measurement process changes the way the particle travels from slit A to the screen; for example how it changes the momentum distribution of the particle. It should thus only depend on the mechanics of what is happening at slit A. As far as the measuring device is concerned, that there is a slit B is an irrelevance. This is again the principle that interactions between particles is local. The equation above is only dependent on what is happening at slit A. Thus it ought to be independent of the distance between the slits. Which it isn't.
What this means is that we cannot have a theory which satisfies all these criteria:
- The underlying physical interactions are local.
- The wavefunction represents (in part) a parametrisation of our uncertainty.
- The results are consistent with experiment.
- There is an underlying theory where uncertainty is parametrised by a classical probability.
Locality is strongly supported (this means that there is no interaction between particles at a distance). Firstly, all our current best theories -- the standard model of particle physics and general relativity -- are explicitly local. Secondly, there is a theorem which shows that a non-local interaction in the spatial wavefunction will necessarily to a singularity in its Fourier transform. In other words, the particle could suddenly get infinite momentum, which disagrees with experiment. The second of these assumptions is shared by both models. The third we obviously need. This means that the final of these four conditions must be violated. In other words, if we treat the wavefunction as a representation of our uncertainty, uncertainty in quantum physics is not represented by a probability, while the amplitude representation of uncertainty works well.
Of course, there are interpretations of quantum mechanics (such as Bohm's) which resolve this by violating locality. These interpretations are, however, difficult (and perhaps impossible) to reconcile with relativistic quantum field theory. My own interpretation, where the wavefunction (in part) represents a parametrisation of our uncertainty is consistent with field theory.
Is field theory a form of quantum mechanics?
Here is the second comment:
Everything else I have read about quantum field theory says that the view you dismiss (that QFT is just QM updated to handle fields) is correct, and that the differences between QFT and QM you state are actually just non-essential differences in the way it is formulated, rather than deep differences rendering QFT into a different category from QM. Please allow me to elaborate.
"The objects modelled in QFT are not classical fields; nor are they classical particles; they are something wholly different."
Well, the objects modelled in QM are not classical particles, either: they are quantum particles. The objects modelled in QFT are quantum fields, and you can formulate QFT in such a way as to makes it clear that quantum fields relate to classical fields as quantum particles relate to classical particles. (I've seen this claim supported in a couple of different places, but most recently in Ch. 4 of Sebens' thesis here: https://deepblue.lib.umich.edu/handle/2027.42/111422 ) This is in fact clear from the way that QFT is often introduced by first considering a set of coupled oscillators representing a discretized version of a field; QFT is the limit of such a system for infinite degrees of freedom.
"Quantum mechanics removes some assumptions of mechanism, but retains some others. ... The most important of these is the assumption that the fundamental building blocks of matter are indestructible."
Taking seriously the analogy that QM:particles as QFT:fields, QFT also holds that the building blocks of matter are indestructible. The number and type of fields in the universe does not change (though of course they can interact in such a way as to disguise themselves, e.g. as we have in electroweak symmetry breaking).
But taking a different tack, we have obvious analogues of creation and annihilation events in QM in the possibility of transitions of, say, a quantum harmonic oscillator between different energy states. This is just what the creation and annihilation of particles is in QFT: the transitions of the field between different states of energy and momentum.
Moreover, if you set things up right you actually can describe genuine creation and annihilation of particles in QM; see the toy example in section 2 of this paper: https://arxiv.org/abs/1506.00497 (the method there is of course intended for QFT, but as the toy example shows it can be used for finite degrees of freedom as well).
"A differential equation, such as the Schroedinger equation, also cannot cope with discontinuous events such as particle creation or annihilation. The true theory of nature thus cannot be reduced to any differential equation."
I quote from Sean Carroll's blog: "Every quantum system obeys a version of the Schrödinger equation; it’s completely general. In particular, there’s no problem talking about relativistic systems or field theories." (http://www.preposterousuniverse.com/blog/2016/08/15/you-should-love-or-at-least-respect-the-schrodinger-equation/) But the way of constructing QFT mentioned above makes this clear.
"The time evolution operator ... acts on the state vector, but there is no deterministic evolution of the state at all. Rather, any "path" to get from A to B is possible."
This is just a difference in formulation. QM can be formulated using the path-integral approach, or using the Schroedinger picture of the evolution of the wavefunction, or the Heisenberg picture of the evolution of the operators. These are equivalent formulations all of which can be used in QFT as well.
This means that there is a deterministically evolving quantum state in QFT just as there is in QM: working in the path-integral formulation or the Heisenberg picture or the interaction picture, rather than the Schroedinger picture, just changes how this evolving state is represented; it doesn't make that state go away. And this state is still related to the world we observe in the same way as it is in QM, through the measurement postulate.
So the measurement problem is still there in QFT, and for that matter so are empirical correlations that violate Bell's inequality, demonstrating non-locality in quantum mechanics. (And the psi-epistemic view you adopt here remains challenged by the PBR theorem; it does not seem to me that QFT changes that situation either.)
QFT and QM certainly suggest differences in the fundamental ontology of physics (fields versus particles), but I think many of the philosophical issues surrounding quantum physics are not so affected by the transition from QM to QFT as you suggest.
Quite a long comment with a number of points.
I should say first of all that I do believe that from a pragmatic point of view philosophers of physics should skip over QM and concentrate on QFT. The simple reason for this is that QFT is the more fundamental theory. Indeed the theory behind the most fundamental confirmed physics. It is also no harder to understand the basic principles (it is harder to perform calculations, but that's a different). So if there is indeed no significant difference between the philosophy of QFT and QM, then you don't lose anything by concentrating on QFT. If, however, there is a difference, then you risk going down a blind alley. Why spent so much time on the assumption that there is no difference, when you are not going to know if that assumption is true until you think about both theories?
Fields or particles?
So let us start by considering the roles that fields and particles play in quantum field theory. The usual picture is the following: you have various fields which spread across all of space time. This concept is not unknown to physicists.
In classical physics, we have electromagnetic and gravitational fields, which mediate the forces. These are delocalised and represented by continuous numbers. The strength of the field is not the same everywhere, but gradually changes across space. Field excitations can be created and destroyed, they can come into existence and out of it. You also have matter particles, which are the constituents of everything around us. These are discrete (you can only have an integer number of particles), and localised (they exist only at one point in space time). Classical particles are indestructible: they can never be created or destroyed.
One analogy (which like all analogies has its flaws) is individual boats (representing particles) on an ocean (representing fields). There is only ever an integer number of boats, which are also at specific points. The field is, however, spread out everywhere. It is not perfectly flat: there are waves.
Both fields and particles carry energy and momentum. Particles just do: their mass is a form of energy, and momentum is what you observe when you see that rest mass in a different frame of reference. There are also other types of energy which can be stored in particles. Fields can carry energy and momentum through waves. This is where you have an excitation in the field which moves through space. A perfectly flat field (amplitude zero everywhere) has zero energy. (I will just accept as a simplifying assumption that the zero of the energy scale is defined to represent a flat field; in some quantum applications it might get a bit more complicated). If you excite the field, that is give it some shape, so one part of it has amplitude greater than zero and another part amplitude less than zero, then that corresponds to a positive energy. You can then move this excitation across the field in time, and in that way the field can carry energy and momentum.
Generally speaking in classical physics, fields interact with particles and can self-interact, while particles can only interact with fields. Each interaction involves the transferral of energy and momentum. So if one particle wants to move another one, it does this by generating a wave or excitation of the field, and shooting that out to the next particle. When the wave hits that next particle, it transfers the energy to it, and we have new motion.
In a classical electromagnetic interaction, this is basically what happens, except that it happens automatically and continuously. Every charged particle continuously excites the electromagnetic field around it. That excitation then propagates through space. The field strength will decrease as it spreads out away from the charge (as described by Maxwell's equations). Eventually it hits another charged particle (which, of course, is also generating its own excitation of the electric field), and energy is transferred to that particle, setting it in motion (remember: motion means change of state, which includes acceleration in classical physics).
In quantum physics, the notions of fields and particles are merged together. I'll leave gravity to one side, since that still isn't well understood. The things that we thought were particles in classical physics have become Fermions -- electrons, neutrinos and quarks. The things that we thought were fields in classical physics are Bosons: photons, carrying the electromagnetic interaction, and (unknown to the classical physicists) the weak carriers, gluons and Higgs. There is still a distinction. Fermions are spin 1/2 particles, and obey Fermi-Dirac statistics (i.e. you can't have two electrons with the same quantum numbers in the same place). The Bosons have integer spin (Higgs spin 0, the others spin 1), and obey Bose-Einstein statistics (you can fit as many of them as you like in the same place). Fermion fields have distinct anti-particles; gauge fields are mostly their own anti-particle, the W-Bosons being the exception. (I am being a little bit sloppy in my language here. Technically, the Higgs field is one of four components of the spin-0 standard model field rather than the field itself; I'll continue to write it as the Higgs field in this article since the standard model scalar field is a bit too much of a mouthful.) Fermion fields additionally come in chiral pairs (which we somewhat unfortunately call left and right-handed), while the Bosons don't (although the weak interaction fields only interact with left handed fermions).
One further important difference is gauge symmetry. Gauge symmetry is an important symmetry of the standard model. Basically, each fermion comes with a complex phase. You can think of this as a compass needle pointing in one direction (the analogy works very well for electromagnetism; less so for the weak and strong nuclear forces but the same underlying principle is still involved). But the direction the needle is pointing cannot be measured. We don't and can't know which way is North. What we can observe are changes or differences in this direction. When performing any calculation, we have to artificially define one direction as North. Once we have fixed the gauge in this fashion, then we can calculate.
But it is also worth comparing what happens when we choose a different gauge fixing. The mathematical process changing from one choice to another is known as a gauge transformation. It's basically just a rotation of the compass needle. We need not have the same gauge at each point in space, so we can rotate the needle by different amounts at different places. This is known as a local gauge transformation. The action, which determines pretty much everything in physics, is invariant under local gauge transformations. That is of significant importance to the physicist, in part because that assumption greatly restricts the possible laws that could describe a physical system.
Under a local gauge transformation, the representations of the fermion and Higgs field are changed by multiplying by a particular factor, and the representation of the spin 1 gauge Bosons is changed by adding something to them. And as long as we do this in a consistent way, everything, the action and every prediction made of the theory to be compared against experiment, remains the same after a local gauge transformation.
There are two conclusions we can draw from this. Firstly, our mathematical representation of the fundamental physical particles contains some redundancies. We are not perfectly representing the physical particles; we are adding stuff that doesn't need to be there. For example, three of the four components of the scalar particle in the standard model are seemingly redundant. They don't correspond to anything in real life. Only the fourth component can be observed -- the Higgs Boson recently found at CERN. Secondly, one can't observe a bare photon field, because a bare photon field is not gauge invariant, and (worse) it transforms additively under a gauge transformation. For a fermion field, every calculation of an observable or probability involves multiplying the operator with its complex conjugate, and the phases of each of them cancel. But that doesn't work with gauge fields. One can construct objects from gauge fields and the differential operator which are gauge invariant (such as the field strength tensor), or composites of photons and fermion fields, but not photons themselves. What about all that light hitting our eyes? Aren't we seeing photons then? No. What we are seeing is the object that generated or scattered the photon. The energy is transferred from that object to our retina by the photon, but we observe the source of the light.
But apart from that, the treatment of "particles" and "fields" in QFT is very similar. At a very broad level, both photons and electrons are, rather than being two very different types of thing as in the classical picture, both the same type of thing which is something of a hybrid of the classical particle and the classical field.
One straight forward way of visualising it is to start with the classical field picture, and then bring in various properties of classical particles. Instead of saying that any excitation is possible, only certain fixed discrete lumps of energy are allowed. So we take the delocalised nature of classical fields, and combine it with the discreteness of classical particles. The excitations of the fields are known as particles as a short hand, while the fields themselves are known as fields.
An alternative is to start from the classical particle picture as the basis, and then bring in certain features of classical fields. These particles are no longer indestructible. Like the original particles of classical mechanics (before fields were introduced), they only interact through direct contact. However, our knowledge of the particles is limited. Instead of expressing our uncertainty through a probability (related to counting frequency; natural for classical particles) we do so using an amplitude representation (more natural for classical fields); which leads to interference effects more common with fields.
Thus there are two different ways of interpreting QFT. The first is to say that the fundamental object of quantum physics is like a classical field but with some particle like properties. Particles are just an excitation of the field, and thus in some sense just a part of the wider field. This is the approach which Matthew adopts, and which I described in the previous but one paragraph. The second approach is to say that the fundamental objects of quantum physics are like classical particles, but with some field like properties. The theory is discussed solely in terms of generation and corruption of particles, and particle interactions. The word "field" is barely mentioned at all. This is how I usually approach the topic myself, and I described it in the previous paragraph. Note that this second particle only approach is not available to QM because in QM electromagnetic interactions are described by a classical field: there is no particle interpretation available.
I think that these two approaches are related to whether the wavefunction is ontological (i.e. a representation of the physical thing) or epistemic (a representation of our knowledge of the physical thing). In the field-picture, the field is the real object, and the wavefunction is a direct representation of the amplitude of the field at any given point. In the particle picture, the wavefunction is primarily a statement of our knowledge of the particle's location. The particle is in reality in one particular location, but we don't know where precisely that is, but can represent how likely it is to be in any given point given what we do know (such as the initial conditions) in terms a wavefunction.
My own approach is something of a hybrid between the ontological and epistemic interpretations of the wavefunction. For some observables (such as location) I treat the wavefunction as wholly epistemic. For others (such as spin or polarisation states) I believe that the ontological representation of the particle's internal degrees of freedom needs to be a wavefunction; but we have our own uncertainty stuck on that. Thus the wavefunction here contains both ontic and epistemic information. (There is, however, in which our representation doesn't do justice to reality. Our representation is stuck in a coordinate system. We take one particular direction and declare it to be the x-axis, and represent spin in that direction by a particular matrix. The choice of which direction to relate to which matrix is entirely arbitrary. But we can't visualise the fermion mathematically without making that choice. That choice of representation is not present in nature. We have added additional information in our representation; and that means that the way information about a particle's spin is "stored" internally within the physical particle isn't something we can easily express mathematically. So the ontic aspect of the wavefunction isn't a precise representation of the physical data.)
These two interpretations lead to the same mathematics. There is no difference in the predictions made by them, when each are formulated properly. So why might we view the particle picture as superior to the field picture?
We only ever observe particles, and never fields. In particular, we never observe a superposition. This is easily explained if particles are fundamental. If fields are the fundamental object, then this is surprising. The usual explanation involves decoherence. So far, I have been discussing individual isolated particles. However, when a particle interacts with an environment containing numerous other particles, its quantumness (as it were) disappears. This is precisely what happens when we take a measurement. Thus it is the very act of taking a measurement effectively turns the field excitation into a classical particle. Thus all we observe are classical particles.
But this excuse doesn't quite convince me. There is more to wavefunction collapse than just decoherence. For example, one can make measurements without actually interacting with the particle, or forcing it into a classical state (for example by looking at an entangled partner). We still only ever observe particles.
The theory is built up from creation and annihilation operators for particles. We could call these field excitations rather than particles, but nonetheless they are the objects which play the crucial role in the theory. They are the objects which convey energy and momentum. It is the interaction between these objects which we study. Fields on their own, apart from the excitations, play no necessary role in the theory. It is quite possible to construct a particle alone model of Quantum Field Theory, with no mention of fields (perhaps paradoxically given the name). However, if one adopts a field picture, one needs to discuss both the base fields plus the excitations. In other words, fields are ontologically redundant, while particles are not.
The mathematical representation of quantum gauge fields is ill-defined because of gauge symmetry. In physics, we start with a physical object, and then create a one to one mapping between that object and a mathematical abstraction. We then perform calculations in the abstract representation, and map back to the physical or real world. As part of that construction of the mathematical representation, we introduce certain redundancies, such as a coordinate system to represent space-time. The theories are formulated so that these artificially introduced abstractions do not affect the final result when mapping back to space time.
The gauge is one of these artificial constructs, present in the representation but not reality. Every observable (related to particles) is gauge invariant, that is it doesn't matter which gauge you choose, you get the same result. However the symbols which are used to represent fields are gauge dependent. It is not possible to express them in a gauge independent way. It is possible to choose a gauge where all the fields in one particular direction are zero. On the other hand, in the particle picture, the operators used to describe particle creation and annihilation are present no matter which gauge you use. The act of creating a photon particle occurs regardless of gauge. The photon field excitation is gauge dependent, and thus unphysical.
- In the language of the field picture, excitations of one field disappear and excitations of another field appear in its place. For example when a photon decays into an electron-positron pair. So excitation number is not conserved in an individual field. And it is these sort of events which are of primary interest to physicists. We are not interested in the underlying field, but only the excitations. Which are, of course, quantum particles.
- Metaphysically, the fields can be thought of as objects of pure potentiality. They do not play a role unless actualised/excited. They are thus a representation of the philosophical concept of prime matter. Form is (in part) the abstract description of the possible energy states of the particles. Aside from form "existing" in an intellect or in non-material beings (which are not the cases we are interested in), neither matter or form can exist on their own. It is their union which forms an actual or existent substance. That substance is the object of study of physics, and that union is a quantum particle. So even in the field picture, the fields are not a substance, while particles are.
The difference between field theory and QM
So now we come to consider the difference between field theory and quantum mechanics. My claim is that this difference leads to important implications for philosophy. Matthew's claim is that, aside from the basic stuff of the universe being delocalised fields rather than localised particles, it doesn't. I agree with Matthew that this distinction, were it the only difference, would not be that significant.
The naive explanation is that field theory is a description of fields and quantum mechanics is a description of particles. But, as I outlined above, things are not that simple. Firstly, quantum field theory (despite the name) can be expressed as a theory of particles. Second quantum mechanics (despite the name) contains fields. In quantum mechanics, the electromagnetic force is described by a classical field. One can also still think of the wavefunction as a field excitation. So this way of putting the distinction is too naive.
Indeed, it is only in QFT that a particle only description of nature is possible (since both QM and classical physics make use of classical fields). There may well be other interpretations of QFT. But if you think of the philosophy of QM and extrapolate that to QFT, you are going to miss the possible particle-only interpretations. As Matthew did, you are going to think "fields in QFT are what particles are in QM" and go to a field interpretation of QFT. But a particle interpretation of QFT is also possible; and extrapolating your QM interpretation to QFT is going to miss that.
The paper cited by Matthew is this one (https://deepblue.lib.umich.edu/handle/2027.42/111422 ).
I have a few technical issues with the paper. His construction of a relativistic Hamiltonian is not local, and I struggle to see how he could use his approach to quantise the gauge field. For example, he takes the square root of the Klein-Gordon spatial term to get a Hamiltonian operator. (We are trying to get a single power of the Energy E from the equation E2 = (pc)2 + m2c4). This is certainly non-local, and will cause problems. It is more usual to divide by the Energy (E = ((pc)2 + m2c4))/E), or more precisely normalise the field operators by the square root of the energy. Equally, I am not sure that his method will generalise to an interacting theory. He is treating an operator equation as though it were an algebraic equation. Multiplying exponentials of operators is not as simple as multiplying exponentials of numbers; and I didn't see that he accounted for this. His operators all commute with each other, so he gets away with it, but if he tried to extend his method to problems mixing Fermion and Boson fields, he wouldn't.
Section 4.6 of his work discusses QFT as a generalisation of classical field theory. He defines the field as a function describing the instantaneous state of a system of particles. He has a function which ascribes a real number to each point in space (this would provide us with a frequency distribution). For the quantum system, this number is complex (an amplitude). He discusses the lattice version of the field theory, which is a little more mathematically well-defined than the continuum version. He defines his space as one which contains all possible field configurations. His definition of the functional derivative and identity operator look sound.
Next he presents his Schroedinger equation describing the evolution of the fields. This looks quite similar to the Schroedinger Equation for quantum mechanics. Except there are important differences, which I will come to later.
His derivation from the SE to the path integral looks standard and sound, except for the first step, where he jumps from the SE to the exponential form of the solution, without (at this point) any time ordered integral over time. As before, this is the point in the argument where I have difficulties.
Now for the differences in the SE. In quantum mechanics, the objects within the Hamiltonian are algebraic and differential functions, and in particular interactions imply a non-local addition to the Hamiltonian. In this equation, the objects are creation and annihilation operators, and all interactions are explicitly local. That difference implies new features of QFT that are not found in QM: 1) the conservation of energy and momentum (which is tied to the locality of the interactions); 2) interactions imply a discrete jump in the field excitations, while in QM all changes to the wavefunction are continuous; 3) interactions involve the creation or annihilation of matter, while in QM they don't. 4) Gauge fields are treated classically in QM (i.e. delocalised, continuous and algebraic) while in QFT they are treated as quantum particles (i.e. local excitations, discrete and operators). The gauge field operator can be written as an eigenstate of the location operator, just like the fermion operator. Thus if we could observe gauge fields directly (we can't because of the way gauge symmetry works), just like fermions, we would always find them in a single place at a given time, not spread out over the universe.
And these are where the most interesting philosophical details arise from. None of these features are found in either classical field theory or quantum mechanics. It is often said (in some sense correctly) that energy and momentum are conserved in classical field theory, but we have to be a bit careful, because different definitions of the terms are used in each case. In classical physics, the conserved quantities are contained within the stress energy tensor, derived via Noether's theorem. But in quantum physics, energy and momentum are defined in terms of the eigenvalues of the time and space differential operators; a definition that has no direct analogue in classical physics.
In any case, the paper reduces the distinction between QM and QFT to two things:
- The presence of creation and annihilation operators in field theory.
- Field theory is relativistic.
I would challenge that last point. Not because there are relativistic formulations of quantum mechanics (one can argue that Dirac quantum mechanics is not self-consistent due to the Klein paradox and other factors), but because one can formulate non-relativistic field theory. This plays a crucial role in contemporary condensed matter physics (the theory describing solids). Admittedly, non-relativistic field theory is usually derived from taking the infinite speed of light limit of relativistic field theory, but the point remains: it is not necessary to invoke relativity to have quantum field theory. Non-relativistic field theory is definitely not Schroedinger quantum mechanics. But it gives more accurate results.
But the first of these two distinctions is certainly valid. QFT makes use of creation and annihilation operators to describe particle interactions. QM doesn't. Matthew of course anticipated this response by saying that
- Some formulations of quantum mechanics use creation and annihilation operators, citing https://arxiv.org/abs/1506.00497.
- Raising and lowering operators used in quantum mechanics, for example, to describe different angular momentum states or (more pertinently) the quantum mechanical simple harmonic operator have a close affinity to the creation and annihilation operators of quantum field theory.
With regards to the first point, I would disagree. The cited paper is a quantum field theory paper. Every standard approach to quantum mechanics conserves particle number. If it uses creation and annihilation operators to describe interactions, then it is no longer quantum mechanics, as originally formulated by Schroedinger, Heisenberg and Dirac. (And when I speak of Quantum mechanics, I mean a theory equivalent to Schroedinger's wave mechanics, or Heisenberg's matrix mechanics.)
The toy model in https://arxiv.org/abs/1506.00497 refers to a rather strange space. In section 2 (which doesn't discuss particle creation and annihilation), you have a 2 dimensional space with a boundary (for example forbidding values for y < 0) plus a one dimensional-space (since I need a name to give this extra dimension, I will call it hyperspace). The Hamiltonians are coupled so that probability can flow from the 2 dimensional space into the separate one dimensional space. Section 3 (which is claimed to have particle creation and annihilation) uses a three dimensional space coupled to a point like hyper-space, which is coupled to the origin of the three dimensional space, which is assumed to be the location of a static particle. The idea is that probability is still conserved, but can flow from the three dimensional space (representing our universe) to the hyperspace and back again. So, in the context of the space particles appear to be appearing and disappearing. In the context of the entire space and hyperspace system, you still have conservation of probability and no creation or destruction of matter. It is difficult to see how this can account for the full effect, particularly given the exclusion principle limiting how many fermions can escape into the point-like hyperspace. And it isn't creation and annihilation in the sense of QFT. The paper itself is restricted to non-relativistic QFT. I note that since then, they have extended to the Dirac equation, but it remains to be seen whether they will be able to extend to full QFT.
With regards to raising and lowering operators, I would agree that there is a close mathematical similarity between them and the creation and annihilation operators of field theory. Indeed, in my book, I use the quantum mechanical harmonic operator as a toy model on my way to describing field theory. But there is an important difference. Raising and lowering operators in quantum mechanics are used to map out the possible energy (and angular momentum) states of the system. They do not describe particle interactions. They do not describe a particle moving from one energy state to another. Neither do they actually involve the creation or destruction of any particle. In every interaction, particle number is conserved in quantum mechanics while it isn't in field theory. So while sharing certain similarities to the objects of field theory (and historically, of course, this formulation of QM helped inspire field theory), they do not play the same role as creation and annihilation operators in field theory.
So I would argue that the use of creation and annihilation operators to describe every particle interaction is one of the key distinguishing features between quantum mechanics and field theory. Is it the only one?
I would say not. One other important difference is related to the gauge fields, such as the electromagnetic interaction. In quantum mechanics, as I discussed above, these are described by a classical field. In quantum field theory, we replace this with another creation operator. This operator creates Bosonic particles (or field excitations) at a particular point in space-time. They are localised. So if we define a field as something delocalised and a particle as something localised, then one key difference between QM and QFT is that in QM fermions interact via classical fields, while in QFT they interact via quantum particles.
Another consequence of this is that QFT, all interactions are local, while in QM interactions can be non-local. The significance of this is related to the conservation of momentum. In classical field theory, the proof of conservation of momentum is derived, via Noether's theorem, from translation symmetry. But this proof relies on the classical equations of motion, which are no longer valid once we come to quantum physics, at least not until we consider expectation values of large ensembles of particles. As I stated above, another complication is that a different definition of energy and momentum is used in quantum physics. So Noether's theorem, even if it worked, would not lead to the conservation of the quantum mechanical definition of momentum. In quantum mechanics, observables are conserved if the operators representing them commute with the Hamiltonian (this follows directly from the Schroedinger equation). Such operators also define symmetries of the system. However, for a particle interacting with an external field, the momentum operator does not commute with the Hamiltonian. One can modify the momentum operator into something that does commute with the Hamiltonian, but then you are not measuring momentum. Conservation is once again associated with symmetry, but it is not clear that momentum is always conserved.
However, in quantum field theory, it is locality rather than symmetry that leads to the conservation of momentum. Momentum is conserved in QFT but not in QM.
So I would say that the three key differences between QFT and QM are:
- The replacement of (algebraic) wavefunctions with creation and annihilation operators (which have different commutation properties). Operators are an altogether different thing than functions.
- Electromagnetic and other interaction fields are treated classically in QM, while in QFT they are replaced by creation and annihilation operators.
- Evolution in time is given by the time ordered exponential of the Hamiltonian operator. In QM this is mathematically equivalent to a simple differential equation. In QFT, because the key objects don't commute, it isn't. So in QM you can either use the path integral approach or the "Solve the differential equation" approach. In QFT, there are also two approaches: path integrals, and canonical quantisation. These are mathematically equivalent, but canonical quantisation is not the same as solve a differential equation.
Of course, the key question is do these differences have any effect on the philosophy. And I would suggest that they do.
Firstly, a key postulate of the mechanical philosophy is that the fundamental particles are indestructible. It is one of the few postulates that survives the transition from classical to quantum mechanics. That postulate is obviously undermined by QFT. Now, one could argue, as Matthew does, that the underlying objects of QFT are fields rather than particles, and that these fields are not created or destroyed. I have argued above that that is not necessarily correct, and we can (and perhaps should) still think of QFT as a theory of particles. But suppose that Matthew is right. That also has huge philosophical implications. Usually we think of each electron as its own independent thing. If this interpretation is right, they are not: they are two different parts of the same thing. We usually think of each human person as our own independent thing. But if we are just different excitations of the same quantum fields, then we are not fundamentally linked together, but part of the same thing. Thus while QM supports an atomist perception of reality (where each object has its own independent existence), QFT would (if this interpretation were correct) support a form of existential monism (where each object are aspects of a single substance). Of course, the particle interpretation of QFT would allow for creation and destruction of matter. So whichever interpretation you make, there is a difference between the philosophy of QM and QFT.
QFT allows for a particle only, with solely local interactions, interpretation while QM doesn't.
Equally, quantum field theory makes it clear that nominalism and conceptualism are untenable (in favour of realism). Possibly you can also see this in QM -- which still has the exclusion principle. But you could still formulate QM so that each individual particle is different, and there are no universals in nature. In QFT this is impossible: every electron is interchangeable with any other electron in the universe.
But my main focus has always been on Aristotle's four causes. The mechanistic philosophy, of course, in effect denies all of them, at least as classically conceived, although there are distortions of the efficient and material cause. Quantum mechanics re-introduces form. The role played by the energy levels of the Hamiltonian corresponds has the same function as form does in Aristotle's physics. Notions of actual and potential states also arise naturally in quantum mechanics. Wolfgang Smith has done important work along these lines, and I have also argued for this at length elsewhere.
Material causation is, I think, obvious. If physical substances can in part be represented abstractly, but not fully (no abstract representation of an electron is an electron), then there must be something more to the being than just what is captured by the abstract representation. Thus we have the idea of material cause, and I don't think that one needs even quantum physics to see this.
Final and efficient causation is, however, something different. We can think of efficient causality in terms of both substances and states. I will discuss substantial causes first. The efficient cause of a substance is whatever substance it emerged from (a unique object or set of objects). The final causes are whatever substances it might decay into (and there might be many possible set of decays). This, of course, presupposes that particles are not indestructible, which is a key difference between quantum mechanics and field theory. Thus quantum field theory allows for efficient causation of substances, while quantum mechanics doesn't. Quantum field theory also is consistent with substantial final causality, while quantum mechanics isn't (or at least no more than classical physics).
So what about changes in state (and efficient causes in a state)? This is allowed in both quantum and classical mechanics, so we could think of the efficient cause as the state before the change, and the final cause of the substance as the state after its next change. We also, have changes of state in field theory. In QM, we start with a particle described by a wavefunction in a given state or superposition of states. The states are related to the possible measured values of the energy and those other observables which can be measured simultaneously with the energy (eigenstates of the Energy or Hamiltonian operator, plus any operators which commute with the Hamiltonian). The Hamiltonian need not be constant in time, so even if a system starts in an exact energy state, it need not stay in one if the operator describing the energy changes. The basis of possible initial states is thus determined from the Hamiltonian at the initial time. The basis of possible final states is determined from the Hamiltonian at the final time.
So we start with an initial state, and want to calculate the likelihood that we will arrive at some final state at a later time. The states are described by a wavefunction. This wavefunction can be written as a vector, where each component of the vector is associated with one possible state, and its value is the amplitude that the particle will be found in that state. When we take a measurement, and convert the amplitudes to probabilities, we destroy most of this information. So I will consider wavefunction evolution only.
Let us say that H is the Hamiltonian describing the system during the initial and final states, and H + H0 as the evolution Hamiltonian, valid between those two times. We can either calculate the evolution of the wavefunction by solving a differential equation, or by rotating the initial state into the basis defined by the evolution Hamiltonian. Each eigenstate of that Hamiltonian evolves in a simple way, according to the complex exponential of the energy times the time. We then rotate back to the basis of the eigenstates of the original Hamiltonian, and that gives us our final wavefunction. If H0 is time dependent, then we can break up the evolution into small time steps, and perform this procedure at each time step. This ultimately leads to the path integral formulation of quantum mechanics. But either way we calculate the evolution of the wavefunction, it is deterministic and continuous. In quantum mechanics, the only indeterminate step (certainly if we adopt the Copenhagen model) is at the moment of measurement, where the system will jump from a superposition into a single state; with the probability of each state given by the square of the amplitude.
So given all this, what do we ascribe to the efficient cause of the change in state? It is not a question that is usually asked, and with good reason because the answer is not clear. You have the initial particle, but that in itself is not sufficient to explain the final state. There is the process of measurement, but a process is not a substance and therefore can't be an efficient cause, and it can't explain the change in the wavefunction which is what we are most interested in. Then there is the interaction term in H0, which in practice drives the change. That might be driven by an electromagnetic field (in a realistic simulation it will be); but that is a classical field, and not the substance that we need to be an efficient cause.
Also the deterministic nature of the evolution undermines final causality in the same way that it is undermined in classical mechanics. You start with an electron wavefunction. You end with an electron wavefunction. There is only one possible outcome. Equally, as in classical mechanics, all particles continually generate an electromagnetic field, so there is the question of which is the cause and which is the effect. You can discuss efficient causality related to changes in state in QM, but, like classical mechanics, it is messy, and not wholly in line with Aristotle's original description of the efficient causes.
But now consider quantum field theory. Once again, we start with an initial state, which is again expressed in terms of the eigenstates of a given Hamiltonian. We end with a final state. But the evolution in the middle is calculated in a very different way. At least in perturbation theory for electrodynamics, this is done by writing down every possible sequence of photon decay and reabsorption (one path), calculating the amplitude for each one, and adding up all the amplitudes that lead to the same final state. (Of course, in practice we have to also worry about renormalisation and the topology of the vacuum; but I will leave that to one side here.) Each path represents one possible route from initial to final state. One of those paths occurs in practice. The others are only possibilities.
Even though we are discussing the path integral formulation in each case, there are several differences between QFT and QM.
Firstly, in QM, the changes in the wavefunction are continuous. In QFT, they are discontinuous. Every emission or absorption of a photon changes the energy of the electron discontinuously. This means that evolution of states in QFT cannot be described by a differential equation, because, unless there is an infinity floating around somewhere in the equation, differential equations always imply a continuous evolution.
Secondly, the indeterminacy is in the evolution, not just at the measurement stage. At any point an electron can emit a photon, and it is impossible to predict when this will be.
Thirdly, there is a clear notion of efficient causality, for states as well as particles. If an electron state emerged after a photon was absorbed by an electron, the efficient causes of that state are the preceding photon and electron in whatever states they were in. There will always be a definite sequence of efficient causes (even if we can't know what that sequence is).
Fourthly, the notion of final causality emerges again distinct from the efficient cause. Every particle has a range of possible decays; which is precisely what final causality describes.
The re-emergence of efficient and final causality is of considerable philosophical significance. For example, it puts the cosmological and teleological (by which I do not mean the argument from design) arguments back on the table. The indeterminacy of evolution in QFT (rather than just measurement) is also of significance in debates over free will in the philosophy of mind. The philosophy of physics is also affected. The last vestiges of mechanism are swept away. An Aristotelian philosophy of physics is back on the table, which it wasn't so much for QM, particularly with its difficulties with efficient and final causality.
I have written some disparaging things about the Schroedinger equation, and Matthew rightly calls me out on this. There is a sense in which it is retained in QFT. But the role it plays in QM and QFT is different, and there are differences in its construction. In QM, it is a simple functional differential equation. In QFT, the time evolution is described by an operator equation. We can call them both "The Schroedinger equation" if we like, but that disguises the differences between the two cases. My worry is that philosophers who discuss the Schroedinger equation might take things from how it is used in QM and apply them directly to QFT.
So first of all, what is the Schroedinger equation? In most undergraduate text books, you will find this called the Schroedinger equation,
represents the wavefunction, and V(x) the potential. This is usually placed in contrast to the relativistic Dirac and Klein-Gordon equations. This is of no use to us. It is not even relativistic. This equation has no role to play in quantum field theory. I provide this equation to highlight that there is some ambiguity over what is meant by the the Schroedinger equation.
Matthew cites Sagan, who uses the following definition of the Schroedinger equation,
where represents the Hamiltonian (or energy) operator. Now that is more like it. This is basically the statement that the energy operator is also the time evolution operator. There is still, however, an issue here. In quantum mechanics, represents a wavefunction, a single object (or perhaps a vector of objects). In field theory, it represents a Fock state, which counts how many excitations we have in all the fields together. It does not represent the field itself. In QM, the Hamiltonian operator is an algebraic and differential function. In QFT it is constructed additionally from creation and annihilation operators, mixing together operators from different fields. Granted we can re-write the QM Hamiltonian in terms of raising and lowering operators, but those are still algebraic and differential functions. These differences are enough to ensure that any discussion of the QM Schroedinger equation need not necessarily carry forward to QFT; or there might be features in the QFT version of the equation not present in QM. This could be dangerous for a philosopher of physics who understands QM but not QFT.
But the biggest problem is that this is not the evolution equation of QFT. In QFT, we use,
Now you might just say that I am grasping at straws here; this is just the solution of the Schroedinger equation as written above. Expand the exponential in a Taylor series, neglect higher order terms, rearrange and take the limit as the time-step goes to zero, and you recover the usual form of the Schroedinger equation. However, this is only valid if the Hamiltonian operator is continuous; or rather if it only implies continuous changes in the Fock state. As stated above, this is not the case when it is constructed from creation and annihilation operators. We can't ignore the higher order powers in the Hamiltonian: this is where QFT gives different predictions to QM (loop effects). The differential form of the equation (the Schroedinger equation) does not allow loop diagrams within the same time-step. The exponential form does.
So my reasons for saying that philosophers of physics should not focus on the Schroedinger equation are:
- The same name is applied to two different equations in QM and QFT. The equations have the same basic form, but the symbols mean subtly different things. Just using the name Schroedinger equation without specifying which version of it could be confusing.
- The Schroedinger equation is less significance in QFT compared to QM. In QM the Hamiltonian operator both defines the energy eigenstates, and is used directly in calculating the wavefunction evolution. In QFT, it defines the energy eigenstates. But we do not solve for the time evolution in the same way; rather than solving a differential equation, we have to consider the different possible decay paths, renormalise, and so on. The tools used to compute amplitudes are very different. A brief mention of the Schroedinger equation misses all this detail. Somebody just familiar with the QM Schroedinger equation might be mislead.
- The differential form of the Schroedinger equation isn't used in QFT, while it is central to most presentations of QM. Yes, you can write QM in the path integral form, but that's not how most people learn the subject.
Coupled scalar operators
Matthew makes the point that the objects modelled in QM are not classical particles. I would in part agree. That is certainly true for fermions, and I would say that the ontologically difference between fermions in QM and QFT is relatively unimportant. However, as discussed above, gauge fields in QM are treated as classical fields, and in QFT as quantum particles.
Matthew states that QFT is often introduced by first considering coupled oscillators, and then considering the limit of such a system for infinite degrees of freedom. I am not completely sure what Matthew's point is here, so my apologies if I misunderstand. I think what he is saying is that the single harmonic oscillator is a system well studied in quantum mechanics; two such oscillators is also a system well studied in quantum mechanics, n such oscillators is also equivalent to the QM theory, and we take the limit as n goes to infinity to get QFT. Thus QFT is basically just QM only for an infinite rather than finite number of degrees of freedom.
This idea is, as I understand it, usually introduced in the context of a scalar and non-interacting field theory. The Lagrangian of the scalar theory is the same as that of the Harmonic oscillator in QM. If we derive equations of motion from the theory, then we uncover the quantum mechanical Klein Gordon equation.
So what is wrong with this picture? It is based on an analogy of the mathematics. Especially in the momentum representation, the same underlying equations are used. But this does not mean that there is the same underlying physical picture. For example, both waves in water and electromagnetic waves are described by the same wave equation. Yet the water waves describe oscillating water molecules, while there is nothing physically oscillating in an electromagnetic wave. The same mathematics, but two different physical situations. We should always be careful when using an argument from analogy.
Firstly, it only works for a non-interacting scalar field. And there doesn't seem to be such a particle in nature. The closest analogue to it in the standard model might be the scalar field associated with the Higgs, but that is coupled to a gauge field. Going beyond the standard model, the inflation might be a similar scalar field. But even here we have an interaction term. As soon as we introduce an interaction term, the analogy in the mathematics breaks down. But it is the interaction term that allows for the creation/destruction of particles, and the movement between QFT and QM. So in any real-world situation, the analogy fails.
Another issue is that in QM, the harmonic oscillator describes the different energy levels (or excitations) of a single particle. In QFT, the modes imply a creation of a new particle. Now, you might say that particles are just excitations of a field, and this proves your point. The difference is that in QM the particle is localised, and is in itself observable. It is not the excitations we observe, but the particle. In QFT, the excitations are each a localised lump, but each excitation need not be in the same area of space. We don't observe the underlying fields, but only the particles (or excitations). So if we are to take the QM analogy literally, it would effectively mean that we are putting a particle at each possible location in space, mysteriously adding the condition that we can't observe the particle in its ground (or lowest energy) state, and then calling that a quantum field. But a quantum field is not the same as a collection of quantum mechanical oscillators. It is not an infinite collection of particles. And that's obvious: if it were, then the mass of those particles (or ground state energy of the oscillators) would create a huge background gravitational field, much greater than what is observed.
Furthermore, the field in QFT theory is not a wavefunction, but an operator, which satisfies various commutation relationships that the QM wavefunction doesn't.
So while using the quantum mechanical Harmonic oscillator might be useful as a teaching tool (and I have used it myself in that way), because the mathematics is similar, it is not a useful picture when trying to understand what the quantum field is.
Path integrals in QM
Matthew also mentions that the path integral method can also be used in quantum mechanics. This is true, and I need to admit that this omission was an error in my earlier post. I was taking a history-over-paths interpretation of QFT, and comparing it to the Copenhagen interpretation of QM. But there is, of course, also a history over paths interpretation of QM. Much of what I said about QFT would apply to that interpretation of QM.
But, of course, there are still differences. The QM Hamiltonian operator doesn't contain creation or annihilation operators. You don't get the same sense that it is destroying a state and then recreating it as you do in QFT. Instead we seem to have a continuous smooth evolution of the wavefunction. But most importantly, as before, is the question of the interaction term. In QM, this is vague, just a term in the Hamiltonian, and it is difficult to attribute it to any cause, any physical object. In QFT, interaction involves the creation or annihilation of a particle. It is easy to see causality at work.
The PBR theorem looks to be another variant of Bell's theorem. It is introduced in https://arxiv.org/abs/1111.3328v1 . The title of the paper is "The quantum state cannot be interpreted statistically," and Matthew's point is that since I interpret the quantum state statistically, I run afoul of the findings of this paper.
A bit of introduction first. One of the big questions when interpreting quantum physics is "What is the wavefunction?" (in quantum mechanics), or "What is Fock state amplitude?" (in field theory). At one level, we know what it is: a complex number that if you take the modulus square of gives the probability that you will get a particular result. It's time evolution is governed by various rules, depending precisely on the theory in question. It can expressed in terms of a basis of orthogonal states, each basis related to the possible values of some observable, and can it also be used to calculate expectation values (or average values) of that observable. About all that, there is no disagreement.
But that is describing what the wavefunction does. It does not explain what it is. And here, there are two general schools of thought. The first is that it is epistemic, i.e. represents our knowledge of the system. So there would be some underlying system with definite values of each parameter, but since we don't know what those values are, we have to express our uncertainty in some way, and we have to use the wavefunction to do that. The second option is that the wavefunction is seen as ontic, i.e. the wavefunction represents reality.
There are two key experiments which encapsulate quantum physics. The first is the double split experiment. Here we have a source emitting some sort of quantum particles, a screen with two slits on it the particles have to pass through, and then a wall of detectors after that measuring where the particles finally end up. When a particle is emitted from the source, one and only one of the detectors is triggered, characteristic of particle like behaviour. But after we send numerous particles through the system, the number of hits on each detector is the classical wave interference pattern. The second key experiment is looks at Einstein-Podolsky-Rosen entanglement. Here a source emits two particles, which must have complementary values of one observable (e.g. one is spin up, and the other spin down). We don't know which value each has until we make a measurement. When we measure the spin of one particle, that immediately fixes the spin of the other particle, as we would expect for classical particles. But when we have the detectors slightly misaligned, the measurements on the two particles will be slightly out of sync. And the difference between them is what we would expect from classical waves rather than classical particles.
In the double slit experiment, the ontic interpretation has no difficulty in explaining where the interference pattern comes from. The particle wavefunction is what is real, but that goes through both slits. But then it has problems explaining why, when we measure the interference pattern, only one detector in one particular place is triggered. The wavefunction ought to be spread out over the whole array of detectors. There must be instantaneous (i.e. faster than light) communication so that the wavefunction "knows" to only trigger the one detector. The epistemic interpretation, on the other hand, explains why only one detector is triggered easily enough. But then it has difficulties explaining the interference pattern, and particularly why it changes when one slit is closed or you measure which slit the particle went through (including measurements which don't interact with the particle, i.e. by studying an entangled partner).
For the EPR experiment, the ontic interpretation needs faster than light collapse of the wavefunction to explain why the two particles shift into the corresponding states when one of them is measured. The epistemic interpretation has no problem with this: it is only our knowledge being updated, rather than reality. However, the epistemic interpretation has difficulties explaining the differences between the measurements on the two particles when the detectors are misaligned, while the ontic interpretation has no problem with this. (This is known as Bell's theorem).
So the ontic interpretation seems to require faster than light physical effects, which is inconsistent with special relativity. The epistemic interpretation seemingly struggles in explaining why there is an interference pattern (what is the particle interfering with?), and the precise details of the EPR experiment seem to rule out property realism, i.e. that the underlying objects have well defined properties, which is assumed by most epistemic interpretations.
I should say that my own interpretation (which has slightly evolved over time, so some of my earlier writing doesn't explain this well) is a bit of a mixture of the two interpretations.
So we need to step back a bit, and look at two fundamental assumptions:
- In physics, we assume that there is a one-to-one mapping between the physical particle (or more accurately, some of the aspects of the physical particle, but not all of them) and a mathematical representation. We start with the physical state. Map to an initial state in the mathematical representation. We then let nature take its course for the physical particle, and manipulate the mathematical representation in some way. We then map back from the representation to make a prediction about the physical reality, which we can then compare against a new measurement. The mathematical representation is not reality, but it must contain all the required degrees of freedom to represent the different states or potentia inherent in reality. It must also respect the same symmetries as occur in reality. There are aspects of reality which are not present in the representation. Not every attribute or state of the particle can be measured or represented by a number. We might introduce additional degrees of freedom in the representation which are not part of reality -- this is OK as long as there is some symmetry preventing them from affecting the final result. However, in that case, the mathematical representation might mislead us when we try to apply it directly to reality, because there is more in the representation than the is in reality.
(This is a bit more controversial.) Probability is an expression of our uncertainty when one or more of the causes is unknown. It is also objective rather than subjective. This is possible because all probabilities are conditional. We start from a series of premises (which under-determine the system), crank the mathematical handle, and express the conclusion about whether a particular outcome will occur as a number between zero and one. This is a prediction, which can be compared (though doesn't have to be) to a frequency distribution after an infinite number of samples. It is objective because everyone (who doesn't make a mistake) starting from the same premises will reach the same conclusion. Different people will have a different (subjective) knowledge of the system, so will disagree about what the premises should be if the calculation is to mirror reality, but that doesn't affect the calculation itself, since each of them will get the same result if they start from the same premises.
I should say that there are many different interpretations of probability. I don't want to discuss the merits (or otherwise) of each of them here. But I use this definition because it is the one which matches the calculations we do in physics (or at least those calculations we are doing here). In physics, we start with an initial state (which usually will correspond to a partial knowledge of the physical system), crank the mathematical handle, and make predictions concerning the final state.
This interpretation of the probability leads most naturally to an epistemic interpretation of the wavefunction. But there is a catch. Probability theory assumes that the most fundamental states representing the outcome are orthogonal to each other, i.e. you can't be in two different states at the same time. It also requires that there is a single basis or representation of the outcomes. Neither of these are consistent with quantum states. So one reason we use amplitudes rather than probabilities to measure uncertainties in quantum physics is that it is impossible to represent the physical states with the sort of basis that is used in probability calculations. If we were to have a perfect knowledge of the quantum state, and chose to represent it mathematically, then that representation would resemble a quantum mechanical amplitude. In this sense, the wavefunction contains real information about the particle. But it also exists as a measure of our uncertainty.
Some observables of interest (such as location) can be expressed as a single orthogonal basis. Here the wavefunction just carries epistemic information. Others (such as the spin state, or the gauge of a particle) can't be. Here the wavefunction carries both epistemic and ontic information (albeit that reality doesn't have a concept of an absolute direction in, for example, spin states or polarisation).
Now an epistemic interpretation of the wavefunction assumes that there is some underlying reality. There are some "hidden variables" expressing the real state of the system. These variables will "determine" the outcome of the various measurements we make. The question is, how are these variables expressed? Is it just like a classical theory, where there is a single, unique basis and every observable property is precisely determined? Is it an ensemble theory, where there are in reality a large number of objects, and it is something like a water wave? Or a quantum theory, where there is no unique basis for the allowed states, and therefore there can be superpositions (the particle is in a unique state in one basis and therefore spread across the states in the other bases), and certain observables will be indeterminate? The answer is that it depends which observable we are looking at. In terms of location, at any given moment the particle is in a definite place (we just don't know what it is). In terms of spin state, the particle is in a definite state in one basis, which means that it it is in an indefinite state in all the other bases. For these observables, the wavefunction contains information which is both ontic and epistemic.
So in the two slit experiment, the particle is really in one particular location, which explains why only one detector fires. But the phase of the particle, which leads to the interference pattern, is really present, so you get the wave-like data. Recall that the amplitude is used to make predictions of the distribution of an ensemble of particles. Each individual particle goes down a specified path regardless of what happens elsewhere. There is a certain amplitude that it ends up in a particular place. When we have many particles, each of them goes down its own path, and hits one particular point on the screen. But to compare to the observable, frequency, we need to convert from the amplitude measure of uncertainty to a probability, and this is where we destroy the phase data and have the "wavefunction collapse". The collapse occurs only because we are making a prediction about an observable which doesn't take into account the underlying phase data, and therefore need to average over it.
With regards to the EPR experiment, we can again say that the particles have a definite location and spin state. But we don't know what this is. And in this case, the observations don't help us know what it is. As soon as we make an observation, we project the particle into a particular basis, and lose all the original information. But again, we are interested in making predictions. The question is "If we measure this value for the first particle with the detector orientated in this particular way, what is the amplitude for a particular measurement-result on the other particle." Because the spin state is represented by the amplitude at both the ontic and epistemic level, the calculation follows what you would expect for waves. No faster than light communication is required. The particles don't emerge from the source in a classical single state=single observable classical configuration as is used in a probability. Instead it leaves the source with a single state in a single basis, similar to how the QM wavefunction treats them, only that in reality the different basis states aren't quite as we represent them because they lack the "preferred direction" that we impose by sticking the representation in the coordinate system. We can't say that the physical particles are omitted with their spins aligned along the x-axis because reality doesn't know anything about an x-axis until we take a measurement.
The problem I have with all papers discussing some variation of Bell's theorem is that they assume that the wavefunction is either ontic or epistemic, but exclude the hybrid approach. They invariably introduce a hidden variable theory (which they will claim gives different results to those of quantum theory), and use probabilities to express uncertainty in those states. They discuss a different type of epistemic interpretation to the one I use; something more akin to an ensemble interpretation. This means that their hidden variable theory must be in the classical "one orthonormal state for one property." They would have different, independent, numbers for each observable. It is at this point I would disagree. Uncertainty in the hidden variable theories is described by an amplitude. It is possible to deny property realism while still accepting that there is an underlying physical state and that the wavefunction is (in part) epistemic. The chances for alignment or misalignment of the spin of a particle in neighbouring orientations is not governed by classical probability, as assumed by Bell's theorem and its variations. If any of these papers mentions the word "probability" in describing the different states of the hidden variable theory (rather than in discussing measurement outcomes) then they are restricting the scope of their argument to classical or ensemble hidden variable theories. But very few people who adopt an epistemic approach will accept this.
Early on in the PBR paper, we read:
Our main assumption is that after preparation, the quantum system has some set of physical properties. These may be completely described by quantum theory, but in order to be as general as possible, we allow that they are described by some other, perhaps undiscovered theory. Assume that a complete list of these physical properties corresponds to some mathematical object, λ.
So we are immediately off to a bad start. A list usually implies an set of things. Standard set theory assumes that these things can be reduced to a number of irreducible and orthogonal objects. That is what I am denying with regards to quantum states. If the "physical properties" means observables, then that is immediately flawed. A quantum particle can't be defined by its observables, since in many states various observables are indeterminate. If it means states, then we might be OK, as long as "list" is defined irreducible states which need not be orthogonal. If that is the case, then you can't use λ to define any probabilities.
Let's read on. The article goes on to set to describe an experiment where there are two different methods of preparing a state. If method 1 is used, then you get a state |ϕ0>, and if method 2 is used then you get a state |ϕ1>.
The paper then defines the two interpretations. The first is that the quantum state is physical: λ is |ϕ0> or |ϕ1>, perhaps augmented by additional variables not described by quantum theory. The second is the statistical view, where the quantum state need not be completely determined by λ; some values of λ may be comparable with either state.
This can be understood via a classical analogy. Suppose there are two different methods of flipping a coin, each of which is biased. Method 1 gives heads with probability p0 > 0 and method 2 with probability 0 < p1, p1 ≠ p0. If the coin is flipped only once, there is no way to determine by observing only the coin which method was used. The outcome heads is compatible with both. The statistical view says something similar about the quantum system after preparation. The preparation method determines either |ϕ0> or |ϕ1> just as the flipping method determines probabilities for the coin. But a complete list of physical properties λ is analogous to a list of coin properties, such as position, momentum, etc. Just as “heads up” is compatible with either flipping method, a particular value of λ might be compatible with either preparation method.
We will show that the statistical view is not compatible with the predictions of quantum theory. We will begin
There are two differences between the "statistical" view of the PBR paper, and the position I hold. Firstly, the analogy with a classical system such as coin flipping invokes probability. My claim is that uncertainties should be expressed as amplitudes, until one has an ensemble of measurements and want to compare the experimental results with a frequency distribution. At that point, one converts the amplitude into a probability via Born's rule, and then use the probability to construct an expected frequency for testing against experiment. Secondly, the identification of λ with a list of coin properties, such as position, momentum and so on. I would not identify the hidden state with a list of properties, but in terms of an underlying state. So it seems that I am closer to their first view rather than their statistical view, even though my interpretation is partially epistemic.
But this is only an analogy, so maybe its not fatal. Let's read on.
And later on, we read,
The simple argument is as follows. Choose a basis of the Hilbert space so that |ϕ0> = |0> and |ϕ1> = |+> = (|0> + |1>)/√2. In so order to derive a contradiction, assume that there is some chance that the complete physical state, λ, of the system is compatible with either preparation method. Suppose that for either method, the probability of this happening is at least q > 0. (Of course it may be the case that, given λ, one method is more likely than the other – the only assumption here is that some fraction q of the time, λ does not determine with certainty which method was used.)
The two systems are brought together and measured. An important assumption for the argument now is that the behaviour of the measuring device – in particular the probabilities for different outcomes – is only determined by the complete physical state of the two systems at the time of measurement, along with the physical properties of the measuring device. Once these things are specified, there can be no remaining dependence on the quantum state of the two systems.
In their statistical interpretation, they are using probability to measure uncertainty of the system being in one state or another. This runs against my dictum that all uncertainties should be expressed as amplitudes until one compares with a frequency distribution. In other words, the PBR theorem is irrelevant for my work. The interpretation they disprove is not one that I hold.
Finally, I ought to nit-pick how the authors phrase the problem. In their opening paragraph, as well as conclusion, the authors use the following language
Nevertheless most physicists and chemists concerned with pragmatic applications successfully treat the quantum state as a real object encoding all properties of microscopic systems.
Regardless of whether one treats the wavefunction ontologically, epistemically, or something between the two, nobody ought to to treat the quantum state as a real object. It is at best a partial representation of reality. Do not confuse the representation or model with reality.
All fields are optional
Comments are generally unmoderated, and only represent the views of the person who posted them.
I reserve the right to delete or edit spam messages, obsene language,or personal attacks.
However, that I do not delete such a message does not mean that I approve of the content.
It just means that I am a lazy little bugger who can't be bothered to police his own blog.
Weblinks are only published with moderator approval
Posts with links are only published with moderator approval (provide an email address to allow automatic approval)