[Physics FAQ] - [Copyright]
Original by Matt McIrvin 1994.
An electromagnetic field wiggles in the same way when it possesses waves. Applying quantum mechanics to this oscillator reveals that it must also have discrete, evenly spaced energy levels. These energy levels are what we usually identify as different numbers of photons. The higher the energy level of a vibrational mode, the more photons there are. In this way, an electromagnetic wave acts as if it were made of particles. The electromagnetic field is a quantum field.
Electromagnetic fields can do things other than vibration. For instance, the electric field produces an attractive or repulsive force between charged objects, which varies as the inverse square of distance. The force can change the momenta of the objects.
Can this be understood in terms of photons as well? It turns out that, in a sense, it can. We can say that the particles exchange "virtual photons" which carry the transferred momentum. Here is a picture (a "Feynman diagram") of the exchange of one virtual photon.
\ / \ <- p / >~~~ / ^ time / ~~~~ / | / ~~~< | / \ ---> space / \
The lines on the left and right represent two charged particles, and the wavy line (jagged because of the limitations of ASCII) is a virtual photon, which transfers momentum from one to the other. The particle that emits the virtual photon loses momentum p in the recoil, and the other particle gets the momentum.
This is a seemingly tidy explanation. Forces don't happen because of any sort of action at a distance, they happen because of virtual particles that spew out of things and hit other things, knocking them around. However, this is misleading. Virtual particles are really not just like classical bullets.
The most obvious problem with a simple, classical picture of virtual particles is that this sort of behavior can't possibly result in attractive forces. If I throw a ball at you, the recoil pushes me back; when you catch the ball, you are pushed away from me. How can this attract us to each other? The answer lies in Heisenberg's uncertainty principle.
Suppose that we are trying to calculate the probability (or, actually, the probability amplitude) that some amount of momentum, p, gets transferred between a couple of particles that are fairly well- localized. The uncertainty principle says that definite momentum is associated with a huge uncertainty in position. A virtual particle with momentum p corresponds to a plane wave filling all of space, with no definite position at all. It doesn't matter which way the momentum points; that just determines how the wavefronts are oriented. Since the wave is everywhere, the photon can be created by one particle and absorbed by the other, no matter where they are. If the momentum transferred by the wave points in the direction from the receiving particle to the emitting one, the effect is that of an attractive force.
The moral is that the lines in a Feynman diagram are not to be interpreted literally as the paths of classical particles. Usually, in fact, this interpretation applies to an even lesser extent than in my example, since in most Feynman diagrams the incoming and outgoing particles are not very well localized; they're supposed to be plane waves too.
The uncertainty principle opens up the possibility that a virtual photon could impart a momentum that corresponds to an attractive force as well as to a repulsive one. But you may well ask what makes the force repulsive for like charges and attractive for opposite charges! Does the virtual photon know what kind of particle it's going to hit?
It's hard even for particle physicists to see this using the Feynman diagram rules of QED, because they're usually formulated in a manner designed to answer a completely different question: that of the probability of particles in plane-wave states scattering off of each other at various angles. Here, though, we want to understand what nudges a couple of particles that are just sitting around some distance apart—to explain the experiment you may have done in high school, in which charged balls of aluminum foil repel each other when hanging from strings. We want to do this using virtual particles. It can be done.
In QED, as in quantum mechanics in general, there are wave functions with complex-number values which have to be squared to get probabilities. We want to see that the wave function changes so that the like charges, on average, are repelled from each other, and the unlike charges, on average, are attracted.
Suppose, for simplicity, that the charged particles' wave functions are initially Gaussians at rest, that is, normal bell-shaped, real-valued functions, and that they are lined up along the x axis. You can think of the wave functions, schematically, as looking like this:
____ ____ / \ / \ x -> _/ \_ _/ \_ 0 _______/ \_________________/ \__________
where you are supposed to imagine that those ASCII stairs are actually continuous, smooth curves. Imagine, furthermore, that the distance between the two lumps in this diagram is much larger than the width of a lump. If you know some quantum mechanics, you know that in the absence of any forces, the lumps will just spread out symmetrically (well, if the particles are identical we have to worry about other details when they start to overlap substantially, but if the lumps are far apart that won't happen for a while). If there is an overall constant potential energy, that will give the wave functions an additional rotating phase, but we can always ignore that without affecting any physical quantities.
Concentrate on one of the particles, say, the one to the left. As well as the ordinary wave functions that are functions of position, I can also define wave functions in "momentum space": there is a probability amplitude for every momentum, which you square to find the probability density. If its wave function in space is Gaussian, then the wave function in "momentum space" is also Gaussian: as a function of the x component of momentum it is also bell-shaped. The narrower the position-space Gaussian is, the wider the momentum-space Gaussian is; that is Heisenberg's uncertainty principle!
_____________ __/ ^ \__ __/ | \__ p -> ____/ | \____ 0 ________/ zero momentum \____________
In order to make this problem tractable, I should specify that the momentum-space wave function isn't so wide (in other words, the position-space wave function isn't so narrow) that consequent relativistic effects become large. (For electrons, this doesn't happen until the position wave functions are squeezed into a space much smaller than an atom; and if the particle is more massive you have to squeeze it even more.) Also, I will ignore the particles' magnetic moments, if they have them, because all I care about is the electrostatic force.
Now, consider a virtual photon that comes from the particle on the right and is absorbed by the particle on the left. Actually calculating the photon's wave function is a little hairy; I have to consider the possibility that the photon was emitted by the other particle at any prior time. (However, I can save myself a little effort later by automatically including the possibility that the photon actually comes from the particle on the left and is absorbed by the particle on the right, with the recoil nudging the left particle: all I have to do is include situations in which the photon is "emitted on the right" in the future and goes "backward in time," and take its momentum to be minus what it really is! As long as I remember what's really going on, this trick is formally OK and saves a lot of trouble; it was introduced by Richard Feynman.)
When I include all of these possibilities, it turns out that I can approximate the photon's momentum-space wave function usably well by the following: the wave function is a function proportional to the electric charge of the emitting particle (in a sense this defines what electric charge is), and it has a few big, narrow spikes in it. One spike is proportional to -i times the charge, and is to the left of the origin; the other spike is minus that and is to the right of the origin. (There is also a third spike at zero momentum that has a real amplitude, but it turns out not to do anything important at the end of the day—it provides a constant potential energy—so I'll ignore it.) The imaginary component of the photon wave function looks like this, if the emitting particle was a negatively charged electron:
| +i | zero momentum p --> | | | v 0 ____________|________________________________________ | | | -i | |
If the emitting particle was positively charged, this picture is upside down.
(A note for experts only: The somewhat QED-savvy may be puzzled by the total nonresemblance of this to any well-known photon propagator. That's because I'm not going into momentum space in every direction, just in the x direction. The more QED-savvy will notice that I am making some pretty monstrous oversimplifications here. Actually they are not so bad; what I'm doing is the equivalent of assuming that the potential can be locally approximated by a sinusoid! If the wave packet is small enough in position space, a Coulomb potential and a sinusoidal one are both effectively a constant-force potential, so I can do this. Neglecting all magnetic effects and taking the nonrelativistic limit, the amplitude for transfer of a given momentum by a single virtual photon—which is essentially what I am colorfully, and without much prevarication, labeling the "photon's momentum-space wave function"— has to have an imaginary part odd in p_x because the potential is real, so in any case the qualitative effect will be the same as what I describe below, and for essentially the same reasons. It's just so much easier to convolute spikes. As for the single-particle "wave functions" of the charged particles, I can speak of them with fair correctness because the particles are far apart and slowly moving.)
The effect of a virtual photon hit on the charged particle's momentum-space wave function is, then, quite simple. The photon has a certain probability amplitude of knocking the charged particle to the left and a certain amplitude of knocking it to the right. The probability amplitude for each possibility is just proportional to i times the charge of the particle times the photon wave function times the time! (The other constants of proportionality depend on the system of units; we're not being terribly quantitative so don't worry about them.) We multiply the original charged particle's wave function—shifted to the right or left in momentum space, depending on which way it got knocked by the photon—times this amplitude, for each of the two possibilities, and then add the modified wave functions for the two possibilities together.
If both particles are negatively charged, or both are positively charged, then we're adding a right-side-up wave function, shifted to the left, to an upside-down wave function shifted to the right. The result is real-valued and looks something like this:
_____ + _/ \_ zero momentum __/ \ | p --> __/ \ | 0 ___/ \ v \ ________ \ __/ \ __/ - \_ _/ \_____/
and it increases in size as time goes on, from zero at the start of the problem. If the particles have opposite charges, then you should flip that picture upside down. The result is proportional to the product of the two charges, because we multiplied in the other particle's charge when finding the photon wave function, and this particle's charge when the interaction happened.
Now, by now you might be a little disturbed. We get wave functions by squaring amplitudes. The lump to the right of the origin goes down just as far as the lump to the left goes up. So isn't the probability that the photon knocked the particle's momentum toward the other one just as large as the probability that it knocked it away? No, because there is still some probability amplitude that no photon interaction occurred at all, and since we have no way of unambiguously telling one possibility from the other, we need to add the two wave functions together before squaring them! (There are also amplitudes for larger numbers of interactions, but for short times, we need not worry about those. Also, the "no-hit" wave function is not exactly the unmodified one, because of its own natural time evolution, but for short times all that does to the momentum-space wave function is give it a small imaginary part that we don't care about here.)
I said the unmodified wave function was positive, so the post-hit wave function will interfere with it constructively on the left side of the origin, and destructively on the right (or vice versa if the particles have opposite charges). So after a little time has passed, the wave function looks something like this in momentum space:
________ zero momentum _/ \___ | / \__ v p --> _/ \__ _/ \______ 0 / \________________
Squaring the wave function gives you a probability distribution whose hump is also shifted to the left. The momentum of the particle is skewed leftward—it is being repelled from the other particle! If the charges have opposite signs, the interference goes the other way and the particle's momentum is skewed rightward, resulting in a net attraction. The position-space wave function itself will tend to move leftward or rightward as it spreads out, as the case may be.
You might wonder: What happens when the time gets late enough that the negative hump in the struck wave function more than cancels the original wave function? Well, at those times, my analysis here is not enough, because there is also a significant amplitude that two photons have hit the particle (and things get gnarlier, because they could have hit it in any order); for still longer times I need to consider three, and so on.
The important point is that the photon doesn't "know" that it's going to hit a particle of the same charge as the one that emitted it, or of the opposite charge. The distinction between attraction and repulsion actually arises when the effect of the virtual photon interferes with the unperturbed wave function! In general, the distinction comes from interference between the contributions from odd and even numbers of virtual photons traveling from one particle to the other. Each such photon multiplies a factor of the product of the two charges into its contribution to the wave function; so the odd processes will get a factor of −1 from this product (times other things, of course) if the charges are different and +1 if they are alike, whereas the even processes get a factor of +1 in either case. The interference between "odd" and "even" terms in the wave function yields effects which survive even upon squaring the amplitude to get a probability. In the discussion above, by limiting consideration to short times, I've been able to ignore everything but the no-photon and one-photon processes.
This interference, with the amplitude for photon collision increasing smoothly with time, is also part of the reason why you can regard a stately and continuous thing like the evolution of a wave packet as the result of violent particle-collision events. As discordant as these phenomena may seem, they are actually two sides of the same coin. In the classical realm we don't see the spreading of the wave functions, but we do see this gradual net change in momentum, and it is what we call a force.
We are really using the quantum-mechanical approximation method known as perturbation theory. In perturbation theory, systems can go through intermediate "virtual states" that normally have energies different from that of the initial and final states. This is because of another uncertainty principle, which relates time and energy.
In the pictured example, we consider an intermediate state with a virtual photon in it. It isn't classically possible for a charged particle to just emit a photon and remain unchanged (except for recoil) itself. The state with the photon in it has too much energy, assuming conservation of momentum. However, since the intermediate state lasts only a short time, the state's energy becomes uncertain, and it can actually have the same energy as the initial and final states. This allows the system to pass through this state with some probability without violating energy conservation.
Some descriptions of this phenomenon instead say that the energy of the system becomes uncertain for a short period of time, that energy is somehow "borrowed" for a brief interval. This is just another way of talking about the same mathematics. However, it obscures the fact that all this talk of virtual states is just an approximation to quantum mechanics, in which energy is conserved at all times. The way I've described it also corresponds to the usual way of talking about Feynman diagrams, in which energy is conserved, but virtual particles can carry amounts of energy not normally allowed by the laws of motion.
(General relativity creates a different set of problems for energy conservation; that's described elsewhere in the sci.physics FAQ.)
In section 2, the virtual photon's plane wave is seemingly created everywhere in space at once, and destroyed all at once. Therefore, the interaction can happen no matter how far the interacting particles are from each other. Quantum field theory is supposed to properly apply special relativity to quantum mechanics. Yet here we have something that, at least at first glance, isn't supposed to be possible in special relativity: the virtual photon can go from one interacting particle to the other faster than light! It turns out, if we sum up all possible momenta, that the amplitude for transmission drops as the virtual particle's final position gets farther and farther outside the light cone, but that's small consolation. This "superluminal" propagation had better not transmit any information if we are to retain the principle of causality.
I'll give a plausibility argument that it doesn't in the context of a thought experiment. Let's try to send information faster than light with a virtual particle.
Suppose that you and I make repeated measurements of a quantum field at distant locations. The electromagnetic field is sort of a complicated thing, so I'll use the example of a field with just one component, and call it F. To make things even simpler, we'll assume that there are no "charged" sources of the F field or real F particles initially. This means that our F measurements should fluctuate quantum- mechanically around an average value of zero. You measure F (really, an average value of F over some small region) at one place, and I measure it a little while later at a place far away. We do this over and over, and wait a long time between the repetitions, just to be safe.
. . . ------X ------ X------ ^ time ------X me | ------ | you X------ ---> space
After a large number of repeated field measurements we compare notes. We discover that our results are not independent; the F values are correlated with each other—even though each individual set of measurements just fluctuates around zero, the fluctuations are not completely independent. This is because of the propagation of virtual quanta of the F field, represented by the diagonal lines. It happens even if the virtual particle has to go faster than light.
However, this correlation transmits no information. Neither of us has any control over the results we get, and each set of results looks completely random until we compare notes (this is just like the resolution of the famous EPR "paradox").
You can do things to fields other than measure them. Might you still be able to send a signal? Suppose that you attempt, by some series of actions, to send information to me by means of the virtual particle. If we look at this from the perspective of someone moving to the right at a high enough speed, special relativity says that in that reference frame, the effect is going the other way:
. . . X------ ------ ------X you X------ ^ time ------ | ------X me | ---> space
Now it seems as if I'm affecting what happens to you rather than the other way around. (If the quanta of the F field are not the same as their antiparticles, then the transmission of a virtual F particle from you to me now looks like the transmission of its antiparticle from me to you.) If all this is to fit properly into special relativity, then it shouldn't matter which of these processes "really" happened; the two descriptions should be equally valid.
We know that all of this was derived from quantum mechanics, using perturbation theory. In quantum mechanics, the future quantum state of a system can be derived by applying the rules for time evolution to its present quantum state. No measurement I make when I "receive" the particle can tell me whether you've "sent" it or not, because in one frame that hasn't happened yet! Since my present state must be derivable from past events, if I have your message, I must have gotten it by other means. The virtual particle didn't "transmit" any information that I didn't have already; it is useless as a means of faster-than-light communication.
The order of events does not vary in different frames if the transmission is at the speed of light or slower. Then, the use of virtual particles as a communication channel is completely consistent with quantum mechanics and relativity. That's fortunate: since all particle interactions occur over a finite time interval, in a sense all particles are virtual to some extent.
You don't have to accept that gravity is a "force" in order to believe that gravitons might exist. According to QM, anything that behaves like a harmonic oscillator has discrete energy levels, as I said in part 1. General relativity allows gravitational waves, ripples in the geometry of spacetime which travel at the speed of light. Under a certain definition of gravitational energy (a tricky subject), the wave can be said to carry energy. If QM is ever successfully applied to GR, it seems sensible to expect that these oscillations will also possess discrete "gravitational energies," corresponding to different numbers of gravitons.
Quantum gravity is not yet a complete, established theory, so gravitons are still speculative. It is also unlikely that individual gravitons will be detected any time in the near future.
Furthermore, it is not at all clear that it will be useful to think of gravitational "forces," such as the one that sticks you to the earth's surface, as mediated by virtual gravitons. The notion of virtual particles mediating static forces comes from perturbation theory, and if there is one thing we know about quantum gravity, it's that the usual way of doing perturbation theory doesn't work.
Quantum field theory is plagued with infinities, which show up in diagrams in which virtual particles go in closed loops. Normally these infinities can be gotten rid of by "renormalization," in which infinite "counterterms" cancel the infinite parts of the diagrams, leaving finite results for experimentally observable quantities. Renormalization works for QED and the other field theories used to describe particle interactions, but it fails when applied to gravity. Graviton loops generate an infinite family of counterterms. The theory ends up with an infinite number of free parameters, and it's no theory at all. Other approaches to quantum gravity are needed, and they might not describe static fields with virtual gravitons.