quantum computing quant ph-9708022

arX

iv:q

uant

-ph/

9708

022

v2

24 S

ep 1

997

Quantum computing

Andrew Steane

Department of Atomic and Laser Physics, University of Oxford

Clarendon Laboratory, Parks Road, Oxford, OX1 3PU, England.

July 1997

1

Abstract

The subject of quantum computing brings together ideas from classical information theory, computer science,and quantum physics. This review aims to summarise not just quantum computing, but the whole subject ofquantum information theory. Information can be identified as the most general thing which must propagatefrom a cause to an effect. It therefore has a fundamentally important role in the science of physics. However,the mathematical treatment of information, especially information processing, is quite recent, dating from themid-twentieth century. This has meant that the full significance of information as a basic concept in physics isonly now being discovered. This is especially true in quantum mechanics. The theory of quantum informationand computing puts this significance on a firm footing, and has lead to some profound and exciting newinsights into the natural world. Among these are the use of quantum states to permit the secure transmission ofclassical information (quantum cryptography), the use of quantum entanglement to permit reliable transmissionof quantum states (teleportation), the possibility of preserving quantum coherence in the presence of irreversiblenoise processes (quantum error correction), and the use of controlled quantum evolution for efficient computation(quantum computation). The common theme of all these insights is the use of quantum entanglement as acomputational resource.

It turns out that information theory and quantum mechanics fit together very well. In order to explain their rela-tionship, this review begins with an introduction to classical information theory and computer science, includingShannon’s theorem, error correcting codes, Turing machines and computational complexity. The principles ofquantum mechanics are then outlined, and the EPR experiment described. The EPR-Bell correlations, andquantum entanglement in general, form the essential new ingredient which distinguishes quantum from classicalinformation theory, and, arguably, quantum from classical physics.

Basic quantum information ideas are next outlined, including qubits and data compression, quantum gates, the‘no cloning’ property, and teleportation. Quantum cryptography is briefly sketched. The universal quantumcomputer is described, based on the Church-Turing Principle and a network model of computation. Algorithmsfor such a computer are discussed, especially those for finding the period of a function, and searching a randomlist. Such algorithms prove that a quantum computer of sufficiently precise construction is not only fundamen-tally different from any computer which can only manipulate classical information, but can compute a smallclass of functions with greater efficiency. This implies that some important computational tasks are impossiblefor any device apart from a quantum computer.

To build a universal quantum computer is well beyond the abilities of current technology. However, the principlesof quantum information physics can be tested on smaller devices. The current experimental situation is reviewed,with emphasis on the linear ion trap, high-Q optical cavities, and nuclear magnetic resonance methods. Theseallow coherent control in a Hilbert space of eight dimensions (3 qubits), and should be extendable up to athousand or more dimensions (10 qubits). Among other things, these systems will allow the feasibility ofquantum computing to be assessed. In fact such experiments are so difficult that it seemed likely until recentlythat a practically useful quantum computer (requiring, say, 1000 qubits) was actually ruled out by considerationsof experimental imprecision and the unavoidable coupling between any system and its environment. However, afurther fundamental part of quantum information physics provides a solution to this impasse. This is quantumerror correction (QEC).

An introduction to quantum error correction is provided. The evolution of the quantum computer is restrictedto a carefully chosen sub-space of its Hilbert space. Errors are almost certain to cause a departure fromthis sub-space. QEC provides a means to detect and undo such departures without upsetting the quantumcomputation. This achieves the apparently impossible, since the computation preserves quantum coherenceeven though during its course all the qubits in the computer will have relaxed spontaneously many times.

2

The review concludes with an outline of the main features of quantum information physics, and avenues forfuture research.

PACS 03.65.Bz, 89.70.+c

3

Contents

1 Introduction 5

2 Classical information theory 12

2.1 Measures of information . . . . . . . . . 12

2.2 Data compression . . . . . . . . . . . . . 13

2.3 The binary symmetric channel . . . . . 14

2.4 Error-correcting codes . . . . . . . . . . 15

3 Classical theory of computation 16

3.1 Universal computer; Turing machine . . 17

3.2 Computational complexity . . . . . . . . 18

3.3 Uncomputable functions . . . . . . . . . 18

4 Quantum verses classical physics 20

4.1 EPR paradox, Bell’s inequality . . . . . 21

5 Quantum Information 22

5.1 Qubits . . . . . . . . . . . . . . . . . . . 22

5.2 Quantum gates . . . . . . . . . . . . . . 22

5.3 No cloning . . . . . . . . . . . . . . . . . 23

5.4 Dense coding . . . . . . . . . . . . . . . 24

5.5 Quantum teleportation . . . . . . . . . . 25

5.6 Quantum data compression . . . . . . . 25

5.7 Quantum cryptography . . . . . . . . . 26

6 The universal quantum computer 27

6.1 Universal gate . . . . . . . . . . . . . . . 28

6.2 Church-Turing principle . . . . . . . . . 28

7 Quantum algorithms 29

7.1 Simulation of physical systems . . . . . 29

7.2 Period finding and Shor’s factorisationalgorithm . . . . . . . . . . . . . . . . . 29

7.3 Grover’s search algorithm . . . . . . . . 32

8 Experimental quantum information pro-

cessors 34

8.1 Ion trap . . . . . . . . . . . . . . . . . . 34

8.2 Nuclear magnetic resonance . . . . . . . 35

8.3 High-Q optical cavities . . . . . . . . . . 36

9 Quantum error correction 36

10 Discussion 40

4

1 Introduction

The science of physics seeks to ask, and find preciseanswers to, basic questions about why nature is as itis. Historically, the fundamental principles of physicshave been concerned with questions such as “what arethings made of?” and “why do things move as theydo?” In his Principia, Newton gave very wide-ranginganswers to some of these questions. By showing thatthe same mathamatical equations could describe themotions of everyday objects and of planets, he showedthat an everyday object such as a tea pot is made ofessentially the same sort of stuff as a planet: the mo-tions of both can be described in terms of their massand the forces acting on them. Nowadays we wouldsay that both move in such a way as to conserve en-ergy and momentum. In this way, physics allows us toabstract from nature concepts such as energy or mo-mentum which always obey fixed equations, althoughthe same energy might be expressed in many differentways: for example, an electron in the large electron-positron collider at CERN, Geneva, can have the samekinetic energy as a slug on a lettuce leaf.

Another thing which can be expressed in many dif-ferent ways is information. For example, the twostatements “the quantum computer is very interest-ing” and “l’ordinateur quantique est tres interessant”have something in common, although they share nowords. The thing they have in common is their in-

formation content. Essentially the same informationcould be expressed in many other ways, for exampleby substituting numbers for letters in a scheme suchas a → 97, b → 98, c → 99 and so on, in which casethe english version of the above statement becomes 116104 101 32 113 117 97 110 116 117 109 . . . . It is verysignificant that information can be expressed in differ-ent ways without losing its essential nature, since thisleads to the possibility of the automatic manipulationof information: a machine need only be able to ma-nipulate quite simple things like integers in order todo surprisingly powerful information processing, fromdocument preparation to differential calculus, even totranslating between human languages. We are familiarwith this now, because of the ubiquitous computer, buteven fifty years ago such a widespread significance ofautomated information processing was not forseen.

However, there is one thing that all ways of express-

ing information must have in common: they all usereal physical things to do the job. Spoken words areconveyed by air pressure fluctuations, written ones byarrangements of ink molecules on paper, even thoughtsdepend on neurons (Landauer 1991). The rallying cryof the information physicist is “no information withoutphysical representation!” Conversely, the fact that in-formation is insensitive to exactly how it is expressed,and can be freely translated from one form to another,makes it an obvious candidate for a fundamentally im-portant role in physics, like energy and momentum andother such abstractions. However, until the secondhalf of this century, the precise mathematical treat-ment of information, especially information process-ing, was undiscovered, so the significance of informa-tion in physics was only hinted at in concepts suchas entropy in thermodynamics. It now appears thatinformation may have a much deeper significance. His-torically, much of fundamental physics has been con-cerned with discovering the fundamental particles ofnature and the equations which describe their motionsand interactions. It now appears that a different pro-gramme may be equally important: to discover theways that nature allows, and prevents, information tobe expressed and manipulated, rather than particles tomove. For example, the best way to state exactly whatcan and cannot travel faster than light is to identifyinformation as the speed-limited entity. In quantummechanics, it is highly significant that the state vec-tor must not contain, whether explicitly or implicitly,more information than can meaningfully be associatedwith a given system. Among other things this producesthe wavefunction symmetry requirements which lead toBose Einstein and Fermi Dirac statistics, the periodicstructure of atoms, and so on.

The programme to re-investigate the fundamental prin-ciples of physics from the standpoint of informationtheory is still in its infancy. However, it already ap-pears to be highly fruitful, and it is this ambitious pro-gramme that I aim to summarise.

Historically, the concept of information in physics doesnot have a clear-cut origin. An important thread canbe traced if we consider the paradox of Maxwell’s de-mon of 1871 (fig. 1) (see also Brillouin 1956). Re-call that Maxwell’s demon is a creature that opensand closes a trap door between two compartments ofa chamber containing gas, and pursues the subversive

5

policy of only opening the door when fast moleculesapproach it from the right, or slow ones from the left.In this way the demon establishes a temperature dif-ference between the two compartments without doingany work, in violation of the second law of thermody-namics, and consequently permitting a host of contra-dictions.

A number of attempts were made to exorcise Maxwell’sdemon (see Bennett 1987), such as arguments that thedemon cannot gather information without doing work,or without disturbing (and thus heating) the gas, bothof which are untrue. Some were tempted to proposethat the 2nd law of thermodynamics could indeed beviolated by the actions of an “intelligent being.” Itwas not until 1929 that Leo Szilard made progress byreducing the problem to its essential components, inwhich the demon need merely identify whether a sin-gle molecule is to the right or left of a sliding partition,and its action allows a simple heat engine, called Szi-lard’s engine, to be run. Szilard still had not solved theproblem, since his analysis was unclear about whetheror not the act of measurement, whereby the demonlearns whether the molecule is to the left or the right,must involve an increase in entropy.

A definitive and clear answer was not forthcoming, sur-prisingly, until a further fifty years had passed. In theintermediate years digital computers were developed,and the physical implications of information gatheringand processing were carefully considered. The ther-modynamic costs of elementary information manipu-lations were analysed by Landauer and others duringthe 1960s (Landauer 1961, Keyes and Landauer 1970),and those of general computations by Bennett, Fred-kin, Toffoli and others during the 1970s (Bennett 1973,Toffoli 1980, Fredkin and Toffoli 1982). It was foundthat almost anything can in principle be done in areversible manner, i.e. with no entropy cost at all(Bennett and Landauer 1985). Bennett (1982) madeexplicit the relation between this work and Maxwell’sparadox by proposing that the demon can indeed learnwhere the molecule is in Szilard’s engine without doingany work or increasing any entropy in the environment,and so obtain useful work during one stroke of the en-gine. However, the information about the molecule’slocation must then be present in the demon’s memory(fig. 1). As more and more strokes are performed, moreand more information gathers in the demon’s memory.

To complete a thermodynamic cycle, the demon musterase its memory, and it is during this erasure opera-tion that we identify an increase in entropy in the en-vironment, as required by the 2nd law. This completesthe essential physics of Maxwell’s demon; further sub-tleties are discussed by Zurek (1989), Caves (1990), andCaves, Unruh and Zurek (1990).

The thread we just followed was instructive, but toprovide a complete history of ideas relevent to quan-tum computing is a formidable task. Our subjectbrings together what are arguably two of the great-est revolutions in twentieth-century science, namelyquantum mechanics and information science (includ-ing computer science). The relationship between thesetwo giants is illustrated in fig. 2.

Classical information theory is founded on the defi-nition of information. A warning is in order here.Whereas the theory tries to capture much of the normalmeaning of the term ‘information’, it can no more dojustice to the full richness of that term in everyday lan-guage than particle physics can encapsulate the every-day meaning of ‘charm’. ‘Information’ for us will be anabstract term, defined in detail in section 2.1. Much ofinformation theory dates back to seminal work of Shan-non in the 1940’s (Slepian 1974). The observation thatinformation can be translated from one form to anotheris encapsulated and quantified in Shannon’s noiselesscoding theorem (1948), which quantifies the resourcesneeded to store or transmit a given body of informa-tion. Shannon also considered the fundamentally im-portant problem of communication in the presence ofnoise, and established Shannon’s main theorem (sec-tion 2.4) which is the central result of classical informa-tion theory. Error-free communication even in the pres-ence of noise is achieved by means of ‘error-correctingcodes’, and their study is a branch of mathematics inits own right. Indeed, the journal IEEE Transactions

on Information Theory is almost totally taken up withthe discovery and analysis of error-correction by cod-ing. Pioneering work in this area was done by Golay(1949) and Hamming (1950).

The foundations of computer science were formulatedat roughly the same time as Shannon’s informationtheory, and this is no coincidence. The father of com-puter science is arguably Alan Turing (1912-1954), andits prophet is Charles Babbage (1791-1871). Babbage

6

conceived of most of the essential elements of a mod-ern computer, though in his day there was not thetechnology available to implement his ideas. A cen-tury passed before Babbage’s Analytical Engine wasimproved upon when Turing described the UniversalTuring Machine in the mid 1930s. Turing’s genius (seeHodges 1983) was to clarify exactly what a calculat-ing machine might be capable of, and to emphasise therole of programming, i.e. software, even more thanBabbage had done. The giants on whose shouldersTuring stood in order to get a better view were chieflythe mathematicians David Hilbert and Kurt Godel.Hilbert had emphasised between the 1890s and 1930sthe importance of asking fundamental questions aboutthe nature of mathematics. Instead of asking “is thismathematical proposition true?” Hilbert wanted to ask“is it the case that every mathematical proposition canin principle be proved or disproved?” This was un-known, but Hilbert’s feeling, and that of most mathe-maticians, was that mathematics was indeed complete,so that conjectures such as Goldbach’s (that every evennumber can be written as the sum of two primes) couldbe proved or disproved somehow, although the logicalsteps might be as yet undiscovered.

Godel destroyed this hope by establishing the existenceof mathematical propositions which were undecidable,meaning that they could be neither proved nor dis-proved. The next interesting question was whether itwould be easy to identify such propositions. Progressin mathematics had always relied on the use of cre-ative imagination, yet with hindsight mathematicalproofs appear to be automatic, each step following in-evitably from the one before. Hilbert asked whetherthis ‘inevitable’ quality could be captured by a ‘me-chanical’ process. In other words, was there a universalmathematical method, which would establish the truthor otherwise of every mathematical assertion? AfterGodel, Hilbert’s problem was re-phrased into that ofestablishing decidability rather than truth, and this iswhat Turing sought to address.

In the words of Newman, Turing’s bold innovation wasto introduce ‘paper tape’ into symbolic logic. In thesearch for an automatic process by which mathemat-ical questions could be decided, Turing envisaged athoroughly mechanical device, in fact a kind of glo-rified typewriter (fig. 7). The importance of the Tur-

ing machine (Turing 1936) arises from the fact that it

is sufficiently complicated to address highly sophisti-cated mathematical questions, but sufficiently simpleto be subject to detailed analysis. Turing used hismachine as a theoretical construct to show that theassumed existence of a mechanical means to establishdecidability leads to a contradiction (see section 3.3).In other words, he was initially concerned with quiteabstract mathematics rather than practical computa-tion. However, by seriously establishing the idea ofautomating abstract mathematical proofs rather thanmerely arithmatic, Turing greatly stimulated the de-velopment of general purpose information processing.This was in the days when a “computer” was a persondoing mathematics.

Modern computers are neither Turing machines norBabbage engines, though they are based on broadlysimilar principles, and their computational power isequivalent (in a technical sense) to that of a Turingmachine. I will not trace their development here, sincealthough this is a wonderful story, it would take toolong to do justice to the many people involved. Letus just remark that all of this development representsa great improvement in speed and size, but does notinvolve any change in the essential idea of what a com-puter is, or how it operates. Quantum mechanics raisesthe possibility of such a change, however.

Quantum mechanics is the mathematical structurewhich embraces, in principle, the whole of physics. Wewill not be directly concerned with gravity, high ve-locities, or exotic elementary particles, so the standardnon-relativistic quantum mechanics will suffice. Thesignificant feature of quantum theory for our purposeis not the precise details of the equations of motion, butthe fact that they treat quantum amplitudes, or statevectors in a Hilbert space, rather than classical vari-ables. It is this that allows new types of informationand computing.

There is a parallel between Hilbert’s questions aboutmathematics and the questions we seek to pose in quan-tum information theory. Before Hilbert, almost allmathematical work had been concerned with estab-lishing or refuting particular hypotheses, but Hilbertwanted to ask what general type of hypothesis waseven amenable to mathematical proof. Similarly, mostresearch in quantum physics has been concerned withstudying the evolution of specific physical systems, but

7

we want to ask what general type of evolution is evenconceivable under quantum mechanical rules.

The first deep insight into quantum information the-ory came with Bell’s 1964 analysis of the paradoxicalthought-experiment proposed by Einstein, Podolskyand Rosen (EPR) in 1935. Bell’s inequality draws at-tention to the importance of correlations between sepa-rated quantum systems which have interacted (directlyor indirectly) in the past, but which no longer influenceone another. In essence his argument shows that thedegree of correlation which can be present in such sys-tems exceeds that which could be predicted on the basisof any law of physics which describes particles in termsof classical variables rather than quantum states. Bell’sargument was clarified by Bohm (1951, also Bohm andAharonov 1957) and by Clauser, Holt, Horne and Shi-mony (1969), and experimental tests were carried outin the 1970s (see Clauser and Shimony (1978) and ref-erences therein). Improvements in such experimentsare largely concerned with preventing the possibilityof any interaction between the separated quantum sys-tems, and a significant step forward was made in theexperiment of Aspect, Dalibard and Roger (1982), (seealso Aspect 1991) since in their work any purported in-teraction would have either to travel faster than light,or possess other almost equally implausible qualities.

The next link between quantum mechanics and infor-mation theory came about when it was realised thatsimple properties of quantum systems, such as the un-avoidable disturbance involved in measurement, couldbe put to practical use, in quantum cryptography (Wies-ner 1983, Bennett et. al. 1982, Bennett and Brassard1984; for a recent review see Brassard and Crepeau1996). Quantum cryptography covers several ideas, ofwhich the most firmly established is quantum key dis-tribution. This is an ingenious method in which trans-mitted quantum states are used to perform a very par-ticular communication task: to establish at two sepa-rated locations a pair of identical, but otherwise ran-dom, sequences of binary digits, without allowing anythird party to learn the sequence. This is very usefulbecause such a random sequence can be used as a cryp-tographic key to permit secure communication. Thesignificant feature is that the principles of quantummechanics guarantee a type of conservation of quan-tum information, so that if the necessary quantum in-formation arrives at the parties wishing to establish

a random key, they can be sure it has not gone else-where, such as to a spy. Thus the whole problem ofcompromised keys, which fills the annals of espionage,is avoided by taking advantage of the structure of thenatural world.

While quantum cryptography was being analysed anddemonstrated, the quantum computer was undergoinga quiet birth. Since quantum mechanics underlies thebehaviour of all systems, including those we call classi-cal (“even a screwdriver is quantum mechanical”, Lan-dauer (1995)), it was not obvious how to conceive ofa distinctively quantum mechanical computer, i.e. onewhich did not merely reproduce the action of a classicalTuring machine. Obviously it is not sufficient merelyto identify a quantum mechanical system whose evolu-tion could be interpreted as a computation; one mustprove a much stronger result than this. Conversely, weknow that classical computers can simulate, by theircomputations, the evolution of any quantum system. . . with one reservation: no classical process will allowone to prepare separated systems whose correlationsbreak the Bell inequality. It appears from this that theEPR-Bell correlations are the quintessential quantum-mechanical property (Feynman 1982).

In order to think about computation from a quantum-mechanical point of view, the first ideas involved con-verting the action of a Turing machine into an equiv-alent reversible process, and then inventing a Hamil-tonian which would cause a quantum system to evolvein a way which mimicked a reversible Turing machine.This depended on the work of Bennett (1973; see alsoLecerf 1963) who had shown that a universal classicalcomputing machine (such as Turing’s) could be madereversible while retaining its simplicity. Benioff (1980,1982) and others proposed such Turing-like Hamiltoni-ans in the early 1980s. Although Benioff’s ideas did notallow the full analysis of quantum computation, theyshowed that unitary quantum evolution is at least aspowerful computationally as a classical computer.

A different approach was taken by Feynman (1982,1986) who considered the possibility not of univer-sal computation, but of universal simulation—i.e. apurpose-built quantum system which could simulatethe physical behaviour of any other. Clearly, such asimulator would be a universal computer too, sinceany computer must be a physical system. Feynman

8

gave arguments which suggested that quantum evolu-tion could be used to compute certain problems moreefficiently than any classical computer, but his devicewas not sufficiently specified to be called a computer,since he assumed that any interaction between adjacenttwo-state systems could be ‘ordered’, without sayinghow.

In 1985 an important step forward was taken byDeutsch. Deutsch’s proposal is widely considered torepresent the first blueprint for a quantum computer,in that it is sufficiently specific and simple to allow realmachines to be contemplated, but sufficiently versa-tile to be a universal quantum simulator, though bothpoints are debatable. Deutsch’s system is essentially aline of two-state systems, and looks more like a regis-ter machine than a Turing machine (both are universalclassical computing machines). Deutsch proved thatif the two-state systems could be made to evolve bymeans of a specific small set of simple operations, thenany unitary evolution could be produced, and there-fore the evolution could be made to simulate that ofany physical system. He also discussed how to pro-duce Turing-like behaviour using the same ideas.

Deutsch’s simple operations are now called quantum‘gates’, since they play a role analogous to that of bi-nary logic gates in classical computers. Various authorshave investigated the minimal class of gates which aresufficient for quantum computation.

The two questionable aspects of Deutsch’s proposal areits efficiency and realisability. The question of effi-ciency is absolutely fundamental in computer science,and on it the concept of ‘universality’ turns. A uni-

versal computer is one that not only can reproduce(i.e. simulate) the action of any other, but can do sowithout running too slowly. The ‘too slowly’ here isdefined in terms of the number of computational stepsrequired: this number must not increase exponentiallywith the size of the input (the precise meaning will beexplained in section 3.1). Deutsch’s simulator is notuniversal in this strict sense, though it was shown tobe efficient for simulating a wide class of quantum sys-tems by Lloyd (1996). However, Deutsch’s work has es-tablished the concepts of quantum networks (Deutsch1989) and quantum logic gates, which are extremelyimportant in that they allow us to think clearly aboutquantum computation.

In the early 1990’s several authors (Deutsch and Jozsa1992, Berthiaume and Brassard 1992, Bernstein andVazirani 1993) sought computational tasks which couldbe solved by a quantum computer more efficiently thanany classical computer. Such a quantum algorithmwould play a conceptual role similar to that of Bell’sinequality, in defining something of the essential natureof quantum mechanics. Initially only very small differ-ences in performance were found, in which quantummechanics permitted an answer to be found with cer-tainty, as long as the quantum system was noise-free,where a probabilistic classical computer could achievean answer ‘only’ with high probability. An importantadvance was made by Simon (1994), who describedan efficient quantum algorithm for a (somewhat ab-stract) problem for which no efficient solution was pos-sible classically, even by probabilistic methods. Thisinspired Shor (1994) who astonished the communityby describing an algorithm which was not only efficienton a quantum computer, but also addressed a centralproblem in computer science: that of factorising largeintegers.

Shor discussed both factorisation and discrete log-arithms, making use of a quantum Fourier trans-form method discovered by Coppersmith (1994) andDeutsch. Further important quantum algorithms werediscovered by Grover (1997) and Kitaev (1995).

Just as with classical computation and information the-ory, once theoretical ideas about computation had gotunder way, an effort was made to establish the essentialnature of quantum information—the task analogous toShannon’s work. The difficulty here can be seen byconsidering the simplest quantum system, a two-statesystem such as a spin half in a magnetic field. Thequantum state of a spin is a continuous quantity de-fined by two real numbers, so in principle it can storean infinite amount of classical information. However,a measurement of a spin will only provide a single two-valued answer (spin up/spin down)—there is no way togain access to the infinite information which appearsto be there, therefore it is incorrect to consider theinformation content in those terms. This is reminis-cent of the renormalisation problem in quantum elec-trodynamics. How much information can a two-statequantum system store, then? The answer, provided byJozsa and Schumacher (1994) and Schumacher (1995),is one two-state system’s worth! Of course Schumacher

9

and Jozsa did more than propose this simple answer,rather they showed that the two-state system plays therole in quantum information theory analogous to thatof the bit in classical information theory, in that thequantum information content of any quantum systemcan be meaningfully measured as the minimum num-ber of two-state systems, now called quantum bits orqubits, which would be needed to store or transmit thesystem’s state with high accuracy.

Let us return to the question of realisability of quan-tum computation. It is an elementary, but fundamen-tally important, observation that the quantum inter-ference effects which permit algorithms such as Shor’sare extremely fragile: the quantum computer is ultra-sensitive to experimental noise and impression. It isnot true that early workers were unaware of this diffi-culty, rather their first aim was to establish whether aquantum computer had any fundamental significanceat all. Armed with Shor’s algorithm, it now appearsthat such a fundamental significance is established, bythe following argument: either nature does allow adevice to be run with sufficient precision to performShor’s algorithm for large integers (greater than, say, agoogol, 10100), or there are fundamental natural limitsto precision in real systems. Both eventualities repre-sent an important insight into the laws of nature.

At this point, ideas of quantum information and quan-tum computing come together. For, a quantum com-puter can be made much less sensitive to noise bymeans of a new idea which comes directly from themarriage of quantum mechanics with classical infor-mation theory, namely quantum error correction. Al-though the phrase ‘error correction’ is a natural oneand was used with reference to quantum comput-ers prior to 1996, it was only in that year that twoimportant papers, of Calderbank and Shor, and in-dependently Steane, established a general frameworkwhereby quantum information processing can be usedto combat a very wide class of noise processes in aproperly designed quantum system. Much progress hassince been made in generalising these ideas (Knill andLaflamme 1997, Ekert and Macchiavello 1996, Bennettet. al. 1996b, Gottesman 1996, Calderbank et. al.

1997). An important development was the demonstra-tion by Shor (1996) and Kitaev (1996) that correctioncan be achieved even when the corrective operationsare themselves imperfect. Such methods lead to a gen-

eral concept of ‘fault tolerant’ computing, of which ahelpful review is provided by Preskill (1997).

If, as seems almost certain, quantum computation willonly work in conjunction with quantum error correc-tion, it appears that the relationship between quantuminformation theory and quantum computers is evenmore intimate than that between Shannon’s informa-tion theory and classical computers. Error correctiondoes not in itself guarantee accurate quantum compu-tation, since it cannot combat all types of noise, butthe fact that it is possible at all is a significant devel-opment.

A computer which only exists on paper will not actu-ally perform any computations, and in the end the onlyway to resolve the issue of feasibility in quantum com-puter science is to build a quantum computer. To thisend, a number of authors proposed computer designsbased on Deutsch’s idea, but with the physical detailsmore fully worked out (Teich et. al. 1988, Lloyd 1993,Berman et. al. 1994, DiVincenco 1995b). The greatchallenge is to find a sufficiently complex system whoseevolution is nevertheless both coherent (i.e. unitary)and controlable. It is not sufficient that only some as-pects of a system should be quantum mechanical, as insolid-state ‘quantum dots’, or that there is an implicitassumption of unfeasible precision or cooling, which isoften the case for proposals using solid-state devices.Cirac and Zoller (1995) proposed the use of a linear iontrap, which was a significant improvement in feasibil-ity, since heroic efforts in the ion trapping communityhad already achieved the necessary precision and lowtemperature in experimental work, especially the groupof Wineland who demonstrated cooling to the groundstate of an ion trap in the same year (Diedrich et. al.

1989, Monroe et. al. 1995). More recently, Gershen-feld and Chuang (1997) and Cory et. al. (1996,1997)have shown that nuclear magnetic resonance (NMR)techniques can be adapted to fulfill the requirementsof quantum computation, making this approach alsovery promising. Other recent proposals of Privman et.

al. (1997) and Loss and DiVincenzo (1997) may alsobe feasible.

As things stand, no quantum computer has been built,nor looks likely to be built in the author’s lifetime, ifwe measure it in terms of Shor’s algorithm, and askfor factoring of large numbers. However, if we ask in-

10

stead for a device in which quantum information ideascan be explored, then only a few quantum bits are re-quired, and this will certainly be achieved in the nearfuture. Simple two-bit operations have been carriedout in many physics experiments, notably magneticresonance, and work with three to ten qubits now seemsfeasible. Notable recent experiments in this regard arethose of Brune et. al. (1994), Monroe et. al. (1995b),Turchette et. al. (1995) and Mattle et. al. (1996).

11

2 Classical information theory

This and the next section will summarise the classicaltheory of information and computing. This is text-book material (Minsky 1967, Hamming 1986) but isincluded here since it forms a background to quantuminformation and computing, and the article is aimed atphysicists to whom the ideas may be new.

2.1 Measures of information

The most basic problem in classical information the-ory is to obtain a measure of information, that is, ofamount of information. Suppose I tell you the value ofa number X . How much information have you gained?That will depend on what you already knew about X .For example, if you already knew X was equal to 2,you would learn nothing, no information, from my rev-elation. On the other hand, if previously your onlyknowledge was that X was given by the throw of a die,then to learn its value is to gain information. We havemet here a basic paradoxical property, which is thatinformation is often a measure of ignorance: the infor-mation content (or ‘self-information’) of X is definedto be the information you would gain if you learned thevalue of X .

IfX is a random variable which has value x with proba-bility p(x), then the information content ofX is definedto be

S({p(x)}) = −∑

x

p(x) log2 p(x). (1)

Note that the logarithm is taken to base 2, and thatS is always positive since probabilities are bounded byp(x) ≤ 1. S is a function of the probability distribi-

tion of values of X . It is important to remember this,since in what follows we will adopt the standard prac-tice of using the notation S(X) for S({p(x)}). It isunderstood that S(X) does not mean a function of X ,but rather the information content of the variable X .The quantity S(X) is also referred to as an entropy,for obvious reasons.

If we already know that X = 2, then p(2) = 1 andthere are no other terms in the sum, leading to S = 0,so X has no information content. If, on the other hand,

X is given by the throw of a die, then p(x) = 1/6 forx ∈ {1, 2, 3, 4, 5, 6} so S = − log2(1/6) ≃ 2.58. IfX cantake N different values, then the information content(or entropy) of X is maximised when the probabilitydistribution p is flat, with every p(x) = 1/N (for ex-ample a fair die yields S ≃ 2.58, but a loaded die withp(6) = 1/2, p(1 · · ·5) = 1/10 yields S ≃ 2.16). This isconsistent with the requirement that the information(what we would gain if we learned X) is maximumwhen our prior knowledge of X is minimum.

Thus the maximum information which could in princi-ple be stored by a variable which can take on N dif-ferent values is log2(N). The logarithms are taken tobase 2 rather than some other base by convention. Thechoice dictates the unit of information: S(X) = 1 whenX can take two values with equal probability. A two-valued or binary variable thus can contain one unit ofinformation. This unit is called a bit. The two valuesof a bit are typically written as the binary digits 0 and1.

In the case of a binary variable, we can define p to bethe probability that X = 1, then the probability thatX = 0 is 1 − p and the information can be written asa function of p alone:

H(p) = −p log2 p− (1 − p) log2(1 − p) (2)

This function is called the entropy function, 0 ≤H(p) ≤ 1.

In what follows, the subscript 2 will be dropped onlogarithms, it is assumed that all logarithms are tobase 2 unless otherwise indicated.

The probability that Y = y given thatX = x is writtenp(y|x). The conditional entropy S(Y |X) is defined by

S(Y |X) = −∑

x

p(x)∑

y

p(y|x) log p(y|x) (3)

= −∑

x

∑

y

p(x, y) log p(y|x) (4)

where the second line is deduced using p(x, y) =p(x)p(y|x) (this is the probability that X = x and

Y = y). By inspection of the definition, we see thatS(Y |X) is a measure of how much information on av-erage would remain in Y if we were to learn X . Notethat S(Y |X) ≤ S(Y ) always and S(Y |X) 6= S(X |Y )usually.

12

The conditional entropy is important mainly as astepping-stone to the next quantity, the mutual infor-

mation, defined by

I(X : Y ) =∑

x

∑

y

p(x, y) logp(x, y)

p(x)p(y)(5)

= S(X) − S(X |Y ) (6)

From the definition, I(X : Y ) is a measure of howmuch X and Y contain information about each other1.If X and Y are independent then p(x, y) = p(x)p(y)so I(X : Y ) = 0. The relationships between the basicmeasures of information are indicated in fig. 3. Thereader may like to prove as an exercise that S(X,Y ),the information content of X and Y (the informationwe would gain if, initially knowing neither, we learnedthe value of both X and Y ) satisfies S(X,Y ) = S(X)+S(Y ) − I(X : Y ).

Information can disappear, but it cannot spring spon-taneously from nowhere. This important fact findsmathematical expression in the data processing inequal-

ity:

if X → Y → Z then I(X : Z) ≤ I(X : Y ). (7)

The symbol X → Y → Z means that X,Y and Z forma process (a Markov chain) in which Z depends on Ybut not directly on X : p(x, y, z) = p(x)p(y|x)p(z|y).The content of the data processing inequality is thatthe ‘data processor’ Y can pass on to Z no more infor-mation about X than it received.

2.2 Data compression

Having pulled the definition of information content,equation (1), out of a hat, our aim is now to provethat this is a good measure of information. It is notobvious at first sight even how to think about such atask. One of the main contributions of classical infor-mation theory is to provide useful ways to think aboutinformation. We will describe a simple situation inorder to illustrate the methods. Let us suppose oneperson, traditionally called Alice, knows the value ofX , and she wishes to communicate it to Bob. We re-strict ourselves to the simple case that X has only two

1Many authors write I(X; Y ) rather than I(X : Y ). I preferthe latter since the symmetry of the colon reflects the fact thatI(X : Y ) = I(Y : X).

possible values: either ‘yes’ or ‘no’. We say that Aliceis a ‘source’ with an ‘alphabet’ of two symbols. Alicecommunicates by sending binary digits (noughts andones) to Bob. We will measure the information con-tent of X by counting how many bits Alice must send,on average, to allow Bob to learn X . Obviously, shecould just send 0 for ‘no’ and 1 for ‘yes’, giving a ‘bitrate’ of one bit per X value communicated. However,what if X were an essentially random variable, exceptthat it is more likely to be ‘no’ than ‘yes’? (think ofthe output of decisions from a grant funding body, forexample). In this case, Alice can communicate moreefficiently by adopting the following procedure.

Let p be the probability that X = 1 and 1 − p be theprobability that X = 0. Alice waits until n values ofX are available to be sent, where n will be large. Themean number of ones in such a sequence of n valuesis np, and it is likely that the number of ones in anygiven sequence is close to this mean. Suppose np isan integer, then the probability of obtaining any givensequence containing np ones is

pnp(1 − p)n−np = 2−nH(p). (8)

The reader should satisfy him or herself that the twosides of this equation are indeed equal: the right handside hints at how the argument can be generalised.Such a sequence is called a typical sequence. To bespecific, we define the set of typical sequences to be allsequences such that

2−n(H(p)+ǫ) ≤ p(sequence) ≤ 2−n(H(p)−ǫ) (9)

Now, it can be shown that the probability that Alice’sn values actually form a typical sequence is greaterthan 1− ǫ, for sufficiently large n, no matter how smallǫ is. This implies that Alice need not communicaten bits to Bob in order for him to learn n decisions.She need only tell Bob which typical sequence she has.They must agree together beforehand how the typicalsequences are to be labelled: for example, they mayagree to number them in order of increasing binaryvalue. Alice just sends the label, not the sequence it-self. To deduce how well this works, it can be shownthat the typical sequences all have equal probability,and there are 2nH(p) of them. To communicate oneof 2nH(p) possibilities, clealy Alice must send nH(p)bits. Also, Alice cannot do better than this (i.e. sendfewer bits) since the typical sequences are equiproba-ble: there is nothing to be gained by further manipu-

13

lating the information. Therefore, the information con-tent of each value of X in the original sequence mustbe H(p), which proves (1).

The mathematical details skipped over in the aboveargument all stem from the law of large numbers, whichstates that, given arbitrarily small ǫ, δ

P (|m− np| < nǫ) > 1 − δ (10)

for sufficiently large n, where m is the number of onesobtained in a sequence of n values. For large enough n,the number of ones m will differ from the mean np byan amount arbitrarily small compared to n. For exam-ple, in our case the noughts and ones will be distributedaccording to the binomial distribution

P (n,m) = C(n,m)pm(1 − p)n−m (11)

≃ 1

σ√

2πe−(m−np)2/2σ2

(12)

where the Gaussian form is obtained in the limitn, np → ∞, with the standard deviation σ =√

np(1 − p), and C(n,m) = n!/m!(n−m)!.

The above argument has already yielded a signifi-cant practical result associated with (1). This is thatto communicate n values of X , we need only sendnS(X) ≤ n bits down a communication channel. Thisidea is referred to as data compression, and is alsocalled Shannon’s noiseless coding theorem.

The typical sequences idea has given a means to calcu-late information content, but it is not the best way tocompress information in practice, because Alice mustwait for a large number of decisions to accumulatebefore she communicates anything to Bob. A bettermethod is for Alice to accumulate a few decisions, say4, and communicate this as a single ‘message’ as bestshe can. Huffman derived an optimal method wherebyAlice sends short strings to communicate the mostlikely messages, and longer ones to communicate theleast likely messages, see table 1 for an example. Thetranslation process is referred to as ‘encoding’ and ‘de-coding’ (fig. 4); this terminology does not imply anywish to keep information secret.

For the case p = 1/4 Shannon’s noiseless coding the-orem tells us that the best possible data compressiontechnique would communicate each message of four Xvalues by sending on average 4H(1/4) ≃ 3.245 bits.

The Huffman code in table 1 gives on average 3.273 bitsper message. This is quite close to the minimum, show-ing that practical methods like Huffman’s are powerful.

Data compression is a concept of great practical impor-tance. It is used in telecommunications, for exampleto compress the information required to convey tele-vision pictures, and data storage in computers. Fromthe point of view of an engineer designing a commu-nication channel, data compression can appear mirac-ulous. Suppose we have set up a telephone link to amountainous area, but the communication rate is nothigh enough to send, say, the pixels of a live videoimage. The old-style engineering option would be toreplace the telephone link with a faster one, but infor-mation theory suggests instead the possibility of usingthe same link, but adding data processing at either end(data compression and decompression). It comes as agreat surprise that the usefulness of a cable can thusbe improved by tinkering with the information insteadof the cable.

2.3 The binary symmetric channel

So far we have considered the case of communicationdown a perfect, i.e. noise-free channel. We have gainedtwo main results of practical value: a measure of thebest possible data compression (Shannon’s noiselesscoding theorem), and a practical method to compressdata (Huffman coding). We now turn to the importantquestion of communication in the presence of noise. Asin the last section, we will analyse the simplest case inorder to illustrate principles which are in fact moregeneral.

Suppose we have a binary channel, i.e. one which al-lows Alice to send noughts and ones to Bob. The noise-free channel conveys 0 → 0 and 1 → 1, but a noisychannel might sometimes cause 0 to become 1 and viceversa. There is an infinite variety of different types ofnoise. For example, the erroneous ‘bit flip’ 0 → 1 mightbe just as likely as 1 → 0, or the channel might havea tendency to ‘relax’ towards 0, in which case 1 → 0happens but 0 → 1 does not. Also, such errors mightoccur independently from bit to bit, or occur in bursts.

A very important type of noise is one which affectsdifferent bits independently, and causes both 0 → 1

14

and 1 → 0 errors. This is important because it capturesthe essential features of many processes encounteredin realistic situations. If the two errors 0 → 1 and1 → 0 are equally likely, then the noisy channel is calleda ‘binary symmetric channel’. The binary symmetricchannel has a single parameter, p, which is the errorprobability per bit sent. Suppose the message sent intothe channel by Alice is X , and the noisy message whichBob receives is Y . Bob is then faced with the task ofdeducing X as best he can from Y . If X consists of asingle bit, then Bob will make use of the conditionalprobabilities

p(x = 0|y = 0) = p(x = 1|y = 1) = 1 − p

p(x = 0|y = 1) = p(x = 1|y = 0) = p

giving S(X |Y ) = H(p) using equations (3) and (2).Therefore, from the definition (6) of mutual informa-tion, we have

I(X : Y ) = S(X) −H(p) (13)

Clearly, the presence of noise in the channel limits theinformation about Alice’s X contained in Bob’s re-ceived Y . Also, because of the data processing inequal-ity, equation (7), Bob cannot increase his informationabout X by manipulating Y . However, (13) shows thatAlice and Bob can communicate better if S(X) is large.The general insight is that the information communi-cated depends both on the source and the propertiesof the channel. It would be useful to have a measureof the channel alone, to tell us how well it conveys in-formation. This quantity is called the capacity of thechannel and it is defined to be the maximum possi-ble mutual information I(X : Y ) between the inputand output of the channel, maximised over all possiblesources:

Channel capacity C ≡ max{p(x)}

I(X : Y ) (14)

Channel capacity is measured in units of ‘bits out persymbol in’ and for binary channels must lie betweenzero and one.

It is all very well to have a definition, but (14) doesnot allow us to compare channels very easily, since wehave to perform the maximisation over input strategies,which is non-trivial. To establish the capacity C(p) ofthe binary symmetric channel is a basic problem ininformation theory, but fortunately this case is quite

simple. From equations (13) and (14) one may seethat the answer is

C(p) = 1 −H(p), (15)

obtained when S(X) = 1 (i.e. P (x = 0) = P (x = 1) =1/2).

2.4 Error-correcting codes

So far we have investigated how much information getsthrough a noisy channel, and how much is lost. Alicecannot convey to Bob more information than C(p) persymbol communicated. However, suppose Bob is busydefusing a bomb and Alice is shouting from a distancewhich wire to cut : she will not say “the blue wire” justonce, and hope that Bob heard correctly. She will re-peat the message many times, and Bob will wait untilhe is sure to have got it right. Thus error-free commu-nication can be achieved even over a noisy channel. Inthis example one obtains the benefit of reduced errorrate at the sacrifice of reduced information rate. Thenext stage of our information theoretic programme is toidentify more powerful techniques to circumvent noise(Hamming 1986, Hill 1986, Jones 1979, MacWilliamsand Sloane 1977).

We will need the following concepts. The set {0, 1} isconsidered as a group (a Galois field GF(2)) where theoperations +,−,×,÷ are carried out modulo 2 (thus,1 + 1 = 0). An n-bit binary word is a vector of ncomponents, for example 011 is the vector (0, 1, 1). Aset of such vectors forms a vector space under addition,since for example 011+101 means (0, 1, 1)+ (1, 0, 1) =(0+1, 1+0, 1+1) = (1, 1, 0) = 110 by the standard rulesof vector addition. This is equivalent to the exclusive-or operation carried out bitwise between the two binarywords.

The effect of noise on a word u can be expressed u →u′ = u + e, where the error vector e indicates whichbits in u were flipped by the noise. For example, u =1001101 → u′ = 1101110 can be expressed u′ = u +0100011. An error correcting code C is a set of wordssuch that

u+ e 6=v + f ∀u, v ∈ C (u 6= v), ∀e, f ∈ E (16)

where E is the set of errors correctable by C, which in-cludes the case of no error, e = 0. To use such a code,

15

Alice and Bob agree on which codeword u correspondsto which message, and Alice only ever sends codewordsdown the channel. Since the channel is noisy, Bob re-ceives not u but u + e. However, Bob can deduce uunambiguously from u + e since by condition (16), noother codeword v sent by Alice could have caused Bobto receive u+ e.

An example error-correcting code is shown in the right-hand column of table 1. This is a [7, 4, 3] Hammingcode, named after its discoverer. The notation [n, k, d]means that the codewords are n bits long, there are2k of them, and they all differ from each other inat least d places. Because of the latter feature, thecondition (16) is satisfied for any error which affectsat most one bit. In other words the set E of cor-rectable errors is {0000000,1000000,0100000,0010000,0001000,0000100,0000010, 0000001}. Note that E canhave at most 2n−k members. The ratio k/n is calledthe rate of the code, since each block of n transmittedbits conveys k bits of information, thus k/n bits perbit.

The parameter d is called the ‘minimum distance’ ofthe code, and is important when encoding for noisewhich affects successive bits independently, as in thebinary symmetric channel. For, a code of minumumdistance d can correct all errors affecting less than d/2bits of the transmitted codeword, and for independentnoise this is the most likely set of errors. In fact, theprobability that an n-bit word receivesm errors is givenby the binomial distribution (11), so if the code cancorrect more than the mean number of errors np, thecorrection is highly likely to succeed.

The central result of classical information theory is thatpowerful error correcting codes exist:

Shannon’s theorem: If the rate k/n < C(p)and n is sufficiently large, there exists a bi-nary code allowing transmission with an ar-bitrarily small error probability.

The error probability here is the probability that anuncorrectable error occurs, causing Bob to misinter-pret the received word. Shannon’s theorem is highlysurprising, since it implies that it is not necessary to en-gineer very low-noise communication channels, an ex-pensive and difficult task. Instead, we can compensate

noise by error correction coding and decoding, that is,by information processing! The meaning of Shannon’stheorem is illustrated by fig. 5.

The main problem of coding theory is to identify codeswith large rate k/n and large distance d. These twoconditions are mutually incompatible, so a compromiseis needed. The problem is notoriously difficult and hasno general solution. To make connection with quan-tum error correction, we will need to mention one im-portant concept, that of the parity check matrix. Anerror correcting code is called linear if it is closed underaddition, i.e. u+ v ∈ C ∀u, v ∈ C. Such a code is com-pletely specified by its parity check matrix H , which isa set of (n − k) linearly independent n-bit words sat-isfying H · u = 0 ∀u ∈ C. The important property isencapsulated by the following equation:

H · (u+ e) = (H · u) + (H · e) = H · e. (17)

This states that if Bob evaluates H ·u′ for his noisy re-ceived word u′ = u+ e, he will obtain the same answerH · e, no matter what word u Alice sent him! If thisevaluation were done automatically, Bob could learnH · e, called the error syndrome, without learning u. IfBob can deduce the error e from H · e, which one canshow is possible for all correctable errors, then he cancorrect the message (by subtracting e from it) withoutever learning what it was! In quantum error correc-tion, this is the origin of the reason one can correct aquantum state without disturbing it.

3 Classical theory of computa-

tion

We now turn to the theory of computation. This ismostly concerned with the questions “what is com-putable?” and “what resources are necessary?”

The fundamental resources required for computing area means to store and to manipulate symbols. The im-portant questions are such things as how complicatedmust the symbols be, how many will we need, how com-plicated must the manipulations be, and how many ofthem will we need?

The general insight is that computation is deemed hard

or inefficient if the amount of resources required rises

16

exponentially with a measure of the size of the prob-lem to be addressed. The size of the problem is givenby the amount of information required to specify theproblem. Applying this idea at the most basic level, wefind that a computer must be able to manipulate bi-nary symbols, not just unary symbols2, otherwise thenumber of memory locations needed would grow ex-ponentially with the amount of information to be ma-nipulated. On the other hand, it is not necessary towork in decimal notation (10 symbols) or any othernotation with an ‘alphabet’ of more than two symbols.This greatly simplifies computer design and analysis.

To manipulate n binary symbols, it is not necessaryto manipulate them all at once, since it can be shownthat any transformation can be brought about by ma-nipulating the binary symbols one at a time or in pairs.A binary ‘logic gate’ takes two bits x, y as inputs, andcalculates a function f(x, y). Since f can be 0 or 1,and there are four possible inputs, there are 16 possi-ble functions f . This set of 16 different logic gates iscalled a ‘universal set’, since by combining such gatesin series, any transformation of n bits can be carriedout. Futhermore, the action of some of the 16 gatescan be reproduced by combining others, so we do notneed all 16, and in fact only one, the nand gate, isnecessary (nand is not and, for which the output is0 if and only if both inputs are 1).

By concatenating logic gates, we can manipulate n-bitsymbols (see fig. 6). This general approach is calledthe network model of computation, and is useful forour purposes because it suggests the model of quan-tum computation which is currently most feasible ex-perimentally. In this model, the essential componentsof a computer are a set of bits, many copies of theuniversal logic gate, and connecting wires.

3.1 Universal computer; Turing ma-

chine

The word ‘universal’ has a further significance in rela-tion to computers. Turing showed that it is possible toconstruct a universal computer, which can simulate theaction of any other, in the following sense. Let us write

2Unary notation has a single symbol, 1. The positive integersare written 1,11,111,1111,. . .

T (x) for the output of a Turing machine T (fig. 7) act-ing on input tape x. Now, a Turing machine can becompletely specified by writing down how it respondsto 0 and 1 on the input tape, for every possible inter-nal configuration of the machine (of which there are afinite number). This specification can itself be writtenas a binary number d[T ]. Turing showed that thereexists a machine U , called a universal Turing machine,with the properties

U(d[T ], x) = T (x) (18)

and the number of steps taken by U to simulate eachstep of T is only a polynomial (not exponential) func-tion of the length of d[T ]. In other words, if we provideU with an input tape containing both a description ofT and the input x, then U will compute the same func-tion as T would have done, for any machine T , withoutan exponential slow-down.

To complete the argument, it can be shown that othermodels of computation, such as the network model, arecomputationally equivalent to the Turing model: theypermit the same functions to be computed, with thesame computational efficiency (see next section). Thusthe concept of the univeral machine establishes that acertain finite degree of complexity of construction issufficient to allow very general information processing.This is the fundamental result of computer science. In-deed, the power of the Turing machine and its cousins isso great that Church (1936) and Turing (1936) framedthe “Church-Turing thesis,” to the effect that

Every function ‘which would naturally be regarded as

computable’ can be computed by the universal Turing

machine.

This thesis is unproven, but has survived many at-tempts to find a counterexample, making it a verypowerful result. To it we owe the versatility of themodern general-purpose computer, since ‘computablefunctions’ include tasks such as word processing, pro-cess control, and so on. The quantum computer, tobe described in section 6 will throw new light on thiscentral thesis.

17

3.2 Computational complexity

Once we have established the idea of a universal com-puter, computational tasks can be classified in termsof their difficulty in the following manner. A given al-gorithm is deemed to address not just one instance ofa problem, such as “find the square of 237,” but oneclass of problem, such as “given x, find its square.” Theamount of information given to the computer in orderto specify the problem is L = log x, i.e. the number ofbits needed to store the value of x. The computational

complexity of the problem is determined by the num-ber of steps s a Turing machine must make in order tocomplete any algorithmic method to solve the problem.In the network model, the complexity is determined bythe number of logic gates required. If an algorithm ex-ists with s given by any polynomial function of L (egs ∝ L3 +L) then the problem is deemed tractable andis placed in the complexity class “p”. If s rises expo-nentially with l (eg s ∝ 2L = x) then the problem ishard and is in another complexity class. It is often eas-ier to verify a solution, that is, to test whether or notit is correct, than to find one. The class “np” is the setof problems for which solutions can be verified in poly-nomial time. Obviously p ∈ np, and one would guessthat there are problems in np which are not in p, (i.e.np 6= p) though surprisingly the latter has never beenproved, since it is very hard to rule out the possibleexistence of as yet undiscovered algorithms. However,the important point is that the membership of theseclasses does not depend on the model of computation,i.e. the physical realisation of the computer, since theTuring machine can simulate any other computer withonly a polynomial, rather than exponential slow-down.

An important example of an intractable problem isthat of factorisation: given a composite (i.e. non-prime) number x, the task is to find one of its fac-tors. If x is even, or a multiple of any small number,then it is easy to find a factor. The interesting case iswhen the prime factors of x are all themselves large.In this case there is no known simple method. Thebest known method, the number field sieve (Menezeset. al. 1997) requires a number of computational stepsof order s ∼ exp(2L1/3(logL)2/3) where L = lnx. Bydevoting a substantial machine network to this task,one can today factor a number of 130 decimal digits(Crandall 1997), i.e. L ≃ 300, giving s ∼ 1018. This istime-consuming but possible (for example 42 days at

1012 operations per second). However, if we double L, sincreases to ∼ 1025, so now the problem is intractable:it would take a million years with current technology,or would require computers running a million timesfaster than current ones. The lesson is an importantone: a computationally ‘hard’ problem is one which inpractice is not merely difficult but impossible to solve.

The factorisation problem has acquired great practicalimportance because it is at the heart of widely usedcyptographic systems such as that of Rivest, Shamirand Adleman (1979) (see Hellman 1979). For, given amessage M (in the form of a long binary number), it iseasy to calculate an encrypted version E = M s mod cwhere s and c are well-chosen large integers which canbe made public. To decrypt the message, the receivercalculates Et mod c which is equal to M for a value oft which can be quickly deduced from s and the factorsof c (Schroeder 1984). In practice c = pq is chosen tobe the product of two large primes p, q known only tothe user who published c, so only that user can readthe messages—unless someone manages to factorise c.It is a very useful feature that no secret keys need bedistributed in such a system: the ‘key’ c, s allowingencryption is public knowledge.

3.3 Uncomputable functions

There is an even stronger way in which a task may beimpossible for a computer. In the quest to solve someproblem, we could ‘live with’ a slow algorithm, butwhat if one does not exist at all? Such problems aretermed uncomputable. The most important example isthe “halting problem”, a rather beautiful result. A fea-ture of computers familiar to programmers is that theymay sometimes be thrown into a never-ending loop.Consider, for example, the instruction “while x > 2,divide x by 1” for x initially greater than 2. We cansee that this algorithm will never halt, without actu-ally running it. More interesting from a mathematicalpoint of view is an algorithm such as “while x is equalto the sum of two primes, add 2 to x, otherwise printx and halt”, beginning at x = 8. The algorithm is cer-tainly feasible since all pairs of primes less than x canbe found and added systematically. Will such an algo-rithm ever halt? If so, then a counterexample to theGoldbach conjecture exists. Using such techniques, avast section of mathematical and physical theory could

18

be reduced to the question “would such and such analgorithm halt if we were to run it?” If we could finda general way to establish whether or not algorithmswill halt, we would have an extremely powerful math-ematical tool. In a certain sense, it would solve all ofmathematics!

Let us suppose that it is possible to find a general algo-rithm which will work out whether any Turing machinewill halt on any input. Such an algorithm solves theproblem “given x and d[T ], would Turing machine Thalt if it were fed x as input?”. Here d[T ] is the de-scription of T . If such an algorithm exists, then it ispossible to make a Turing machine TH which halts ifand only if T (d[T ]) does not halt, where d[T ] is thedescription of T . Here TH takes as input d[T ], whichis sufficient to tell TH about both the Turing machineT and the input to T . Hence we have

TH(d[T ]) halts ↔ T (d[T ]) does not halt (19)

So far everything is ok. However, what if we feed TH

the description of itself, d[TH ]? Then

TH (d[TH ]) halts ↔ TH (d[TH ]) does not halt (20)

which is a contradiction. By this argument Turingshowed that there is no automatic means to estab-lish whether Turing machines will halt in general: the“halting problem” is uncomputable. This implies thatmathematics, and information processing in general,is a rich body of different ideas which cannot all besummarised in one grand algorithm. This liberatingobservation is closely related to Godel’s theorem.

19

4 Quantum verses classical

physics

In order to think about quantum information theory,let us first state the principles of non-relativisitic quan-tum mechanics, as follows (Shankar 1980).

1. The state of an isolated system Q is representedby a vector |ψ(t)〉 in a Hilbert space.

2. Variables such as position and momentum aretermed observables and are represented by Her-mitian operators. The position and momentumoperatorsX,P have the following matrix elementsin the eigenbasis of X :

〈x|X |x′〉 = xδ(x − x′)

〈x|P |x′〉 = −ihδ′(x − x′)

3. The state vector obeys the Schrodinger equation

ihd

dt|ψ(t)〉 = H |ψ(t)〉 (21)

where H is the quantum Hamiltonian operator.

4. Measurement postulate.

The fourth postulate, which has not been made ex-plicit, is a subject of some debate, since quite differentinterpretive approaches lead to the same predictions,and the concept of ‘measurement’ is fraught with am-biguities in quantum mechanics (Wheeler and Zurek1983, Bell 1987, Peres 1993). A statement which isvalid for most practical purposes is that certain phys-ical interactions are recognisably ‘measurements’, andtheir effect on the state vector |ψ〉 is to change itto an eigenstate |k〉 of the variable being measured,the value of k being randomly chosen with probabilityP ∝ | 〈k |ψ〉 |2. The change |ψ〉 → |k〉 can be expressedby the projection operator (|k〉〈k|)/ 〈k |ψ〉.

Note that according to the above equations, the evo-lution of an isolated quantum system is always uni-

tary, in other words |ψ(t)〉 = U(t) |ψ(0)〉 where U(t) =exp(−i

∫

Hdt/h) is a unitary operator, UU † = I. Thisis true, but there is a difficulty that there is no suchthing as a truly isolated system (i.e. one which experi-ences no interactions with any other systems), except

possibly the whole universe. Therefore there is alwayssome approximation involved in using the Schrodingerequation to describe real systems.

One way to handle this approximation is to speak ofthe system Q and its environment T . The evolutionof Q is primarily that given by its Schrodinger equa-tion, but the interaction between Q and T has, in part,the character of a measurement of Q. This produces anon-unitary contribution to the evolution of Q (sinceprojections are not unitary), and this ubiquitous phe-nomenon is called decoherence. I have underlined theseelementary ideas because they are central in what fol-lows.

We can now begin to bring together ideas of physicsand of information processing. For, it is clear thatmuch of the wonderful behaviour we see around us inNature could be understood as a form of informationprocessing, and conversely our computers are able tosimulate, by their processing, many of the patterns ofNature. The obvious, if somewhat imprecise, questionsare

1. “can Nature usefully be regarded as essentially aninformation processor?”

2. “could a computer simulate the whole of Nature?”

The principles of quantum mechanics suggest that theanswer to the first quesion is yes3. For, the state vector|ψ〉 so central to quantum mechanics is a concept verymuch like those of information science: it is an abstractentity which contains exactly all the information aboutthe system Q. The word ‘exactly’ here is a reminderthat not only is |ψ〉 a complete description of Q, it isalso one that does not contain any extraneous informa-tion which can not meaningfully be associated with Q.The importance of this in quantum statistics of Fermiand Bose gases was mentioned in the introduction.

The second question can be made more precise by con-verting the Church-Turing thesis into a principle of

3This does not necessarily imply that such language captureseverthing that can be said about Nature, merely that this is auseful abstraction at the descriptive level of physics. I do notbelieve any physical ‘laws’ could be adequate to completely de-scribe human behaviour, for example, since they are sufficientlyapproximate or non-prescriptive to leave us room for manoeuvre(Polkinghorne 1994).

20

physics,

Every finitely realizible physical system can be simu-

lated arbitrarily closely by a universal model computing

machine operating by finite means.

This statement is based on that of Deutsch (1985). Theidea is to propose that a principle like this is not derivedfrom quantum mechanics, but rather underpins it, likeother principles such as that of conservation of energy.The qualifications introduced by ‘finitely realizible’ and‘finite means’ are important in order to state somethinguseful.

The new version of the Church-Turing thesis (nowcalled the ‘Church-Turing Principle’) does not refer toTuring machines. This is important because there arefundamental differences between the very nature of theTuring machine and the principles of quantum mechan-ics. One is described in terms of operations on classicalbits, the other in terms of evolution of quantum states.Hence there is the possibility that the universal Turingmachine, and hence all classical computers, might notbe able to simulate some of the behaviour to be foundin Nature. Conversely, it may be physically possible(i.e. not ruled out by the laws of Nature) to realise anew type of computation essentially different from thatof classical computer science. This is the central aimof quantum computing.

4.1 EPR paradox, Bell’s inequality

In 1935 Einstein, Podolski and Rosen (EPR) drewattention to an important feature of non-relativisticquantum mechanics. Their argument, and Bell’s anal-ysis, can now be recognised as one of the seeds fromwhich quantum information theory has grown. TheEPR paradox should be familiar to any physics gradu-ate, and I will not repeat the argument in detail. How-ever, the main points will provide a useful way in toquantum information concepts.

The EPR thought-experiment can be reduced inessence to an experiment involving pairs of two-statequantum systems (Bohm 1951, Bohm and Aharonov1957). Let us consider a pair of spin-half particlesA and B, writing the (mz = +1/2) spin ‘up’ state|↑〉 and the (mz = −1/2) spin ‘down’ state |↓〉. The

particles are prepared initially in the singlet state(|↑〉 |↓〉 − |↓〉 |↑〉)/

√2, and they subsequently fly apart,

propagating in opposite directions along the y-axis. Al-ice and Bob are widely separated, and they receive par-ticle A and B respectively. EPR were concerned withwhether quantum mechanics provides a complete de-scription of the particles, or whether something wasleft out, some property of the spin angular momentasA, sB which quantum theory failed to describe. Sucha property has since become known as a ‘hidden vari-able’. They argued that something was left out, be-cause this experiment allows one to predict with cer-tainty the result of measuring any component of sB,without causing any disturbance of B. Therefore allthe components of sB have definite values, say EPR,and the quantum theory only provides an incompletedescription. To make the certain prediction withoutdisturbing B, one chooses any axis η along which onewishes to know B’s angular momentum, and then mea-sures not B but A, using a Stern-Gerlach apparatusaligned along η. Since the singlet state carries no netangular momentum, one can be sure that the corre-sponding measurement on B would yield the oppositeresult to the one obtained for A.

The EPR paper is important because it is carefully ar-gued, and the fallacy is hard to unearth. The fallacycan be exposed in one of two ways: one can say eitherthat Alice’s measurement does influence Bob’s particle,or (which I prefer) that the quantum state vector |φ〉 isnot an intrinsic property of a quantum system, but anexpression for the information content of a quantumvariable. In a singlet state there is mutual informa-tion between A and B, so the information content ofB changes when we learn something about A. So farthere is no difference from the behaviour of classicalinformation, so nothing surprising has occurred.

A more thorough analysis of the EPR experimentyields a big surprise. This was discovered by Bell(1964,1966). Suppose Alice and Bob measure the spincomponent of A and B along different axes ηA andηB in the x-z plane. Each measurement yields an an-swer + or −. Quantum theory and experiment agreethat the probability for the two measurements to yieldthe same result is sin2((φA − φB)/2), where φA (φB)is the angle between ηA (ηB) and the z axis. How-ever, there is no way to assign local properties, thatis properties of A and B independently, which lead to

21

this high a correlation, in which the results are cer-tain to be opposite when φA = φB , certain to beequal when φA = φB + 180◦, and also, for example,have a sin2(60◦) = 3/4 chance of being equal whenφA − φB = 120◦. Feynman (1982) gives a particularlyclear analysis. At φA − φB = 120◦ the highest cor-relation which local hidden variables could produce is2/3.

The Bell-EPR argument allows us to identify a taskwhich is physically possible, but which no classicalcomputer could perform: when repeatedly given in-puts φA, φB at completely separated locations, respondquickly (i.e. too quick to allow light-speed communi-cation between the locations) with yes/no responseswhich are perfectly correlated when φA = φB + 180◦,anticorrelated when φA = φB, and more than ∼ 70%correlated when φA − φB = 120◦.

Experimental tests of Bell’s argument were carried outin the 1970’s and 80’s and the quantum theory was ver-ified (Clauser and Shimony 1978, Aspect et. al. 1982;for more recent work see Aspect (1991), Kwiat et. al.

1995 and references therein). This was a significantnew probe into the logical structure of quantum me-chanics. The argument can be made even strongerby considering a more complicated system. In par-ticular, for three spins prepared in a state such as(|↑〉 |↑〉 |↑〉 + |↓〉 |↓〉 |↓〉)/

√2, Greenberger, Horne and

Zeilinger (1989) (GHZ) showed that a single measure-ment along a horizontal axis for two particles, andalong a vertical axis for the third, will yield with cer-tainty a result which is the exact opposite of what alocal hidden-variable theory would predict. A widerdiscussion and references are provided by Greenbergeret. al. (1990), Mermin (1990).

The Bell-EPR correlations show that quantum me-chanics permits at least one simple task which is be-yond the capabilities of classical computers, and theyhint at a new type of mutual information (Schumacherand Nielsen 1996). In order to pursue these ideas, wewill need to construct a complete theory of quantuminformation.

5 Quantum Information

Just as in the discussion of classical information the-ory, quantum information ideas are best introduced bystating them, and then showing afterwards how theylink together. Quantum communication is treated in aspecial issue of J. Mod. Opt., volume 41 (1994); reviewsand references for quantum cryptography are given byBennett et. al. (1992); Hughes et. al. (1995); Phoenixand Townsend (1995); Brassard and Crepeau (1996);Ekert (1997). Spiller (1996) reviews both communica-tion and computing.

5.1 Qubits

The elementary unit of quantum information is thequbit (Schumacher 1995). A single qubit can be envis-aged as a two-state system such as a spin-half or a two-level atom (see fig. 12), but when we measure quan-tum information in qubits we are really doing some-thing more abstract: a quantum system is said to haven qubits if it has a Hilbert space of 2n dimensions,and so has available 2n mutually orthogonal quantumstates (recall that n classical bits can represent up to2n different things). This definition of the qubit willbe elaborated in section 5.6.

We will write two orthogonal states of a singlequbit as {|0〉 , |1〉}. More generally, 2n mutually or-thogonal states of n qubits can be written {|i〉},where i is an n-bit binary number. For example,for three qubits we have {|000〉 , |001〉 , |010〉 , |011〉 ,|100〉 , |101〉 , |110〉 , |111〉}.

5.2 Quantum gates

Simple unitary operations on qubits are called quan-tum ‘logic gates’ (Deutsch 1985, 1989). For example,if a qubit evolves as |0〉 → |0〉, |1〉 → exp(iωt) |1〉, thenafter time t we may say that the operation, or ‘gate’

P (θ) =

(

1 00 eiθ

)

(22)

has been applied to the qubit, where θ = ωt. This canalso be written P (θ) = |0〉〈0|+exp(iθ) |1〉〈1|. Here are

22

some other elementary quantum gates:

I ≡ |0〉〈0| + |1〉〈1| = identity (23)

X ≡ |0〉〈1| + |1〉〈0| = not (24)

Z ≡ P (π) (25)

Y ≡ XZ (26)

H ≡ 1√2

[

(|0〉 + |1〉) 〈0| + (|0〉 − |1〉) 〈1|]

(27)

these all act on a single qubit, and can be achieved bythe action of some Hamiltonian in Schrodinger’s equa-tion, since they are all unitary operators4. There are aninfinite number of single-qubit quantum gates, in con-trast to classical information theory, where only twologic gates are possible for a single bit, namely theidentity and the logical not operation. The quantumnot gate carries |0〉 to |1〉 and vice versa, and so isanalagous to a classical not. This gate is also calledX since it is the Pauli σx operator. Note that the set{I,X, Y, Z} is a group under multiplication.

Of all the possible unitary operators acting on a pairof qubits, an interesting subset is those which can bewritten |0〉〈0|⊗I+|1〉〈1|⊗U , where I is the single-qubitidentity operation, and U is some other single-qubitgate. Such a two-qubit gate is called a “controlled U”gate, since the action I or U on the second qubit iscontrolled by whether the first qubit is in the state|0〉 or |1〉. For example, the effect of controlled-not

(“cnot”) is

|00〉 → |00〉|01〉 → |01〉|10〉 → |11〉|11〉 → |10〉 (28)

Here the second qubit undergoes a not if and onlyif the first qubit is in the state |1〉. This list of statechanges is the analogue of the truth table for a classicalbinary logic gate. The effect of controlled-not actingon a state |a〉 |b〉 can be written a→ a, b→ a⊕b, where⊕ signifies the exclusive or (xor) operation. For thisreason, this gate is also called the xor gate.

Other logical operations require further qubits. Forexample, the and operation is achieved by use of the

4The letter H is adopted for the final gate here because itseffect is a Hadamard transformation. This is not to be confusedwith the Hamiltonian H.

3-qubit “controlled-controlled-not” gate, in which thethird qubit experiences not if and only if both theothers are in the state |1〉. This gate is named a Toffoligate, after Toffoli (1980) who showed that the classicalversion is universal for classical reversible computation.The effect on a state |a〉 |b〉 |0〉 is a→ a, b→ b, 0 → a ·b.In other words if the third qubit is prepared in |0〉 thenthis gate computes the and of the first two qubits.The use of three qubits is necessary in order to permitthe whole operation to be unitary, and thus allowed inquantum mechanical evolution.

It is an amusing excercise to find the combinationsof gates which perform elementary arithmatical op-erations such as binary addition and multiplication.Many basic constructions are given by Barenco et. al.

(1995b), further general design considerations are dis-cussed by Vedral et. al. (1996) and Beckman et. al.

(1996).

The action of a sequence of quantum gates can be writ-ten in operator notation, for example X1H2xor1,3 |φ〉where |φ〉 is some state of three qubits, and the sub-scripts on the operators indicate to which qubits theyapply. However, once more than a few quantum gatesare involved, this notation is rather obscure, and canusefully be replaced by a diagram known as a quan-tum network—see fig. 8. These diagrams will be usedhereafter.

5.3 No cloning

No cloning theorem: An unknown quantum state can-not be cloned.

This states that it is impossible to generate copies ofa quantum state reliably, unless the state is alreadyknown (i.e. unless there exists classical informationwhich specifies it). Proof: to generate a copy of aquantum state |α〉, we must cause a pair of quantumsystems to undergo the evolution U(|α〉 |0〉) = |α〉 |α〉where U is the unitary evolution operator. If this isto work for any state, then U must not depend on α,and therefore U(|β〉 |0〉) = |β〉 |β〉 for |β〉 6= |α〉. How-ever, if we consider the state |γ〉 = (|α〉 + |β〉)/

√2, we

have U(|γ〉 |0〉) = (|α〉 |α〉+ |β〉 |β〉)/√

2 6= |γ〉 |γ〉 so thecloning operation fails. This argument applies to anypurported cloning method (Wooters and Zurek 1982,

23

Dieks 1982).

Note that any given ‘cloning’ operation U can workon some states (|α〉 and |β〉 in the above example),though since U is trace-preserving, two different clon-able states must be orthogonal, 〈α| β〉 = 0. Unlesswe already know that the state to be copied is one ofthese states, we cannot guarantee that the chosen Uwill correctly clone it. This is in contrast to classi-cal information, where machines like photocopiers caneasily copy whatever classical information is sent tothem. The controlled-not or xor operation of equa-tion (28) is a copying operation for the states |0〉 and|1〉, but not for states such as |+〉 ≡ (|0〉+ |1〉)/

√2 and

|−〉 ≡ (|0〉 − |1〉)/√

2.

The no-cloning theorem and the EPR paradox togetherreveal a rather subtle way in which non-relativisticquantum mechanics is a consistent theory. For, ifcloning were possible, then EPR correlations could beused to communicate faster than light, which leadsto a contradiction (an effect preceding a cause) oncethe principles of special relativity are taken into ac-count. To see this, observe that by generating manyclones, and then measuring them in different bases,Bob could deduce unambiguously whether his mem-ber of an EPR pair is in a state of the basis {|0〉 , |1〉}or of the basis {|+〉 , |−〉}. Alice would communicateinstanteously by forcing the EPR pair into one basisor the other through her choice of measurement axis(Glauber 1986).

5.4 Dense coding

We will discuss the following statement:

Quantum entanglement is an information resource.

Qubits can be used to store and transmit classical in-formation. To transmit a classical bit string 00101,for example, Alice can send 5 qubits prepared in thestate |00101〉. The receiver Bob can extract the infor-mation by measuring each qubit in the basis {|0〉 , |1〉}(i.e. these are the eigenstates of the measured observ-able). The measurement results yield the classical bitstring with no ambiguity. No more than one classicalbit can be communicated for each qubit sent.

Suppose now that Alice and Bob are in possession ofan entangled pair of qubits, in the state |00〉 + |11〉(we will usually drop normalisation factors such as

√2

from now on, to keep the notation uncluttered). Al-ice and Bob need never have communicated: we imag-ine a mechanical central facility generating entangledpairs and sending one qubit to each of Alice and Bob,who store them (see fig. 9a). In this situation, Al-ice can communicate two classical bits by sending Bobonly one qubit (namely her half of the entangled pair).This idea due to Wiesner (Bennett and Wiesner 1992)is called “dense coding”, since only one quantum bittravels from Alice to Bob in order to convey two clas-sical bits. Two quantum bits are involved, but Al-ice only ever sees one of them. The method relies onthe following fact: the four mutually orthogonal states|00〉 + |11〉 , |00〉 − |11〉, |01〉 + |10〉 , |01〉 − |10〉 canbe generated from each other by operations on a sin-gle qubit. This set of states is called the Bell basis,since they exhibit the strongest possible Bell-EPR cor-relations (Braunstein et. al. 1992). Starting from|00〉 + |11〉, Alice can generate any of the Bell basisstates by operating on her qubit with one of the opera-tors {I,X, Y, Z}. Since there are four possibilities, herchoice of operation represents two bits of classical in-formation. She then sends her qubit to Bob, who mustdeduce which Bell basis state the qubits are in. This hedoes by operating on the pair with the xor gate, andmeasuring the target bit, thus distinguishing |00〉±|11〉from |01〉 ± |10〉. To find the sign in the superposition,he operates with H on the remaining qubit, and mea-sures it. Hence Bob obtains two classical bits with noambiguity.

Dense coding is difficult to implement, and so has nopractical value merely as a standard communicationmethod. However, it can permit secure communica-tion: the qubit sent by Alice will only yield the twoclassical information bits to someone in possession ofthe entangled partner qubit. More generally, densecoding is an example of the statement which beganthis section. It reveals a relationship between classi-cal information, qubits, and the information content ofquantum entanglement (Barenco and Ekert 1995). Alaboratory demonstration of the main features is de-scribed by Mattle et. al. (1996); Weinfurter (1994)and Braunstein and Mann (1995) discuss some of themethods employed, based on a source of EPR photonpairs from parametric down-conversion.

24

5.5 Quantum teleportation

It is possible to transmit qubits without sending qubits!

Suppose Alice wishes to communicate to Bob a singlequbit in the state |φ〉. If Alice already knows whatstate she has, for example |φ〉 = |0〉, she can commu-nicate it to Bob by sending just classical information,eg “Dear Bob, I have the state |0〉. Regards, Alice.”However, if |φ〉 is unknown there is no way for Aliceto learn it with certainty: any measurement she mayperform may change the state, and she cannot clone itand measure the copies. Hence it appears that the onlyway to transmit |φ〉 to Bob is to send him the phys-ical qubit (i.e. the electron or atom or whatever), orpossibly to swap the state into another quantum sys-tem and send that. In either case a quantum system istransmitted.

Quantum teleportation (Bennett et. al. 1993, Ben-nett 1995) permits a way around this limitation. Asin dense coding, we will use quantum entanglementas an information resource. Suppose Alice and Bobpossess an entangled pair in the state |00〉 + |11〉. Al-ice wishes to transmit to Bob a qubit in an unknownstate |φ〉. Without loss of generality, we can write|φ〉 = a |0〉 + b |1〉 where a and b are unknown coef-ficients. Then the initial state of all three qubits is

a |000〉 + b |100〉 + a |011〉 + b |111〉 (29)

Alice now measures in the Bell basis the first twoqubits, i.e. the unknown one and her member of the en-tangled pair. The network to do this is shown in fig. 9b.After Alice has applied the xor and Hadamard gates,and just before she measures her qubits, the state is

|00〉 (a |0〉 + b |1〉) + |01〉 (a |1〉 + b |0〉)+ |10〉 (a |0〉 − b |1〉) + |11〉 (a |1〉 − b |0〉) . (30)

Alice’s measurements collapse the state onto one of fourdifferent possibilities, and yield two classical bits. Thetwo bits are sent to Bob, who uses them to learn whichof the operators {I,X, Z, Y } he must apply to his qubitin order to place it in the state a |0〉 + b |1〉 = |φ〉.Thus Bob ends up with the qubit (i.e. the quantuminformation, not the actual quantum system) whichAlice wished to transmit.

Note that the quantum information can only arrive atBob if it disappears from Alice (no cloning). Also,

quantum information is complete information: |φ〉 isthe complete description of Alice’s qubit. The use ofthe word ‘teleportation’ draws attention to these twofacts. Teleportation becomes an especially importantidea when we come to consider communication in thepresence of noise, section 9.

5.6 Quantum data compression

Having introduced the qubit, we now wish to showthat it is a useful measure of quantum information con-tent. The proof of this is due to Jozsa and Schumacher(1994) and Schumacher (1995), building on work ofKholevo (1973) and Levitin (1987). To begin the ar-gument, we first need a quantity which expresses howmuch information you would gain if you were to learnthe quantum state of some system Q. A suitable quan-tity is the Von Neumann entropy

S(ρ) = −Trρ log ρ (31)

where Tr is the trace operation, and ρ is the densityoperator describing an ensemble of states of the quan-tum system. This is to be compared with the classi-cal Shannon entropy, equation (1). Suppose a classi-cal random variable X has a probability distributionp(x). If a quantum system is prepared in a state |x〉dictated by the value of X , then the density matrixis

∑

x p(x) |x〉〈x|, where the states |x〉 need not beorthogonal. It can be shown (Kholevo 1973, Levitin1987) that S(ρ) is an upper limit on the classical mu-tual information I(X : Y ) between X and the result Yof a measurement on the system.

To make connection with qubits, we consider the re-sources needed to store or transmit the state of a quan-tum system q of density matrix ρ. The idea is to collectn ≫ 1 such systems, and transfer (‘encode’) the jointstate into some smaller system. The smaller systemis transmitted down the channel, and at the receivingend the joint state is ‘decoded’ into n systems q′ of thesame type as q (see fig. 9c). The final density matrix ofeach q′ is ρ′, and the whole process is deemed success-ful if ρ′ is sufficiently close to ρ. The measure of thesimilarity between two density matrices is the fidelity

defined by

f(ρ, ρ′) =(

Tr√

ρ1/2ρ′ρ1/2)2

(32)

25

This can be interpreted as the probability that q′ passesa test which ascertained if it was in the state ρ. Whenρ and ρ′ are both pure states, |φ〉〈φ| and |φ′〉〈φ′|, thefidelity is none other than the familiar overlap: f =| 〈φ| φ′〉 |2.

Our aim is to find the smallest transmitted systemwhich permits f = 1 − ǫ for ǫ ≪ 1. The argument isanalogous to the ‘typical sequences’ idea used in section2.2. Restricting ourselves for simplicity to two-statesystems, the total state of n systems is represented bya vector in a Hilbert space of 2n dimensions. However,if the von Neumann entropy S(ρ) < 1 then it is highlylikely (i.e. tends to certainty in the limit of large n)that, in any given realisation, the state vector actuallyfalls in a typical sub-space of Hilbert space. Schumacherand Jozsa showed that the dimension of the typical sub-space is 2nS(ρ). Hence only nS(ρ) qubits are requiredto represent the quantum information faithfully, andthe qubit (i.e. the logarithm of the dimensionality ofHilbert space) is a useful measure of quantum informa-tion. Furthermore, the encoding and decoding opera-tion is ‘blind’: it does not depend on knowledge of theexact states being transmitted.

Schumacher and Josza’s result is powerful because itis general: no assumptions are made about the exactnature of the quantum states involved. In particular,they need not be orthogonal. If the states to be trans-mitted were mutually orthogonal, the whole problemwould reduce to one of classical information.

The ‘encoding’ and ‘decoding’ required to achieve suchquantum data compression and decompression is tech-nologically very demanding. It cannot at present bedone at all using photons. However, it is the ultimatecompression allowed by the laws of physics. The detailsof the required quantum networks have been deducedby Cleve and DiVincenzo (1996).

As well as the essential concept of information, otherclassical ideas such as Huffman coding have their quan-tum counterparts. Furthermore, Schumacher and Niel-son (1996) derive a quantity which they call ‘coherentinformation’ which is a measure of mutual informa-tion for quantum systems. It includes that part of themutual information between entangled systems whichcannot be accounted for classically. This is a helpfulway to understand the Bell-EPR correlations.

5.7 Quantum cryptography

No overview of quantum information is complete with-out a mention of quantum cryptography. This areastems from an unpublished paper of Wiesner writtenaround 1970 (Wiesner 1983). It includes various ideaswhereby the properties of quantum systems are used toachieve useful cryptographic tasks, such as secure (i.e.secret) communication. The subject may be dividedinto quantum key distribution, and a collection of otherideas broadly related to bit commitment. Quantumkey distribution will be outlined below. Bit commit-ment refers to the scenario in which Alice must makesome decision, such as a vote, in such a way that Bobcan be sure that Alice fixed her vote before a giventime, but where Bob can only learn Alice’s vote at somelater time which she chooses. A classical, cumbersomemethod to achieve bit commitment is for Alice to writedown her vote and place it in a safe which she gives toBob. When she wishes Bob, later, to learn the infor-mation, she gives him the key to the safe. A typicalquantum protocol is a carefully constructed variationon the idea that Alice provides Bob with a preparedqubit, and only later tells him in what basis it wasprepared.

The early contributions to the field of quantum cryp-tography were listed in the introduction, further refer-ences may be found in the reviews mentioned at the be-ginning of this section. Cryptography has the unusualfeature that it is not possible to prove by experimentthat a cryptographic procedure is secure: who knowswhether a spy or cheating person managed to beat thesystem? Instead, the users’ confidence in the methodsmust rely on mathematical proofs of security, and itis here that much important work has been done. Aconcerted effort has enabled proofs to be establishedfor the security of correctly implemented quantum keydistribution. However, the bit commitment idea, longthought to be secure through quantum methods, wasrecently proved to be insecure (Mayers 1997, Lo andChau 1997) because the participants can cheat by mak-ing use of quantum entanglement.

Quantum key distribution is a method in which quan-tum states are used to establish a random secret key forcryptography. The essential ideas are as follows: Aliceand Bob are, as usual, widely seperated and wish tocommunicate. Alice sends to Bob 2n qubits, each pre-

26

pared in one of the states |0〉 , |1〉 , |+〉 , |−〉, randomlychosen5. Bob measures his received bits, choosing themeasurement basis randomly between {|0〉 , |1〉} and{|+〉 , |−〉}. Next, Alice and Bob inform each otherpublicly (i.e. anyone can listen in) of the basis theyused to prepare or measure each qubit. They find outon which occasions they by chance used the same basis,which happens on average half the time, and retain justthose results. In the absence of errors or interference,they now share the same random string of n classicalbits (they agree for example to associate |0〉 and |+〉with 0; |1〉 and |−〉 with 1). This classical bit string isoften called the raw quantum transmission, RQT.

So far nothing has been gained by using qubits. Theimportant feature is, however, that it is impossible foranyone to learn Bob’s measurement results by observ-ing the qubits en route, without leaving evidence oftheir presence. The crudest way for an eavesdopperEve to attempt to discover the key would be for herto intercept the qubits and measure them, then passthem on to Bob. On average half the time Eve guessesAlice’s basis correctly and thus does not disturb thequbit. However, Eve’s correct guesses do not coincidewith Bob’s, so Eve learns the state of half of the nqubits which Alice and Bob later decide to trust, anddisturbs the other half, for example sending to Bob |+〉for Alice’s |0〉. Half of those disturbed will be projectedby Bob’s measurement back onto the original state sentby Alice, so overall Eve corrupts n/4 bits of the RQT.

Alice and Bob can now detect Eve’s presence simply byrandomly choosing n/2 bits of the RQT and announc-ing publicly the values they have. If they agree on allthese bits, then they can trust that no eavesdropperwas present, since the probability that Eve was presentand they happened to choose n/2 uncorrupted bits is(3/4)n/2 ≃ 10−125 for n = 1000. The n/2 undisclosedbits form the secret key.

In practice the protocol is more complicated since Evemight adopt other strategies (e.g. not intercept all thequbits), and noise will currupt some of the qubits evenin the absence of an evesdropper. Instead of reject-ing the key if many of the disclosed bits differ, Aliceand Bob retain it as long as they find the error rateto be well below 25%. They then process the key in

5Many other methods are possible, we adopt this one merelyto illustrate the concepts.

two steps. The first is to detect and remove errors,which is done by publicly comparing parity checks onpublicly chosen random subsets of the bits, while dis-carding bits to prevent increasing Eve’s information.The second step is to decrease Eve’s knowledge of thekey, by distilling from it a smaller key, composed ofparity values calculated from the original key. In thisway a key of around n/4 bits is obtained, of which Eveprobably knows less than 10−6 of one bit (Bennett et.

al. 1992).

The protocol just described is not the only one possible.Another approach (Ekert 1991) involves the use of EPRpairs, which Alice and Bob measure along one of threedifferent axes. To rule out eavesdropping they checkfor Bell-EPR correlations in their results.

The great thing about quantum key distribution isthat it is feasible with current technology. A pioneer-ing experiment (Bennett and Brassard 1989) demon-strated the principle, and much progress has been madesince then. Hughes et. al. (1995) and Phoenix andTownsend (1995) summarised the state of affairs twoyears ago, and recently Zbinden et. al. (1997) havereported excellent key distribution through 23 km ofstandard telecom fibre under lake Geneva. The qubitsare stored in the polarisation states of laser pulses, i.e.coherent states of light, with on average 0.1 photonsper pulse. This low light level is necessary so thatpulses containing more than one photon are unlikely.Such pulses would provide duplicate qubits, and hencea means for an evesdropper to go undetected. The sys-tem achieves a bit error rate of 1.35%, which is lowenough to guarantee privacy in the full protocol. Thedata transmission rate is rather low: MHz as opposedto the GHz rates common in classical communications,but the system is very reliable.

Such spectacular experimental mastery is in contrastto the subject of the next section.

6 The universal quantum com-

puter

We now have sufficient concepts to understand thejewel at the heart of quantum information theory,

27

namely, the quantum computer (QC). Ekert and Jozsa(1996) and Barenco (1996) give introductory reviewsconcentrating on the quantum computer and factori-sation; a review with emphasis on practicalities is pro-vided by Spiller (1996). Introductory material is alsoprovided by DiVincenzo (1995b) and Shor (1996).

The QC is first and foremost a machine which is atheoretical construct, like a thought-experiment, whosepurpose is to allow quantum information processing tobe formally analysed. In particular it establishes theChurch-Turing Principle introduced in section 4.

Here is a prescription for a quantum computer, basedon that of Deutsch (1985, 1989):

A quantum computer is a set of n qubits in which thefollowing operations are experimentally feasible:

1. Each qubit can be prepared in some known state|0〉.

2. Each qubit can be measured in the basis {|0〉 , |1〉}.

3. A universal quantum gate (or set of gates) canbe applied at will to any fixed-size subset of thequbits.

4. The qubits do not evolve other than via the abovetransformations.

This prescription is incomplete in certain technicalways to be discussed, but it encompasses the mainideas. The model of computation we have in mind isa network model, in which logic gates are applied se-quentially to a set of bits (here, quantum bits). In anelectronic classical computer, logic gates are spread outin space on a circuit board, but in the QC we typicallyimagine the logic gates to be interactions turned on andoff in time, with the qubits at fixed positions, as in aquantum network diagram (fig. 8, 12). Other modelsof quantum computation can be conceived, such as acellular automaton model (Margolus 1990).

6.1 Universal gate

The universal quantum gate is the quantum equivalentof the classical universal gate, namely a gate which

by its repeated use on different combinations of bitscan generate the action of any other gate. What isthe set of all possible quantum gates, however? Toanswer this, we appeal to the principles of quantummechanics (Schrodinger’s equation), and answer thatsince all quantum evolution is unitary, it is sufficientto be able to generate all unitary transformations ofthe n qubits in the computer. This might seem a tallorder, since we have a continuous and therefore infiniteset. However, it turns out that quite simple quantumgates can be universal, as Deutsch showed in 1985.

The simplest way to think about universal gates is toconsider the pair of gates V (θ, φ) and controlled-not(or xor), where V (θ, φ) is a general rotation of a singlequbit, ie

V (θ, φ) =

(

cos(θ/2) −ie−iφ sin(θ/2)−ieiφ sin(θ/2) cos(θ/2)

)

. (33)

It can be shown that any n × n unitary matrix canbe formed by composing 2-qubit xor gates and single-qubit rotations. Therefore, this pair of operations isuniversal for quantum computation. A purist may ar-gue that V (θ, φ) is an infinite set of gates since theparameters θ and φ are continuous, but it suffices tochoose two particular irrational angles for θ and φ,and the resulting single gate can generate all single-qubit rotations by repeated application; however, apractical system need not use such laborious methods.The xor and rotation operations can be combined tomake a controlled rotation which is a single univer-sal gate. Such universal quantum gates were discussedby Deutsch et. al. (1995), Lloyd (1995), DiVincenzo(1995a) and Barenco (1995).

It is remarkable that 2-qubit gates are sufficient forquantum computation. This is why the quantum gateis a powerful and important concept.

6.2 Church-Turing principle

Having presented the QC, it is necessary to argue forits universality, i.e. that it fulfills the Church-TuringPrinciple as claimed. The two-step argument is verysimple. First, the state of any finite quantum systemis simply a vector in Hilbert space, and therefore can berepresented to arbitrary precision by a finite number ofqubits. Secondly, the evolution of any finite quantum

28

system is a unitary transformation of the state, andtherefore can be simulated on the QC, which can gen-erate any unitary transformation with arbitrary preci-sion.

A point of principle is raised by Myers (1997), whopoints out that there is a difficulty with computationaltasks for which the number of steps for completion can-not be predicted. We cannot in general observe the QCto find out if it has halted, in contrast to a classicalcomputer. However, we will only be concerned withtasks where either the number of steps is predictable,or the QC can signal completion by setting a dedicatedqubit which is otherwise not involved in the compu-tation (Deutsch 1985). This is a very broad class ofproblems. Nielsen and Chuang (1997) consider the useof a fixed quantum gate array, showing that there isno array which, operating on qubits representing bothdata and program, can perform any unitary transfor-mation on the data. However, we consider a machinein which a classical computer controls the quantumgates applied to a quantum register, so any gate arraycan be ‘ordered’ by a classical program to the classicalcomputer.

The QC is certainly an interesting theoretical tool.However, there hangs over it a large and importantquestion-mark: what about imperfection? The pre-scription given above is written as if measurements andgates can be applied with arbitrary precision, which isunphysical, as is the fourth requirement (no extraneousevolution). The prescription can be made realistic byattaching to each of the four requirements a statementabout the degree of allowable imprecision. This is asubject of on-going research, and we will take it up insection 9. Meanwhile, let us investigate more specifi-cally what a sufficiently well-made quantum computermight do.

7 Quantum algorithms

It is well known that classical computers are able to cal-culate the behaviour of quantum systems, so we havenot yet demonstrated that a quantum computer can doanything which a classical computer can not. Indeed,since our theories of physics always involve equationswhich we can write down and manipulate, it seems

highly unlikely that quantum mechanics, or any futurephysical theory, would permit computational problemsto be addressed which are not in principle solvable on alarge enough classical Turing machine. However, as wesaw in section 3.2, those words ‘large enough’, and also‘fast enough’, are centrally important in computer sci-ence. Problems which are computationally ‘hard’ canbe impossible in practice. In technical language, whilequantum computing does not enlarge the set of compu-tational problems which can be addressed (comparedto classical computing), it does introduce the possibil-ity of new complexity classes. Put more simply, tasksfor which classical computers are too slow may be solv-able with quantum computers.

7.1 Simulation of physical systems

The first and most obvious application of a QC is thatof simulating some other quantum system. To simulatea state vector in a 2n-dimensional Hilbert space, a clas-sical computer needs to manipulate vectors containingof order 2n complex numbers, whereas a quantum com-puter requires just n qubits, making it much more effi-cient in storage space. To simulate evolution, in generalboth the classical and quantum computers will be inef-ficient. A classical computer must manipulate matricescontaining of order 22n elements, which requires a num-ber of operations (multiplication, addition) exponen-tially large in n, while a quantum computer must buildunitary operations in 2n-dimensional Hilbert space,which usually requires an exponentially large num-ber of elementary quantum logic gates. Therefore thequantum computer is not guaranteed to simulate every

physical system efficiently. However, it can be shownthat it can simulate a large class of quantum systemsefficiently, including many for which there is no effi-cient classical algorithm, such as many-body systemswith local interactions (Lloyd 1996, Zalka 1996, Wies-ner 1996, Meyer 1996, Lidar and Biam 1996, Abramsand Lloyd 1997, Boghosian and Taylor 1997).

7.2 Period finding and Shor’s factorisa-

tion algorithm

So far we have discussed simulation of Nature, which isa rather restricted type of computation. We would like

29

to let the QC loose on more general problems, but ithas so far proved hard to find ones on which it performsbetter than classical computers. However, the fact thatthere exist such problems at all is a profound insightinto physics, and has stimulated much of the recentinterest in the field.

Currently one of the most important quantum algo-rithms is that for finding the period of a function.Suppose a function f(x) is periodic with period r, i.e.f(x) = f(x + r). Suppose further that f(x) can beefficiently computed from x, and all we know initiallyis that N/2 < r < N for some N . Assuming there isno analytic technique to deduce the period of f(x), thebest we can do on a classical computer is to calculatef(x) for of order N/2 values of x, and find out whenthe function repeats itself (for well-behaved functionsonly O(

√N) values may be needed on average). This

is inefficient since the number of operations is exponen-tial in the input size logN (the information requiredto specify N).

The task can be solved efficiently on a QC by the el-egant method shown in fig. 10, due to Shor (1994),building on Simon (1994). The QC requires 2n qubits,plus a further 0(n) for workspace, where n = ⌈2 logN⌉(the notation ⌈x⌉ means the nearest integer greaterthan x). These are divided into two ‘registers’, eachof n qubits. They will be referred to as the x and yregisters; both are initially prepared in the state |0〉(i.e. all n qubits in states |0〉). Next, the operation His applied to each qubit in the x register, making thetotal state

1√w

w−1∑

x=0

|x〉 |0〉 (34)

where w = 2n. This operation is referred to as aFourier transform in fig. 10, for reasons that willshortly become apparant. The notation |x〉 means astate such as |0011010〉, where 0011010 is the integer xin binary notation. In this context the basis {|0〉 , |1〉}is referred to as the ‘computational basis.’ It is conve-nient (though not of course necessary) to use this basiswhen describing the computer.

Next, a network of logic gates is applied to both x andy regisiters, to perform the transformation Uf |x〉 |0〉 =|x〉 |f(x)〉. Note that this transformation can be uni-tary because the input state |x〉 |0〉 is in one to one

correspondance with the output state |x〉 |f(x)〉, so theprocess is reversible. Now, applying Uf to the stategiven in eq. (34), we obtain

1√w

w−1∑

x=0

|x〉 |f(x)〉 (35)

This state is illustrated in fig. 11a. At this point some-thing rather wonderful has taken place: the value off(x) has been calculated for w = 2n values of x, all inone go! This feature is referred to as quantum paral-

lelism and represents a huge parallelism because of theexponential dependence on n (imagine having 2100, i.e.a million times Avagadro’s number, of classical proces-sors!)

Although the 2n evaluations of f(x) are in some sense‘present’ in the quantum state in eq. (35), unfortu-nately we cannot gain direct access to them. For, ameasurement (in the computational basis) of the y reg-ister, which is the next step in the algorithm, will onlyreveal one value of f(x)6. Suppose the value obtainedis f(x) = u. The y register state collapses onto |u〉,and the total state becomes

1√M

M−1∑

j=0

|du + jr〉 |u〉 (36)

where du + jr, for j = 0, 1, 2 . . .M − 1, are all thevalues of x for which f(x) = u. In other words theperiodicity of f(x) means that the x register remainsin a superposition of M ≃ w/r states, at values of xseparated by the period r. Note that the offset du ofthe set of x values depends on the value u obtained inthe measurement of the y register.

It now remains to extract the periodicity of the statein the x register. This is done by applying a Fouriertransform, and then measuring the state. The discreteFourier transform employed is the following unitaryprocess:

UFT |x〉 =1√w

w−1∑

k=0

ei2πkx/w |k〉 (37)

Note that eq. (34) is an example of this, operating onthe initial state |0〉. The quantum network to apply

6It is not strictly necessary to measure the y register, but thissimplifies the description.

30

UFT is based on the fast Fourier transform algorithm(see, e.g., Knuth (1981)). The quantum version wasworked out by Coppersmith (1994) and Deutsch (1994)independently, a clear presentation may also be foundin Ekert and Josza (1996), Barenco (1996)7. Beforeapplying UFT to eq. (36) we will make the simplifyingassumption that r divides w exactly, so M = w/r. Theessential ideas are not affected by this restriction; whenit is relaxed some added complications must be takeninto account (Shor 1994, 1995a; Ekert and Josza 1996).

The y register no longer concerns us, so we will justconsider the x state from eq. (36):

UFT1

√

w/r

w/r−1∑

j=0

|du + jr〉 =1√r

∑

k

f(k) |k〉 (38)

where

|f(k)| =

{

1 if k is a multiple of w/r0 otherwise

(39)

This state is illustrated in fig. 11b. The final state ofthe x register is now measured, and we see that thevalue obtained must be a multiple of w/r. It remainsto deduce r from this. We have x = λw/r where λis unknown. If λ and r have no common factors, thenwe cancel x/w down to an irreducible fraction and thusobtain λ and r. If λ and r have a common factor, whichis unlikely for large r, then the algorithm fails. In thiscase, the whole algorithm must be repeated from thestart. After a number of repetitions no greater than ∼log r, and usually much less than this, the probabilityof success can be shown to be arbitrarily close to 1(Ekert and Josza 1996).

The quantum period-finding algorithm we have de-scribed is efficient as long as Uf , the evaluation of f(x),is efficient. The total number of elementary logic gatesrequired is a polynomial rather than exponential func-tion of n. As was emphasised in section 3.2, this makesall the difference between tractable and intractable inpractice, for sufficiently large n.

To add the icing on the cake, it can be remarked thatthe important factorisation problem mentioned in sec-tion 3.2 can be reduced to one of finding the period of

7An exact quantum Fourier transform would require rotationoperations of precision exponential in n, which raises a problemwith the efficiency of Shor’s algorithm. However, an approximateversion of the Fourier transform is sufficient (Barenco et. al.

1996)

a simple function. This and all the above ingredientswere first brought together by Shor (1994), who thusshowed that the factorisation problem is tractable onan ideal quantum computer. The function to be eval-uated in this case is f(x) = ax mod N where N isthe number to be factorised, and a < N is chosen ran-domly. One can show using elementary number theory(Ekert and Josza 1996) that for most choices of a, theperiod r is even and ar/2 ± 1 shares a common factorwith N . The common factor (which is of course a fac-tor N) can then be deduced rapidly using a classicalalgorithm due to Euclid (circa 300 BC; see, e.g. Hardyand Wright 1965).

To evaluate f(x) efficiently, repeated squaring (moduloN) is used, giving powers ((a2)2)2 . . .. Selected suchpowers of a, corresponding to the binary expansion ofa, are then multiplied together. Complete networksfor the whole of Shor’s algorithm were described byMiquel et. al. (1996), Vedral et. al. (1996) and Beck-man et. al. (1996). They require of order 300(logN)3

logic gates. Therefore, to factorise numbers of order10130, i.e. at the limit of current classical methods,would require ∼ 2 × 1010 gates per run, or 7 hours ifthe ‘switching rate’ is one megaHertz8. Consideringhow difficult it is to make a quantum computer, thisoffers no advantage over classical computation. How-ever, if we double the number of digits to 260 thenthe problem is intractable classically (see section 3.2),while the ideal quantum computer takes just 8 timeslonger than before. The existence of such a powerfulmethod is an exciting and profound new insight intoquantum theory.

The period-finding algorithm appears at first sight likea conjuring trick: it is not quite clear how the quan-tum computer managed to produce the period like arabbit out of a hat. Examining fig. 11 and equations(34) to (38), I would say that the most important fea-tures are contained in eq. (35). They are not only thequantum parallelism already mentioned, but also quan-

tum entanglement, and, finally, quantum interference.Each value of f(x) retains a link with the value of xwhich produced it, through the entanglement of the xand y registers in eq. (35). The ‘magic’ happens whena measurement of the y register produces the special

8The algorithm might need to be run log r ∼ 60 times toensure at least one successful run, but the average number ofruns required will be much less than this.

31

state |ψ〉 (eq. 36) in the x register, and it is quan-tum entanglement which permits this (see also Jozsa1997a). The final Fourier transform can be regarded asan interference between the various superposed statesin the x register (compare with the action of a diffrac-tion grating).

Interference effects can be used for computational pur-poses with classical light fields, or water waves for thatmatter, so interference is not in itself the essentiallyquantum feature. Rather, the exponentially large num-ber of interfering states, and the entanglement, are fea-tures which do not arise in classical systems.

7.3 Grover’s search algorithm

Despite considerable efforts in the quantum computingcommunity, the number of useful quantum algorithmswhich have been discovered remains small. They con-sist mainly of variants on the period-finding algorithmpresented above, and another quite different task: thatof searching an unstructured list. Grover (1997) pre-sented a quantum algorithm for the following problem:given an unstructured list of items {xi}, find a partic-ular item xj = t. Think, for example, of looking for aparticular telephone number in the telephone directory(for someone whose name you do not know). It is nothard to prove that classical algorithms can do no betterthan searching through the list, requiring on averageN/2 steps, for a list of N items. Grover’s algorithmrequires of order

√N steps. The task remains compu-

tationally hard: it is not transferred to a new complex-ity class, but it is remarkable that such a seeminglyhopeless task can be speeded up at all. The ‘quan-tum speed-up’ ∼

√N/2 is greater than that achieved

by Shor’s factorisation algorithm (∼ exp(2(lnN)1/3)),and would be important for the huge sets (N ≃ 1016)which can arise, for example, in code-breaking prob-lems (Brassard 1997).

An important further point was proved by Bennett et.

al. (1997), namely that Grover’s algorithm is optimal:no quantum algorithm can do better than O(

√N).

A brief sketch of Grover’s algorithm is as follows. Eachitem has a label i, and we must be able to test in aunitary way whether any item is the one we are seeking.In other words there must exist a unitary operator S

such that S |i〉 = |i〉 if i 6= j, and S |j〉 = − |j〉, wherej is the label of the special item. For example, thetest might establish whether i is the solution of somehard computational problem9. The method begins byplacing a single quantum register in a superpositionof all computational states, as in the period-findingalgorithm (eq. (34)). Define

|Ψ(θ)〉 ≡ sin θ |j〉 +cos θ√N − 1

∑

i6=j

|i〉 (40)

where j is the label of the element t = xj to be found.The initially prepared state is an equally-weighted su-perposition, |Ψ(θ0)〉 where sin θ0 = 1/

√N . Now apply

S, which reverses the sign of the one special element ofthe superposition, then Fourier transform, change thesign of all components except |0〉, and Fourier trans-form back again. These operations represent a subtleinterference effect which achieves the following trans-formation:

UG |θ〉 = |Ψ(θ + φ)〉 (41)

where sinφ = 2√N − 1/N . The coefficient of the spe-

cial element is now slightly larger than that of all theother elements. The method proceeds simply by apply-ing UG m times, where m ≃ (π/4)

√N . The slow rota-

tion brings θ very close to π/2, so the quantum statebecomes almost precisely equal to |j〉. After the m it-erations the state is measured and the value j obtained(with error probability O(1/N)). If UG is applied toomany times, the success probability diminishes, so it isimportant to know m, which was deduced by Boyer et.

al. (1996). Kristen Fuchs compares the technique tocooking a souffle. The state is placed in the ‘quantumoven’ and the desired answer rises slowly. You mustopen the oven at the right time, neither too soon nottoo late, to guarantee success. Otherwise the soufflewill fall—the state collapses to the wrong answer.

The two algorithms I have presented are the easiest todescribe, and illustrate many of the methods of quan-tum computation. However, just what further methodsmay exist is an open question. Kitaev (1996) has shownhow to solve the factorisation and related problems us-ing a technique fundamentally different from Shor’s.His ideas have some similarities to Grover’s. Kitaev’smethod is helpfully clarified by Jozsa (1997b) who also

9That is, an “np” problem for which finding a solution is hard,but testing a proposed solution is easy.

32

brings out the common features of several quantum al-gorithms based on Fourier transforms. The quantumprogrammer’s toolbox is thus slowly growing. It seemssafe to predict, however, that the class of problems forwhich quantum computers out-perform classical ones isa special and therefore small class. On the other hand,any problem for which finding solutions is hard, buttesting a candidate solution is easy, can at last resortbe solved by an exhaustive search, and here Grover’salgorithm may prove very useful.

33

8 Experimental quantum infor-

mation processors

The most elementary quantum logical operations havebeen demonstrated in many physics experiments dur-ing the past 50 years. For example, the not operation(X) is no more than a stimulated transition betweentwo energy levels |0〉 and |1〉. The important xor op-eration can also be identified as a driven transition in afour-level system. However, if we wish to contemplatea quantum computer it is necessary to find a systemwhich is sufficiently controllable to allow quantum logicgates to be applied at will, and yet is sufficiently com-plicated to store many qubits of quantum information.

It is very hard to find such systems. One might hope tofabricate quantum devices on solid state microchips—this is the logical progression of the microfabricationtechniques which have allowed classical computers tobecome so powerful. However, quantum computationrelies on complicated interference effects and the greatproblem in realising it is the problem of noise. Noquantum system is really isolated, and the coupling tothe environment produces decoherence which destroysthe quantum computation. In solid state devices theenvironment is the substrate, and the coupling to thisenvironment is strong, producing typical decoherencetimes of the order of picoseconds. It is important to re-alise that it is not enough to have two different states|0〉 and |1〉 which are themselves stable (for examplestates of different current in a superconductor): we re-quire also that superpositions such as |0〉+ |1〉 preservetheir phase, and this is typically where the decoherencetimescale is so short.

At present there are two candidate systems whichshould permit quantum computation on 10 to 40qubits. These are the proposal of Cirac and Zoller(1995) using a line of singly charged atoms confinedand cooled in vacuum in an ion trap, and the pro-posal of Gershenfeld and Chuang (1997), and simulta-neously Cory et. al. (1996), using the methods of bulknuclear magnetic resonance. In both cases the propos-als rely on the impressive efforts of a large commu-nity of researchers which developed the experimentaltechniques. Previous proposals for experimental quan-tum computation (Lloyd 1993, Berman et. al. 1994,Barenco et. al. 1995a, DiVincenzo 1995b) touched on

some of the important methods but were not experi-mentally feasible. Further recent proposals (Privmanet. al. 1997, Loss and DiVincenzo 1997) may becomefeasible in the near future.

8.1 Ion trap

The ion trap method is illustrated in fig. 12, and de-scribed in detail by Steane (1997b). A string of ions isconfined by a combination of oscillating and static elec-tric fields in a linear ‘Paul trap’ in high vacuum (10−8

Pa). A single laser beam is split by beam splitters andacousto-optic modulators into many beam pairs, onepair illuminating each ion. Each ion has two long-livedstates, for example different levels of the ground statehyperfine structure (the lifetime of such states againstspontaneous decay can exceed thousands of years). Letus refer to these two states as |g〉 and |e〉; they are or-thogonal and so together represent one qubit. Eachlaser beam pair can drive coherent Raman transitionsbetween the internal states of the relevant ion. Thisallows any single-qubit quantum gate to be applied toany ion, but not two-qubit gates. The latter requiresan interaction between ions, and this is provided bytheir Coulomb repulsion. However, exactly how to usethis interaction is far from obvious; it required the im-portant insight of Cirac and Zoller.

Light carries not only energy but also momentum, sowhenever a laser beam pair interacts with an ion, itexchanges momentum with the ion. In fact, the mu-tual repulsion of the ions means that the whole stringof ions moves en masse when the motion is quantised(Mossbauer effect). The motion of the ion string isquantised because the ion string is confined in the po-tential provided by the Paul trap. The quantum statesof motion correspond to the different degrees of exci-tation (‘phonons’) of the normal modes of vibrationof the string. In particular we focus on the groundstate of the motion |n = 0〉 and the lowest excitedstate |n = 1〉 of the fundamental mode. To achieve,for example, controlled-Z between ion x and ion y, westart with the motion in the ground state |n = 0〉. Apulse of the laser beams on ion x drives the transition|n = 0〉 |g〉x → |n = 0〉 |g〉x, |n = 0〉 |e〉x → |n = 1〉 |g〉x,so the ion finishes in the ground state, and the motionfinishes in the initial state of the ion: this is a ‘swap’operation. Next a pulse of the laser beams on ion y

34

drives the transition

|n = 0〉 |g〉y → |n = 0〉 |g〉y|n = 0〉 |e〉y → |n = 0〉 |e〉y|n = 1〉 |g〉y → |n = 1〉 |g〉y|n = 1〉 |e〉y → −|n = 1〉 |e〉y

Finally, we repeat the initial pulse on ion x. The overalleffect of the three pulses is

|n = 0〉 |g〉x |g〉y → |n = 0〉 |g〉x |g〉y|n = 0〉 |g〉x |e〉y → |n = 0〉 |g〉x |e〉y|n = 0〉 |e〉x |g〉y → |n = 0〉 |e〉x |g〉y|n = 0〉 |e〉x |e〉y → −|n = 0〉 |e〉x |e〉y

which is exactly a controlled-Z between x and y. Eachlaser pulse must have a precisely controlled frequencyand duration. The controlled-Z gate and the single-qubit gates together provide a universal set, so we canperform arbitrary transformations of the joint state ofall the ions!

To complete the prescription for a quantum computer(section 6), we must be able to prepare the initialstate and measure the final state. The first is possiblethrough the methods of optical pumping and laser cool-ing, the second through the ‘quantum jump’ or ‘elec-tron shelving’ measurement technique. All these arepowerful techniques developed in the atomic physicscommunity over the past twenty years. However, thecombination of all the techniques at once has only beenachieved in a single experiment, which demonstratedpreparation, quantum gates, and measurement for justa single trapped ion (Monroe et. al 1995b).

The chief experimental difficulty in the ion trap methodis to cool the string of ions to the ground state of thetrap (a sub-microKelvin temperature), and the chiefsource of decoherence is the heating of this motion ow-ing to the coupling between the charged ion string andnoise voltages in the electrodes (Steane 1997, Winelandet. al. 1997). It is unknown just how much the heat-ing can be reduced. A conservative statement is thatin the next few years 100 quantum gates could be ap-plied to a few ions without losing coherence. In thelonger term one may hope for an order of magnitudeincrease in both figures. It seems clear that an ion trapprocessor will never achieve sufficient storage capacity

and coherence to permit factorisation of hundred-digitnumbers. However, it would be fascinating to try aquantum algorithm on just a few qubits (4 to 10) andthus to observe the principles of quantum informationprocessing at work. We will discuss in section 9 meth-ods which should allow the number of coherent gateoperations to be greatly increased.

8.2 Nuclear magnetic resonance

The proposal using nuclear magnetic resonance (NMR)is illustrated in fig. 13. The quantum processor in thiscase is a molecule containing a ‘backbone’ of about tenatoms, with other atoms such as hydrogen attachedso as to use up all the chemical bonds. It is the nu-clei which interest us. Each has a magnetic momentassociated with the nuclear spin, and the spin statesprovide the qubits. The molecule is placed in a largemagnetic field, and the spin states of the nuclei aremanipulated by applying oscillating magnetic fields inpulses of controlled duration.

So far, so good. The problem is that the spin stateof the nuclei of a single molecule can be neither pre-pared nor measured. To circumvent this problem, weuse not a single molecule, but a cup of liquid contain-ing some 1020 molecules! We then measure the aver-age spin state, which can be achieved since the averageoscillating magnetic moment of all the nuclei is largeenough to produce a detectable magnetic field. Somesubtleties enter at this point. Each of the molecules inthe liquid has a very slightly different local magneticfield, influenced by other molecules in the vicinity, soeach ‘quantum processor’ evolves slightly differently.This problem is circumvented by the spin-echo tech-nique, a standard tool in NMR which allows the effectsof free evolution of the spins to be reversed, withoutreversing the effect of the quantum gates. However,this increases the difficulty of applying long sequencesof quantum gates.

The remaining problem is to prepare the initial state.The cup of liquid is in thermal equilibrium to be-gin with, so the different spin states have occupationprobabilities given by the Boltzman distribution. Onemakes use of the fact that spin states are close in en-ergy, and so have nearly equal occupations initially.Thus the density matrix ρ of the O(1020) nuclear spins

35

is very close to the identity matrix I. It is the small dif-ference ∆ = ρ− I which can be used to store quantuminformation. Although ∆ is not the density matrix ofany quantum system, it nevertheless transforms underwell-chosen field pulses in the same way as a densitymatrix would, and hence can be considered to repre-sent an effective quantum computer. The reader isreferred to Gershenfeld and Chuang (1997) for a de-tailed description, including the further subtlety thatan effective pure state must be distilled out of ∆ bymeans of a pulse sequence which performs quantumdata compression.

NMR experiments have for some years routinelyachieved spin state manipulations and measurementsequivalent in complexity to those required for quan-tum information processing on a few qubits, thereforethe first few-qubit quantum processors will be NMRsystems. The method does not scale very well as thenumber of qubits is increased, however. For example,with n qubits the measured signal scales as 2−n. Alsothe possibility to measure the state is limited, sinceonly the average state of many processors is detectable.This restricts the ability to apply quantum error correc-tion (section 9), and complicates the design of quantumalgorithms.

8.3 High-Q optical cavities

Both systems we have described permit simple quan-tum information processing, but not quantum commu-nication. However, in a very high-quality optical cav-ity, a strong coupling can be achieved between a singleatom or ion and a single mode of the electromagneticfield. This coupling can be used to apply quantumgates between the field mode and the ion, thus openingthe way to transferring quantum information betweenseparated ion traps, via high-Q optical cavities and op-tical fibres (Cirac et. al. 1997). Such experiments arenow being contemplated. The required strong couplingbetween a cavity field and an atom has been demon-strated by Brune et. al. (1994), and Turchette et. al.

(1995). An electromagnetic field mode can also be usedto couple ions within a single trap, providing a fasteralternative to the phonon method (Pellizzari et. al.

1995).

9 Quantum error correction

In section 7 we discussed some beautiful quantum al-gorithms. Their power only rivals classical computers,however, on quite large problems, requiring thousandsof qubits and billions of quantum gates (with the pos-sible exception of algorithms for simulation of physicalsystems). In section 8 we examined some experimen-tal systems, and found that we can only contemplate‘computers’ of a few tens of qubits and perhaps somethousands of gates. Such systems are not ‘computers’at all because they are not sufficiently versatile: theyshould at best be called modest quantum informationprocessors. Whence came this huge disparity betweenthe hope and the reality?

The problem is that the prescription for the univer-sal quantum computer, section 6, is unphysical in itsfourth requirement. There is no such thing as a perfectquantum gate, nor is there such a thing as an isolatedsystem. One may hope that it is possible in principle toachieve any degree of perfection in a real device, butin practice this is an impossible dream. Gates suchas xor rely on a coupling between separated qubits,but if qubits are coupled to each other, they will un-avoidably be coupled to something else as well (Plenioand Knight 1996). A rough guide is that it is veryhard to find a system in which the loss of coherenceis smaller than one part in a million each time a xor

gate is applied. This means the decoherence is roughly107 times too fast to allow factorisation of a 130 digitnumber! It is an open question whether the laws ofphysics offer any intrinsic lower limit to the decoher-ence rate, but it is safe to say that it would be sim-pler to speed up classical computation by a factor of106 than to achieve such low decoherence in a largequantum computer. Such arguments were eloquentlyput forward by Haroche and Raimond (1996). Theirwork, and that of others such as Landauer (1995,1996)sounds a helpful note of caution. More detailed treat-ments of decoherence in quantum computers are givenby Unruh (1995), Palma et. al. (1996) and Chuang et.

al. (1995). Large numerical studies are described byMiquel et. al. (1996) and Barenco et. al. (1997).

Classical computers are reliable not because they areperfectly engineered, but because they are insensitiveto noise. One way to understand this is to examinein detail a device such as a flip-flop, or even a humble

36

mechanical switch. Their stability is based on a com-bination of amplification and dissipation: a small de-parture of a mechanical switch from ‘on’ or ‘off’ resultsin a large restoring force from the spring. Amplifiersdo the corresponding job in a flip-flop. The restoringforce is not sufficient alone, however: with a conser-vative force, the switch would oscillate between ‘on’and ‘off’. It is important also to have damping, sup-plied by an inelastic collision which generates heat inthe case of a mechanical switch, and by resistors in theelectronic flip-flop. However, these methods are ruledout for a quantum computer by the fundamental prin-ciples of quantum mechanics. The no-cloning theoremmeans amplification of unknown quantum states is im-possible, and dissipation is incompatible with unitaryevolution.

Such fundamental considerations lead to the widely ac-cepted belief that quantum mechanics rules out thepossibility to stabilize a quantum computer against theeffects of random noise. A repeated projection of thecomputer’s state by well-chosen measurements is notin itself sufficient (Berthiaume et. al. 1994, Miquel et.

al 1997). However, by careful application of informa-tion theory one can find a way around this impasse.The idea is to adapt the error correction methods ofclassical information theory to the quantum situation.

Quantum error correction (QEC) was established asan important and general method by Steane (1996b)and independently Calderbank and Shor (1996). Someof the ideas had been introduced previously by Shor(1995b) and Steane (1996a). They are related to the‘entanglement purification’ introduced by Bennett et.

al. (1996a) and independently Deutsch et. al. (1996).The theory of QEC was further advanced by Knilland Laflamme (1997), Ekert and Macchiavello (1996),Bennett et. al. (1996b). The latter paper describesthe optimal 5-qubit code also independently discov-ered by Laflamme et. al. (1996). Gottesman (1996)and Calderbank et. al. (1997) discovered a generalgroup-theoretic framework, introducing the importantconcept of the stabilizer, which also enabled manymore codes to be found (Calderbank et. al. 1996,Steane 1996cd). Quantum coding theory reached afurther level of maturity with the discovery by Shorand Laflamme (1997) of a quantum analogue to theMacWilliams identities of classical coding theory.

QEC uses networks of quantum gates and measure-ments, and at first is was not clear whether these net-works had themselves to be perfect in order for themethod to work. An important step forward was takenby Shor (1996) and Kitaev (1996) who showed howto make error correcting networks tolerant of errorswithin the network. In other words, such ‘fault tol-erant’ networks remove more noise than they intro-duce. Shor’s methods were generalised by DiVincenzoand Shor (1996) and made more efficient by Steane(1997a,c). Knill and Laflamme (1996) introduced theidea of ‘concatenated’ coding, which is a recursive cod-ing method. It has the advantage of allowing arbitrar-ily long quantum computations as long as the noiseper elementary operation is below a finite threshold,at the cost of inefficient use of quantum memory (sorequiring a large computer). This threshold result wasderived by several authors (Knill et al 1996, Aharonovand Ben-Or 1996, Gottesman et. al. 1996). Furtherfault tolerant methods are described by Knill et. al.

(1997), Gottesman (1997), Kitaev (1997).

The discovery of QEC was roughly simultaneous withthat of a related idea which also permits noise-freetransmission of quantum states over a noisy quantumchannel. This is the ‘entanglement purification’ (Ben-nett et. al. 1996a, Deutsch et. al. 1996). The cen-tral idea here is for Alice to generate many entangledpairs of qubits, sending one of each pair down the noisychannel to Bob. Bob and Alice store their qubits, andperform simple parity checking measurements: for ex-ample, Bob’s performs xor between a given qubit andthe next he receives, then measures just the targetqubit. Alice does the same on her qubits, and theycompare results. If they agree, the unmeasured qubitsare (by chance) closer than average to the desired state|00〉 + |11〉. If they disagree, the qubits are rejected.By recursive use of such checks, a few ‘good’ entangledpairs are distilled out of the many noisy ones. Once inpossession of a good entangled state, Alice and Bob cancommunicate by teleportation. A thorough discussionis given by Bennett et. al. (1996b).

Using similar ideas, with important improvements, vanEnk et. al. (1997) have recently shown how quan-tum information might be reliably transmitted betweenatoms in separated high-Q optical cavities via imper-fect optical fibres, using imperfect gate operations.

37

I will now outline the main principles of QEC.

Let us write down the worst possible thing which couldhappen to a single qubit: a completely general interac-tion between a qubit and its environment is

|ei〉 (a |0〉 + b |1〉) → a (c00 |e00〉 |0〉 + c01 |e01〉 |1〉)+ b (c10 |e10〉 |1〉 + c11 |e11〉 |0〉) (42)

where |e...〉 denotes states of the environment and c...are coefficients depending on the noise. The first sig-nificant point is to notice that this general interactioncan be written

|ei〉 |φ〉 → (|eI〉 I + |eX〉X + |eY 〉Y + |eZ〉Z) |φ〉 (43)

where |φ〉 = a |0〉 + b |1〉 is the initial state of thequbit, and |eI〉 = c00 |e00〉+c10 |e10〉, |eX〉 = c01 |e01〉+c11 |e11〉, and so on. Note that these environment statesare not necessarily normalised. Eq. (43) tells us thatwe have essentially three types of error to correct oneach qubit: X , Y and Z errors. These are ‘bit flip’ (X)errors, phase errors (Z) or both (Y = XZ).

Suppose our computer q is to manipulate k qubits ofquantum information. Let a general state of the kqubits be |φ〉. We first make the computer larger, in-troducing a further n − k qubits, initially in the state|0〉. Call the enlarged system qc. An ‘encoding’ oper-ation is performed: E(|φ〉 |0〉) = |φE〉. Now, let noiseaffect the n qubits of qc. Without loss of generality,the noise can be written as a sum of ‘error operators’M , where each error operator is a tensor product ofn operators (one for each qubit), taken from the set{I,X, Y, Z}. For example M = I1X2I3Y4Z5X6I7 forthe case n = 7. A general noisy state is

∑

s

|es〉Ms |φE〉 (44)

Now we introduce even more qubits: a further n − k,prepared in the state |0〉a. This additional set iscalled an ‘ancilla’. For any given encoding E, thereexists a syndrome extraction operation A, operatingon the joint system of qc and a. whose effect isA(Ms |φE〉 |0〉a) = (Ms |φE〉) |s〉a ∀Ms ∈ S. The set Sis the set of correctable errors, which depends on theencoding. In the notation |s〉a, s is just a binary num-ber which indicates which error operator Ms we aredealing with, so the states |s〉a are mutually orthogo-nal. Suppose for simplicity that the general noisy state

(44) only contains Ms ∈ S, then the joint state of en-vironment, qc and a after syndrome extraction is

∑

s

|es〉 (Ms |φE〉) |s〉a (45)

We now measure the ancilla state, and somethingrather wonderful happens: the whole state collapsesonto |es〉 (Ms |φE〉) |s〉a, for some particular value of s.Now, instead of general noise, we have just one partic-ular error operator Ms to worry about. Furthermore,the measurement tells us the value s (the ‘error syn-drome’) from which we can deduce which Ms we have!Armed with this knowledge, we apply M−1

s to qc bymeans of a few quantum gates (X , Z or Y ), thus pro-ducing the final state |es〉 |φE〉 |s〉a. In other words, wehave recovered the noise-free state of qc! The final en-vironment state is immaterial, and we can re-preparethe ancilla in |0〉a for further use.

The only assumption in the above was that the noise ineq. (44) only contains error operators in the correctableset S. In practice, the noise includes both members andnon-members of S, and the important quantity is theprobability that the state collapses onto a correctableone when the syndrome is extracted. It is here that thetheory of error-correcting codes enters in: our task is tofind encoding and extraction operations E,A such thatthe set S of correctable errors includes all the errorsmost likely to occur. This is a very difficult problem.

It is a general truth that to permit efficient stabiliza-tion against noise, we have to know something aboutthe noise we wish to suppress. The most obvious quasi-realistic assumption is that of uncorrelated stochasticnoise. That is, at a given time or place the noise mighthave any effect, but the effects on different qubits, oron the same qubit at different times, are uncorrelated.This is the quantum equivalent of the binary symet-ric channel, section 2.3. By assuming uncorrelatedstochastic noise we can place all possible error oper-ators M in a heirarchy of probability: those affectingfew qubits (i.e. only a few terms in the tensor productare different from I) are most likely, while those af-fecting many qubits at once are unlikely. Our aim willbe to find quantum error correcting codes (QECCs)such that all errors affecting up to t qubits will be cor-rectable. Such a QECC is termed a ‘t-error correctingcode’.

The simplest code construction (that discovered by

38

Calderbank and Shor and Steane) goes as follows. Firstwe notice that a classical error correcting code, such asthe Hamming code shown in table 1, can be used tocorrect X errors. The proof relies on eq. (17) whichpermits the syndrome extraction A to produce an an-cilla state |s〉 which depends only on the error Ms andnot on the computer’s state |φ〉. This suggests thatwe store k quantum bits by means of the 2k mutuallyorthogonal n-qubit states |i〉, where the binary num-ber i is a member of a classical error correcting codeC, see section 2.4. This will not allow correction ofZ errors, however. Observe that since Z = HXH , thecorrection of Z errors is equivalent to rotating the stateof each qubit by H , correcting X errors, and rotatingback again. This rotation is called a Hadamard trans-form; it is just a change in basis. The next ingredient isto notice the following special property (Steane 1996a):

H∑

i∈C

|i〉 =1√2k

∑

j∈C⊥

|j〉 (46)

where H ≡ H1H2H3 · · ·Hn. In words, this says thatif we make a quantum state by superposing all themembers of a classical error correcting code C, thenthe Hadamard-transformed state is just a superpositionof all the members of the dual code C⊥. From thisit follows, after some further steps, that it is possibleto correct both X and Z errors (and therefore also Yerrors) if we use quantum states of the form given ineq. (46), as long as both C and C⊥ are good classicalerror correcting codes, i.e. both have good correctionabilities.

The simplest QECC constructed by the above reciperequires n = 7 qubits to store a single (k = 1) qubitof useful quantum information. The two orthogonalstates required to store the information are built fromthe Hamming code shown in table 1:

|0E〉 ≡ |0000000〉 + |1010101〉 + |0110011〉 + |1100110〉+ |0001111〉 + |1011010〉 + |0111100〉 + |1101001〉 (47)

|1E〉 ≡ |1111111〉 + |0101010〉 + |1001100〉 + |0011001〉+ |1110000〉 + |0100101〉 + |1000011〉 + |0010110〉 (48)

Such a QECC has the following remarkable property.Imagine I store a general (unknown) state of a singlequbit into a spin state a |0E〉 + b |1E〉 of 7 spin-halfparticles. I then allow you to do anything at all toany one of the 7 spins. I could nevertheless extractmy original qubit state exactly. Therefore the large

perturbation you introduced did nothing at all to thestored quantum information!

More powerful QECCs can be obtained from more pow-erful classical codes, and there exist quantum code con-structions more efficient than the one just outlined.Suppose we store k qubits into n. There are 3n waysfor a single qubit to be in error, since the error mightbe one of X , Y or Z. The number of syndrome bitsis n − k, so if every single-qubit error, and the error-free case, is to have a different syndrome, we require2n−k ≥ 3n+ 1. For k = 1 this lower limit is filled ex-actly by n = 5 and indeed such a 5-qubit single-errorcorrecting code exists (Laflamme et. al. 1996, Bennettet. al. 1996b).

More generally, the remarkable fact is that for fixedk/n, codes exist for which t/n is bounded from belowas n → ∞ (Calderbank and Shor 1995, Steane 1996b,Calderbank et. al. 1997). This leads to a quantumversion of Shannon’s theorem (section 2.4), though anexact definition of the capacity of a quantum channelremains unclear (Schumacher and Nielsen 1996, Bar-num et. al. 1996, Lloyd 1997, Bennett et. al. 1996b,Knill and Laflamme 1997a). For finite n, the probabil-ity that the noise produces uncorrectable errors scalesroughly as (nǫ)t+1, where ǫ ≪ 1 is the probability ofan arbitrary error on each qubit. This represents anextremely powerful noise suppression. We need to beable to reduce ǫ to a sufficiently small value by pas-sive means, and then QEC does the rest. For exam-ple, consider the case ǫ ≃ 0.001. With n = 23 thereexisits a code correcting all t = 3-qubit errors (Go-lay 1949, Steane 1996c). The probability that uncor-rectable noise occurs is ∼ 0.0234 ≃ 3 × 10−7, thus thenoise is suppressed by more than three orders of mag-nitude.

So far I have described QEC as if the ancilla and themany quantum gates and measurements involved werethemselves noise-free. Obviously we must drop this as-sumption if we want to form a realistic impression ofwhat might be possible in quantum computing. Shor(1996) and Kitaev (1996) discovered ways in which allthe required operations can be arranged so that the cor-rection suppresses more noise than it introduces. Theessential ideas are to verify states wherever possible,to restrict the propagation of errors by careful networkdesign, and to repeat the syndrome extraction: for each

39

group of qubits qc, the syndrome is extracted severaltimes and qc is only corrected once t+1 mutually con-sistent syndromes are obtained. Fig. 14 illustrates afault-tolerant syndrome extraction network, i.e. onewhich restricts the propagation of errors. Note that ais verified before it is used, and each qubit in qc onlyinteracts with one qubit in a.

In fault-tolerant computing, we cannot apply arbitraryrotations of a logical qubit, eq. (33), in a single step.However, particular rotations through irrational anglescan be carried out, and thus general rotations are gen-erated to an arbitrary degree of precision through repe-tition. Note that the set of computational gates is nowdiscrete rather than continuous.

Recently the requirements for reliable quantum com-puting using fault-tolerant QEC have been estimated(Preskill 1997, Steane 1997c). They are formidable.For example, a computation beyond the capabilities ofthe best classical computers might require 1000 qubitsand 1010 quantum gates. Without QEC, this wouldrequire a noise level of order 10−13 per qubit per gate,which we can rule out as impossible. With QEC, thecomputer would have to be made ten or perhaps onehundred times larger, and many thousands of gateswould be involved in the correctors for each elemen-tary step in the computation. However, much morenoise could be tolerated: up to about 10−5 per qubitper gate (i.e. in any of the gates, including those inthe correctors) (Steane 1997c). This is daunting butpossible.

The error correction methods briefly described here arenot the only type possible. If we know more aboutthe noise, then humbler methods requiring just a fewqubits can be quite powerful. Such a method was pro-posed by Cirac et. al. (1996) to deal with the principlenoise source in an ion trap, which is changes of the mo-tional state during gate operations. Also, some jointstates of several qubits can have reduced noise if the en-vironment affects all qubits together. For example thetwo states |01〉 ± |10〉 are unchanged by environmentalcoupling of the form |e0〉 I1I2 + |e1〉X1X2. (Palma et.

al. 1996, Chuang and Yamamoto 1997). Such statesoffer a calm eye within the storm of decoherence, inwhich quantum information can be manipulated withrelative impunity. A practical computer would proba-bly use a combination of methods.

10 Discussion

The idea of ‘Quantum Computing’ has fired manyimaginations simply because the words themselves sug-gest something strange but powerful, as if the physi-cists have come up with a second revolution in informa-tion processing to herald the next millenium. This is afalse impression. Quantum computing will not replaceclassical computing for similar reasons that quantumphysics does not replace classical physics: no one everconsulted Heisenberg in order to design a house, andno one takes their car to be mended by a quantummechanic. If large quantum computers are ever made,they will be used to address just those special taskswhich benefit from quantum information processing.

A more lasting reason to be excited about quantumcomputing is that it is a new and insightful way tothink about the fundamental laws of physics. Thequantum computing community remains fairly smallat present, yet the pace of progress has been fast andaccelerating in the last few years. The ideas of clas-sical information theory seem to fit into quantum me-chanics like a hand into a glove, giving us the feel-ing that we are uncovering something profound aboutNature. Shannon’s noiseless coding theorem leads toSchumacher and Josza’s quantum coding theorem andthe significance of the qubit as a useful measure of in-formation. This enables us to keep track of quantuminformation, and to be confident that it is indepen-dent of the details of the system in which it is stored.This is necessary to underpin other concepts such aserror correction and computing. The classical theoryof error correction leads to the discovery of quantumerror correction. This allows a physical process pre-viously thought to be impossible, namely the almostperfect recovery of a general quantum state, undoingeven irreversible processes such as relaxation by spon-taneous emission. For example, during a long error-corrected quantum computation, using fault-tolerantmethods, every qubit in the computer might decay amillion times and yet the coherence of the quantuminformation be preserved.

Hilbert’s questions regarding the logical structure ofmathematics encourage us to ask a new type ofquestion about the laws of physics. In looking atSchrodinger’s equation, we can neglect whether it isdescribing an electron or a planet, and just ask about

40

the state manipulations it permits. The language ofinformation and computer science enables us to framesuch questions. Even such a simple idea as the quan-tum gate, the cousin of the classical binary logic gate,turns out to be very useful, because it enables usto think clearly about quantum state manipulationswhich would otherwise seem extremely complicated orimpractical. Such ideas open the way to the design ofquantum algorithms such as those of Shor, Grover andKitaev. These show that quantum mechanics allowsinformation processing of a kind ruled out in classicalphysics. It relies on the propagation of a quantum statethrough a huge (exponentially large) number of dimen-sions of Hilbert space. The computation result arisesfrom a controlled interference among many computa-tional paths, which even after we have examined themathematical description, still seems wonderful andsurprising.

The intrinsic difficulty of quantum computation lies inthe sensitivity of large-scale interference to noise andimprecision. A point often raised against the quantumcomputer is that it is essentially an analogue ratherthan a digital device, and has many limitations as a re-sult. This is a misconception. It is true that any quan-tum system has a continuous state space, but so hasany classical system, including the circuits of a digitalcomputer. The fault-tolerant methods used to permiterror correction in a quantum computer restrict the setof quantum gates to a discrete set, therefore the ‘legal’states of the quantum computer are discrete, just asin a classical digital computer. The really importantdifference between analogue and digital computing isthat to increase the precision of a result arrived at byanalogue means, one must re-engineer the whole com-puter, whereas with digital methods one need merelyincrease the number of bits and operations. The fault-tolerant quantum computer has more in common witha digital than an analogue device.

Shor’s algorithm for the factorisation problem stimu-lated a lot of interest in part because of the connectionwith data encryption. However, I feel that the signifi-cance of Shor’s algorithm is not primarily in its possibleuse for factoring large integers in the distant future.Rather, it has acted as a stimulus to the field, prov-ing the existence of a powerful new type of computingmade possible by controlled quantum evolution, andexhibiting some of the new methods. At present, the

most practically significant achievement in the generalarea of quantum information physics is not in comput-ing at all, but in quantum key distribution.

The title ‘quantum computer’ will remain a misnomerfor any experimental device realised in the next twentyyears. It is an abuse of language to call even a pocketcalculator a ‘computer’, because the word has come tobe reserved for general-purpose machines which moreor less realise Turing’s concept of the Universal Ma-chine. The same ought to be true for quantum comput-ers if we do not want to mislead people. However, smallquantum information processors may serve useful roles.For example, concepts learned from quantum informa-tion theory may permit the discovery of useful newspectroscopic methods in nuclear magnetic resonance.Quantum key distribution could be made more secure,and made possible over larger distances, if small ‘relaystations’ could be built which applied purification orerror correction methods. The relay station could bean ion trap combined with a high-Q cavity, which isrealisable with current technology. It will surely notbe long before a quantum state is teleported from onelaboratory to another, a very exciting prospect.

The great intrinsic value of a large quantum computeris offset by the difficulty of making one. However, fewwould argue that this prize does not at least merit a lotof effort to find out just how unattainable, or hopefullyattainable, it is. One of the chief uses of a processorwhich could manipulate a few quantum bits may be tohelp us better understand decoherence in quantum me-chanics. This will be amenable to experimental inves-tigation during the next few years: rather than waitingin hope, there is useful work to be done now.

On the theoretical side, there are two major open ques-tions: the nature of quantum algorithms, and the lim-its on reliability of quantum computing. It is not yetclear what is the essential nature of quantum comput-ing, and what general class of computational problem isamenable to efficient solution by quantum methods. Isthere a whole mine of useful quantum algorithms wait-ing to be delved, or will the supply dry up with thefew nuggets we have so far discovered? Can significantcomputational power be achieved with less than 100qubits? This is by no means ruled out, since it is hardto simulate even 20 qubits by classical means. Concern-ing reliability, great progress has been made, so that we

41

can now be cautiously optimistic that quantum com-puting is not an impossible dream. We can identify re-quirements sufficient to guarantee reliable computing,involving for example uncorrelated stochastic noise oforder 10−5 per gate, and a quantum computer a hun-dred times larger than the logical machine embeddedwithin it. However, can quantum decoherence be re-lied upon to have the properties assumed in such anestimate, and if not then can error correction methodsstill be found? Conversely, once we know more aboutthe noise, it may be possible to identify considerablyless taxing requirements for reliable computing.

To conclude with, I would like to propose a more wide-ranging theoretical task: to arrive at a set of principleslike energy and momentum conservation, but which ap-ply to information, and from which much of quantummechanics could be derived. Two tests of such ideaswould be whether the EPR-Bell correlations thus be-came transparent, and whether they rendered obviousthe proper use of terms such as ‘measurement’ and‘knowledge’.

I hope that quantum information physics will be recog-nised as a valuable part of fundamental physics. Thequest to bring together Turing machines, information,number theory and quantum physics is for me, andI hope will be for readers of this review, one of themost fascinating cultural endeavours one could havethe good fortune to encounter.

I thank the Royal Society and St Edmund Hall, Oxford,for their support.

42

Abrams D S and Lloyd S 1997 Simulation of many-body Fermi systems on a universal quantum computer(preprint quant-ph/9703054)

Aharonov D and Ben-Or M 1996 Fault-tolerant quan-tum computation with constant error (preprint quant-ph/9611025)

Aspect A, Dalibard J and Roger G 1982 Experimentaltest of Bell’s inequalities using time-varying analysers,Phys. Rev. Lett. 49, 1804-1807

Aspect A 1991 Testing Bell’s inequalities, Europhys.News. 22, 73-75

Barenco A 1995 A universal two-bit gate for quantumcomputation, Proc. R. Soc. Lond. A 449 679-683

Barenco A and Ekert A K 1995 Dense coding based onquantum entanglement, J. Mod. Opt. 42 1253-1259

Barenco A, Deutsch D, Ekert E and Jozsa R 1995aConditional quantum dynamics and quantum gates,Phys. Rev. Lett. 74 4083-4086

Barenco A, Bennett C H, Cleve R, DiVincenzo D P,Margolus N, Shor P, Sleator T, Smolin J A and Wein-furter H 1995b Elementary gates for quantum compu-tation, Phys. Rev. A 52, 3457-3467

Barenco A 1996 Quantum physics and computers, Con-temp. Phys. 37 375-389

Barenco A, Ekert A, Suominen K A and Torma P 1996Approximate quantum Fourier transform and decoher-ence, Phys. Rev. A 54, 139-146

Barenco A, Brun T A, Schak R and Spiller T P 1997 Ef-fects of noise on quantum error correction algorithms,Phys. Rev. A 56 1177-1188

Barnum H, Fuchs C A, Jozsa R and Schumacher B 1996A general fidelity limit for quantum channels, Phys.Rev. A 54 4707-4711

Beckman D, Chari A, Devabhaktuni S and Preskill J1996 Efficient networks for quantum factoring, Phys.Rev. A 54, 1034-1063

Bell J S 1964 On the Einstein-Podolsky-Rosen paradox,Physics 1 195-200

Bell J S 1966 On the problem of hidden variables inquantum theory, Rev. Mod. Phys. 38 447-52

Bell J S 1987 Speakable and unspeakable in quantum

mechanics (Cambridge University Press)

Benioff P, 1980 J. Stat. Phys. 22 563

Benioff P 1982a Quantum mechanical hamiltonianmodels of Turing machines, J. Stat. Phys. 29 515-546

Benioff P 1982b Quantum mechanical models of Turingmachines that dissipate no energy, Phys. Rev. Lett.48 1581-1585

Bennett C H 1973 Logical reversibility of computation,IBM J. Res. Develop. 17 525-532

Bennett C H 1982 Int. J. Theor. Phys. 21 905

Bennett C H, Brassard G, Briedbart S and WiesnerS 1982 Quantum cryptography, or unforgeable sub-way tokens, in Advances in Cryptology: Proceedings

of Crypto ’82 (Plenum, New York) 267-275

Bennett C H and Brassard G 1984 Quantum cryptogra-phy: public key distribution and coin tossing, in Proc.

IEEE Conf. on Computers, Syst. and Signal Process.

175-179

Bennett C H and Landauer R 1985 The fundamen-tal physical limits of computation, Scientific American,July 38-46

Bennett C H 1987 Demons, engines and the second law,Scientific American vol 257 no. 5 (November) 88-96

Bennett C H and Brassard G 1989, SIGACT News 20,78-82

Bennett C H and Wiesner S J 1992 Communication viaone- and two-particle operations on Einstein-Podolsky-Rosen states, Phys. Rev. Lett. 69, 2881-2884

Bennett C H, Bessette F, Brassard G, Savail L and

43

Smolin J 1992 Experimental quantum cryptography,J. Cryptology 5, 3-28

Bennett C H, Brassard G, Crepeau C, Jozsa R, Peres Aand Wootters W K 1993 Teleporting an unknown quan-tum state via dual classical and Einstein-Podolsky-Rosen channels, Phys. Rev. Lett. 70 1895-1898

Bennett C H 1995 Quantum information and compu-tation, Phys. Today 48 10 24-30

Bennett C H, Brassard G, Popescu S, Schumacher B,Smolin J A and Wootters W K 1996a Purification ofnoisy entanglement and faithful teleportation via noisychannels, Phys. Rev. Lett. 76 722-725

Bennett C H, DiVincenzo D P, Smolin J A and Woot-ters W K 1996b Mixed state entanglement and quan-tum error correction, Phys. Rev. A 54 3825

Bennett C H, Bernstein E, Brassard G and Vazirani U1997 Strengths and weaknesses of quantum computing,(preprint quant-ph/9701001)

Berman G P, Doolen G D, Holm D D, Tsifrinovich V I1994 Quantum computer on a class of one-dimensionalIsing systems, Phys. Lett. 193 444-450

Bernstein E and Vazirani U 1993 Quantum complexitytheory, in Proc. of the 25th Annual ACM Symposium

on Theory of Computing (ACM, New York) 11-20

Berthiaume A, Deutsch D and Jozsa R 1994 The stabil-isation of quantum computation, in Proceedings of the

Workshop on Physics and Computation, PhysComp 94

60-62 Los Alamitos: IEEE Computer Society Press

Berthiaume A and Brassard G 1992a The quantumchallenge to structural complexity theory, in Proc. of

the Seventh Annual Structure in Complexity Theory

Conference (IEEE Computer Society Press, Los Alami-tos, CA) 132-137

Berthiaume A and Brassard G 1992b Oracle quantumcomputing, in Proc. of the Workshop on Physics of

Computation: PhysComp ’92 (IEEE Computer SocietyPress, Los Alamitos, CA) 60-62

Boghosian B M and Taylor W 1997 Simulating quan-

tum mechanics on a quantum computer (preprintquant-ph/9701019)

Bohm D 1951 Quantum Theory (Englewood Cliffs, N.J.)

Bohm D and Aharonov Y 1957 Phys. Rev. 108 1070

Boyer M, Brassard G, Hoyer P and Tapp ATight bounds on quantum searching (preprint quant-ph/9605034)

Brassard G 1997 Searching a quantum phone book,Science 275 627-628

Brassard G and Crepeau C 1996 SIGACT News 27

13-24

Braunstein S L, Mann A and Revzen M 1992 Maximalviolation of Bell inequalities for mixed states, Phys.Rev. Lett. 68, 3259-3261

Braunstein S L and Mann A 1995 Measurement of theBell operator and quantum teleportation, Phys. Rev.A 51, R1727-R1730

Brillouin L 1956, Science and information theory (Aca-demic Press, New York)

Brune M, Nussenzveig P, Schmidt-Kaler F, BernardotF, Maali A, Raimond J M and Haroche S 1994 FromLamb shift to light shifts: vacuum and subphoton cav-ity fields measured by atomic phase sensitive detection,Phys. Rev. Lett. 72, 3339-3342

Calderbank A R and Shor P W 1996 Good quantumerror-correcting codes exist, Phys. Rev. A 54 1098-1105

Calderbank A R, Rains E M, Shor P W and SloaneN J A 1996 Quantum error correction via codes overGF (4) (preprint quant-ph/9608006)

Calderbank A R, Rains E M, Shor P W and SloaneN J A 1997 Quantum error correction and orthogonalgeometry, Phys. Rev. Lett. 78 405-408

Caves C M 1990 Quantitative limits on the ability of aMaxwell Demon to extract work from heat, Phys. Rev.

44

Lett. 64 2111-2114

Caves C M, Unruh W G and Zurek W H 1990 comment,Phys. Rev. Lett. 65 1387

Chuang I L, Laflamme R, Shor P W and Zurek W H1995 Quantum computers, factoring, and decoherence,Science 270 1633-1635

Chuang I L and Yamamoto 1997 Creation of a persis-tent qubit using error correction Phys. Rev. A 55,114-127

Church A 1936 An unsolvable problem of elementarynumber theory, Amer. J. Math. 58 345-363

Cirac J I and Zoller P 1995 Quantum computationswith cold trapped ions, Phys. Rev. Lett. 74 4091-4094

Cirac J I, Pellizari T and Zoller P 1996 Enforcing coher-ent evolution in dissipative quantum dynamics, Science273, 1207

Cirac J I, Zoller P, Kimble H J and Mabuchi H 1997Quantum state transfer and entanglement distributionamong distant nodes of a quantum network, Phys.Rev. Lett. 78, 3221

Clauser J F, Holt R A, Horne M A and Shimony A1969 Proposed experiment to test local hidden-variabletheories, Phys. Rev. Lett. 23 880-884

Clauser J F and Shimony A 1978 Bell’s theorem: ex-perimental tests and implications, Rep. Prog. Phys.41 1881-1927

Cleve R and DiVincenzo D P 1996 Schumacher’s quan-tum data compression as a quantum computation,Phys. Rev. A 54 2636

Coppersmith D 1994 An approximate Fourier trans-form useful in quantum factoring, IBM Research Re-port RC 19642

Cory D G, Fahmy A F and Havel T F 1996 Nu-clear magnetic resonance spectroscopy: an experimen-tally accessible paradigm for quantum computing, inProc. of the 4th Workshop on Physics and Computa-

tion (Complex Systems Institute, Boston, New Eng-land)

Crandall R E 1997 The challenge of large numbers,Scientific American February 59-62

Deutsch D 1985 Quantum theory, the Church-Turingprinciple and the universal quantum computer, Proc.Roy. Soc. Lond. A 400 97-117

Deutsch D 1989 Quantum computational networks,Proc. Roy. Soc. Lond. A 425 73-90

Deutsch D and Jozsa R 1992 Rapid solution of prob-lems by quantum computation, Proc. Roy. Soc. LondA 439 553-558

Deutsch D, Barenco A & Ekert A 1995 Universality inquantum computation, Proc. R. Soc. Lond. A 449

669-677

Deutsch D, Ekert A, Jozsa R, Macchiavello C, PopescuS, and Sanpera A 1996 Quantum privacy amplificationand the security of quantum cryptography over noisychannels, Phys. Rev. Lett. 77 2818

Diedrich F, Bergquist J C, Itano W M and. WinelandD J 1989 Laser cooling to the zero-point energy of mo-tion, Phys. Rev. Lett. 62 403

Dieks D 1982 Communication byelectron-paramagnetic-resonance devices, Phys. Lett.A 92 271

DiVincenzo D P 1995a Two-bit gates are universal forquantum computation, Phys. Rev. A 51 1015-1022

DiVincenzo D P 1995b Quantum computation, Science270 255-261

DiVincenzo D P and Shor P W 1996 Fault-toleranterror correction with efficient quantum codes, Phys.Rev. Lett. 77 3260-3263

Einstein A, Rosen N and Podolsky B 1935 Phys. Rev.47, 777

Ekert A 1991 Quantum cryptography based on Bell’stheorem Phys. Rev. Lett. 67, 661-663

45

Ekert A and Jozsa R 1996 Quantum computation andShor’s factoring algorithm, Rev. Mod. Phys. 68 733

Ekert A and Macchiavello C 1996 Quantum error cor-rection for communication, Phys. Rev. Lett. 77 2585-2588

Ekert A 1997 From quantum code-making to quantumcode-breaking, (preprint quant-ph/9703035)

van Enk S J, Cirac J I and Zoller P 1997 Ideal commu-nication over noisy channels: a quantum optical imple-mentation, Phys. Rev. Lett. 78, 4293-4296

Feynman R P 1982 Simulating physics with computers,Int. J. Theor. Phys. 21 467-488

Feynman R P 1986 Quantum mechanical computers,Found. Phys. 16 507-531; see also Optics News Febru-ary 1985, 11-20.

Fredkin E and Toffoli T 1982 Conservative logic, Int.J. Theor. Phys. 21 219-253

Gershenfeld N A and Chuang I L 1997 Bulk spin-resonance quantum computation, Science 275 350-356

Glauber R J 1986, in Frontiers in Quantum Optics,Pike E R and Sarker S, eds (Adam Hilger, Bristol)

Golay M J E 1949 Notes on digital coding, Proc. IEEE37 657

Gottesman D 1996 Class of quantum error-correctingcodes saturating the quantum Hamming bound, Phys.Rev. A 54, 1862-1868

Gottesman D 1997 A theory of fault-tolerant quantumcomputation (preprint quant-ph 9702029)

Gottesman D, Evslin J, Kakade S and Preskill J 1996(to be published)

Greenberger D M, Horne M A and Zeilinger A 1989Going beyond Bell’s theorem, in Bell’s theorem, quan-

tum theory and conceptions of the universe, Kafatos M,ed, (Kluwer Academic, Dordrecht) 73-76

Greenberger D M, Horne M A, Shimony A and

Zeilinger A 1990 Bell’s theorem without inequalities,Am. J. Phys. 58, 1131-1143

Grover L K 1997 Quantum mechanics helps in search-ing for a needle in a haystack, Phys. Rev. Lett. 79,325-328

Hamming R W 1950 Error detecting and error correct-ing codes, Bell Syst. Tech. J. 29 147

Hamming R W 1986 Coding and information theory,2nd ed, (Prentice-Hall, Englewood Cliffs)

Hardy G H and Wright E M 1979 An introduction to

the theory of numbers (Clarendon Press, Oxford)

Haroche S and Raimond J-M 1996 Quantum comput-ing: dream or nightmare? Phys. Today August 51-52

Hellman M E 1979 The mathematics of public-keycryptography, Scientific American 241 August 130-139

Hill R 1986 A first course in coding theory (ClarendonPress, Oxford)

Hodges A 1983 Alan Turing: the enigma (Vintage,London)

Hughes R J, Alde D M, Dyer P, Luther G G, Mor-gan G L ans Schauer M 1995 Quantum cryptography,Contemp. Phys. 36 149-163

J. Mod. Opt. 41, no 12 1994 Special issue: quantumcommunication

Jones D S 1979 Elementary information theory

(Clarendon Press, Oxford)

Jozsa R and Schumacher B 1994 A new proof of thequantum noiseless coding theorem, J. Mod. Optics 41

2343

Jozsa R 1997a Entanglement and quantum computa-tion, appearing in Geometric issues in the foundations

of science, Huggett S et. al., eds, (Oxford UniversityPress)

Jozsa R 1997b Quantum algorithms and the Fouriertransform, submitted to Proc. Santa Barbara confer-

46

ence on quantum coherence and decoherence (preprintquant-ph/9707033)

Keyes R W and Landauer R 1970 IBM J. Res. Develop.14, 152

Keyes R W 1970 Science 168, 796

Kholevo A S 1973 Probl. Peredachi Inf 9, 3; Probl.Inf. Transm. (USSR) 9, 177

Kitaev A Yu 1995 Quantum measurements andthe Abelian stablizer problem, (preprint quant-ph/9511026)

Kitaev A Yu 1996 Quantum error correction with im-perfect gates (preprint)

Kitaev A Yu 1997 Fault-tolerant quantum computationby anyons (preprint quant-ph/9707021)

Knill E and Laflamme R 1996 Concatenated quantumcodes (preprint quant-ph/9608012)

Knill E, Laflamme R and Zurek W H 1996 Accuracythreshold for quantum computation, (preprint quant-ph/9610011)

Knill E and Laflamme R 1997 A theory of quantumerror-correcting codes, Phys. Rev. A 55 900-911

Knill E, Laflamme R and Zurek W H 1997 Resilientquantum computation: error models and thresholds(preprint quant-ph/9702058)

Knuth D E 1981 The Art of Computer Programming,

Vol. 2: Seminumerical Algorithms, 2nd ed (Addison-Wesley).

Kwiat P G, Mattle K, Weinfurter H, Zeilinger A,Sergienko A and Shih Y 1995 New high-intensity sourceof polarisation-entangled photon pairs Phys. Rev.Lett. 75, 4337-4341

Laflamme R, Miquel C, Paz J P and Zurek W H 1996Perfect quantum error correcting code, Phys. Rev.Lett. 77, 198-201

Landauer R 1961 IBM J. Res. Dev. 5 183

Landauer R 1991 Information is physical, Phys. TodayMay 1991 23-29

Landauer R 1995 Is quantum mechanics useful? Philos.Trans. R. Soc. London Ser. A. 353 367-376

Landauer R 1996 The physical nature of information,Phys. Lett. A 217 188

Lecerf Y 1963 Machines de Turing reversibles .Recursive insolubilite en n ∈ N de l’equation u = θnu,ou θ est un isomorphisme de codes, C. R. Acad. Fran-caise Sci. 257, 2597-2600

Levitin L B 1987 in Information Complexity and Con-

trol in Quantum Physics, Blaquieve A, Diner S, LochakG, eds (Springer, New York) 15-47

Lidar D A and Biham O 1996 Simulating Isingspin glasses on a quantum computer (preprint quant-ph/9611038)

Lloyd S 1993 A potentially realisable quantum com-puter, Science 261 1569; see also Science 263 695(1994).

Lloyd S 1995 Almost any quantum logic gate is univer-sal, Phys. Rev. Lett. 75, 346-349

Lloyd S 1996 Universal quantum simulators, Science273 1073-1078

Lloyd S 1997 The capacity of a noisy quantum channel,Phys. Rev. A 55 1613-1622

Lo H-K and Chau H F 1997 Is quantum bit commit-ment really possible?, Phys. Rev. Lett. 78 3410-3413

Loss D and DiVincenzo D P 1997 Quantum Computa-tion with Quantum Dots, submitted to Phys. Rev. A(preprint quant-ph/9701055)

MacWilliams F J and Sloane N J A 1977 The theory of

error correcting codes, (Elsevier Science, Amsterdam)

Mattle K, Weinfurter H, Kwiat P G and Zeilinger A1996 Dense coding in experimental quantum commu-nication, Phys. Rev. Lett. 76, 4656-4659.

47

Margolus N 1986 Quantum computation, Ann. NewYork Acad. Sci. 480 487-497

Margolus N 1990 Parallel Quantum Computation, inComplexity, Entropy and the Physics of Information,

Santa Fe Institute Studies in the Sciences of Complex-

ity, vol VIII p. 273 ed Zurek W H (Addison-Wesley)

Maxwell J C 1871 Theory of heat (Longmans, Greenand Co, London)

Mayers D 1997 Unconditionally secure quantum bitcommitment is impossible, Phys. Rev. Lett. 78 3414-3417

Menezes A J, van Oorschot P C and Vanstone S A 1997Handbook of applied cryptography (CRC Press, BocaRaton)

Mermin N D 1990 What’s wrong with these elementsof reality? Phys. Today (June) 9-11

Meyer D A 1996 Quantum mechanics of lattice gasautomata I: one particle plane waves and potentials,(preprint quant-ph/9611005)

Minsky M L 1967 Computation: Finite and Infinite

Machines (Prentice-Hall, Inc., Englewood Cliffs, N. J.;also London 1972)

Miquel C, Paz J P and Perazzo 1996 Factoring in adissipative quantum computer Phys. Rev. A 54 2605-2613

Miquel C, Paz J P and Zurek W H 1997 Quantumcomputation with phase drift errors, Phys. Rev. Lett.78 3971-3974

Monroe C, Meekhof D M, King B E, Jefferts S R, ItanoW M, Wineland D J and Gould P 1995a Resolved-sideband Raman cooling of a bound atom to the 3Dzero-pointenergy, Phys. Rev. Lett. 75 4011-4014

Monroe C, Meekhof D M, King B E, Itano W M andWineland D J 1995b Demonstration of a universalquantum logic gate, Phys. Rev. Lett. 75 4714-4717

Myers J M 1997 Can a universal quantum computerbe fully quantum? Phys. Rev. Lett. 78, 1823-1824

Nielsen M A and Chuang I L 1997 Programmable quan-tum gate arrays, Phys. Rev. Lett. 79, 321-324

Palma G M, Suominen K-A & Ekert A K 1996 Quan-tum computers and dissipation, Proc. Roy. Soc. Lond.A 452 567-584

Pellizzari T, Gardiner S A, Cirac J I and Zoller P1995 Decoherence, continuous observation, and quan-tum computing: A cavity QED model, Phys. Rev.Lett. 75 3788-3791

Peres A 1993 Quantum theory: concepts and methods

(Kluwer Academic Press, Dordrecht)

Phoenix S J D and Townsend P D 1995 Quantum cryp-tography: how to beat the code breakers using quan-tum mechanics, Contemp. Phys. 36, 165-195

Plenio M B and Knight P L 1996 Realisitic lowerbounds for the factorisation time of large numbers ona quantum computer, Phys. Rev. A 53, 2986-2990.

Polkinghorne J 1994 Quarks, chaos and christianity

(Triangle, London)

Preskill J 1997 Reliable quantum computers, (preprintquant-ph/9705031)

Privman V, Vagner I D and Kventsel G 1997 Quan-tum computation in quantum-Hall systems, (preprint,quant-ph/9707017)

Rivest R, Shamir A and Adleman L 1979 On dig-ital signatures and public-key cryptosystems, MITLaboratory for Computer Science, Technical Report,MIT/LCS/TR-212

Schroeder M R 1984 Number theory in science and

communication (Springer-Verlag, Berlin Heidelberg)

Schumacher B 1995 Quantum coding, Phys. Rev. A51 2738-2747

Schumacher B W and Nielsen M A 1996 Quantum dataprocessing and error correction Phys Rev A 54, 2629

Shankar R 1980 Principles of quantum mechanics

(Plenum Press, New York)

48

Shannon C E 1948 A mathematical theory of commu-nication Bell Syst. Tech. J. 27 379; also p. 623

Shor P W 1994 Polynomial-time algorithms for primefactorization and discrete logarithms on a quantumcomputer, in Proc. 35th Annual Symp. on Founda-

tions of Computer Science, Santa Fe, IEEE ComputerSociety Press; revised version 1995a preprint quant-ph/9508027

Shor P W 1995b Scheme for reducing decoherence inquantum computer memory, Phys. Rev. A 52 R2493-R2496

Shor P W 1996 Fault tolerant quantum computation,in Proc. 37th Symp. on Foundations of Computer Sci-

ence, to be published. (Preprint quant-ph/9605011).

Shor P W and Laflamme R 1997 Quantum analog ofthe MacWilliams identities for classical coding theory,Phys. Rev. Lett. 78 1600-1602

Simon D 1994 On the power of quantum computation,in Proc. 35th Annual Symposium on Foundations of

Computer Science (IEEE Computer Society Press, LosAlamitos) 124-134

Slepian D 1974 ed, Key papers in the development of

information theory (IEEE Press, New York)

Spiller T P 1996 Quantum information processing:cryptography, computation and teleportation, Proc.IEEE 84, 1719-1746

Steane A M 1996a Error correcting codes in quantumtheory, Phys. Rev. Lett. 77 793-797

Steane A M 1996b Multiple particle interference andquantum error correction, Proc. Roy. Soc. Lond. A452 2551-2577

Steane A M 1996c Simple quantum error-correctingcodes, Phys. Rev. A 54, 4741-4751

Steane A M 1996d Quantum Reed-Muller codes, sub-mitted to IEEE Trans. Inf. Theory (preprint quant-ph/9608026)

Steane A M 1997a Active stabilisation, quantum com-

putation, and quantum state sythesis, Phys. Rev.Lett. 78, 2252-2255

Steane A M 1997b The ion trap quantum informationprocessor, Appl. Phys. B 64 623-642

Steane A M 1997c Space, time, parallelism and noiserequirements for reliable quantum computing (preprintquant-ph/9708021)

Szilard L 1929 Z. Phys. 53 840; translated in Wheelerand Zurek (1983).

Teich W G, Obermayer K and Mahler G 1988 Struc-tural basis of multistationary quantum systems II. Ef-fective few-particle dynamics, Phys. Rev. B 37 8111-8121

Toffoli T 1980 Reversible computing, in Automata,

Languages and Programming, Seventh Colloquium,Lecture Notes in Computer Science, Vol. 84, de BakkerJ W and van Leeuwen J, eds, (Springer) 632-644

Turchette Q A, Hood C J, Lange W, Mabushi H andKimble H J 1995 Measurement of conditional phaseshifts for quantum logic, Phys. Rev. Lett. 75 4710-4713

Turing A M 1936 On computable numbers, with an ap-plication to the Entschneidungsproblem, Proc. Lond.Math. Soc. Ser. 2 42, 230 ); see also Proc. Lond.Math. Soc. Ser. 2 43, 544 )

Unruh W G 1995 Maintaining coherence in quantumcomputers, Phys. Rev. A 51 992-997

Vedral V, Barenco A and Ekert A 1996 Quantumnetworks for elementary arithmetic operations, Phys.Rev. A 54 147-153

Weinfurter H 1994 Experimental Bell-state analysis,Europhys. Lett. 25 559-564

Wheeler J A and Zurek W H, eds, 1983 Quantum the-

ory and measurement (Princeton Univ. Press, Prince-ton, NJ)

Wiesner S 1983 Conjugate coding, SIGACT News 15

78-88

49

Wiesner S 1996 Simulations of many-body quantumsystems by a quantum computer (preprint quant-ph/9603028)

Wineland D J, Monroe C, Itano W M, Leibfried D,King B, and Meekhof D M 1997 Experimental issuesin coherent quantum-state manipulation of trappedatomic ions, preprint, submitted to Rev. Mod. Phys.

Wooters W K and Zurek W H 1982 A single quantumcannot be cloned, Nature 299, 802

Zalka C 1996 Efficient simulation of quantum systemsby quantum computers, (preprint quant-ph/9603026)

Zbinden H, Gautier J D, Gisin N, Huttner B, MullerA, Tittle W 1997 Interferometry with Faraday mirrorsfor quantum cryptography, Elect. Lett. 33, 586-588

Zurek W H 1989 Thermodynamic cost of computa-tion, algorithmic complexity and the information met-ric, Nature 341 119-124

50

Fig. 1. Maxwell’s demon. In this illustration the demon sets up a pressure difference by only raising thepartition when more gas molecules approach it from the left than from the right. This can be done in acompletely reversible manner, as long as the demon’s memory stores the random results of its observations ofthe molecules. The demon’s memory thus gets hotter. The irreversible step is not the acquisition of information,but the loss of information if the demon later clears its memory.

51

data

compression

Shannon's

theorem

Entanglement

Bell-EPR

correlations

multiple particle

interference

Measurement

Decoherence

cryptography

Quantum

computer

Computer

(Turing)

Error correcting

codes

Schrodinger's

equation

QuantumMechanics

Information Theory

computational

complexity

Maxwell's

demon

Quantum

error

correction

Statistical

Mechanics

Quantum

key

distribution

quantum

algorithms

Hilbert space

Fig. 2. Relationship between quantum mechanics and information theory. This diagram is not intended tobe a definitive statement, the placing of entries being to some extent subjective, but it indicates many of theconnections discussed in the article.

52

S(X)S(Y)

S(Y|X)S(X|Y) I(X:Y)

S(X,Y)

Fig. 3. Relationship between various measures of classical information.

53

A BEncode channel Decode

Fig. 4. The standard communication channel (“the information theorist’s coat of arms”). The source (Alice)produces information which is manipulated (‘encoded’) and then sent over the channel. At the receiver (Bob)the received values are ’decoded’ and the information thus extracted.

54

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

P(success)

k/n

Fig. 5. Illustration of Shannon’s theorem. Alice sends n = 100 bits over a noisy channel, in order to communicatek bits of information to Bob. The figure shows the probability that Bob interprets the received data correctly, asa function of k/n, when the error probability per bit is p = 0.25. The channel capacity is C = 1−H(0.25) ≃ 0.19.Dashed line: Alice sends each bit repeated n/k times. Full line: Alice uses the best linear error-correcting codeof rate k/n. The dotted line gives the performance of error-correcting codes with larger n, to illustrate Shannon’stheorem.

55

Fig. 6. A classical computer can be built from a network of logic gates.

56

0 0 000001 11111 1 1 1

Fig. 7. The Turing Machine. This is a conceptual mechanical device which can be shown to be capable ofefficiently simulating all classical computational methods. The machine has a finite set of internal states, anda fixed design. It reads one binary symbol at a time, supplied on a tape. The machine’s action on reading agiven symbol s depends only on that symbol and the internal state G. The action consists in overwriting a newsymbol s′ on the current tape location, changing state to G′, and moving the tape one place in direction d (leftor right). The internal construction of the machine can therefore be specified by a finite fixed list of rules ofthe form (s,G → s′, G′, d). One special internal state is the ‘halt’ state: once in this state the machine ceasesfurther activity. An input ‘programme’ on the tape is transformed by the machine into an output result printedon the tape.

57

H|j> { } |j>X1 H2 XOR1,3

Fig. 8. Example ‘quantum network.’ Each horizontal line represents one qubit evolving in time from left toright. A symbol on one line represents a single-qubit gate. Symbols on two qubits connected by a verticalline represent a two-qubit gate operating on those two qubits. The network shown carries out the operationX1H2xor1,3 |φ〉. The ⊕ symbol represents X (not), the encircled H is the H gate, the filled circle linked to ⊕is controlled-not.

58

H

H

H|f>

|f>

H

uv

uv

E|a>

|g>

|b> D|a>

|g>

|b>

a)

c)

b)

Fig. 9. Basic quantum communication concepts. The figure gives quantum networks for (a) dense coding, (b)teleportation and (c) data compression. The spatial separation of Alice and Bob is in the vertical direction;time evolves from left to right in these diagrams. The boxes represent measurements, the dashed lines representclassical information.

59

FTx

y

FT|0>Uf

x

f(x)

x

0|0>

S |k>

Fig. 10. Quantum network for Shor’s period-finding algorithm. Here each horizontal line is a quantum registerrather than a single qubit. The circles at the left represent the preparation of the input state |0〉. The encircledft represents the Fourier transform (see text), and the box linking the two registers represents a network toperform Uf . The algorithm finishes with a measurement of the x regisiter.

60

|0> |16> |32> |48> |64> |80> |96> |112>

|16>

|0>

|32>

x

|48>

y

|0> |16> |32> |48> |64> |80> |96> |112>

|16>

|0>

|32>

|48>

x

y

a)

b)

Fig. 11. Evolution of the quantum state in Shor’s algorithm. The quantum state is indicated schematically byidentifying the non-zero contributions to the superposition. Thus a general state

∑

cx,y |x〉 |y〉 is indicated byplacing a filled square at all those coordinates (x, y) on the diagram for which cx,y 6= 0. (a) eq. (35). (b) eq.(38).

61

+ +

Fig. 12. Ion trap quantum information processor. A string of singly-charged atoms is stored in a linear iontrap. The ions are separated by ∼ 20 µm by their mutual repulsion. Each ion is addressed by a pair of laserbeams which coherently drive both Raman transitions in the ions, and also transitions in the state of motionof the string. The motional degree of freedom serves as a single-qubit ‘bus’ to transport quantum informationamong the ions. State preparation is by optical pumping and laser cooling; readout is by electron shelving andresonance fluorescence, which enables the state of each ion to be measured with high signal to noise ratio.

62

B

Fig. 13. Bulk nuclear spin resonance quantum information processor. A liquid of ∼ 1020 ‘designer’ moleculesis placed in a sensitive magnetometer, which can both generate oscillating magnetic fields and also detect theprecession of the mean magnetic moment of the liquid. The situation is somewhat like having 1020 independentprocessors, but the initial state is one of thermal equilibrium, and only the average final state can be detected.The quantum information is stored and manipulated in the nuclear spin states. The spin state energy levels ofa given nucleus are influenced by neighbouring nuclei in the molecule, which enables xor gates to be applied.They are little influenced by anything else, owing to the small size of a nuclear magnetic moment, which meansthe inevitable dephasing of the processors with respect to each other is relatively slow. This dephasing can beundone by ‘spin echo’ methods.

63

H

H

H

H

H

H

H

H

H

H

Fig. 14. Fault tolerant syndrome extraction, for the QECC given in equations (47),(48). The upper 7 qubitsare qc, the lower are the ancilla a. All gates, measurements and free evolution are assumed to be noisy. OnlyH and 2-qubit xor gates are used; when several xors have the same control or target bit they are shownsuperimposed, NB this is a non-standard notation. The first part of the network, up until the 7 H gates,prepares a in |0E〉, and also verifies a: a small box represents a single-qubit measurement. If any measurementgives 1, the preparation is restarted. The H gates transform the state of a to |0E〉 + |1E〉. Finally, the 7 xor

gates between qc and a carry out a single xor in the encoded basis {|0E〉 , |1E〉}. This operation carriesX errorsfrom qc into a, and Z errors from a into qc. The X errors in qc can be deduced from the result of measuringa. A further network is needed to identify Z errors. Such correction never makes qc completely noise-free, butwhen applied between computational steps it reduces the accumulation of errors to an acceptable level.

64

Message Huffman Hamming

0000 10 00000000001 000 10101010010 001 01100110011 11000 11001100100 010 00011110101 11001 10110100110 11010 01111000111 1111000 11010011000 011 11111111001 11011 01010101010 11100 10011001011 111111 00110011100 11101 11100001101 111110 01001011110 111101 10000111111 1111001 0010110

Table 1: Huffman and Hamming codes. The left column shows the sixteen possible 4-bit messages, the othercolumns show the encoded version of each message. The Huffman code is for data compression: the most likelymessages have the shortest encoded forms; the code is given for the case that each message bit is three timesmore likely to be zero than one. The Hamming code is an error correcting code: every codeword differs fromall the others in at least 3 places, therefore any single error can be corrected. The Hamming code is also linear:all the words are given by linear combinations of 1010101, 0110011, 0001111, 1111111. They satisfy the paritychecks 1010101, 0110011, 0001111.

65