statistical cosmollogy in retrospect

5
I t has become a bit of a cliché to refer to the modern era of observational cosmology as a “golden age”, but clichés are clichés because they are often perfectly apt. When I started my graduate studies in cosmology in 1985, the largest redshift survey in circulation was the CfA slice (De Lapparent, Geller and Huchra 1986) containing just over a thousand galaxy positions. Now we have the Anglo- Australian 2dF Galaxy Redshift Survey (2dFGRS) in which the combination of wide- field (two-degree) optics and multifibre spec- troscopy has enabled redshifts of about a quarter of a million galaxies to be obtained in just a few years (Colless et al. 2001). Back in 1985, no experiment had been able to detect variations in the temperature of the cosmic microwave background (CMB) radiation across the sky predicted by theories for the formation of the structures seen in galaxy surveys. This changed in 1992 with the discovery, by the COBE team (Smoot et al. 1992), of large-scale “ripples” in the CMB. This, in turn, triggered immense theoretical and observational interest in the potential use of finer-scale structure in the CMB as a cosmological diagnostic. The stun- ning high-resolution maps produced by the WMAP satellite (Bennett et al. 2003) are to COBE what the 2dFGRS is to the CfA survey. These huge new datasets, and the others that will soon follow them, have led to a recognition of the central role of statistics in the new era of cosmology. The new maps may be pretty to look at, but if they are to play a role in the empirical science that cosmology aspires to be, they must be analysed using objective techniques. This requires a statistical approach. Patterns in the distribution of galaxies and in the distribution of hot and cold spots in the microwave sky must be quantified using statistical descriptors that can be compared with theories. In their search for useful descriptors, cosmologists have bor- rowed ideas from diverse fields of mathematics, such as graph theory and differential topology, as well as adapting more traditional statistical devices to meet the particular demands of cos- mological data (Martinez and Saar 2002). The power-spectrum, for example, a technique usu- ally applied to the analysis of periodicities in time series, has been deployed with great success to the analysis of both 2dFGRS (Percival et al. 2001) and WMAP (Hinshaw et al. 2003). This urgent demand for a statistical approach to cos- mology is not new. It is merely the latest mani- festation of a fascinating and deep connection between the fields of astronomy and statistics. Astronomy and the history of statistics The attitude of many physicists to statistics is summarized by two quotations: “There are lies, damned lies, and statistics” (Benjamin Disraeli); and “If your experiment needs statistics, you ought to have done a better experiment” (Ernest Rutherford). It was certainly my atti- tude when I was an undergraduate physics stu- dent that statistics was something done by biologists and the like, not by real scientists. I skipped all the undergraduate lectures on stat- istics that were offered, and only saw the error of my ways when I started as a graduate stu- dent and realised how fundamentally important it is to understand statistics if you want to do science with observations that are necessarily imperfect. Astronomy is about using data to test hypotheses in the presence of error, it is about making inferences in the face of uncer- tainty, and it is about extracting model para- meters from noisy data. In short, it is about statistics. This argument has even greater strength when applied to cosmology, which has all the peculiarities of astronomy and some of its own. In mainstream astronomy one can sur- vey populations of similar objects, count fre- quencies and employ arguments based on large numbers. In cosmology one is trying to make inferences about a unique system, the universe. As we shall see, history has a lesson to teach us on this matter too. The connection between astronomy and stat- istics is by no means a one-way street. The Cosmology 3.16 June 2003 Vol 44 Statistical cosmol Over the past decade unprecedented improvements in observational technology have revolutionized astronomy. The effects of this data explosion have been felt particularly strongly in the field of cosmology. Enormous new datasets, such as those recently produced by the WMAP satellite, have at last placed our understanding of the universe on a firm empirical footing. The new surveys have also spawned sophisticated statistical approaches that can do justice to the quantity and quality of the information. In this paper I look at some of the new developments in statistical cosmology in their historical context, and show that they are just the latest example of a deep connection between the fields of astronomy and statistics that goes back at least as far as Galileo. Abstract Peter Coles looks at the past, present and future association of astronomy and cosmology with statistical theory. 1: The huge improve- ment in angular resol- ution from degree to arcminute scales reveals unprecedented detail in the temper- ature structure of the CMB, but also poses challenges. Millions rather than thousands of pixels mean that even simple statistical covariances require more substantial number-crunching for WMAP (below) than COBE (above). More importantly, the high signal-to-noise of WMAP means that systematic errors must be understood even more carefully than in the COBE map.

Upload: peter-coles

Post on 06-Jul-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Statistical cosmollogy in retrospect

It has become a bit of a cliché to refer to the

modern era of observational cosmology as

a “golden age”, but clichés are clichés

because they are often perfectly apt. When I

started my graduate studies in cosmology in

1985, the largest redshift survey in circulation

was the CfA slice (De Lapparent, Geller and

Huchra 1986) containing just over a thousand

galaxy positions. Now we have the Anglo-

Australian 2dF Galaxy Redshift Survey

(2dFGRS) in which the combination of wide-

field (two-degree) optics and multifibre spec-

troscopy has enabled redshifts of about a

quarter of a million galaxies to be obtained in

just a few years (Colless et al. 2001). Back in

1985, no experiment had been able to detect

variations in the temperature of the cosmic

microwave background (CMB) radiation across

the sky predicted by theories for the formation

of the structures seen in galaxy surveys. This

changed in 1992 with the discovery, by the

COBE team (Smoot et al. 1992), of large-scale

“ripples” in the CMB. This, in turn, triggered

immense theoretical and observational interest

in the potential use of finer-scale structure in the

CMB as a cosmological diagnostic. The stun-

ning high-resolution maps produced by the

WMAP satellite (Bennett et al. 2003) are to

COBE what the 2dFGRS is to the CfA survey.

These huge new datasets, and the others that

will soon follow them, have led to a recognition

of the central role of statistics in the new era of

cosmology. The new maps may be pretty to look

at, but if they are to play a role in the empirical

science that cosmology aspires to be, they must

be analysed using objective techniques. This

requires a statistical approach. Patterns in the

distribution of galaxies and in the distribution

of hot and cold spots in the microwave sky must

be quantified using statistical descriptors that

can be compared with theories. In their search

for useful descriptors, cosmologists have bor-

rowed ideas from diverse fields of mathematics,

such as graph theory and differential topology,

as well as adapting more traditional statistical

devices to meet the particular demands of cos-

mological data (Martinez and Saar 2002). The

power-spectrum, for example, a technique usu-

ally applied to the analysis of periodicities in

time series, has been deployed with great success

to the analysis of both 2dFGRS (Percival et al.2001) and WMAP (Hinshaw et al. 2003). This

urgent demand for a statistical approach to cos-

mology is not new. It is merely the latest mani-

festation of a fascinating and deep connection

between the fields of astronomy and statistics.

Astronomy and the history of statistics

The attitude of many physicists to statistics is

summarized by two quotations: “There are lies,

damned lies, and statistics” (Benjamin Disraeli);

and “If your experiment needs statistics, you

ought to have done a better experiment”

(Ernest Rutherford). It was certainly my atti-

tude when I was an undergraduate physics stu-

dent that statistics was something done by

biologists and the like, not by real scientists. I

skipped all the undergraduate lectures on stat-

istics that were offered, and only saw the error

of my ways when I started as a graduate stu-

dent and realised how fundamentally important

it is to understand statistics if you want to do

science with observations that are necessarily

imperfect. Astronomy is about using data to

test hypotheses in the presence of error, it is

about making inferences in the face of uncer-

tainty, and it is about extracting model para-

meters from noisy data. In short, it is about

statistics. This argument has even greater

strength when applied to cosmology, which has

all the peculiarities of astronomy and some of

its own. In mainstream astronomy one can sur-

vey populations of similar objects, count fre-

quencies and employ arguments based on large

numbers. In cosmology one is trying to make

inferences about a unique system, the universe.

As we shall see, history has a lesson to teach us

on this matter too.

The connection between astronomy and stat-

istics is by no means a one-way street. The

Cosmology

3.16 June 2003 Vol 44

Statistical cosmolOver the past decade unprecedented

improvements in observational technology

have revolutionized astronomy. The effects

of this data explosion have been felt

particularly strongly in the field of

cosmology. Enormous new datasets, such

as those recently produced by the WMAP

satellite, have at last placed our

understanding of the universe on a firm

empirical footing. The new surveys have

also spawned sophisticated statistical

approaches that can do justice to the

quantity and quality of the information. In

this paper I look at some of the new

developments in statistical cosmology in

their historical context, and show that

they are just the latest example of a deep

connection between the fields of

astronomy and statistics that goes back at

least as far as Galileo.

Abstract

Peter Coles looks at the past, present and future association of astronomy and cosmology with statistical theory.

1: The huge improve-ment in angular resol-ution from degree toarcminute scalesreveals unprecedenteddetail in the temper-ature structure of theCMB, but also poseschallenges. Millionsrather than thousandsof pixels mean thateven simple statisticalcovariances requiremore substantialnumber-crunching forWMAP (below) thanCOBE (above). Moreimportantly, the highsignal-to-noise ofWMAP means thatsystematic errors mustbe understood evenmore carefully than inthe COBE map.

Page 2: Statistical cosmollogy in retrospect

development of statistics from the 17th century

to the modern era is a fascinating story, the

details of which are described by Hald (1998).

To cut a long story short, it makes sense to

think of three revolutionary (but not necessar-

ily chronological) stages. The first involved the

formulation of the laws of probability from

studies of gambling and games of chance. The

second involved applying these basic notions of

probability to topics in natural philosophy,

including astronomy (particularly astrometry

and celestial mechanics). The third, and most

recent stage, saw the rise of statistical thinking

in the life sciences, including sociology and

anthropology. It is the second of these stages

that bears the most telling witness to the cru-

cial role astronomy has played in the develop-

ment of mainstream statistics: an astonishing

number of the most basic concepts in modern

statistics were invented by astronomers. Here

are just a few highlights:

Galileo Galilei (1632) was, as far as I am

aware, the first to find a systematic way of tak-

ing errors into account when fitting observa-

tions, i.e. assigning lower weights to

observations with larger errors. He used

absolute errors, rather than the squared error

favoured by modern approaches.

Daniel Bernoulli (1735) performed a pio-

neering analysis of the inclinations of the plan-

etary orbits, with the aim of determining

whether they were consistent with a random

distribution on the sky.

John Michell (1767), although probably

more famous for his discussion of what are now

known as black holes, engaged in a statistical

analysis of the positions of the “fixed stars” on

the celestial sphere to see if they were consistent

with a random pattern.

Laplace (1776) studied the problem of deter-

mining the mean inclination of planetary orbits.

Carl Freidrich Gauss (1809) developed the

principles of least-squares in fitting observations.

Laplace (1810) proved a special case of the

Central Limit Theorem, namely that the distri-

bution of the sum of a large number of inde-

pendent variables drawn from some

distribution tends towards the Gaussian distri-

bution regardless of the original distribution.

Friedrich Wilhelm Bessel (1838), best known

for his functions, provided a general proof of

the Central Limit Theorem.

John Herschel (1850) developed a theory of

errors based on the “normal”, i.e. Gaussian dis-

tribution. The phrase “normal distribution” is

now standard.

Sir Harold Jeffreys (1939) published a book

on The Theory of Probability and so rekindled

a long-standing debate on the meaning of prob-

ability ignited by Laplace and Thomas Bayes to

which I shall return later on.

All these pioneering studies influenced mod-

ern-day statistical methods and terminology,

and all were done by scientists who were, first

and foremost, astronomers. It is only compar-

atively recently that there has been significant

traffic in the reverse direction. The Berkeley

statisticians Jerzy Newman and Elizabeth Scott

(1952), for example, made an important con-

tribution to the study of the spatial distribution

of galaxy clustering.

A forensic connection

Before resuming the thread of my argument, I

can’t resist taking a detour in order to visit

another surprising point at which the history of

statistics meets that of astronomy. When I give

popular talks, I often try to draw the analogy

between cosmology and forensic science. We

have one universe, our one crime scene. We can

collect trace evidence, look for fingerprints,

establish alibis and so on. But we can’t study a

population of similar crime scenes, nor can we

perform a series of experimental crimes under

slightly different physical conditions. What we

have to do is make inferences and deductions

within the framework of a hypothesis that we

continually subject to empirical test. This

process carries on until reasonable doubt is

exhausted. Detectives call this “developing a

suspect”; cosmologists call it “constraining a

theory”. I admit that I have stretched this anal-

ogy to breaking point, but at least it provides

an excuse for discussing another famous

astronomer-come-statistician. Lambert Adolphe

Jacques Quételet (1796–1894) was a Belgian

astronomer whose main interest was in celestial

mechanics. He was also an expert in statistics.

As we have seen, this tendency was not unusual

for astronomers in the 19th century but

Quételet was a particularly notable example.

He has been called “the father of modern stat-

istics” and, among other things, was responsi-

ble for organizing the first international

conference on statistics in 1853. His fame as a

statistician owed less to his astronomy, however,

than the fact that in 1835 he had written a book

called On Man. Quételet had been struck not

only by the regular motions displayed by heav-

enly bodies, but also by regularities in social

phenomena, such as the occurrence of suicides

and crime. His book was an attempt to apply

statistical methods to the development of man’s

physical and intellectual faculties. His later

work Anthropometry, or the Measurement ofDifferent Faculties in Man (1871) carried these

ideas further. This foray into “social physics”

was controversial, and led to a number of

unsavoury developments in pseudoscience such

as the eugenics movement, but it did inspire two

important developments in forensics.

Inspired by Quételet’s work on anthropometry,

Adolphe Bertillon hit upon the idea of solving

crime using a database of measurements of phys-

ical characteristics of convicted criminals (length

of head, width of head, length of fingers, and so

on). On their own, none of these could possibly

prove a given individual to be guilty, but it ought

to be possible using a large number of measure-

ments to establish identity with a high proba-

bility. “Bertillonage”, as the system he invented

came to be known, was cumbersome but proved

2: Great astronomers who also developed statistical methods.

Bernoulli Bessel Galileo Gauss Jeffreys Laplace Quételet

Cosmology

3.17June 2003 Vol 44

ogy in retrospect

Page 3: Statistical cosmollogy in retrospect

successful in several high profile criminal cases

in Paris. By 1892 Bertillon was famous, but

nowadays bertillonage is a word found only in

the more difficult Sunday newspaper cross-

words. The reason for its demise was the devel-

opment of fingerprinting. It’s a curious

coincidence that the first fingerprinting scheme

in practical use was implemented in India, by a

civil servant called William Herschel. But that

is another story.

The great debate

A particularly interesting aspect of modern stat-

istical cosmology is the attention being given to

Bayesian methods. This has reopened a debate

that goes back as far as the early days of stat-

istics, indeed to the first of the three revolution-

ary periods I referred to above. This debate

concerns the very meaning of the probabilistic

reasoning on which the field of statistics is based.

We make probabilistic statements of various

types in various contexts. “Arsenal will proba-

bly win the Premiership,” “There’s a one-in-six

chance of rolling a six,” and so on. But what

do we actually mean when we make such state-

ments? Roughly speaking there are two com-

peting views. Probably the most common is the

frequentist interpretation, which considers a

large ensemble of repeated trials. The proba-

bility of an event or outcome is then given by

the fraction of times that particular outcome

arises in the ensemble. In other words, proba-

bility is identified with some kind of proportion.

The alternative “Bayesian” view does not con-

cern itself with ensembles and frequencies of

events, but is instead a kind of generalization of

the rules of logic. While ordinary logic, Boolean

algebra, relates to propositions that are either

true (with a value of 1) or false (value 0),

Bayesian probability represents the intermediate

state in which one does not have sufficient

information to determine truth or falsehood

with certainty. Notice that Bayesian probability

is applied to logical propositions, rather than

events or outcomes, and it represents the degree

of reasonable belief one can attach. Moreover,

all probabilities in this case are dependent on

the assumption of a particular model.

The name “Bayesian” derives from the Revd

Thomas Bayes (1702–61) who found a way of

“inverting” the probability involved in a partic-

ular problem involving the binomial distribution.

The general result of this procedure, now known

as Bayes’ theorem, was actually first obtained by

Laplace. The idea is relatively simple. Suppose I

have an urn containing N balls, n of which are

white and N–n of which are black. Given values

of n and N, one can write down the probability

of drawing m white balls in a sequence of Mdraws. The problem Bayes solved was to find the

probability of n given a particular value of value

of m, thus inverting the probability P(m|n) into

P(n|m). This particular problem is only of acad-

emic interest, but the general result is important.

To give an illustration, suppose we have a

parameter A that we wish to extract from some

set of data D that we assume is noisy. A stan-

dard frequentist approach would proceed via

the construction of a likelihood function, which

gives the probability of the data D for each par-

ticular value of the parameter A and the prop-

erty’s experimental noise P(D|A). The

maximum of this likelihood would be quoted

as the “best” value of the parameter: it is the

value of the parameter that makes the measured

data “most likely”. The alternative, Bayesian,

approach would be to “invert” the likelihood

P(D|A) to obtain P(A|D) using Bayes’ theorem.

The maximum of P(A|D) then gives the most

probable value of A. Clearly these two

approaches can be equivalent if P(A|D) and

P(D|A) are identical, but in order to invert the

likelihood correctly we need to multiply it by

P(A), a “prior probability” for A. If this prob-

ability is taken to be constant then the two

approaches are equivalent, but consistent

Bayesian reasoning can involve non-uniform

priors. To a frequentist, a prior probability

means model-dependence. To a Bayesian, any

statement about probability involves a model.

It has to be admitted, however, that the issue of

prior probabilities and how to assign them

remains a challenge for Bayesians.

This is not a sterile debate about the meaning

of words. The interpretation of probability is

important for many reasons, not least because

of the implications it holds for the scientific

method. The frequentist interpretation, the one

favoured by many of the classical statistical

texts, sits most comfortably with an approach

to the philosophy of science like that of Karl

Popper. In such a view, data are there to refute

theories, and repeated failure to refute a theory

provides no reason for stronger belief in it. On

the other hand, Bayesians advocate “inductive”

reasoning. Data can either support a model,

making it more probable, or contradict it, mak-

ing it less probable. Falsifiability is no longer the

sole criterion by which theories are judged.

There is not space here to go over the pros and

cons of these two competing approaches, but it

is interesting to remark that the early historical

development of mathematical statistics owes

much more to the Bayesian approach than fre-

quentist ones but these fell into disrepute later.

Most standard statistical texts, such as the com-

pendious three-volume Kendall and Stuart

(1977), are rigorously frequentist. When Harold

Jeffreys (1939) advocated a Bayesian position he

was regarded as something of a heretic. The book

by Ed Jaynes (2003), published posthumously, is

a must for anyone interested in these issues.

The Bayesian interpretation holds many

attractions for cosmology. One is that the uni-

verse is, by definition, unique. It is not impos-

sible to apply frequentist arguments in this

arena by constructing an ensemble of universes,

but one has to worry about whether such an

ensemble is meant to be real or simply a con-

struct to allow probabilities to become propor-

tions. Bayesians, on the other hand, can deal

with unique entities quite straightforwardly.

Another reason for cosmologists to be inter-

ested in Bayesian reasoning is that they are quite

used to the idea that things are model-

dependent. For example, one of the important

tasks set during the era of precision cosmology

has been the determination of a host of cosmo-

logical parameters that can’t be predicted apriori in the standard Big Bang model. These

parameters include the expansion rate (Hubble

constant) H0, the spatial curvature k, the mean

cosmic density in various forms of material

Ωm (cold dark matter), Ων (neutrinos), Ωb

(baryons), and so on. In all, it is possible to infer

joint values of 13 or more such parameters from

measurements of the fluctuations of the cosmic

microwave background. But such parameters

only make sense within the overall model

within which they are defined. This massive

inference problem also demonstrates another

virtue of the Bayesian approach. The result

(posterior probability) of one analysis can be

fed into another as a prior probability.

Probable worlds

As well as its practical implications for data

analysis, a Bayesian approach also leads to use-

ful insights into more esoteric topics; see

Garrett and Coles (1993). Among these is the

Anthropic Principle which is, roughly speaking,

the assertion that there is a connection between

the existence of life in the universe and the fun-

damental physics that governs large-scale cos-

mological behaviour. This expression was first

used by Carter (1974) who suggested adding the

word “Anthropic” to the usual Cosmological

Principle to stress that our universe is “special”,

at least to the extent that it has permitted intel-

ligent life to evolve. Barrow and Tipler (1986)

give a complete discussion.

There are many otherwise viable cosmologi-

cal models that are not compatible with the

observation that human observers exist. For

example, heavy elements such as carbon and

oxygen are vital to the complex chemistry

required for terrestrial life to have developed.

And it takes around 10 billion years of stellar

evolution for generations of stars to synthesize

significant quantities of these elements from the

primordial gas of hydrogen and helium that

exists in the early stages of a Big Bang model.

Therefore, we could not inhabit a universe

younger than about 10 billion years old. Since

the size of the universe is related to its age if it

is expanding, this line of reasoning, originallly

due to Dicke (1961) sheds some light on the

question of why the universe is as big as it is. It

has to be big, because it has to be old if there

Cosmology

3.18 June 2003 Vol 44

Page 4: Statistical cosmollogy in retrospect

has been time for us to evolve within it. Should

we be surprised that the universe is so big? No,

its size is not improbable if one takes into

account the conditioning information.

This form of reasoning is usually called the

Weak Anthropic Principle (WAP), and is essen-

tially a corrective to the Copernican Principle

that we do not inhabit a special place in the uni-

verse. According to WAP, we should remember

that we can only inhabit those parts of space-

time compatible with human life, away from the

centre of a massive black hole, for example. By

the argument given above, we just could not

exist at a much earlier epoch than we do. This

kind of argument is relatively uncontroversial,

and can lead to useful insights. From a Bayesian

perspective, what WAP means is that our exis-

tence in the universe in itself contains informa-

tion about the universe and it must be included

when conditional probabilities are formed.

One example of a useful insight gleaned in this

way relates to the Dirac Cosmology. Dirac

(1937) was perplexed by a number of apparent

coincidences between large dimensionless ratios

of physical constants. He found no way to

explain these coincidences using standard theor-

ies, so he decided that they had to be a conse-

quence of a deep underlying principle. He

therefore constructed an entire theoretical edifice

of time-varying fundamental constants on the so-

called Large Number Hypothesis. The simple

argument by Dicke outlined above, however, dis-

penses with the need to explain these coinci-

dences in this way. For example, the ratio

between the present size of the cosmological

horizon and the radius of an electron is roughly

the same as the ratio between the strengths of the

gravitational and electromagnetic forces binding

protons and electrons. (Both ratios are huge: of

order 1040). This does indeed seem like a coinci-

dence, but remember that the size of the horizon

depends on the time: it gets bigger as time goes

on. And the lifetime of a star is determined by

the interplay between electromagnetic and grav-

itational effects. It turns out that both these ratios

reduce to the same value precisely because they

both depend on the lifetime of stellar evolution:

the former through our existence as observers,

and the latter through the fundamental physics

describing the structure of a star. The “coinci-

dence” is therefore not as improbable as one

might have thought when one takes into account

all the available conditioning information.

Some cosmologists, however, have sought to

extend the Anthropic Principle into deeper

waters. While the weak version applies to phys-

ical properties of our universe such as its age,

density or temperature, the “Strong” Anthropic

Principle (SAP) is an argument about the laws

of physics according to which these properties

evolve. It appears that these fundamental laws

are finely tuned to permit complex chemistry,

which, in turn, permits the development of biol-

ogy and ultimately human life. If the laws of

electromagnetism and nuclear physics were only

slightly different, chemistry and biology would

be impossible. On the face of it, the fact that the

laws of Nature do appear to be tuned in this

way seems to be a coincidence, in that there is

nothing in our present understanding of funda-

mental physics that requires the laws to be con-

ducive to life in this way. This is therefore

something we should seek to explain.

In some versions of the SAP, the reasoning is

essentially teleological (i.e. an argument from

design): the laws of physics are as they are

because they must be like that for life to develop.

This is tantamount to requiring that the exis-

tence of life is itself a law of Nature, and the

more familiar laws of physics are subordinate to

it. This kind of reasoning appeals to some with

a religious frame of mind but its status among

scientists is rightly controversial, as it suggests

that the universe was designed specifically in

order to accommodate human life. An alterna-

tive construction of the SAP involves the idea

that our universe may consist of an ensemble of

mini-universes, each one having different laws of

physics to the others. We can only have evolved

in one of the mini-universes compatible with the

development of organic chemistry and biology,

so we should not be surprised to be in one where

the underlying laws of physics appear to have

special properties. This provides some kind of

explanation for the apparently surprising prop-

erties of the laws of Nature mentioned above.

This latter form of the SAP is not an argument

from design, since the laws of physics could

vary haphazardly from mini-universe to mini-

universe, and is in some aspects logically simi-

lar to the WAP, except with a frequentist slant.

Fingerprinting the universe

I hope I have demonstrated that the relationship

between statistics and cosmology is an intimate

and longstanding one. This does not mean that

the modern era does not pose new challenges,

some of them resulting from the sheer scale of the

data flows being generated. To understand some

of these new problems it is perhaps helpful to step

back about 20 years and look at the sort of meth-

ods that could be applied to cosmological data

and the sort of questions that were being asked.

In the 1980s there was no universally accepted

“standard” cosmological model. The curvature

and expansion rate of the universe had not yet

been measured, the average density of matter

was uncertain by a factor of at least 10. Some

basic ideas of the origin of cosmic structure had

been established, such as that it formed by grav-

itational condensation from small initial fluc-

tuations perhaps generated during an

inflationary epoch in the early universe, but the

form of the initial fluctuation spectrum was

uncertain. The largest redshift surveys covered

a volume less than 100 Mpc deep and contained

on the order of a thousand galaxies.

The only forms of statistical analysis that

could be performed at this time had to be sim-

ple in order to cope with the sparseness of the

data. The most commonly used statistical analy-

sis of galaxy clustering observations involved the

two-point correlation (e.g. Peebles 1980). This

measures the statistical tendency for galaxies to

occur in pairs rather than individually. The cor-

relation function, usually written ξ(r), measures

the excess number of pairs of galaxies found at

a separation r compared to how many such pairs

would be found if the distribution of galaxies in

space were completely random. In other words,

ξ(r) measures the excess probability of finding a

galaxy at a distance r from a given galaxy, com-

pared with what one would expect if galaxies

were distributed independently throughout

space. A positive value of ξ(r) indicates that

there are more pairs of galaxies with a separa-

tion r; galaxies are then said to be clustered on

the scale r. A negative value indicates that galax-

ies tend to avoid each other, so they are anti-

clustered. A completely random distribution,

usually called a Poisson distribution, has ξ(r) =0

for all r. Estimates of the correlation function

of galaxies indicate that ξ(r) is a power-law

Cosmology

3.19June 2003 Vol 44

3: The 2dFGRS coneshowing more than200 000 galaxies withrecession velocitiesup to 100 000 km/s,selected from theAPM survey. The CfAredshift survey, stateof the art in 1986,contained only twothousand galaxieswith top recessionvelocity 15 000 km/s.The 2dFGRS is theclosest to a fairsample of the galaxydistribution that hasever been produced.

10h

11h

12h

13h

14h

0.1

0.2

0.3

0.5

1.0

1.5

2.0

2.5

4h

3h

2h

1h

0h

23h

22h

reds

hift

billion light years

Page 5: Statistical cosmollogy in retrospect

function of the form ξ(r) ≈ (r/r0)–1.8, where the

constant r0 is usually called the correlation

length. The value of r0 depends on the type of

galaxy used, but is around 5 Mpc for bright

galaxies. On larger scales than this, however, ξ(r)becomes negative, indicating the presence of

large voids. The correlation function is mathe-

matically related to the power spectrum P(k) by

a Fourier transformation; the power-spectrum is

in many ways a better descriptor of clustering on

large-scales than the correlation function and

has played a central role in the analysis of the

2dFGRS I referred to earlier. Since they form a

transform pair, ξ(r) and P(k) contain the same

information about clustering pattern. A slightly

different version of the power-spectrum not

based on Fourier modes but on spherical har-

monics is the principal tool for the analysis of

CMB fluctuations especially for the purpose of

extracting cosmological parameters.

The correlation function and power-spectrum

are examples of what are called second-order

clustering statistics (the first-order statistic is

simply the mean density). Their success in pin-

ning down cosmological parameters relies on the

fact that according to standard cosmological

models, the initial fluctuations giving rise to

large-scale are Gaussian in character, meaning

that second-order quantities furnish a complete

description of their spatial statistics. The

microwave background provides a snapshot of

these initial fluctuations, so the spherical har-

monic spectrum is essentially a complete

description of the statistics of the primordial uni-

verse. The universe of galaxies is not quite such

a pristine reflection of the initial cosmic fluctu-

ations, but the power-spectrum is such a simple

and robust measure of clumping that one can

correct for nonlinear evolution, bias and distor-

tions produced by peculiar motions in a rela-

tively straightforward manner, at least on large

scales (Percival et al. 2001). This is why second-

order statistics have been so fruitful in the analy-

sis of the latest surveys and in establishing the

basic parameters of the standard cosmology.

One has to bear in mind, however, that

second-order statistics are relatively blunt instru-

ments. There is much more information in

2dFGRS and WMAP than is revealed by their

power-spectra, but this information is not so

much about cosmological parameters as about

deviations from the standard model. The broad-

brush description afforded by a handful of num-

bers will have to be replaced by measurements

of the fine details left by complex astrophysical

processes involved in structure formation and

possible signatures of unknown physics. The

strength of the second-order statistics is that they

are insensitive to these properties, so their detec-

tion will require higher-order approaches tuned

to probe particular departures from the standard

model. Does the CMB display primordial non-

Gaussianity (Komatsu et al. 2003), for example?

What can one learn about galaxy formation

from the clustering of galaxies of different types

in different environments (Madgwick et al.2003)? The recent stunning discoveries are not

the end of the line for statistical cosmology, but

are signposts pointing in new directions.

Peter Coles is Professor of Astrophysics at theUniversity of Nottingham.

ReferencesBarrow J D and F J Tipler 1986 The Anthropic CosmologicalPrinciple Oxford University Press, Oxford.Bennett C L et al. 2003 ApJ submitted, astro-ph/0302207.Colless M M et al. 2001 MNRAS 328 1039.

De Lapparent V, M J Geller and J P Huchra 1986 ApJ 302 L1.Garrett A J M and P Coles 1993 Comments on Astrophys. 17 23.Hald A 1998 A History of Mathematical Statistics from 1750 to 1930John Wiley & Sons.Hinshaw G et al. 2003 ApJ submitted, astro-ph/0302217.Jaynes E T 2003 Probability Theory CUP, Cambridge.Jeffreys H 1939 The Theory of Probability OUP, Oxford.Kendall M G and A Stuart 1977 The Advanced Theory of Statistics(3 volumes), Charles Griffin & Co, London.Komatsu E et al. 2003 ApJ submitted, astro-ph/0302223.Madgwick D S et al. 2003 MNRAS submitted, astro-ph/0303668.Martinez V J and E Saar 2002 The Statistics of the GalaxyDistribution Chapman & Hall.Neymann J and E Scott 1952 ApJ 116 144.Peebles P J E 1980 The Large-scale Structure of the UniversePrinceton University Press, Princeton.Percival W J et al. 2001 MNRAS 327 1297.Smoot G F et al. 1992 ApJ 396 L1.

Cosmology

3.20 June 2003 Vol 44

4: The 2dFGRSpower-spectrum.Note the small errorbars resulting fronthe large samplesize. There is somesuggestion of“wiggles” in thespectrum that maybe the remnants ofthe acoustic peaksin the WMAP angularspectrum (figure 5).The power spectrumis a useful diagnosticof clustering pattern;a completestatistical descrip-tion of the filamentsand voids in figure 3will require moresophisticatedmeasures. (2dFGRSTeam/Will Percival.)

5: The CMB angular powerspectrum for WMAP, COBE,ACBAR and CBI data. Notethat the WMAP results areconsistent with the otherexperiments and that thecombined WMAP and CBIpoints constrain the angularpower spectrum of our skywith great accuracy andprecision. The agreementbetween the measuredspectrum and the modelshown not only suggeststhe basic cosmologicalframework is correct, butalso allows parameters ofthe model to be estimatedwith unprecedentedaccuracy. The polarizationspectrum shown below isless certain, but does showa perhaps surprising peakat very low sphericalharmonics usually attributedto the astrophysical effectof cosmic reionization.(WMAP Science Teamimages.)

10

1

0.1

0.01

10–3

pow

er s

pect

rum

∆2 (

k)

0.01 0.1 1 wave number k/h Mpc–1

pow

er s

pect

rum

/sm

ooth

mod

el

1

0.02 0.05 0.1

angular scale90° 2° 0.5° 0.2°

6000

5000

4000

3000

2000

1000

0

3

2

1

0

–10 10 40 100 200 400 800 1400 multipole moments (l )

l(l+

1)C l

/2π

(µK2

)(l+

1)C l

/2π

(µK2

)

TT cross power spectrum

Λ – CDM all dataWMAPCBIACBAR

TE cross power spectrumreionization