the probability sampling tradition in a period of crisis q2010 keynote speech carl-erik särndal...
TRANSCRIPT
The Probability Sampling Tradition in a period of crisis
Q2010 Keynote speech
Carl-Erik Särndal
Université de Montréal
The Probability Sampling Tradition
has governed surveys at National Statistical Institutes (NSI:s) for decades
Breaking a tradition : Not easy …
Background
The merits of probability sampling, also known as scientific sampling, are put in question by severe imperfections : non-sampling errors, economic pressures etc.
The problem not new – but more and more compelling
BackgroundThe probability sampling process• is expensive (through follow-ups);• its theoretical merits are compromised
(by nonresponse, etc.)• “a few extra %” amount to very little• alternative data collection methods exist
Yet probability sampling continues to be practiced. Wasteful ? Can we do without probability sampling?
My view
is a (Canadian) theoretician’s view
on (official) statistics production
To what extent guided by (statistical science) theory ?
Something we admire: Being able to predict facts about the world we
live in by theoretical arguments and deduction
This is the predictive power of science
In statistics: Want precise statements, backed by convincing theory, of level of unemployment, of industrial production, and so on
Theory as a basis for science (knowledge)
Theory as a basis for science
Gérard Jorland : How is it possible that one can predict, merely by theoretical deductions, the existence of a new planet, or a new chemical element, or a new elementary particle?
Based only on a calculus, on a set of mathematical equations ... remarkable achievement of the human mind.
Famous example: Planet Neptune was “found” by mathematical prediction by Le Verrier 1846, then empirically observed by Galle, at the position given by Le Verrier
Many other examples come from physics, astronomy, chemistry
A hypothesis to test:
The sciences are predictive to the extent that they are mathematically formulated.
But that hypothesis is rejected : Today, Economics is highly mathematical and theoretical, but such arguments did not predict the current economic crisis, for example.
The contrast
Physics: Predictive power of formal theory very high
Economics: Predictive power of formal theory low
So “science formulated mathematically” does not guarantee “predictive power of theory”
Why then are Physics and Economics different? Both are theoretical (mathematical) .
Contrasts
Physics : the objects (planets, elementary particles, and so on) are inanimate ; predictive power very high
Economics : the objects and the participants (human beings) are unpredictable, relationships highly complex; predictive power very low
Theory as a guide in statistics productionOur ambition : Create knowledge (predictions) about our
world through statistical surveys .
To what extent is this activity supported by theory ? To what extent scientific ?
Legitimate questions !
Some NSI:s take pride in “scientific principles”.
Sampling = Limiting attention to a small subset
To what extent scientific ?
We accept without hesitation that observing only n = 1,000 (or a few thousand) is enough - but provided the sample is “scientific”
What is a scientific sample ? RoperCentre, Univ. of Connecticut, says :
A scientific sample is a process in which respondents are chosen randomly by one of several methods. The key component in the scientific sample is that everyone within the designated group (sample frame) has a chance of being selected.
We may add : Such a sample also known as a probability sample It is not necessarily a representative sample in the sense “all have the same probability”.
scientific sample probability sample representative sample
around these terms, unfortunate ambiguity and confusion reigns in literature, in conversation
Ask, and you get a variety of responses
Sampling = Limiting attention to a small subset
Two contrasting examples:
Sampling trees in a forest - to predict volume
Sampling human beings in a country - to predict (assess) unemployment, or health conditions, or expenditures
Estimating volume of wood on a sample of trees
With classical probability sampling theory, we get not only a figure for the total volume of wood in the forest, but also a statement of its margin of error, free of any assumptions.
We can determine exactly the accuracy we want.
Estimating unemployed on a sample of people
We get from the LFS a figure, but we cannot quantify its margin of error. There is no objective declaration of numerical quality
because unmeasured are : nonresponse error, measurement error, frame error, recording and data handling error, and so on
The contrast
Trees are inanimate objects, like planets
Human beings, they are precisely that, human,
inconsistent, emotional, prone to error
The contrast
Trees : Predictive power of probability sampling theory very high – objects do not “cause trouble”
People : Predictive power of sampling theory very low - the survey is complex; human beings are involved
A large scale statistical investigation (survey) :
“Unpredictable people are involved at so many points of this incredibly complex process”
so we will never have a theory that will allow precise measurement of total survey error
(Stanley McCarthy 2001)
Producing numbers is (relatively) easy ; by comparison, stating their accuracy is difficult
Article by Platek and Särndal : Can a statistician deliver ?
J. Official Statistics vol. 17 (2001), pp. 1 – 127
with 16 discussions
and a rejoinder by the authors
Can a statistician fulfill the promise (to society) ?
Upon rereading : Have we advanced any, in 10 years ?
The title : Can a statistician deliver ?
“Statistician” may denote
the head of a National Statistical Institute (NSI)
or
a person expert in the subject (labour market, or health issues, or manufacturing industry, etc.)
or
a person trained in statistical science (methodologist)
As expected, feelings conveyed were of two kinds:
high ranking NSI officials: “Keep the ship sailing”, despite difficult times
academics and researchers: Regret the absence of a more solid (theoretical) base for (national) statistics production
Three themes are prominent in the 16 discussions (summarized in the authors’ rejoinder) :
The role of theory
The scientific and professional credo of the NSI
The concept of quality in regard to the NSI’s activity
The uncertain future of the NSI
I. Fellegi (Statistics Canada) on survival of the NSI. “Survival beyond quality” depends on
• Respect for respondents, and
• Credibility of information; Accuracy is an important part, but so are Relevance, Transparency & others
The uncertain future of the NSI
I. Fellegi : A life and death question for the NSI is
credibility :
Information that is not believed will not be used, and the NSI has no function any more.
Can the NSI count on future high co-operation and truthful response ? -
More and more doubtful.
Believing numerical information
We have no objective measures of “margin of error”
But what about the Total Survey Error model ? (US Bureau of the Census, around 1950)
It recognizes total error as a sum of a number of components.
Can we not use these equations, this theory ?
Believing information
The Total Survey Error model
• helped us to focus on specific components of total error
• disappointed us by failing to provide routine measures for the numerical quality of published statistics.
Believing informationDiscussants of Can a statistician deliver ?
deliver “a death sentence” on the TSE model :
“Unattainable and unrealistic ideal”“Utopian project”“Unrealistic utopian dream”
Theory is there, but it does not workSome say: We choose not to use itIn question are the notions of “probability” and
“probable error”
Statistics Canada Quality Guidelines (1998)
describes Survey Methodology as : “A collection of practices, backed by some
theory and empirical evaluation, among which practitioners have to make sensible choices in the context of a particular application”
A patchwork of theories, one for questionnaire design, one for motivating response, one for data handling and editing, one for imputation, one for estimation in small areas, and so on
Fragmentation …
European Statistics Code of Practice (2005)
Sound methodology must underpin quality statistics. This requires adequate tools, procedures and expertise. The overall methodological framework of the statistical authority follows European and other international standards, guidelines, and good practices ... Survey designs, sample selections, and sample weights are well based and regularly reviewed, revised or updated …
(Emphasis is mine.) A “be-good” encouragement; what about “scientific underpinnings” ?
The stark reality
“Good practice” is the guide, not theory .
Numerical quality is not assured .
Large errors probably not infrequent; most go undetected .
So what ? - Other important professions are also guided by a bunch of “good practices”
The NSI:s situation
Its work is guided by “a collection of practices supported by some theory” plus requirement to keep response burden low
With this frail and fragmented base, the NSI must produce reliable Official Statistics, for the good of the nation, a solid basis for policy decisions
Not an enviable situation and a threat to NSI’s existence…
The Probability Sampling Tradition (born in 1930’s)
created the concept of Nonresponse Rate :
“the selected objects” (the probability sample) as compared with
“the data delivering objects” (the respondents)
We measure, steadfastly, sometimes misguidedly, the size ratio of those two sets
Our obsession with the Nonresponse Rate
When NR rate was 2%, nobody worried
When NR rate is now around 50%, we worry
• Intuitively because the non-responding may be systematically related to target variable values
• Probabilistically because “making the observation” (getting the response) has an unknown probability; the theory capsizes
The believers in Probability Sampling regret that the theory cannot cope
The non-believers : Why worry about the NR rate ? Just collect some reasonably good data from a reasonably representative set of objects.
Our obsession with Nonresponse Rates
Why not (in the manner of some private survey institutes) just get data from “a reasonably representative set of co-operative objects”, and not bother with this stifling concept of the Nonresponse Rate ?
It is time that NSI:s deliver a strong endorsement of the Probability Sampling Tradition – if this is what they really believe in; otherwise, act accordingly
Our obsession with Nonresponse Rates
NR rate itself is a poor indicator of NR bias,
of “accuracy of estimates”
See for ex. Groves (2006), Schouten (2009)
Särndal and Lundström (2008)
Conclusions
What options remain for the NSI today, to show their superior capacity to produce “serious numbers” amidst a deluge of “junk information” ?
The underpinnings may be just “a collection of practices”, but still, the NSI is the model of statistical competence in the nation - and it must demonstrate this !
Media criticism of the NSI sometimes harsh.
Conclusions
The NSI’s delicate balancing act
vis-à-vis
• The national government : fulfill the mandate
• The world of theory and learning : show “scientific credibility”
• The other (private) producers of statistics : tough competition
• The supra-agency (EuroStat) : dictates
Conclusions
A fact is that the quality component accuracy cannot be measured (probabilistically).
Yet this is what users want desperately to have measured.
When important numbers are proven wrong (by users), trust in the NSI suffers
Other numbers may be wrong, but go unnoticed - and may not matter much .
ConclusionsThe Probability Sampling (Scientific Sampling)
tradition, is a reflection of an idyllic past -
now we are 2010 , not 1950 On what grounds is it still defendable, in our
time? It is a challenge to the NSI, and to the academics
(the theoreticians), to provide the answers
Conclusions
The NSI vis-à-vis the scientific world : a sometimes hesitant relationship:
Most NSI:s have a scientific (academic) advisory board
NSI:s look to the learned world for support and acceptance
NSI:s own investment in research may (understandably) be limited.
Implementing new theory into the NSI's production has met with obstacles
Conclusions
Relationship of the NSI to the world of learning; an empirical investigation, see
Risto Lehtonen and Carl-Erik Särndal : Research and Development in Official
Statistics and Scientific Co-operation with Universities: A Follow-Up Study , J. Official Statistics (2010)