the 6th conference on survey sampling in economic and social research september 21-22, 2009...
TRANSCRIPT
The 6th Conference on The 6th Conference on Survey Sampling in Economic and Social Research Survey Sampling in Economic and Social Research
September 21-22, 2009 Katowice, PolandSeptember 21-22, 2009 Katowice, Poland
Criticalities in Applying the Neyman’s Optimality in Business Surveys: a Comparison
of Selected Allocation Methods
Paola M. Chiodini a,d, Rita Lima c , Giancarlo Manzi b,d, Bianca Maria Martelli c,*, Flavio Verrecchia d
a. Department of Statistics, Università di Milano-Bicocca, Milan, Italy b. Department of Economics, Business and Statistics, Università degli Studi di Milano, Milan, Italy c. ISAE, Rome, Italy d. ESeC, Assago (MI), Italy
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
2
DISCUSS POSSIBLE MORE EFFICIENT SAMPLE DESIGNS FOR THE ISAE BUSINESS TENDENCY (BTS) SURVEY
– BTS Economic features
– BTS Statistical features
– Operational bounds
TO MEET EVERYBODY’S NEEDS WHILE STRENGHTENING OUTCOMES RELIABILITY (INDUSTRIAL CONFIDENCE)
AIM OF THE PAPERAIM OF THE PAPER
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
3
BTSBTS ECONOMIC FEATURES ECONOMIC FEATURES
• Business Tendency Surveys investigate CONFIDENCE of economic agents
• CONFIDENCE can be defined as the (positive) attitude of economic agents toward both firms’ (internal) and country’s (external) variables– Corresponding Universe real value unknown
• To this purpose BTS collect information about a wide range of variables selected for their capability, when analysed together, to give an overall picture of industrial sector of the economy (OECD 2003)
• The survey ask entrepreneurs and managers assessmentsassessments on current trends and expectationsexpectations for the near future regarding both their own business and the general situation of the economy
• Business Tendency Survey thus collect qualitativequalitative information, mainly with a three options ordinal scale
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
4
BTSBTS ECONOMIC FEATURES ECONOMIC FEATURES
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
5
CONFIDENCECONFIDENCE
• Answers obtained from the survey are quantified in form of “balances”“balances” , i.e. differences between positive and negative answers’ percentages
• The statistical series derived from business tendency surveys are particularly suitable for monitoring and forecasting business cycles
• The aggregation of selected series (order book level, production expectations and stock) gives the confidenceconfidence indicator
• Confidence indicators (and some single series too) often have leading capabilitiesleading capabilities and are widely used in the analysis of the economic cycle (recessions/expansions)
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
6
SHORT SURVEY HISTORYSHORT SURVEY HISTORY
• The manufacturing survey began 1959 on a quarterly basis and became monthly 1962 on a limited number of questions (purposive panel)
• During the years the survey was broadly modified to meet upcoming occurrences: – 1986 the sample was updated in order to provide information
also a regional level adopting a stratified (sector/region/size) partially random sample
– 1998 the Neyman’s optimal allocation of the reporting units to sample strata based on workforce variance was introduced (Cochran 1977)
– 2003 data processing was upgraded introducing a two-stage weighting system (sample weights and size weights) according to OECD (2003) able to assure a fully fledged comparability between local and national data
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
7
GDP and CONFIDENCEGDP and CONFIDENCE
• Confidence well fit the GDP shifts • In recent times (since April 2009) positive signals from the survey
(last available GDP figures Q II 2009: very negative)
-6
-4
-2
0
2
4
6
60
65
70
75
80
85
90
95
100
105
GDP (t-4 % ch lhs) Confidence (index, 2000=100, s.a. rhs)
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
8
EUROPEAN REFERENCE FRAMEEUROPEAN REFERENCE FRAME
• The Survey is part of the Joint Harmonised Business and Consumer Survey (BCS) program of the European Commission
• The project began 1962 and ISAE (formerly ISCO) was one of the founder member
• The principle of harmonisation underlying the project aims to produce a set of comparable data for all European countries (EC 2007)
• To achieve this goal institutes have to: – Use the same harmonised questionnaire– To strictly respect the Commission timetable in carrying on the
survey and transmitting the results
Institutes are relatively free to define any other aspects of the entire process (apart from a minimum sample size)
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
9
• FRAME : ASIA archive of Italian active firms (last update 2006): + complete universe of firms – relatively late update
BTS Statistical features: SAMPLE DESIGNBTS Statistical features: SAMPLE DESIGN
• QUESTIONNAIRE: fixed by Commission. Can only be integrated
• DATA COLLECTING MODE: CATI (Computer Aided Telephonic Interviewing), partly integrated with fax (foreseen some CAWI):
Keep ASIA as FRAME
MIXED MODE
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
10
OPERATIONAL CONSTRAINSOPERATIONAL CONSTRAINS
– EC: • recommended SAMPLE SIZESAMPLE SIZE about 40004000 units (firms/kind
of activity units), bound to the country population size
• Very strict TIMING CONSTRAINTS:
– MONTHLY FREQUENCY, – 12 DAYS DATA COLLECTION– 1 WEEK PROCESSING RESULTS
– NATIONAL: LOCAL INFORMATIONLOCAL INFORMATION• Governmental priority• Possible revenues
– ISAE: PRESERVING “LOYAL” FIRMSPRESERVING “LOYAL” FIRMS: • Research purposes of longitudinal analyses• Conflicting with sampling theory (Panel rotation)
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
11
BTS STATISTICAL FEATURES BTS STATISTICAL FEATURES
As the total sample size is predetermined (about 4000 units), to increase precision is then mainly possible to work on:
– Strata definition (partially predetermined and bound to economic and administrative settings)
– Units’ allocation to StrataUnits’ allocation to Strata– Panel maintenance– Non response handling– Weighting
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
12
STRATA DEFINITIONSTRATA DEFINITION
STRATA defined according to: • ECONOMIC SECTORS
– 19, nearly EC requests, adapted to Italian economy
• AREAS (NUTS1) – 4, administrative classification, widely different in size
• FIRMS’ SIZE (by workforce)– Small (10-49 ), Medium (50-249) , Large (>=250). Distribution is
right (positively) skewed because of the presence of few “large” establishments and many “small” units
• Minimum threshold of 10 employees – About 80% of total workforce
FIRMS BY STATA
Nord Ovest Nord Est Centro Sud e Isole Total
10-49 50-249 250 &+ 10-49 50-249 250 &+ 10-49 50-249 250 &+ 10-49 50-249 250 &+
10-12. Manufacture of food, beverages and tobacco products 1496 233 56 1715 277 41 1082 90 12 1856 175 12 7045
13. Manufacture of textiles 1584 342 55 575 88 10 1047 82 4 272 32 3 4094
14. Manufacture of wearing apparel 1472 140 23 1856 151 23 1230 102 9 1298 96 6 6406
15. Manufacture of leather and related products 318 46 1 893 125 12 2053 141 9 625 52 3 4278
16-17. Manufacture of wood and paper products 1239 154 17 1404 168 15 860 89 10 750 44 3 4753
18. Printing and reproduction of recorded media 966 75 10 740 60 5 505 33 1 317 21 . 2733
19. Manufacture of coke and refined petroleum products 32 9 7 16 5 . 19 7 4 78 7 4 188
20-21. Manufacture of chemical and pharmaceutical products 680 304 90 328 111 14 227 67 32 221 27 1 2102
22. Manufacture of rubber and plastic products 1511 292 42 939 190 15 546 93 5 449 61 5 4148
23. Manufacture of other non-metallic mineral products 938 126 18 1265 226 49 857 111 15 1204 104 1 4914
24. Manufacture of basic metals 652 208 41 275 119 12 160 35 6 147 34 4 1693
25. Manufacture of fabricated metal products, except machinery and equipment 6199 622 41 4528 438 28 1949 175 11 1941 219 11 16162
26. Manufacture of computer, electronic and optical products 717 145 28 439 97 18 260 57 15 113 24 5 1918
27. Manufacture of electrical equipment 1050 206 35 808 161 29 362 54 15 194 27 4 2945
28. Manufacture of machinery and equipment n.e.c. 3243 692 82 2830 595 104 739 121 8 523 60 3 9000
29-30. Manufacture of transport vehicles 699 183 84 396 95 30 313 70 13 237 85 20 2225
31. Manufacture of furniture 927 95 4 1730 242 19 931 96 7 527 63 6 4647
32. Other manufacturing 619 82 12 719 111 8 506 36 4 185 6 1 2289
33. Repair and installation of machinery and equipment 1450 81 6 979 46 2 657 31 4 849 62 3 4170
Total 25792 4035 652 22435 3305 434 14303 1490 184 11786 1199 95 85710
TOTAL WORKFORCE BY STATA
Nord Ovest Nord Est Centro Sud e Isole Total
10-49 50-249 250 &+ 10-49 50-249 250 &+ 10-49 50-249 250 &+ 10-49 50-249 250 &+
10-12. Manufacture of food, beverages and tobacco products 27893 23432 40923 31525 28191 32670 18935 8486 7621 32959 16070 5503 274208
13. Manufacture of textiles 32739 33020 26112 10710 7955 6052 18756 6527 1551 5062 2917 905 152306
14. Manufacture of wearing apparel 25793 13141 16920 32975 14787 11170 22033 8594 3734 24100 7785 3805 184838
15. Manufacture of leather and related products 5503 4124 386 17289 12460 4817 36840 10993 6341 11089 4621 1071 115532
16-17. Manufacture of wood and paper products 22614 15000 13008 25842 16333 7002 15504 7683 5124 12912 4316 1876 147213
18. Printing and reproduction of recorded media 17487 7378 4488 13377 5689 2794 8674 2984 2383 5279 1549 . 72081
19. Manufacture of coke and refined petroleum products 707 1334 4508 372 510 . 465 746 2259 1360 473 3419 16152
20-21. Manufacture of chemical and pharmaceutical products 14955 32746 58762 7149 11242 8071 4343 7667 32378 4106 2811 261 184492
22. Manufacture of rubber and plastic products 30241 27601 28723 19022 18094 5443 10691 8381 2428 9049 6394 3247 169315
23. Manufacture of other non-metallic mineral products 17815 12261 14026 24414 23525 28662 15779 10672 6612 21637 9113 2218 186734
24. Manufacture of basic metals 13577 21680 46749 5770 13307 8195 3262 3637 5906 2986 3352 1818 130240
25. Manufacture of fabricated metal products, except machinery and equipment 112440 55834 17454 84656 39819 12084 34463 15080 4608 35425 19366 3622 434851
26. Manufacture of computer, electronic and optical products 14252 14888 30018 8887 9793 9102 4951 5539 13192 2110 2744 4238 119713
27. Manufacture of electrical equipment 20542 20544 31418 16787 16037 22000 6837 5834 15590 3759 2206 2206 163760
28. Manufacture of machinery and equipment n.e.c. 64091 68542 45279 57244 58170 62515 14439 11284 7747 10022 5096 1063 405491
29-30. Manufacture of transport vehicles 14322 19757 113600 8376 9871 30848 6087 7599 10836 4722 8721 33591 268329
31. Manufacture of furniture 16196 9189 1547 32435 21947 7379 16910 8348 3290 9853 5070 4874 137038
32. Other manufacturing 11120 8104 3685 13714 10307 13520 8991 3137 1794 3199 612 326 78509
33. Repair and installation of machinery and equipment 24766 7353 4053 16932 3820 723 11118 2579 5974 15474 5411 1436 99639
Total 487054 395926 501659 427476 321858 273047 259079 135768 139368 215102 108625 75479 3340442
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
15
• 10 - 49
FIRMS POPULATION BY SIZE
• 50 - 249• 250 - • Total
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
16
SIMULATION
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
17
UNIT ALLOCATION TO STRATA: UNIT ALLOCATION TO STRATA: SIMULATIONS SETTINGSSIMULATIONS SETTINGS
• REFERENCE POPULATION: ASIA INDUSTRIAL SECTOR– 85710 ENTERPRISES– 3040422 PERSONS EMPLOYED
• 3 DIMENSIONS– AREAS (NUTS1)– ECONOMIC SECTORS – FIRMS’ SIZE
• 226 STRATA
• 500 REPLICATES
• SIMULATION TECHNIQUE: SEQUENTIAL UNIT SELECTION
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
18
UNITS ALLOCATION TO STRATA:UNITS ALLOCATION TO STRATA:ALTERNATIVE ALLOCATION METHODSALTERNATIVE ALLOCATION METHODS
• UNIFORM (21 units per stratum)
• PROPORTIONAL (fh 4,4%)
• NEYMAN (x-optimal)
• ISAE (NEYMAN x-optimal on areas; winsorised 5%)
• AOSU(n1h): UNIFORM(n1h) + NEYMAN(n2h)– n1h= 1, 2, … , 21– n2h= nh-n1h – so that:
• n1h= 0 then AOSU0 = NEYMAN• n1h= 21 then AOSU21 = UNIFORM
• APSU(n1h): UNIFORM(n1h) + PROPORTIONAL(n2h)
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
19
UNIT ALLOCATION TO STRATA:UNIT ALLOCATION TO STRATA:SIMULATION METHODSIMULATION METHOD
START
RANDOM UNIT SELECTION
(SEQUENTIALY RANKED)
REPLICATION
Simulation
DW
If replicate < 500• If replicate = 500•
Allocation
MethodsNeyman samples
ISAE samples
AOSU(n1) samples
…
OU
TP
UT
EN
D
OVERALL
STATS
DOMAIN
STATS
INF
ER
EN
CE
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
20
Distribution of Replication (Total workforce)Total workforce)
Neyman ISAEAOSU3AOSU9UNIFORMPROP.
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
21
OVERALL POPULATION
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
22
REPLICATION BOX PLOT (Total workforce)Total workforce)
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
23
STATISTICS
• Bias = N – N r
• Total Error (TE) = |Bias| + N r
• Relative Total Error (RTE) = TE / N r
• Range = max(N r) - min(N r)
• Where: : Population mean–
r : Replication mean–
r : Replication STD– N : # Enterprises
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
24
UNIT ALLOCATION:UNIT ALLOCATION:StatisticsStatistics
|BIAS| STD TE RANGE
isae 283 22158 22441 124361
neyman 135 21337 21472 114837
aosu1 520 21648 22168 128626
aosu3 922 22253 23176 126329
aosu9 345 23568 23914 129781
uniform 141 60956 61096 326455
apsu3 6370 123260 129630 648787
proportional 11093 177073 188166 1017149
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
25
REPLICATION BOUNDED BOX PLOT (Total workforce)Total workforce)
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
26
BOUNDED UNIT ALLOCATION: UNIT ALLOCATION:StatisticsStatistics
Bound:
• Max 50% allocation per strata
• Minimum 3 unit per strata
|BIAS| STD TE RANGE
aosu3 1513 46444 47957 256410
aosu9 2724 46362 49086 269084
aosu24 (i.e uniform) 803 59688 60491 362685
apsu3 6462 123244 129706 644813
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
27
DOMAIN ANALYSIS
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
28
STRATA COVERAGE
AOSU, UNIF, PROP:• 0 strata with 0%
allocation
NEYMAN:• 12 strata with 0%
allocation
ISAE:• 7 strata with 0%
allocation
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
29
STRATA STATISTICS
• CVs = rs / rs
• Biass = s – rs
• Total Errors (TEs) = |Biass| + rs
• Relative Total Errors (RTEs) =
TEs / rs = (|Biass| / rs) + rCVs
Where: s : Strata population mean–
rS : Strata replication mean–
rS : Strata replication STD
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
30
STRATA BOX PLOT: |Bias| by strata (|Biass|)
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
31
STRATA BOX PLOT: CV of replication means by strata (rCVs)
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
32
STRATA BOX PLOT: Relative Total Error by strata (RTEs)
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
33
UNIT ALLOCATION TO STRATA:UNIT ALLOCATION TO STRATA:StatisticsStatistics
RTEMax
(|BIASs| / rs)
Max
(rCVs)
Max
(RTEs)
isae 0,0067 0,0315 0,5664 0,5979
neyman 0,0064 0,0315 0,5664 0,5979
aosu1 0,0066 0,0244 0,4250 0,4491
aosu3 0,0069 0,0202 0,2775 0,2778
aosu9 0,0072 0,0141 0,1549 0,1624
uniform 0,0183 0,0226 0,4042 0,4152
apsu3 0,0388 0,0582 1,0052 1,0094
proportional 0,0563 0,1033 1,6645 1,6713
September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland
34
CONCLUDING REMARKS AND OPEN QUESTIONSCONCLUDING REMARKS AND OPEN QUESTIONS
Strata allocation: best proposal seem to be:
Overall population: Neyman
Domain analysis: Approach based on Neyman and strata representativeness constraints
The AOSU(n1) family
ISAE
They allow to strike a balance between theory and practical They allow to strike a balance between theory and practical constraintsconstraints