survey of electronic commerce and technology: past, present and future challenges
DESCRIPTION
Survey of Electronic Commerce and Technology: Past, Present and Future Challenges. Jason Raymond. Third International Conference on Establishment Surveys June 2007. Outline. Description of the survey Methodology Improvements to the sample design Weighted Outliers Future challenges. - PowerPoint PPT PresentationTRANSCRIPT
Survey of Electronic Commerce and Technology: Past, Present and Future Challenges
Jason Raymond
Third International Conference on Establishment Surveys
June 2007
Outline
Description of the surveyMethodologyImprovements to the sample design Weighted OutliersFuture challenges
Description of the survey
Annual survey in place since 1999
Cross-economy surveySome exceptions at sub-industry level
Domains of interest:NAICS, SIZE (number of employees)
Description of the survey
Two-page questionnaire with questions on:Use of information and communications technologies (Internet, intranet, web site, …)Use of electronic commerce for the purchase and sale of goods and servicesBarriers to electronic commerce
Types of questions: Mostly categoricalSome numerical
total sales over Internetpercentages
Methodology
SamplingUniverse
Statistics Canada’s Business Register List of public units
Target populationFixed thresholds of exclusion:
$100,000 or $250,000 in gross business income depending on industryCovers approximately 95% of income in each industry
around 700,000 businesses
Methodology
SamplingStratification
NAICS3, NAICS4Size:
0 to 19 employees20 to 99 employees100 to 499 employees500 employees and more -> Take-all stratum
Public/private sector
Take-some strata
Methodology
SamplingNeyman allocation
Sample SelectionSample size: around 19,000 enterprisesMaximum overlap between two consecutive years:
Kish and Scott method (1971)Approximately 70% overlap
Outlier detectionVariables:
Sales over InternetYear over year difference for sales over Internet
Method: Variant of sigma gap
Distance measure between observations
Methodology
Partial nonresponse (8.3%) imputationDeductive (1%)Historical (0.1%)Administrative (0.02%)Donor (7.2%)
Total nonresponse (31%) reweighting
Methodology
Methodology
Estimation using Statistics Canada’s Generalized Estimation System (GES)
Types of estimatesMeansTotalsProportionsRatios
Data quality measures based on CVs and imputation rates
Improvements to the sample design
When?Current sample design tested in 2004 in parallel with original design and adopted in 2005
Why?Improve the comparability of estimates over timeNeed for estimates by size of enterprise
Target populationOriginal sampling design:
Units accounting for 95% of the total incomeDrawback: Unstable population over time
New sampling designFixed thresholds of exclusion: $100,000 or $250,000 depending on the industry
Improvements to the sample design
Stratification and allocationOriginal sampling design
NAICS3, NAICS4Lavallée-Hidiroglou: 2 take-some strata and 1 take-all stratum
Auxiliary variable: GROSS BUSINESS INCOME
Drawback: Not efficient for estimates by size (Number of employees)
Improvements to the sample design
New sampling designStratification:
NAICS3, NAICS4Size:
0 to 19 employees20 to 99 employees100 to 499 employees500 employees and more -> Take-all stratum
Public/private
Neyman allocation
Improvements to the sample design
Take-some strata
Weighted Outliers
Small proportions of firms sell over Internet (8% of private sector and 16% public sector)Moderate values but large weights sometimes significantly influence estimates
Previously outlier detection uniquely for unweighted values of sales over the Internet
Weighted Outliers
Weighted outlier detection and treatment implemented in 2006Same detection method as for unweighted values (variant of sigma gap method)Treatment methods studied
Hidiroglou/Srinath WinsorizationDalén and Tambay Promotion to own stratum
Hidiroglou/Srinath (1981)Weight reduction method Minimizes MSE of estimator for totalRequires use of population characteristics which are unknown, and which may possibly not be estimated reliably.
Weighted Outliers
WinsorizationReduces values larger than a certain cutoff to the cutoff itself (dependent on outlier detection method)Modified to weight reduction method
Weighted Outliers
Dalén(1987) and Tambay(1988)Cross between Winsorization and weight reduction The cutoff for weighted outlier detection is determined for each stratumOutlier value is split into two parts:
Portion less than the cutoff which receives the same new weight as the non-outliers;Portion greater than the cutoff which is allocated a weight of 1
Weighted Outliers
Weighted Outliers
Promotion to own stratumOutliers assigned a weight of 1Remaining units in stratum have their weights adjusted Outlier represents only itself during estimation
Implemented method: Dalén and TambayFewer assumptions Nice compromise
Impact on the estimates is reducedNot as drastic as promotion to own stratum
Method performed well using 2005 dataAdditional empirical studies to confirm effectiveness of the method (simulations?)
Weighted Outliers
Future challenges
Response burdenMaximising overlap = increased response burden?Minimal effect on response ratesConditioning effect?Sample rotation:
Ease response burdenControl sample overlap for longitudinal analysis
Statistics Canada’s Business Register redesign
Sampling elements based on operating structure VS statistical structureCertain modeled variables replaced by administrative data
Future challenges
Pour plus d’information, veuillez contacter
For more information please contact
www.statcan.ca
Jason Raymond613-951-1917