steps at eiawhere we are now paula weir and sue harris energy information administration, u.s....
TRANSCRIPT
StEPS at EIA—Where We Are Now
Paula Weir and Sue Harris Energy Information Administration, U.S. Department of Energy
ICES3 Topic Contributed Session: Generalized Survey Processing Systems—Part II
Presentation Outline
• Background and reason for implementing a generalized system
• First issue of getting data into StEPS
• StePs modules—changes made and how used
• Modifications to other systems and creation of new systems needed
• Summary and future work
Background
• In 2002 EIA acquired StEPS from the U.S. Bureau of Census
• Low cost solution for replacing legacy systems • Since then 18 surveys have been migrated to
StEPS• Differences in survey requirements and business
process have resulted in significant customization • Challenges faced in implementing a generalized
system designed with other surveys and processes in mind
Getting Data into StEPS• Process for creating batch data
into the StEPS input format• EIA developed the Data Collection
Module (DCM)• Originally intended to be a
generalized system • DCM provides the identification of
the respondents expected to report
• All reported data are keyed or passed through the DCM
• The DCM performs multiple front-end tasks
• Process original responses and resubmissions.
• Generate reports for organizing and tracking information regarding survey processing (case control)
StEPs Modules
• This processing system includes: a) modules for specifying parameters for the specific users and survey; b) modules for data collection activities including mailing, receipt and check-in; as well as, c) modules for post collection such as editing, imputation, and estimation.
Customization of StEPS
• Evolutionary process• Original vision was to remain
compatible with future versions and upgrades from Census
• As a result of differences in survey and business processes, the vision changed
• Example: analysts work on multiple surveys and time periods, so EIA created a StEPS Menu Interface screen
• EIA Tools
Defining STePS Edits• StEPS has 7 edit rules types:
1) required data item test; 2) range test; 3) list directed test; 4) skip pattern validation test; 5) balance test; 6) survey rule test; and 7) negative test.
• The use of the edit types varied within the three groups of EIA surveys.
• Many of the edit rules are common to many of the data cells, so a “wildcard” feature was developed to apply the edit to cells selected through a drop down.
• Performance issues and resolution approaches
• Enhanced roster functionality
Imputation Definition
• Imputation implemented differently within the three survey groups
• Simple imputation vs. general imputation
• Calculation of impute values within StEPS vs brought in as auxiliary file
• Imputation as a rollover function
• Major change initiative: Imputation in roster surveys
Review and Correction
• View reported data, edit flags and resolution, view notes
• Override flag created• Failures highlighted in red• Notes functionality
enhanced• Roster item view and
correction screens created• Mass correction function
for control data
Control Information
• Master Control and Control Information
• Latest action
• Ghosting and successor ID
• Unghosting to preserve ID as of the time data are reported
• Relationship to other systems and processes
Other Existing Systems and Creation of New Systems
• Historical respondent level database--SIS
• Master Frame FileIntegrated Frame System
• OHUB—temporary holding area of control and survey data, job status, other frame and sample information
Overview of Production Processes and Systems
FAX
DCM
Unformatted email
XLS
PEDRO EDES
Master Frame File System
PD Weekly System
SISPD only
StEPS NG--------------------------------
StEPS Reserves--------------------------------
StEPS PD
CubesOHUB
OHUB jobs
Aggregate Data Repository (ADR)
Oil and Gas Information
Resource System (OGIRS)
CSAF
PC SAS; XLS
XLS
Graphicr Inteface
Dissemination
System previously in place
Systems developed for StEPS migration
EDX (Energy Data Exchange)
CSAF
PC SAS; XLS
XLS(191) DQRS
Analyst Query
NG Weekly System
Data Warehouse
NGPS(Natural Gas Publication
System) External
Data
Summary and Future Work• StEPS: one box in an overall process and flow• Low cost alternative but did not replace all the functionality in the
legacy system• New systems and interfaces had to be built and other systems
modified• Main problems: lack of knowledge on effective implementation
(training and resource issues); differences in processes and work flow—language barrier; integration with other systems; independent treatment of separate but dependent surveys; dimensionality of EIA surveys
• Quick fixes and survey specific solutions vs integrated generalized solution for implementation
• Short run: 3 more surveys; Longer run: focus on edit/impute/estimate in StEPS, graphical interface and outlier detection, revisit process flow, integration of separate but related surveys, upcoming forms modifications, Red Hat?