overview

71
Auto-Validating Your Data Store: A do-it-yourself approach to data integrity and anomaly detection. Evan Davies Office of Strategic Planning and Analysis The College of William and Mary in Virginia

Upload: chibale

Post on 25-Feb-2016

29 views

Category:

Documents


2 download

DESCRIPTION

Auto-Validating Your Data Store: A do-it-yourself approach to data integrity and anomaly detection. Evan Davies Office of Strategic Planning and Analysis The College of William and Mary in Virginia . Overview. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Overview

Auto-Validating Your Data Store: A do-it-yourself approach to

data integrity and anomaly detection.

Evan DaviesOffice of Strategic Planning and Analysis

The College of William and Mary in Virginia

Page 2: Overview

Overview

• Institutional researchers increasingly rely on data marts, stores, and warehouses for management information.

• Given the perpetually ‘developing’ status of these environments, both commercial and institutional, you can spend significant time dealing with data that does not meet even shifting standards for table logic, variable conventions, and values.

• Such anomalies can stop production programs or lead to inaccurate information until detected.

Page 3: Overview

Does this ever happen to you?• NOTE: Table WORK.ENROLL created, with 8241 rows and 19

columns. • NOTE: Table WORK.PERSON created, with 8241 rows and

13 columns. • NOTE: Table WORK.MAJOR created, with 8253 rows and 30

columns. • NOTE: Table WORK.MAJOR_PERSON created, with 8264

rows and 30 columns. • Who are these extra people? Why are they in here?

Page 4: Overview

Agenda

• This presentation demonstrates a simple yet sophisticated way of using SAS® Enterprise Guide to check tables, variables, and values automatically to find out if that data meets IR standards and premises before you start analytical work (and to let others know that things may need fixing).

Page 5: Overview

Things You Should Know…

• ‘Simple’ is a relative term, as is ‘sophisticated’• This involves more coding than mouse clicking• You should have at some concepts of SAS® coding,

SQL, and relational databases• To make this work back at your campus, you need to

know (or find out) how to access your data• If you have significant structured programming and

SQL experience, please refrain from laughing out loud. Snickering is acceptable.

Page 6: Overview

The College of William & Mary• The only royally chartered colonial institution, 1693, by

King William III and Queen Mary II… • Making it the second oldest college in the United States• Phi Beta Kappa, the first Greek honor society, was founded

here in 1776• Became state-supported in 1906 and coeducational in 1918

• The Alma Mater of George Washington and Thomas Jefferson , as well as Jon Stewart and Secretary of Defense Robert Gates

• Named one of Intel's 50 Most Unwired College Campuses for our campus-wide wireless network

• The Colonial Campus section of the 1,200 acre campus is restored to its 18th-century appearance

Page 7: Overview

The Wren Building (1700)

The Oldest Academic Building in Continuous Use in the U.S.

Page 8: Overview

The College Today…• 5,800 undergraduates and 1,950 graduate students from all 50

states and 30 foreign countries• 22 percent are students of color• 79 percent of freshmen graduated in the top ten percent of their

class• Highest SAT middle 50th range of all public institutions in Virginia• 11:1 student-faculty ratio• W&M has more recipients of the Commonwealth's Outstanding

Faculty Award than any other institution• 5 undergraduate and graduate schools: Arts & Sciences, Business,

Education, Law, and Marine Science • 36 undergraduate programs, 12 masters , doctoral, and professional

degrees

• is a Highly Selective Public Liberal Arts University

Page 9: Overview

HRSHuman Resource System∙Applications∙Personnel Mgmt∙Position Control∙Benefits Mgmt∙Work Study∙Payroll/Account∙A21 Certification∙CARS Interface

The College of William and Mary

Federal Govt∙IRS/SSA∙W2s/1099s

VA Dept of Taxation∙W2s/1099s∙CDS Vendors

Office Supply Vendor

VA DOA CARS System∙Payroll Acct∙EDI to W&M∙CDS Pymts∙EDI Vendor Pymts∙Expense/Cash Acct∙Agency/CPRS Tape

Federal Dept Labor∙Employment Compliance

VA DPT∙BES System∙Benefits Eligibility

VA VRS∙Retire, Life Ins, Opt Life

VA VEC∙Unemployment Benefits

FSA Administrator

TIAA - CREF∙Annuity Eligibility

Benefits Vendors

VA DPT PMIS∙Personnel Mgmt

VA DPB∙Budget Admin

Social Security Administration

Fed Reserve∙US Savings Bonds

External SystemInterfaces/Entities

Office Supplies Systems

Warehouse Systems

Work Order System

Faculty Salary Tracking

Leave Account

1500 Hour Tracking

FRSFinancial Record System∙Accounts Payable∙Purchasing∙General Ledger∙Budget∙CARS∙Interface/Recon∙EDI & CDS Vendors∙Financial Statements∙Grants Account∙Fixed Assets (6/99)

VA APA

∙Bank∙ACH Direct Deposits∙Check Verification∙Cancelled Checks∙Lock-Box Payments

Cash Receipts System

FAACS (Old Fixed Assets)

WORCSStudent Web

Old Campus Police

Old SISStudent Info System∙Residence Life∙Student Billing∙Transcripts

New SISStudent Info System∙Prospects∙Admissions∙Student Records∙Registration∙Course Schedule∙PO Box Management

DARS

SSA/FICADeposits Payroll Office

US Savings BondsAdmin Payroll Office

Checks 1-2-3∙Pay Checks∙Direct Deposit Stubs∙W2s∙AP checks∙1099sGeneral Accounting Office

New Campus Police

Identification SystemFood Services

Parking System

SWEMPatron Info System

Mysoft Call Account Sys Telecom

Power FAIDSNew Financial Aid

VA SCHEV

SAT Scores

ACT Scores

GRE Scores

MAT Scores

MCAT Scores

NCAA

Prospective Student Search

Peterson’s Pros. Student Data

National Clearing House Loan Verification

PELL Grants

ISIR F.A. Input

Wiz Kid

Student Health System

Door Access System

Alumni Development System

Old Financial Aid

Schedule 25Resource 25

Athlete Tracking

Web Applications (IP)

College Systems that Interface with Administrative Computing Systems

College Systems that Interface with Administrative Computing Systems

Data HistoryWe had many homegrown systems, some based on Information Associates®

architecture, but extensively modified.

Page 10: Overview

Recent Data History

• After false starts, in 2003 we bought a large Banner enterprise system, and then added a datamart, and then a data store. We are now five years into our two year installation period.

• Or put another way, we are ‘current’ on version(s) of 8.2 and 3.1, with new version(s) around the corner.

• When the IR staff start to understand the relationships and foibles of a particular version, it is time to upgrade to a newer version in which some things are fixed… and some other things are broken differently enabled.

Production

DataMart

DataStore

v.6, v.7, v.8… v.1, v.2… v.1, v.2, v.3….

Page 11: Overview

Things To Realize…• SungardHE Banner® products are not a bad

system. Nor are any other commercial vendor’s products.

• Any enterprise-level system with a data store is a permanently evolving, almost organic entity,

• with multiple input opportunities for breaking constraints and premises

• and for finding new ways to induce unexpected results through changing business rules, institutional decentralization, flexibility, and collegiality.

Page 12: Overview

Data Integrity in Pre- and Post-Enterprise Systems

• Old History: I.T. used to do a system General Edit, with “general” meaning edited for operational purposes, not analytical purposes. If the data got the payroll to run today or allowed you to admit a student, it was ‘valid’.

• New History: Data validity is still measured against operational standards.

Page 13: Overview

Data Integrity Post-Enterprise System

• And I.T. is now even busier than ever just making the production system ‘run’, without markedly larger numbers of staff to commit to the data warehousing activity. They actually do less general editing because more transactions are interactive rather than process-oriented.

• At the end of a day, data is off-loaded into the data store, in different forms and with different premises from how the data is held from the production system.

Page 14: Overview

Data Integrity Pre- and Post-Enterprise System

• This means that there is now an even bigger split between operational and analytical data integrity, especially in terms of the differing forms of the data.

• The data store is not as rigorously evaluated for data integrity or meaning precisely because it is not production. And since the operational offices of the institution are satisfied with their pieces of the data pie, everything is working fine.

Page 15: Overview

The Institutional Research Role

• IR has always had premises for data reporting that go beyond ‘general edit’.

• We analyze data relationships that make up the whole picture.

• We work in the aggregate, rather than by individual transaction.

• We are ideally poised to discover the anomalies that occur between multiple institutional sources -- between one office’s interpretation of a transaction and another office’s idea of the same information.

Page 16: Overview

A Data Store’s Added Task to IR

• IR now has to do the same comprehensive validation tasks that we used to do, plus identify and deal with the newer ‘introduced’ problem of tables that violate their own premises or have unexpected values due to complex dynamics among:

RULES

STORE

PEOPLE CHANGES

PRODUPGRADE

Page 17: Overview

The Complex Dynamics• imperfect translation of data between the production

transaction system and the data store/warehouse,

• vendor maintenance or institutional business rules changes that intentionally induce changes in tables.

• generally caused by the inability to predict all of the systemic results of making unspecified or unimagined changes in a system, sometimes known as the Butterfly Effect.

• the continuous upgrade cycles applied to the system,

• and data that results from imperfectly recorded transactions in uncertain environments with less than adequate collaboration and training.

Page 18: Overview

So What Can We Do?

• Write a program to – check the logic premises of frequently used tables– check for missing or out-of-range values in data that

affects IR– run the program frequently to uncover problems in

time for the current census and prevent future term anomalies. Keep the results for documentation.

– communicate findings promptly and efficiently in order to effect change

– build in flexibility to test different things at different times in different ways

Page 19: Overview

Limitations in Place

• Do it on your own, since it is for IR purposes. Remember, the data already meets everybody else’s needs.

• Use existing resources. Keep it simple.

• Someone else is going to have to be notified to deal with the data anomaly once it is identified.

• Don’t weaponize the process. This is why we choose to have anomalies, not mistakes.

Page 20: Overview

How To Accomplish?

• The use of SAS® EG on a PC platform allows for the remote submission of a premise and data-checking program during off-hours, when demand on the system is lessened.

• It also allows a convenient environment in which to store the program and results, access validation and history tables, and send automatic e-mails to interested parties such as admissions, registrar, IR and IT staff.

Page 21: Overview

• “SAS® Enterprise Guide®, a powerful Microsoft Windows client application that provides a guided mechanism to exploit the power of SAS and publish dynamic results throughout your institution. It’s the preferred interface to SAS for analysts, statisticians and programmers– SAS Enterprise Guide saves time by automatically generating computer code with an easy point-and-click interface.”

• Think of it as a graphical office environment for the SAS language. As Microsoft Outlook ties MS Office products together, Enterprise Guide has a similar role for SAS

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Page 22: Overview

Still not clear on SAS/EG® ?

• It is an environment in which SAS 9.1 runs;• It can bring together data views and any type of files

from any network data servers, including Oracle.• It contains SAS program(s), the larger project, notes,

and a graphical description of how the project, processes, and programs relate to each other.

• It documents how any processes or programs have been run, and the results.

• It can generate code for procedures and datasteps.• It inherits libraries, autoexecs, etc. from 9.1

Page 23: Overview

We Use Enterprise Guide® To…

• Schedule and launch the anomaly tracking job• Provide a comprehensive project

environment for IR staff to be able to visit independently and add new anomaly checks

• Be able to see and modify the associated tables and data in one place

• Provide our novice IR staff a more centralized and friendlier view of the process

Page 24: Overview

What Does It Look Like?

Page 25: Overview
Page 26: Overview
Page 27: Overview
Page 28: Overview
Page 29: Overview
Page 30: Overview
Page 31: Overview

Can You Just Use SAS® Itself?

•YES!• Provided you schedule the job through MS

Task Scheduler and have no desire for the previously mentioned features or facilities.

• Programmatically, all process features are part of base SAS v9.1 on an XP or Vista platform.

Page 32: Overview

Beginning Steps

• Survey yourself and other staff (IR and other offices) to make a preliminary list of tables and values that have issues

• Establish the names, emails, and hierarchy of those who will receive automated communications.

• Make a calendar of when you want to test certain items due to different functional cycles (admissions, registration, etc)

Page 33: Overview

Let’s Go Coding…

• I can’t show you the entire code for my anomaly program today.

• It would need to be modified to fit your data structures and needs anyway.

• Instead I will concentrate on imparting some:• Key Program Ideas• SQL techniques

• to enable you to develop your own program

Page 34: Overview

The Overall Design ConceptSet up a SAS program which launches automatically using SAS/EG

Construct some macros to help pass anomaly parameters

Acquire data for testing from your datastore tables

Use Proc SQL and other steps and procedures to test data

Page 35: Overview

The Overall Design Concept (2)Test tables and variables based on the academic year and institutional work cycle

Anomalies ‘fixed’ are deleted from the master ; new ones are added to it

The master table is subset and sent as email at various intervals and detail levels

A history set of anomaly transactions is kept for study

Page 36: Overview

Set up a SAS program which launches automatically using SAS/EG

Page 37: Overview

Set up a SAS program which launches automatically using SAS/EG

Page 38: Overview

%let term_start_limit=200325;

%let term_end_limit =201030;

%let highestssn='772';

%let es_validset='EL','MW','WD','WM','WW'; /*all valid statuses encountered by enrolled students AFTER enrollment/dropadd*/

%let es_bad='QW','WB','AW'; /*all bad or ineligible statuses not to be counted by census date*/

%let tooold='1900'; *out-of-range birthday year cutoff;

Start by setting up metadata necessary for testing

Page 39: Overview

Start by setting up metadata necessary for testing

Macro variable assignment increases flexibility as table names change.

%let table_enroll=enrollment;%let table_academic_study=academic_study;%let table_addcurrent=address_current;%let table_prevedslot=previous_education_slot;%let table_schedule=schedule_offering;%let table_course_catalog=course_catalog;%let table_course=student_course;

Page 40: Overview

Start by setting up metadata necessary for testing

*format table for anomalies;proc format;value anom 1= 'withdrawn with classes'2= 'duplicate person recs '3-4= 'missing value '5= 'ssn out-of-range '6= 'Two ids, one ssn '7= 'dup recs course_catalo '……;run;

Page 41: Overview

Construct some macros to help pass anomaly parameters

Page 42: Overview

Create table todayterm as select stvterm_code as studyterm,

datepart(stvterm_start_date) as begd , datepart(stvterm_end_date) as endd from (your data source) where today() > datepart(stvterm_start_date) and today() < datepart(stvterm_end_date) ;

*set a global study term variable based on value in todayterm;data _null_;

set todayterm;call symput('study',studyterm);

Construct some macros to help pass anomaly parameters

Page 43: Overview

•Use up to 3 variables to show people what is anomalous about any particular situation. Name them A, B, and C. The variables you will pass to these variables will differ with each problem.•You will need to put both the value and the name of the variable into your anomaly report, plus some identifier(s) for the student or entity, plus some anomaly details.

Construct some macros to help pass anomaly parameters

Page 44: Overview

*Macro to help transfer of anomalies;%macro keep ; (keep= id person_uid aval aval_desc bval bval_desc cval cval_desc anom studyterm first_anom_date suspend_date data_own)%mend keep;

Construct some macros to help pass anomaly parameters

Page 45: Overview

%macro pass (aval=,bval=,cval=,dsn=,anom=,studyterm=,suspend_date=,data_own=) ;aval_desc = "&aval";bval_desc = "&bval";cval_desc = "&cval";aval = put(&aval,$25.);bval = put(&bval,$25.);cval = put(&cval,$25.);anom=&anom;studyterm=put(&studyterm,$6.);first_anom_date=today();suspend_date=&suspend_date;data_own=&data_own;

%mend pass;

Construct some macros to help pass anomaly parameters

PASSES THE NAME OF THE VARIABLE

PASSES char VALUE OF THE VARIABLE

PASSES OTHER VALUES

Page 46: Overview

proc sql;connect to odbc as mydb (datasrc="&datasrc" user=&user password=&password);create table addressc as select * from connection to mydb ( select m.PERSON_UID, m.id, n.address_type, n.postal_code, n.city, n.county, n.state_province, n.nationfrom &table_enroll m

inner join&table_addcurrent non m.person_uid = n.entity_uid

where m.ACADEMIC_PERIOD in (&study) and ((m.ENROLLED_IND='Y' and m.REGISTERED_IND='Y' ) or (m.ENROLLED_IND='Y' and m.ENROLLMENT_STATUS in

(&es_set)))and n.address_type in ('IN', 'P1', 'MA')

) ;

Acquire data for testing from your datastore

Page 47: Overview

Acquire data for testing from your datastore

…from &table_enroll minner join&table_addcurrent non m.person_uid = n.entity_uid

where m.ACADEMIC_PERIOD in (&study) and and n.address_type in ('IN', 'P1', 'MA')

STUDENTS THIS TERM

ALL ADDRESSES

ADDRESSES TO CHECK

Page 48: Overview

Acquire data for testing from your datastore

Let your server do the heavy data work

SQL call

Result Set

Oracle ®

Page 49: Overview

• The most useful technique for detecting a table that violates record premises is by joining the table back to a copy of itself that has been summarized by the number of expected rows, keeping only those records that don’t meet expectations.

• A SQL statement that uses a count() function, a ‘group by’ clause, and a ‘having’ clause does this job effectively.

Use Proc SQL and other procedures to test logic/values

Page 50: Overview

select l.* from(select PERSON_UID, ID, ACADEMIC_PERIOD, PROGRAM,PRIMARY_PROGRAM_IND, ADMISSIONS_POPULATION from &table_academic_study where academic_period in (&study) ) l

inner join(select person_uidfrom &table_academic_studywhere academic_period in (&study)group by person_uid, programhaving count(person_uid) > 1 ) r

on l.person_uid = r.person_uid

Use Proc SQL and other procedures to test logic/values

Page 51: Overview

selectperson_uidfrom &table_academic_studywhere academic_period in (&study)group by person_uid, programhaving count(person_uid) > 1

Use Proc SQL and other procedures to test logic/values

PERSON_UID PROGRAM ACADEMIC_PERIOD COUNT

1234 BA-GOVT 200910 1

1234 BA-GOVT 200910 1

PERSON_UID PROGRAM ACADEMIC_PERIOD COUNT

1234 BA-GOVT 200910 2

Page 52: Overview

select l.person_uid, r.person_uid as other_uid, l.tax_id, r.tax_id as other_tax_id, l.full_name_lfmifrom person l inner join person r on l.TAX_ID = r. TAX_IDAND l.person_uid <> r.person_uidwhere l.tax_id is not nulland r.tax_id is not null

This SQL will find two different university ids that share the same SSN. This generally occurs when the institution has issued two ids to the same person without an adequate search of records. The faster this is spotted, the better.

Use Proc SQL and other procedures to test logic/values

Page 53: Overview

In addition to testing table logic, once you have the datasets established, any variety or combination of values can be tested. Here are four conditions to get you started thinking about what can be tested:

- missing values;if citizenship_type = '' then do;

- withdrawn with classes;if enrolled_ind = 'Y' and registered_ind = 'Y' and enrollment_status = 'WB' then do;

Use Proc SQL and other procedures to test logic/values

Page 54: Overview

if state_province in ('AA','AE','AP','PR','VI','AL','AK','AZ','AR','CA', …'WI','WY' ) and nation > ''

orstate_province in ('AB','BC','MB','NB','NL','NT','NS','NU','ON','PE','QC','SK','YT') and nation ^= 'CA'

ornation = 'CA' and state_province not in

('AB','BC','MB','NB','NL','NT','NS','NU','ON','PE','QC','SK','YT')

orstate_province in ('FC' ,'HK', 'RQ', 'XX') then do;

Use Proc SQL and other procedures to test logic/values

Page 55: Overview

- ssn out of range;if tax_id ^= '' then do; ssnverf = indexc(substr(tax_id,1,9),' ','-abcdefghijklmnopqrstuvwxyz','ABCDEFGHIJKLMNOPQRSTUVWXYZ') ;

if substr(tax_id,6,4) = '0000' or substr(tax_id,1,3) < '001' or substr(tax_id,4,2) = '00' or substr(tax_id,1,3) > &highestssn or ssnverf > 0 then do;

Use Proc SQL and other procedures to test logic/values

Page 56: Overview

data T_person_v %keep ;set person;

*test3 - missing value;anom=0;if citizenship_type = '' then

do;%pass(aval=citizenship_type, bval=full_name_lfmi, cval=, dsn=person, anom=3, studyterm=&study, suspend_date=., data_own='reg')

end;if anom > 0;run;

Pass anomalies found into a transaction table

The KEEP Macro in place

The PASS Macro in place

The Anomaly Testing

Page 57: Overview

*test6 - two ids, one ssn;data T_person_v3 %keep ;

set anom_person2;%pass(aval=tax_id, bval=other_tax_id, cval=other_id, dsn=person, anom=6, studyterm=&study, suspend_date=.,data_own='reg')

run;

Pass anomalies found into a transaction table

The KEEP Macro

PASS

Page 58: Overview

Nomenclature of Test Datasets - T_area_XN where T = Test area = broad anomaly category X = v(ariable based)

a(ssumption of table logic violated) N = incremental test set number

dataset T_person_v3 is the third datset testing the demographic (person) table for variable-based anomalies such as missing values or incompatible statuses

Pass anomalies found into a transaction table

Page 59: Overview

*create temp work dataset for todays transactions;data today;

length aval_desc bval_desc cval_desc $25 ; set

T_addressc_v1T_addressc_v2T_addressc_aT_acadstudy_aT_ccat_aT_person_aT_person_v1T_person_v2T_person_v3

{all the transaction sets};run;

Pass anomalies found into a transaction table

Page 60: Overview

Remember to unduplicate all the duplicated records found! You only need one example record per anomaly.

proc sort NODUPKEY data=today;by person_uid anom aval bval cval studyterm;run;

Failure to do so may result in multiple joins in the next step, when you update the master dataset.

Pass anomalies found into a transaction table

Page 61: Overview

Anomalies are added or deleted from the master

Today’s Set

Master Set

Data master ;Update master today;------------or-------------Proc SQL;Select * fromoldmaster lright jointoday ron{criteria}

Add New RecordsKeep Matching Records

Delete Non-Matches

Page 62: Overview

proc sql; create table newmaster asselect r.person_uid, r.id, r.anom, r.aval, r.aval_desc , {other variables},

coalesce(l.first_anom_date, r.first_anom_date) as first_anom_datefrom oldmaster l right join today ronl.person_uid = r.person_uid and l.anom = r.anomand l.aval= r.aval and l.bval= r.bval and l.cval= r.cvaland l.studyterm = r.studyterm ;quit;

Anomalies are added or deleted from the master

Page 63: Overview

The master table is subset and sent as email

data reg ban ir adm grr soe bur law;set master;output ir; *for all records;if data_own='reg' then output reg; else if data_own = 'ban' then output ban; else if data_own = 'adm' then output adm; else if data_own = 'grr' then output grr; else if data_own = 'soe' then output soe; else if data_own = 'bur' then output bur; else if data_own = 'law' then output law;

Page 64: Overview

The master table is subset and sent as email

proc export data=reg dbms=excel2002 outfile= 'g:\temp\reg.xls' replace;

proc export data=ban dbms=excel2002 outfile= 'g:\temp\ban.xls' replace;

proc export data=ir dbms=excel2002 outfile= 'g:\temp\ir.xls' replace;

Page 65: Overview

The master table is subset and sent as email

filename reports email "[email protected]"; data _null_; file reports; set departments; put '!EM_TO! ' name; put '!EM_SUBJECT! Report for ' dept; put ‘Hi’ fname ‘-'; put 'Here is the latest report of anomalies for the' dept'.' ; if dept='ban' then put '!EM_ATTACH! g:\temp\ban.xls'; else if dept ='reg' then put '!EM_ATTACH! g:\temp\reg.xls'; else if dept ='ir' then put '!EM_ATTACH! g:\temp\adm.xls'; put '!EM_SEND!'; put '!EM_NEWMSG!'; put '!EM_ABORT!'; run;

Page 66: Overview

Success!

Subject line

Attached xlsName Department

Page 67: Overview

Technical Hurdles Along The Way

• Don’t use your Outlook mailer. Specify your SMTP mail service as the mailer. You may have to pass your authentication through the SAS sasv9.cfg file to allow STMP mailing.

• Depending on your network, you may have to use “Cscript” as the keyword in the scheduler to launch the program, rather than the implicit “Wscript”. ‘C’ stands for ‘console’.

Page 68: Overview

Lessons Learned

• Send all output to yourself for several days to review, before allowing it to be sent out automatically

• Send lower priority anomalies infrequently, and high priority ones weekly or daily

• Send notice of table violations infrequently to IT, once they have identified a problem and resolution path. (Don’t bug them if they, too, are waiting for a vendor patch or fix)

Page 69: Overview

Lessons Learned

• Be aware of the length of time the job takes to execute. You may need to adjust what and when you test if it starts taking too much time

• Bring your ‘users’ into the process by asking them what you can do differently to help them. Do they need another variable to help isolate problems? Alternate ID?

Page 70: Overview

Auto-Validating Your Data Store Evan Davies

Presentation available online at:http://web.wm.edu/ir/conferencepres.html

Page 71: Overview