p18 strang national high risk breast registry (nhrr) database: systems architecture and data...

1
82S Abstracts P18 STRANG NATIONAL HIGH RISK BREAST REGISTRY (NHRR) DATABASE: SYSTEMS ARCHITECTURE AND DATA NORMALIZATION FOR LONG-TERM RESEARCH Charles Potter, Vivian Wu, Danny Wu and George Wong Strang Cancer Prevention Center New York, New York The Strang NHRR database for breast cancer began in 1987 as a hierarchical database of encoded family history data from 850 women at increased risk of developing breast cancer. At present, the registry population has increased to approximately 15,000 registrants with 180,000 affected relatives. In 1992 the original database and system architecture was re- engineered, normalized and ported to SQLServer operating under Windows NTAS. A fax/scanner based data entry system has been integrated to provide first-stage data entry and quality assurance. The data are reviewed and auto-normalized by the software into data structures suitable for long-term data management. Pedigrees are generated and stored as images in the database. The recent discovery of the BRCA1 gene has prompted the addition of data types to store genetic information and images associated with genetac testing. Pathology and radiology images may also be stored. The long-term follow-up of research populations at risk for disease onset at some future time, poses a variety of data management and data representation problems. A two-part strategy has been adopted for the NHRR database which minimizes the effects of structurally variant changes of data over time. Part one defines encoded family history data so that future modifications to the data set will not compromise the scientific validity of the registry data as a whole. Part two involves the storage of data in a form which is structurally invariant over time as provided by imaging technologies. Structural invariance of the image data is particularly important when diagnostic classification and perhaps even the natural history of the disease itself, may change over several decades. This two-part strategy for systems architecture and database structure will be generalized and applied to all long-term cancer registries at Strang. P19 AN INTEGRATED SYSTEM FOR META-ANALYSES OF INDIVIDUAL PATIENT DATA: DATA DICTIONARY AND DATABASE STRUCTURE Luc Bijnens, Albert lvanov, Guido Hoctin Boes and Richard Sylvester European Organization for Research and Treatment of Cancer Brussels, Belgium In a meta-analysis several studies are combined in a statistically controlled manner to give an overview of the state of the art of the research in a particular area. In individual patient data based meta-analyses, the amount of information that has to be processed is cotrnparable to that of a multicenter trial. In comparison to a multicenter trial, a meta-analysis processes a relatively small amount of data for a larger number of patients. To be able to store, combine and process detailed information on both patients and trials, a proper database design using data dictionaries is necessary. At the EORTC Data Center a software package called Meta-Analysis Integrated System - Patient Case (MAIS-PC) is being developed with exactly these properties. It can handle the data coming from the individual trials and the trial specific administrative information such as treatment comparisons, eligibility criteria, names and addresses of the investigators, references, etc. This information is linked to the statistical, reporting and mailing software. The system keeps track of the status of the individual trials

Upload: charles-potter

Post on 22-Aug-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: P18 Strang National High Risk Breast Registry (NHRR) database: Systems architecture and data normalization for long-term research

82S Abstracts

P18 STRANG NATIONAL HIGH RISK BREAST R E G I S T R Y (NHRR)

DATABASE: SYSTEMS A R C H I T E C T U R E AND DATA N O R M A L I Z A T I O N FOR L O N G - T E R M RESEARCH

Charles Potter, Vivian Wu, Danny Wu and George Wong Strang Cancer Prevention Center

New York, New York

The Strang NHRR database for breast cancer began in 1987 as a hierarchical database of encoded family history data from 850 women at increased risk of developing breast cancer. At present, the registry population has increased to approximately 15,000 registrants with 180,000 affected relatives. In 1992 the original database and system architecture was re- engineered, normalized and ported to SQLServer operating under Windows NTAS. A fax/scanner based data entry system has been integrated to provide first-stage data entry and quality assurance. The data are reviewed and auto-normalized by the software into data structures suitable for long-term data management. Pedigrees are generated and stored as images in the database. The recent discovery of the BRCA1 gene has prompted the addition of data types to store genetic information and images associated with genetac testing. Pathology and radiology images may also be stored.

The long-term follow-up of research populations at risk for disease onset at some future time, poses a variety of data management and data representation problems. A two-part strategy has been adopted for the NHRR database which minimizes the effects of structurally variant changes of data over time. Part one defines encoded family history data so that future modifications to the data set will not compromise the scientific validity of the registry data as a whole. Part two involves the storage of data in a form which is structurally invariant over time as provided by imaging technologies. Structural invariance of the image data is particularly important when diagnostic classification and perhaps even the natural history of the disease itself, may change over several decades. This two-part strategy for systems architecture and database structure will be generalized and applied to all long-term cancer registries at Strang.

P19 AN I N T E G R A T E D SYSTEM FOR META-ANALYSES OF INDIVIDUAL PATIENT DATA: DATA DICTIONARY AND DATABASE S T R U C T U R E

Luc Bijnens, Albert lvanov, Guido Hoctin Boes and Richard Sylvester European Organization for Research and Treatment o f Cancer

Brussels, Belgium

In a meta-analysis several studies are combined in a statistically controlled manner to give an overview of the state of the art of the research in a particular area. In individual patient data based meta-analyses, the amount of information that has to be processed is cotrnparable to that of a multicenter trial. In comparison to a multicenter trial, a meta-analysis processes a relatively small amount of data for a larger number of patients. To be able to store, combine and process detailed information on both patients and trials, a proper database design using data dictionaries is necessary. At the EORTC Data Center a software package called Meta-Analysis Integrated System - Patient Case (MAIS-PC) is being developed with exactly these properties. It can handle the data coming from the individual trials and the trial specific administrative information such as treatment comparisons, eligibility criteria, names and addresses of the investigators, references, etc. This information is linked to the statistical, reporting and mailing software. The system keeps track of the status of the individual trials