ganesh sankaran - cdiscportal.cdisc.org/cdisc user networks/north america/atlantic... · •typical...

20
Experience and process for collaborating with an outsource company to create the define file. Ganesh Sankaran TAKE Solutions

Upload: hoangnhi

Post on 18-Mar-2018

224 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Experience and process for collaborating with an outsource

company to create the define file.

Ganesh Sankaran TAKE Solutions

Page 2: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Agenda

• Typical work flow when sponsors create the SDTM / ADaM in-house and collaborate with vendors for the Define files

• Define.xml Sections

• Define.xml Process - How do we go about extracting the information from the data & documents provided ..?

• Validating Define.xml & the typical Checks

• Common Issues

• Conclusion – How soon should the sponsor start..?

Page 3: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Typical Work flow collaborating with a Vendor for creating Define files

Run the compliance / structure checks on

the data

Generate draft Define.xml & run the

compliance checks

Summarize the Issues/findings and

deliver the draft define for review

Sponsor reviews the findings and update the specification / dataset /

annotation

Send the updated Annotations/Specifica

tion / XPTs back to the vendor for a final

delivery (Pass II)

Runs the compliance

checks, re-generate the final version of Define

(Pass II)

Sponsor provides the documents &

Draft Data

Page 4: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Inputs that are provided.. • Annotated Case Report Form

• Mapping Specification documents

• SAS Datasets / XPTs

• Sponsor Controlled Terminology Documents, if applicable

• Protocol, if Trial Design Domain to be produced

• Data Guide / Supplemental Document

Page 5: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Define.XML Section

• TOC – Metadata of Datasets

• blankcrf (Annotated )

• Variable Level Metadata

• Value Level Metadata

• Controlled Terminology

• Computational Algorithms

• Supplemental Data Definition Document

Page 6: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Define.XML Section (Not visible through the Style Sheet)

• Xmlns - Identifies the default namespace for this document

• ODMVersion - Identifies the ODM version that underlies the schema for the Define-XML

• FileOID - unique identifier for this file. • CreationDateTime - When the specific

version of the define.xml file was created. • StudyName, StudyDescription,

ProtocolName – Study level Information

Page 7: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Define.XML Components and how do we generate them…

• MetaData Generation – • DOMAIN Level

• VARIABLE Level

• VALUE Level

• ORIGIN, CODELIST, Comments and Computational Algorithm

• blankcrf, Data Guide / Supplemental Docs

• Generate Define.xml

• Validate Define files

Page 8: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Define.XML process

Page 9: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Input Sheet for Define.XML Generation • DOMAIN Level Input – SAS based macro utility will create the Input s for this sheet

based on the Datasets provided

VARIABLE METADATA – By reading through the metadata of the SAS datasets provided, variable Level metadata input sheet is populated.

Page 10: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Input Sheet for Define.XML Generation • ORIGIN information will be extracted based on the Annotations & Mapping

Specification provided. Based on the variables for which CODELIST , COMPUTATION ALGORITHM and VALUELIST need to be populated, OID will be assigned here.

Based on the OIDs assigned in the VARIABLE LEVEL sheet, VALUE LEVEL input sheet and CODELIST input sheet will be generated by reading the data and the associated codelist files.

Page 11: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Input Sheet for Define.XML Generation • Value Level Input

• Codelist / Computation Methods Input

Page 12: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

External Documents – blankcrf & Data Guide

• Annotated Case Report Form and Supplemental Documents like Data Guide will be linked to the define.xml

• ORIGIN Page number presented as part of the variable level metadata must be hyperlinked to the corresponding CRF pages attached to the Define file.

Page 13: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Input Sheet for Define.XML Generation • Once the Domain Level, Variable level, Value Level, Codelist sheets

are created, external documents linked and the ORIGIN, COMPUTATIONAL ALGORITHM & External Dictionary information updated and inputs reviewed, DEFINE.XML can be generated

Page 14: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Validation Checks • Structural Checks:

Type of Checks on the Metadata Type of Checks on the Metadata

1. Domain Label mismatch 2. Variable Label mismatch 3. Data type mismatch 4. Missing Expected & Required variables 5. Required / Expected Variables with NULL

values for all records

6. Non-standard SDTM variables 7. Variable Names in lower case 8. Variable Order mismatch 9. Variables with Formats 10. Permissible variables present with NULL Values for all records

Page 15: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Validate Define.XML

• A valid Define.xml should be well formed & conform to the XML schemas. Should reference correct versions of CDISC standards.

Sample Validation Checks

1. XML is well formed 2. All Required Elements are included and /

not empty 3. OID attribute element must be unique

within a single Metadata version – No duplicates def:leaf element, def:ComputationMethod , def:valueListDef,

4. No Duplicates in ItemGroupDef, ItemDef, ItemRef, Study, CodeList element etc.

6. Invalid Data type value for CODELIST elements 7. CodeValue must be unique

within a single CodeList 8. Invalid Codelist for variable, non- extensible CT 8. Invalid Data type value for ItemDef elements 9. Invalid ‘Filetype’, ‘MedDRA’ values 10. Invalid ‘Repeating’, ‘Mandatory’ values

Page 16: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Common Issues • Origin is ‘CRF’, but not annotated. ORIGIN ‘Derived’ but annotated in the

CRF.

• Key variables not properly defined.

• While presenting Custom domains, Domain assumption should be followed. Sometimes custom domains derived without a TOPIC variable.

• Subjects collected as part of external data LB/EG, but not populated in DM domain. All Subjects must be present in DM domain.

• One-to-one relationship missing across some of the paired variables like TEST / TESTCD, PARAM / PARMCD, VISIT / VISITNUM, AVISIT /AVISITN, TPT / TPTNUM TPT & TPTREF

• Common variables across different domains having different ORIGIN derivation. If it’s the same across, can go with “Copied from ADSL.XX”

Page 17: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Common Issues (contd)

• Generally, XPTs up to 1 GB size is fine. If the XPT file size exceeds 1GB, it must be split to smaller datasets not exceeding 1 GB. Study Data Specifications

• Split files should have the same metadata structure so that concatenation / merging of the split datasets should be feasible. Both smaller split files & larger (non-split) file should be included.

• Split datasets and the method applied should be documented in the data guide

• If not following linear approach, need to make sure consistency between ADaM/SDTM sources.

Page 18: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Common Issues (Contd) • ADaM when derived in a Parallel Stream might require

extra efforts for ensuring traceability & Data Lineage.

Page 19: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Conclusion

• Finalize the scope of the work being outsourced / to be performed by the vendor.

• Explain the process being followed and agree to a common form for exchange of documets that could expedite the Define files generation.

• While working across a family of similar studies within the same indication, after a couple of iterations/studies, should look for achieving better efficiency.

• Identify the Vendor(s) at least three months before you expect the first Define.XML to be published. If possible, do a pilot or DEMO define.

Page 20: Ganesh Sankaran - CDISCportal.cdisc.org/CDISC User Networks/North America/Atlantic... · •Typical work flow when sponsors create the SDTM / ... •Validating Define.xml & the typical

Thank You