analysis of internal validation datasets using open-source … · 2017-12-20 · analysis of...

174
Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica L. Romsos, Lisa Borsuk, and Peter M. Vallone National Institute of Standards and Technology Gaithersburg, Maryland, USA Forensics @ NIST 2016 November 9, 2016

Upload: others

Post on 23-Feb-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Analysis of Internal Validation Datasets Using Open-Source Software STR-validator

Sarah Riman, Erica L. Romsos, Lisa Borsuk, and Peter M. Vallone

National Institute of Standards and Technology Gaithersburg, Maryland, USA

Forensics @ NIST 2016November 9, 2016

Page 2: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Points of view in this presentation are mine and do not necessarily representthe official position of the National Institute of Standards and Technology or theU.S. Department of Commerce.

NIST Disclaimer Certain commercial products are identified in order to specifyexperimental procedures as completely as possible. In no case does such anidentification imply a recommendation or endorsement by the National Institute ofStandards and Technology, nor does it imply that any of these products arenecessarily the best available for the purpose.

Disclaimer

Page 3: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

The focus of this workshop is to introduce the community to the availability ofSTR-validator, an open source software that can be utilized when analyzing largeinternal validation data sets. STR-validator was created by Oskar Hanson atthe Norwegian Institute of Public Health.

Participants will be trained on how to import data obtained from the internalvalidation experiments of PowerPlex Fusion 6C into STR-validator and evaluateparameters such as: analytical and stochastic thresholds, stutter percentagecalculations, peak height ratios, base-pair sizing precision, and sensitivity.

Objectives

Page 4: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Requirements

Personal computers

Installation of the R Software

Installation of the STR-validator Package

Page 5: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Workshop Schedule

Time Topic

9:00 AM-10:00 AM

Load STR-validator package and launch its GUI Trim and Slim txt.files Check Precision Calculate Stutter Thresholds

10:00 AM – 10:10 AM Break

10:10 AM-11:00 AM Calculate Analytical Thresholds Analyze Peak Height Ratio

11:00 AM -11:10 AM Break

11:10 AM-12:00 PM Calculate Stochastic Thresholds Questions Feedback about the workshop (survey) Workshop ends

Page 6: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Launch R by clicking on

Launch R

Page 7: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Load STR-validator

In the R console, load the STR-validator package by typing library(strvalidator)

Press Enter

Page 8: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Launch STR-validator GUI

In the R console, launch the STR-validator Graphical User Interphase by typing : strvalidator()

Press Enter

The STR-validator main GUI

Page 9: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

What is STR-Validator?

A free and open source R-package

Intended for: Validating STR kits Processing controls Comparing methods and instrumentation

STR-validator Graphical User Interface (GUI) easy to use can greatly ↑ speed of validation

Should I be knowledgeable about programming? .. Not at all.

Page 10: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

STR-Validator GUI Welcome Screen

Current Version

Remember Settings

Creator of STR-Validator

Online Resources

Help Page for each Function

Function Tabs

Page 11: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Analysis of Internal Validation Study of PowerPlex Fusion 6C Using STR-Validator

Page 12: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

PowerPlex Fusion 6C

The largest commercial STR multiplex kit available for CE use.

Has a total of 27 loci including the 20 CODIS core loci.

The 27 loci are in 6 dyes and include:• SE33, Penta D and Penta E• 3 Y-STR markers (DYS391, DYS570, DYS576)

A one kit for both direct amplification and casework with a 60 min PCR time capability.

It gives ~17 orders of magnitude of improvement using the NIST 1036 data set.

http://www.promega.com/products/pm/genetic-identity/powerplex-fusion/Butler, J.M., Hill, C.R. and Coble, M.D. (2012) Variability of New STR Loci and Kits in US Population Groups. Profiles in DNA

Page 13: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica
Page 14: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Kit

Page 15: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot PowerPlex Fusion 6C

Page 16: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica
Page 17: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Save PowerPlex Fusion 6C in the workspace

Page 18: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Save the plot as an image

Page 19: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Save the plot as an image

Page 20: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Select a Directory

The Workspace Tab View your plot Save your project

Page 21: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Save Your Project

Page 22: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

PowerPlex Fusion 6C

This is to remind you that the STR-validator will automatically detect the Fusion 6C kit. In case the kit of interest is not in the software you can add it to the STR-validator through the DryLab

Tab.

Page 23: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

What happens if R quits on you?

Page 24: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Alternatively, Open a Project from the Project Tab

Page 25: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Remember to Save Your Data Before, during, and after analysis

Page 26: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Semi-Wide type of table Format = Unstacked Data

Semi-long narrow type of table Format

= Slim or Stacked data

How to Prepare the Data for Analysis?

Export (.txt)

Import (.txt)

Slim

Page 27: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Semi-Wide type of table Format = Unstacked Data

Semi-long narrow type of table Format

= Slim or Stacked data

How to Manually Trim/Slim the Data for Analysis in STR-validator?

Trim

Slim

Trim: removing unwanted samples and/or columns

Slim: transforming files from GeneMapper format into

STR-validator format

Page 28: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Open a New Workspace in STR-validator GUI and save as Name.RData (e.g. Trim_Slim_Analysis)

Page 29: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Import DataSet

Page 30: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Import DataSet

Set1

Page 31: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Perform Manual Trimming

Trim function removes unwanted samples and columns

Page 32: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Trim the Ladder

Keep or Remove Sample(s) in the

sample frame.

Keep or Remove Column(s) in the

Column frame.

A pipe (|) is used for separation.

Double Click on the sample/column

you wish to remove or keep.

Set1

Set1_trim

Page 33: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

View the Trimmed Dataset

Semi-Wide type of table Format = Unstacked Data

Ladder is removed

Set1_trim

Set1_trim

Page 34: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Perform Manual Slimming

GeneMapper semi-wide type of table format

STR-validator semi-long type of data frame

Sli

m f

unct

ion

Page 35: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Slim a Dataset

Set1_trim

Set1_trim_slim

Page 36: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

View the Slimmed Dataset

Semi-long narrow type of table Format = Slim or Stacked data

STR-validator format

Page 37: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Automatic Trimming and Slimming in STR-validator

Set1

Page 38: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Remember to Slim your txt.files either Manually or Automatically in STR-validator

Page 39: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Remember to Save Your Workspace

Page 40: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Precision Analysis

Precision

Characterizes the degree of mutual agreement among a series of

individual measurements/values and results.

Depends only on the distribution of random errors and does not

relate to the true value or specified value.

Is usually expressed in terms of imprecision and computed as a

standard deviation of the test results.

SW

GD

AM

G

uid

eli

ne

s

How to measure the precision of your instrument?

All measured alleles should fall within a ± 0.5 bp window around the measured size

for the corresponding allele in the allelic ladder.

Page 41: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

One injection of 24 ladders performed

1 ladder assigned as the “ladder”

22 ladders assigned as samples (A-V)

Analyzed at your Analytical Threshold (AT)

Export __GenotypeTable.txt from GeneMapper with at least the following information:“Sample.Name”, “Marker”, “Allele” and “Size”.

Experimental Procedure for Precision Analysis

Page 42: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

How to Plot Size Precision for the Allelic Ladders ?

Page 43: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

How to get a Summary on Statistics of Precision?

Page 44: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Open a New Workspace in STR-validator GUI and save as Name.RData (e.g. Precision_Analysis)

Page 45: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Open a New Workspace, Name and Save it

Page 46: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Import Ladder DataSet

Page 47: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Import Ladder DataSet

Page 48: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Precision Tab

Page 49: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Precision

Page 50: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Precision

Page 51: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Size Precision Boxplot for the Allelic Ladders by Allele

Page 52: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Calculate Summary Statistics

Page 53: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Calculate Summary Statistics for Precision

Page 54: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Go to Summary Statistics for Precision and Sort “Size.Sd” by Descending Order

Note that none of the intervals extendnear the +/- 0.5 bp range

Page 55: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Remember to Save Your Workspace

Page 56: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Stutter Is a well-characterized PCR artifact.

Appears as a minor peak one or more repeat units upstream or downstream from a true allele.

Results from strand slippage during the amplification process

Courtesy Dr. John M. Butler

Page 57: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Experimental Procedure for Stutter Ratio

95 single source samples at 1.0 ng of DNA input included in stutter ratio calculation

Analyzed at AT=1 in all dye channels with stutter filters turned off

Export __GenotypeTable.txt from GeneMapper with at least the following information: “Sample.Name”, “Marker”, “Allele”, and “Height”.

Page 58: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

How are Stutters Calculated in STR-Validator?

Stutter peak designation – True Allele designation

= 10 - 11

= -1 type of stutter

Stutter Ratio = Stutter peak height

True allele peak height

= 3205126

Page 59: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

How to Plot Stutter Ratio as a Function of the True Allele?

Page 60: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

How to Calculate Average Stutter Percentage at Each Locus?

Page 61: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Open a New Workspace, Name and Save as Stutter_Analysis

Page 62: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Stutter Analysis

Import Data Set Import Reference Set

Page 63: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Import Data

Page 64: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Import Reference

Page 65: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Reference set contains the known profiles for the dataset samples.

Reference set is used to extract the known alleles from the dataset.

Therefore, it is very important to work with a correct reference set.

Reference dataset requires the following information: “Sample.Name”, “Marker”, and “Allele”.

Page 66: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Calculate Stutter

Page 67: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Calculate Stutter Ratio

Page 68: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Check Subsetting

The naming convention for samples is very important.

To prevent errors, always test the subsetting.

Page 69: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Analysis Range of Stutter Ratio

Number of backward stutters =2

an i.e. max repeat difference 2 = n-2 repeats

Number of forward stutters = 1

an i.e. max repeat difference 1 = n+1 repeats

Page 70: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Level of Interference

Page 71: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

X X 14 X

X X 16

14

X 16

No Overlap between Stutters and Alleles

Stutter-Stutter Interference Allowed

Stutter-Allele Interference Allowed

14

X 16

Hansson, O., P. Gill, and T. Egeland, STR-validator: an open source platform for validation and process control. Forensic Sci Int Genet, 2014. 13: p. 154-66.

Page 72: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Replace “False Stutters”

Page 73: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

View the Results and Sort the Column of Ratio

Page 74: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Stutters

Page 75: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Stutter Ratio as a Function of Parent Allele

Stutter Ratio increases as the

number of repeats increases

Page 76: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Stutter Ratio as a Function of Peak Height

Page 77: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Stutter Ratio as a Function of Peak Height

Page 78: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Calculate Stutter Statistics by Stutter

Page 79: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

View the Results and Sort the Column of Perc.95 (decreasing)

The highest stutter ratio is observed in type -1 in marker (SE33)

Page 80: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Calculate Stutter Statistics by Locus

Page 81: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

View the Results and Sort the Column of Perc.95 (decreasing)

The highest stutter ratio is observed in marker (D12S391)

Page 82: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Remember to Save Your Workspace

Page 83: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Workshop Schedule

Time Topic

9:00 AM-10:00 AM

Load STR-validator package and launch its GUI Trim and Slim txt.files Check Precision Calculate Stutter Thresholds

10:00 AM – 10:10 AM Break

10:10 AM-11:00 AM Calculate Analytical Thresholds Analyze Peak Height Ratio

11:00 AM -11:10 AM Break

11:10 AM-12:00 PM Calculate Stochastic Thresholds Questions Feedback about the workshop (survey) Workshop ends

Page 84: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

The Analytical Threshold

Analytical Threshold

Peaks at and above this threshold can be reliably distinguishedfrom background noise and are generally considered eitherartifacts or true alleles.

SW

GD

AM

Au

toso

ma

l S

TR

In

terp

reta

tio

n G

uid

eli

ne

s

Page 85: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

The Analytical Threshold

Experimental Design

Sensitivity study data

Three mostly heterozygous samples selected

DNA input amounts ranged from:2.0 ng, 1.0 ng, 0.5 ng, 0.25 ng, 0.125 ng, 0.0625 ng, and 0.031 ng

Amplified in triplicate with positive and negative controls

Analyzed at 1 RFU in all dye channels

Export the __SamplePlotSizingTable.txt from GeneMapper with at least the following information: “Dye/Sample Peak”, “Sample.File.Name”, “Marker”, “Allele”, “Height”, and “Data.Point”.

Page 86: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

The Analytical Threshold

Different methods for analytical threshold calculations

Users can plot the analyzed data

Methods 1, 2, 4, and 7 are calculated simultaneously (except for method 6)

Masked data used to estimate the AT can be exported for manual calculations to confirm the result

Page 87: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

*AT1*AT2*AT4*AT6

Page 88: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

*AT7

Page 89: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

What do these AT methods mean?

Page 90: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

DNA Dilution Series Data AT1 AT2 AT4 AT7

Blue 64 106 69 53

Green 68 137 73 50

Yellow 53 91 58 34

Red 57 107 61 40

Purple 55 107 60 36

Analysis of AT1, AT2, AT4, and AT7 in STR-validator

Page 91: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Negative control samples Positive control samples

The Analytical Threshold (Methods 1, 2, 4, 7)

Page 92: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Methods 1, 2, 4, and 7

The Analytical Threshold

1. Create an Analysis Method with peak amplitude thresholds = 1 RFU in all dye channels

2. Import DNA sensitivity data into GeneMapper

Page 93: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

The Analytical Threshold (Methods 1, 2, 4, & 7)3. Analyze the sample

4. Select all samples in the Samples table

5. Open the Samples Plot window

Page 94: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

5. Select to show all dyes

The Analytical Threshold Method 1, 2, 4, and 7

Show all

Page 95: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

6. Show the Sizing Table Sizing Table

Dye/Sample Peak Marker Allele Height Data.Point

7. The Sizing Table must contain all the following columns

Page 96: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

8. Export the Sizing Table

Page 97: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Open a New Workspace in STR-validator GUI and save as Name.RData (e.g. Analytical_Threshold_Analysis)

Page 98: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

File_SamplePlotSizingTable.txt

Import DNA Dilution Sizing Table and Reference Data

Page 99: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

c c

Import Data Import Reference

Page 100: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Check Your Workspace

Page 101: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Calculate Analytical Thresholds

Page 102: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Calculate ATs

Page 103: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Check Subsetting

Page 104: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Mask Peaks High peaks Area around samples alleles ILS peaks

Page 105: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Manually Inspect the Masking

Prepare and Mask Choose a Sample Save Plot

Page 106: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica
Page 107: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica
Page 108: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Saved in the Workspace

Page 109: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Result for each sample and Method

Percentile Rank of noise used to calculate ATM2

Raw data = peaks included in the calculations + masked peaks

Output of Analysis is a List of Three Data Frames

Page 110: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Results = AT Values for each Sample and Method

Page 111: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

What do these columns represent?

AT Results for each sample and Method

Page 112: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

The Analytical Threshold Results

AT for each method per sample AT for each method per dye per sample

Page 113: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

AT for each method globally across all samples

AT for each method globally across all samples per dye

Page 114: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

DNA Dilution Series Data AT1 AT2 AT4 AT7

Blue 64 106 69 53

Green 68 137 73 50

Yellow 53 91 58 34

Red 57 107 61 40

Purple 55 107 60 36

Summary Statistics after Analyzing 66 Samples at Different DNA input

Page 115: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Percentile Rank of noise used to calculate ATM2

Page 116: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Masked Raw Data

Page 117: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

How to Export Masked Data for Manual Check and Calculations ?

Page 118: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Evaluate the Distribution of NoiseExtract peaks included in the calculation from the masked dataset

Page 119: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Discard Masked Data

Hit Apply and Don’t Save at this step

Page 120: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Crop Data from ILS

Page 121: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

The Result Tab

Check assumptions

Page 122: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Gaussian (Normal) Distribution of Noise

Page 123: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Gaussian (Normal) Distribution of Noise Signal

Page 124: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Natural Logarithm of Noise

Page 125: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Natural Logarithm of Noise Signals

Page 126: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Remember to Save Your Workspace

Page 127: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

1. Analyze samples in GeneMapper at your AT

2. Export -GenotypeTable.txt from GeneMapper with at least the following information: “Sample.Name”, “Marker”, “Allele” and “Height”.

Import, from one or several batches of sensitivity studies

The Analytical Threshold

Method 6

Page 128: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

To Calculate AT6, a kit must be specified.

However kit is NOT an option in the calculateAT6_gui function.

Download the updated STR-validator development version “1.8.0.9002”.

(1) Install devtools by typing or copy/paste the following command in R-console :install.packages("devtools", dependencies=TRUE)

(2) Download the updated development version by typing this into the command windowdevtools::install_github("oskarhansson/strvalidator")

Reference:https://github.com/OskarHansson/strvalidator/commit/55aa1e7cb7b257435350cda77b52e1b062c21596

Page 129: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Peak Balance

Peak Height Ratio (PHR)

Establish potential expectations for allele pairing to define genotypes for mixed samples. It is an

indication of which alleles may be heterozygous pairs.

To express the PHR as a percentage: divide the peak height of an allele with a lower relative

fluorescence unit (RFU) value by the peak height of an allele with a higher RFU value, and then

multiplying this value by 100

SW

GD

AM

G

uid

eli

ne

s

Page 130: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Experimental Design for Peak Height Ratio Analysis

Sensitivity study data

Three mostly heterozygous samples selected

DNA input amounts ranged from:– 2.0 ng, 1.0 ng, 0.5 ng, 0.25 ng, 0.125 ng, 0.0625 ng, and 0.031 ng

Amplified in triplicate with positive and negative controls

Analyzed at your AT

Export __GenotypeTable.txt from GeneMapper with at least the following information: ”Sample.Name”, “Marker”, “Height”, and “Allele”.

Page 131: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Peak Height Ratio

Page 132: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Summarize Balance at Each Locus

Page 133: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Open a New Workspace in STR-validator GUI and save as Name.RData (e.g. PeakBalance_Analysis)

Page 134: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Import Data Import Reference

Page 135: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Intra-locus Peak Balance

D10S1248

Hb = Peak Height HMW

Peak Height LMW

Hb = Peak Height smaller

Peak Height larger

= 465 = 0.85

550

= 465 = 0.85

550

= 529 = 1.2

431

= 431 = 0.81

529

CSF1PO

Hb = Peak Height LMW

Peak Height HMW

= 550 = 1.18

465

= 431 = 0.81

529

Page 136: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Calculate Balance

Page 137: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Results of Hb Analysis

Page 138: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Balance

Page 139: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Peak Height Ratio plotted by the mean peak height of the locus

Page 140: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Balance

Page 141: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Peak Height Ratio plotted by Locus

Page 142: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Calculate Hb Summary Statistics

Page 143: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Balance Dialogue

Page 144: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

View the Results and Sort the Column of Perc.95 (Increasing)

The worst balance is observed for marker D3S1358

Page 145: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Remember to Save Your Workspace

Page 146: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Workshop Schedule

Time Topic

9:00 AM-10:00 AM

Load STR-validator package and launch the GUI Check Precision Calculate Stutter Thresholds

10:00 AM – 10:10 AM Break

10:10 AM-11:00 AM Calculate Analytical Thresholds Analyze Peak Height Ratio

11:00 AM -11:10 AM Break

11:10 AM-12:00 PM Calculate Stochastic Thresholds Questions Feedback about the workshop (survey) Workshop ends

Page 147: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Stochastic Threshold

Stochastic Threshold:

Is the RFU value above which it is reasonable to assume that, at agiven locus, allelic dropout of a sister allele has not occurred.

Minimizes the chance of wrongly deciding a heterozygous locus as ahomozygous one.

SW

GD

AM

Au

toso

ma

l S

TR

In

terp

reta

tio

n G

uid

eli

ne

s

Page 148: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Calculating Stochastic Threshold

Experimental Design

Sensitivity study data

Three mostly heterozygous samples selected

DNA input amounts ranged from:– 2.0 ng, 1.0 ng, 0.5 ng, 0.25 ng, 0.125 ng, 0.0625 ng, and 0.031 ng

Amplified in triplicate with positive and negative controls

Analyzed at your AT

Export __GenotypeTable.txt from GeneMapper with at least the following information: ”Sample.Name”, “Marker”, “Height”, and “Allele”.

Page 149: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Stochastic Threshold

Page 150: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Probability of drop-out modelled by logistic regression

Page 151: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Open a New Workspace in STR-validator GUI and save as Name.RData (e.g. StochasticThrehsold_Analysis)

Page 152: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Instead of Import, Click on Open to Import Stochastic_Threshold_Analysis.RData

Amount Reference Set Data Set

Page 153: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Calculate Dropouts for Set7

Page 154: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Four Methods to Score Drop-out Alleles

Drop-Out= Allele with a peak height lower than the limit of detection threshold (LDT).

LDT is not the AT. The lowest peak height in thedataset is automatically suggested in the ‘Limit ofDetection Threshold’ field.

Page 155: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Drop out Scoring Result

Sort Column “RFU” (PH of Surviving Allele) by decreasing order

The tallest peak with drop-out of the sister allele is 239 and observed in Penta E.

Page 156: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Dropout: 0 (no dropout), 1 (allele dropout), and 2 (locus dropout) Rfu: height of surviving allele Heterozygous: 1 for heterozygous and 0 for homozygous Average Peak Height (H) for each sample Total peak Height for each sample Number of Peaks Number of expected peaks Profile Proportion Drop-out is scored: relative to random allele (Method X); if HMW allele is

missing (Method 1); if LMW allele is missing (Method 2); if any of the allelesare missing (Method L).

Drop out Scoring Result

Page 157: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Model Drop-out

Page 158: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Drop-out Prediction

Page 159: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Probability of drop-out is 5% at 160 A conservative threshold is 202

Probability of drop-out modelled by logistic regression

Page 160: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Drop-Out Data

Dot-plot

Page 161: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Drop out Data

Page 162: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Drop out Events by Marker

Page 163: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Heat-map from the Drop-out Data

Heat-map arranged by DNA-input

Page 164: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Heat-map from the Drop-out Data

Page 165: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Add Amount Information to Set7_Dropout Dataset

Page 166: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Add Amount Information to Set7_Dropout Dataset

Page 167: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Heat-map from the Drop-out Data

Page 168: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Heat-map Arranged by DNA-input

Page 169: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Heat-map from the Drop-out Data by Sample Name

Page 170: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Plot Heat-map from the Drop-out Data by Sample Name

Page 171: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Drop out Events by Sample

Page 172: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Analysis of Data based on

Analytical Method:

AT7

Stochastic Threshold

Conservative Stochastic Threshold

Scoring drop-out relative to the

LMW allele

160 202

Scoring drop-out relative to the

HMW allele

122 157

Scoring drop-out relative to a

random allele

138 182

Scoring drop-outper locus

193 227

Summary of Thresholds

49 Heterozygote allele with a drop-out of the sister allele

Page 173: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

Acknowledgments:

NIST Peter Vallone

Erica Romsos

Lisa Borsuk

Norwegian Institute of Public Health Oskar Hansson

Contact Info:

[email protected]

(301) 975-4162

Page 174: Analysis of Internal Validation Datasets Using Open-Source … · 2017-12-20 · Analysis of Internal Validation Datasets Using Open-Source Software STR-validator Sarah Riman, Erica

References

1. https://sites.google.com/site/forensicapps/strvalidator2. https://github.com/OskarHansson/strvalidator3. https://cran.r-project.org/web/packages/strvalidator/index.html4. O. Hansson, P. Gill, T. Egeland, STR-validator: An open source platform for validation and process control, Forensic

Science International: Genetics 13 (2014) 154–166.5. P. Gill, L. Gusmao, H. Haned, W. Mayr, N. Morling, W. Parson, L. Prieto, M. Prinz, H. Schneider, P. Schneider, B. Weir,

DNA commission of the International Society of Forensic Genetics: Recommendations on the evaluation of STR typing results that may include drop-out and/or drop-in using probabilistic methods, Forensic Science International: Genetics 6 (2012) 679–688.

6. Peter Gill, Roberto Puch-Solis, James Curran, The low-template-DNA (stochastic) threshold-Its determination relative to risk analysis for national DNA databases, Forensic Science International: Genetics, Volume 3, Issue 2, March 2009, Pages 104-111

7. Torben Tvedebrink, Poul Svante Eriksen, Helle Smidt Mogensen, Niels Morling, Evaluating the weight of evidence by using quantitative short tandem repeat data in DNA mixtures Journal of the Royal Statistical Society: Series C (Applied Statistics), Volume 59, Issue 5, 2010, Pages 855-874,

8. J. Bregu et al. Analytical Thresholds and Sensitivity: Establishing RFU Thresholds for Forensic DNA Analysis. JFS (2013) 1 pg 120-129.

9. Ullrich J. Monich, Ken Duy, Muriel Medard, Viveck Cadambe, Lauren E. Alfonse, and Catherine Grgicak. Probabilistic characterisation of baseline noise in STR proles. Forensic Science International: Genetics.

10. J.-A. Bright, J. Turkington, J. Buckleton. Examination of the variability in mixed DNA profile parameters for the Identifiler™ multiplex. Forensic Sci. Int. Genet., 4 (2) (2010), pp. 111–114.