12248 2015 9820 article 1. · 2017-08-25 · white paper recommendations for use and...
TRANSCRIPT
White Paper
Recommendations for Use and Fit-for-Purpose Validation of BiomarkerMultiplex Ligand Binding Assays in Drug Development
Darshana Jani,1,13 John Allinson,2 Flora Berisha,3 Kyra J. Cowan,4 Viswanath Devanarayan,5 Carol Gleason,6
Andreas Jeromin,7 Steve Keller,8 Masood U. Khan,9 Bill Nowatzke,10 Paul Rhyne,11 and Laurie Stephen12
Received 18 May 2015; accepted 12 August 2015; published online 16 September 2015
Abstract. Multiplex ligand binding assays (LBAs) are increasingly being used to support many stages ofdrug development. The complexity of multiplex assays creates many unique challenges in comparison tosingle-plexed assays leading to various adjustments for validation and potentially during sample analysisto accommodate all of the analytes being measured. This often requires a compromise in decision makingwith respect to choosing final assay conditions and acceptance criteria of some key assay parameters,depending on the intended use of the assay. The critical parameters that are impacted due to the addedchallenges associated with multiplexing include the minimum required dilution (MRD), quality controlsamples that span the range of all analytes being measured, quantitative ranges which can becompromised for certain targets, achieving parallelism for all analytes of interest, cross-talk acrossassays, freeze-thaw stability across analytes, among many others. Thus, these challenges also increase thecomplexity of validating the performance of the assay for its intended use. This paper describes thechallenges encountered with multiplex LBAs, discusses the underlying causes, and provides solutions tohelp overcome these challenges. Finally, we provide recommendations on how to perform a fit-for-purpose-based validation, emphasizing issues that are unique to multiplex kit assays.
KEY WORDS: biomarker; diagnostics; ligand binding assay; multiplex; validation.
INTRODUCTION
Ligand binding assays (LBAs) are commonly used toolsto measure many clinically relevant analytes, including
biomarkers and new biologic-based drugs. Most LBAs relyon antibodies as critical reagents to capture and detect theanalyte of interest in a biological matrix. Traditionally, theseassays, which we refer to in this paper as single-plexed assays,are designed to detect the presence of a single analyte.Commonly used single-plexed assays rely on detectiontechnologies that include enzyme-driven detection (ELISA),chemiluminescence (CL), radioactive isotopes, fluorescence(FL), and electrochemiluminescence (ECL). New technolo-gies have been developed that allow multiple analytes to bemeasured simultaneously in a single sample, all in a singlereaction vessel. Performing multiple assays together in thiscontext is referred to as multiplexing. The demand for andusefulness of multiplex assays have resulted in a large numberof premade commercially available multiplex assay kitsoffering a welcome increase in the number of tools thatscientists have at their disposal besides reduced laboratoryanalysis time and sample volume requirements and costsavings. However, multiplex assays also present uniquechallenges not encountered with single-plexed assays. Uniquechallenges such as different detection ranges, cross-reactivitywith multiplex assay reagents, increased specificity issues andmatrix interference, cross-talk across assays, and changes insensitivity, if not carefully addressed, can generate misleadingresults.
The development of new therapeutic drugs requiresmany years of preclinical and clinical studies in order toobtain drug approval. The importance of the data generated
This manuscript is the result of a collaborative effort from the AAPSBiomarker Discussion group, part of the Ligand Binding AssayBioanalytical Focus Group.
1 Pfizer Inc., One Burtt Road, Andover, Massachusetts 01810, USA.2 LGCLtd,NewmarketRoad, Fordham,CambridgeshireCB7 5WW,UK.3 Kyowa-Kirin Pharmaceuticals, 212 Carnegie Center #101, Princeton,New Jersey 08540, USA.
4Genentech, 1 DNAWay, South San Francisco, California 94080, USA.5AbbVie Inc., 1NorthWaukeganRoadNorth, Chicago, Illinois 60064,USA.6 Bristol-Myers Squibb, Route 206 and Province Line Road,Princeton, New Jersey 08540, USA.
7Quanterix Corporation, 113 Hartwell Avenue, Lexington, Massa-chusetts 02421, USA.
8Abbvie Inc., 1500 Seaport Blvd, Redwood City, California 94063, USA.9 KCAS Bioanalytical and Biomarker Services, 12400 ShawneeMission Parkway, Shawnee, Kansas 66216, USA.
10 Radix Biosolutions, 111 Cooperative Way #120, Georgetown, Texas78626, USA.
11 Quintiles Corporation, 1600 Terrell Mill Road Suite 100, Marietta,Georgia 30067, USA.
12Ampersand Biosciences, LLC, 3 Main St., Saranac Lake, New York12983, USA.
13 To whom correspondence should be addressed. (e-mail:[email protected])
The AAPS Journal, Vol. 18, No. 1, January 2016 (# 2015)DOI: 10.1208/s12248-015-9820-y
1 1550-7416/1 /0 0-0001/0 # 2015 American Association of Pharmaceutical Scientists6 01
in these studies mandates the use of well-characterized assays,both in terms of analytical performance and reliability of thedata. Thus, many of these assays are validated prior to use toensure that the performance and data integrity aresuitable for the intended use. There are many excellentpublications that provide guidance on how to validatethese assays and how to ensure that the analyticalperformance is fit for its intended purpose (1–3), as wellas the recent draft guidance for industry on bioanalyticalmethod validation (4). Despite these publications, someaspects of validation that are unique to multiplex assaysare not covered in these publications.
In this publication, we first provide a brief overview ofthe available multiplexing technologies along with the bene-fits and challenges of each technology. Throughout the paper,we highlight some of the basic principles of LBA methodvalidation for new users in the context of multiplexing. Wethen present an overview of the unique challenges withcommercial multiplex assay kits and recommend solutions tohelp overcome these challenges. Finally, we provide recom-mendations on how to perform a fit-for-purpose validation ofmultiplex kit assays, with an emphasis on the unique aspectsof multiplex assays.
OVERVIEW OF MULTIPLEXING TECHNOLOGIES
The emerging multiplex LBA technologies are based oneither micro-bead or solid-phase planar array, requiringspecialized instruments. The bead-based technologies typical-ly distinguish LBAs from one another by assigning a uniquebead set to each assay in contrast to solid-phase planar arrayswhich distinguish each LBA by a location on the solid-phasesurface. There are differences in the mechanism used todetect and quantify the captured analyte for example, FL,CL, or ECL. There are also notable differences in the benefitsand limitations for each of these technologies. The intendeduse should be the driving force on selecting the appropriatemultiplex assay. In addition, several recent publicationscomprehensively review the latest multiplexing technolo-gies (5–8).
Recommendations on How to Select a Suitable MultiplexAssay
Commercial multiplex kits are an economical way ofmeasuring multiple analytes from a small volume of sample.They are generally designed to evaluate endogenous markersinvolved in a specific pathway or disease indication. Afeasibility assessment of commercially available multiplexassay kit for the intended study involves the evaluation of anumber of key parameters, including the range of detection;the availability of reliable and representative reference orpurified sources of the analyte; and the performance of themultiplex in sample matrix in the presence of endogenouslevels of analyte, specificity, sensitivity, vendor support, andother technology considerations. The use of commercial assaykits can expedite sample analysis and conserve resources.However, the trade-off is that these kits may not have beendeveloped for the intended purpose of study of interest. Insuch circumstances, confirmation of acceptable performancemust be demonstrated and may require feasibility testing. For
example, methods commercialized to measure analytes inserum may be adapted to CSF collections. Likewise, vendorkits may require modification to meet regulatory require-ments. Frequent issues include expanding the number/concentration of calibrators and quality controls. Currentlyavailable multiplex assay vendors are listed in Table I. Thefollowing section provides an outline of the recommendedkey parameters that will guide the user toward finding thebest multiplex assay:
1. Quantitative range of detection and sensitivityMost vendors provide calibration curve informationwith the multiplex assay product or kit insert foreach analyte to determine whether the assay willprovide sufficient sensitivity and range of detection.However, the range of detection reported by thevendor may not reflect the range of detection thatwill be encountered in a specific sample matrix ofinterest. If the quantitative range is unknown orquestionable, the user should determine the rangeof detection for each analyte in both healthy anddisease-state samples to ensure that the sampleanalyte levels fall within the quantitative range ofthe assay for all analytes. This can be done byworking directly with the vendor or by performingadditional development experiments in their ownlaboratory.
2. Performance in sample matrixThe majority of commercially available assays aredeveloped and tested using purified proteins orprotein mixtures made in assay buffer, occasionallytested in clinical matrix like serum or plasma.When matrix samples are used, they are often acombination of native and spiked analytes, due tovariable levels of each analyte present in thesamples. Commercially available multiplex kits areoften manufactured to be used for several applica-tions. As part of the feasibility assessment, for aspecific study, it is important that the assayperformance is assessed in its intended samplematrix. An increased number of samples may beneeded to cover the ranges of all of the analytes inthe multiplex. If the kit vendor or other reliablesource cannot confirm performance of the assay inthe intended sample matrix, the user will need toperform development experiments to test thisparameter prior to kit selection for further use.
3. SpecificityThe majority of multiplex assay vendors provideinformation on the specificity for each assay in thecontext of a multiplex environment. Ideally, thevendor should have performed both a single detectionand single calibrator assay to ensure the specificity ofeach assay within the multiplex or looked at sampleresults of single-plexed assay compared to multiplex.This can be challenging for the scientist to performdue to a lack of availability of single-plexed reagents.Data from the vendor can give confidence in thespecificity of the multiplex. However, depending onthe intended use of the data, the user may need toperform additional specificity experiments on theassay to confirm the parameter.
2 Jani et al.
4. Vendor supportA healthy business relationship with the vendorplays a critical role in assessing and evaluating themultiplex assays (9). Critical examples of commu-nication from the vendor include manufacturing orreagent changes, supporting information on the kitperformance, any information on, and availabilityof key reagents included in the kit and, mostimportantly, the availability of large lots of eachkit (lot-to-lot variability is discussed in a latersection). Ongoing timely technical support is essen-tial to ensure that problems are quickly solved.In summary, the selection criteria for the type ofplatform and its corresponding kits depend mainly onthe above-mentioned four critical parameters. It isrecommended that all of this information be utilizedto make a decision to pursue a type of multiplexingplatform. If these key parameters fail to providescientifically sound information, or if some of these
key assay attributes are missing, scientists are encour-aged to strive for different technology platforms.
MULTIPLEX ASSAY CHALLENGES AND SOLUTIONS
Multiplex assays inherently suffer from numerousanalytical challenges during development, validation, andassay maintenance and throughout sample analysis. A fewprincipal challenges to consider and their correspondingsolutions are described below. Although this paper dis-cusses mainly the challenges and solutions for commercialkit-based assays, certain key aspects of multiplex develop-ment are also included. When employing commercial kit-based assays, vendors typically provide supporting datathat address these challenges through characterizationwork during their kit development; however, it is recom-mended that every scientist vigorously examine the dataprior to validation and sample testing. Examples of
Table I. Review of Some Current Multiplexing Platforms
Technology Surface Detection Benefits Websites
Bioscale Acoustic membraneparticle technology
Sensor resonantfrequency
Sensitivity http://bioscale.com/vibe-workstation
Curiox Magnetic beads(DropArray plates)
FL -Sensitivity-Automated-Kit menu-Custom assay development
http://www.curiox.com/applications.html
Quanterix Magnetic beads (SiMOA) FL -Sensitivity-Automated-Kit menu-Custom assay development-Regulatory path from RUO
to -IVD/CDx
www.quanterix.com
Luminex Fluorescent beads FL -Sensitivity-Kit menu-Custom assay development-Regulatory path from RUOto -IVD/CDx-Flexibility to build your ownassay
http://www.luminexcorp.com
Cyto beads Magnetic beads FL -Sensitivity-Kit menu
http://info.bio-rad.com
Meso ScaleDiscovery
Electrode carbonsurface
ECL -Sensitivity-Kit menu-Custom assay development-Automation
http://www.mesoscale.com
Aushon Quartz DCD -Sensitivity-Kit menu-Custom assay
development
http://www.aushon.com
Randox Biochip CL -Sensitivity-Kit menu-Custom assay development-Automation-Regulatory path from RUO
to -IVD/CDx
http://www.randox.com
Genalyte Silicon photonicbiosensors
Surface resonanceunits
-Sensitivity-Kit menu-Custom assay development-Elimination of cross-talk
http://genalyte.com
FL fluorescence, RUO research uses only, IVD/CDx in vitro diagnostic/companion diagnostic, SiMOA single molecule array
3Use and Fit-for-Purpose Validation of Biomarker Multiplex LBA
solutions during various stages of multiplex validation,sample testing, and assay maintenance are described inTable II.
Challenges with Quantitative Ranges and Optimal SampleDilution
In a multiplex assay, sample dilution of a particularanalyte must take into account the concentrations of the otheranalytes present in the sample. There are times when asample will have low concentrations of some of the analytesand high concentrations of others, making the decision todilute the sample difficult. The challenge presented by thissituation is to decide on the appropriate sample dilutionfactor that ensures that all the analytes in the sample fall intotheir respective quantitative range. The compromise insample dilution may not be the optimal dilution for everyanalyte being measured. The following example furtherclarifies the situation.
In the development of a multiplex panel to measureapolipoprotein (Apo) profiles associated with cardiovasculardisease (Fig. 1), it was noted that the optimal dilution (middleof the curve) for Apo AII was 1:200,000, whereas the optimaldilution for Apo B and Apo E was 1:4000 and 1:1000,respectively. A compromise was made for Apo B and Apo Eat 1:2000, with most of the samples falling within a good rangeof the curve at that dilution. For Apo AII, a dilution of 1:2000resulted in samples falling above the upper limit of quantita-tion (ULOQ); therefore, the assay was redesigned as acompetitive assay, which decreased the assay’s sensitivity,however, brought the optimal sample dilution to 1:2000.
Another consideration for this quantitative range andsample dilution challenge is the matrix of the sample.LBAs typically perform differently in the samples takenfrom normal healthy subjects versus disease-state patients,and the levels of analytes in different sample matricesoften substantially vary. Careful consideration of samplematrix should be evaluated as a possible solution to thechallenge of finding an optimal sample dilution. AlthoughMRD is determined based on fundamental principles ofquantitative measurement of analyte, the issue iscompounded by number of analytes in multiplex assay.Similar to sample dilution, MRD may also be impacted bymatrix and level of analyte. Figure 2 illustrates thecalculation of the minimum required dilution (MRD)using six serum samples, clarifying stepwise how theMRD can be determined. If an acceptable sample dilutioncannot be achieved, the user should consider removingthe problematic analytes from the multiplex panel andrunning them separately.
Challenges with Cross-Reactivity (Specificity)
Cross-reactivity occurs when the capture or detectionreagents in a LBA recognize similar epitopes on otheranalytes present in the sample. Epitopes that are locatedin conserved regions of related proteins, regions withsimilar secondary or tertiary structure, or similar aminoacid sequences are often problematic. A multiplex assaycreates an environment that is more susceptible to cross-reactivity issues than single-plexed assays. In the case of
using commercial kits, manufacturers typically test forcross-reactivity in their multiplex assays. Due to thecomplexity and expense of reproducing this work, re-searchers should investigate the possibility of obtainingauthenticated copies of raw data from the manufacturer,describing the results of their cross-reactivity experiments.It can be challenging to obtain this information fromvendors. If it is obtained and depending upon the level ofconfidence in the data supplied, it may be possible toaccept that information without having to repeat the testsseparately. A second option is to compare multiplex towell-characterized single-plexed assays (10–12). Althoughthe reagents provided in a single-plexed assay may or maynot generate comparable results to the multiplex assaydue to differences in reagents, this exercise could providevaluable information for assay performance. In the casewhere results are not comparable, a third approach totesting reagent cross-reactivity within a multiplexed assayis the Bmissing man^ technique where all analytes exceptone are added to the assay. Changes in the performanceof the multiplex assays (positive or negative) wouldsuggest that cross-reactivity is occurring with the reagentsbeing tested. Likewise, the assay for the omitted analyteshould generate a signal at background level. An exampleof this is shown below (which may vary depending on theplatform), whereby a series of plates are prepared,varying capture antibody target (C), analyte (A), anddetection antibody (D), for a 10-plex:
1. (C, A, D) = (+10, +10, +10)2. (+10, +9 , +10) = target cross-reactivity3. (+10, +10, +9) = detection cross-reactivity4. (+9, +10, +10) = capture cross-reactivity
As multiplex assays are far susceptible to cross-reactivitydue to the complexity of multiple capture and detectionreagents in a single format, we recommend that the scientistevaluate cross-reactivity if needed depending on the qualityof the data available from the vendor.
Challenges of Cross-Talk
Cross-talk is different from cross-reactivity. Cross-reactivity deals specifically with chemical interferencebetween antibody pairs and their analytes. Cross-talk inmultiplex assays is any case in which a signal from oneanalyte in isolation creates an unwanted effect on another.Cross-talk is sometimes described as well-to-well or spot-to-spot Bcarryover,^ Bbleed-over,^ or Bleaching,^ eventwhich compromises the quantitation of each analyte.Multiplex ligand-binding assays that rely on Bspots^ withina well or solid surface use spot location to distinguisheach individual analyte. It is generally typical that thevendor will provide the data supporting the lack of cross-talk. It is recommended that the feasibility test of cross-talk should be evaluated regardless of data availabilityfrom the vendor (13). One way to test for cross-talk is tovary the concentration of one analyte over the fulldynamic range of the assay while keeping the otheranalytes in the multiplex at a low constant concentration.Blank samples should also be included in this test, whichwill reveal increased background signals caused by cross-
4 Jani et al.
TableII.Overview
ofCha
lleng
esan
dSo
lutio
ns:M
ultiplex
Validation,
SampleAna
lysis,an
dAssay
Mainten
ance
Metho
dstage
Cha
lleng
eSo
lution
Metho
dva
lidation
Dataha
ndlin
g•Use
built-intemplates
•Con
tractIT
resources
•Workwith
instrument/softwarevend
orsto
assistwith
solutio
nsAllpa
ss/procedu
reifan
analytesystem
fails?
•Reportda
taforan
alytes
that
pass,repeatrunforan
alytes
that
dono
tpa
ss(secon
drunshou
ldmaskresults
forpa
ssingan
alytes)
•Ana
lyte
that
does
notmeetva
lidationacceptan
cecriteria(ifrepeat
analysisfails)shou
ldno
tbe
includ
edin
multip
lexsamplean
alysis.
•Dem
onstrate
that
remov
alof
capture/detect/ana
lyte
forfailed
assaydo
esno
tchan
gethemultip
lexassayperforman
ceLackof
regu
latory
performan
cerecommen
dation
sUtilizeap
proa
chba
sedon
bioa
nalytcialmetho
dsforbiothe
rape
utics,
implem
enting
acceptan
cecriteria
priorto
in-study
analysis,a
ndusingfit-for-pu
rposeap
proa
chba
sedon
intend
eduseof
theassay
Samplean
alysis
Dataha
ndlin
g(w
orklistprep
aration,
data
man
agem
ent,
assayap
prov
al,d
atarepo
rting)
•Possibleuseof
macros,bu
ilt-intemplates
•Con
tractIT
resources
•Workwith
instrument/softwarevend
orsto
assistwith
solutio
nsAllpa
ss—ifyo
umustrerunon
ean
alyte,
wha
tdo
youdo
withda
taforothe
rpa
ssingan
alytes
from
firstrun?
Reportda
taforan
alytes
that
pass,repeatrunforan
alytes
that
dono
tpa
ss(secon
drunshou
ldmaskresults
forpa
ssingan
alytes).Ana
lyte
that
does
notmeetsamplean
alysisacceptan
cecriteria(ifrepeat
analysisfails)
shou
ldno
tbe
includ
edin
final
analysis.
Iden
tical
curvefitting
•Use
best
fitap
proa
ch•Itisacceptab
leto
usedifferentcurvefits/weigh
tingfordifferentan
alytes;
however,for
asing
lean
alytecontinue
usingsamecurvefittin
gafterva
lidation
Lackof
regu
latory
performan
cerecommen
dation
s•Utilizeregu
latory
requ
irem
ents
that
mostcloselymee
tthestud
yrequ
irem
ents
•FDA
DraftGuida
nceon
Bioan
alytical
Metho
dValidationinclud
esa
discussion
onbiom
arke
rsAssay
mainten
ance
Variabilityin
reagen
tlots
•Im
plem
entade
fine
dprocessto
inve
stigatelot-to-lot
variab
ility
•App
lyacorrection
factor
•Screen
multiplelots
•Acquire
sufficien
tvo
lumes
ofan
original
lotwithex
piration
dating
that
allowscompletionof
theprog
ram
Availa
bilityof
critical
reagen
ts•Use
surrog
atemolecules
ifne
eded
•Initiatean
analyte/an
tibo
dyprod
uction
prog
ram
inan
ticipa
tion
starting
theproject
•Con
tact
vend
orsto
assist
withsourcing
•Review
research
literatureforpo
ssible
acad
emic
sources
Italicized
points
areun
ique
tomultiplexing
5Use and Fit-for-Purpose Validation of Biomarker Multiplex LBA
talk. If preliminary experiments fail to confirm manufac-turer’s claims, scientists are encouraged to assess alterna-tive kits.
Selectivity Challenges
The ability of an assay to recognize only the analyteof interest in the presence of sample matrix is referred toas selectivity. Selectivity issues are amplified in multiplexassays due to the increased number of reagents andanalytes being measured. Examples of interfering mole-cules that contribute to selectivity issues include solublereceptors, rheumatoid factor, and heterophilic antibodies(14). Similar to single-plexed assays, selectivity in amultiplex assay is typically tested by assessing recoveryof the analyte from spiked samples containing theinterference factor to be tested. Occasionally, selectivityissues may be present for some analytes but not others.Reproducibility of samples from one experiment toanother may also be impacted. During study-specific
feasibility experiments, it is recommended that a fewindividual samples, preferably target disease-state sam-ples and the target matrix pool, are analyzed twice toevaluate whether the mean values for each experimentsare within ≤30%. This exercise will help determine earlyon in the evaluation of a multiplex kit which analytes arelikely to pass validation. In addition, reproducibilityresults will aid in establishing the MRD and will aid insetting up the parallelism experiments in validation.Solutions for selectivity include increasing the sampledilution or the addition of blocking agents, detergents, orheterophilic antibody blockers. This is a key challengefor multiplex assays, as changes in buffers needed forone analyte may negatively impact one or more of theother analytes in the panel. The scientist should evaluatewhether selectivity issues are critical enough to addressand then determine if the affected assay should be run asa single-plexed assay or if it is worthwhile to spend timeand effort finding a solution that is amenable for theentire multiplex panel.
a Apo AII (Samples 1 1:200,000) b Apo AII (Samples 1 1:2000)
c Apo B (Samples 1 1:2000) d Apo E (Samples 1 1:2000)
Fig. 1. Multiplex curves and samples for apolipoproteins AII, B, and E. A multiplexassay was developed for serum apolipoproteins (Apo) on the Luminex platform.For the Apo AII assay (a), the optimal dilution was 1:200,000; however, the optimaldilution for Apo B (c) and Apo E (d) is closer to the 1:2000 range, with samplesfalling below the LLOQ at 1:200,000. Conversion of the Apo AII assay to thecompetitive format (b) decreased the assay sensitivity to bring the optimal dilutiondown to 1:2000. The calibrators are represented by the blue circles, and patientsamples (n=49) are represented by green squares
6 Jani et al.
Challenges with Lot-to-Lot Variability
One of the most critical aspects of using commer-cial assays is lot-to-lot reproducibility (9). One strategythat is commonly used to overcome this limitation is thesecuring of a large number of kits prior to the study.This strategy can help overcome difficulties such as thehalting of assay production in the middle of your study
and preventing additional data variability that can beassociated with lot changes. A second part of thisstrategy is to perform analysis on the samples inbatches to minimize assay variability over time or withmultiple kit lots. Finally, sparing a few kits from oldlots enables the laboratory to bridge the old lot to newlots. More details on kit bridging are discussed later inthis paper.
(1)
(2) (3)
Fig. 2. Calculation of optimal minimum required dilution (MRD): results of parallelismassessment for six individual matrix samples for a single analyte. In this figure, non-parallelismis apparent over part of the dilution range. Where there is parallelism across neat and dilutedsamples, no dilution will be required (1). The first chart shows test results adjusted for the dilutionfactor for all dilutions plotted against the actual dilution. In this example, the results increase asthe dilution increases (due to matrix interference of some type) until it levels out. A consensus ofresults is observed once the interference effects have been sufficiently diluted out. In this chart,the dilutions from 1/8, 1/16, and 1/32 appear to have good consensus. This effectively indicatesthat an MRD of 1/8 is potentially needed (2). The second chart shows the same data calculated as% recovery using the neat sample result as the 100% target value. The red dashed lines are theacceptance limits for this particular case, and clearly, the results fall out with those, due to thematrix interference in the inadequately diluted samples (3). The next step to prove that thevariance from 1/8 to 1/32 results is acceptable is to recalculate the % recovery but now using the1/8 dilution results as the 100% target. Here, the variance of higher dilutions up to 1/32 meets theacceptance criteria and proves that the MRD is 1/8. It also shows that diluting samples up to 1/32would achieve acceptable results. Parallelism is therefore shown between the 1/8 and the 1/32dilutions. This example shows a single analyte across six individual matrix samples. It may bemore appropriate to use a larger number of samples when conducting this test, although there isoften difficulty in obtaining matrix with significant concentrations of the biomarker of interest toallow multiple dilutions to be assessed that result in levels within the analytical range. It isrecommended that this experiment is conducted for every analyte in the multiplex panel. It isusually easier to assess data results in this way on an analyte-by-analyte basis rather than bypresenting multiple analytes together from a single sample
7Use and Fit-for-Purpose Validation of Biomarker Multiplex LBA
RECOMMENDEDADJUSTMENTSTOFIT-FOR-PURPOSEVALIDATION PRACTICES FORMULTIPLEXKITASSAYS
There are several publications that describe how toperform a fit-for-purpose validation of commercial single-plexed LBAs (15,16). There are also several recent papersthat have performed fit-for purpose validation of commercialmultiplex kits (17–20). Many of the recommendations forsingle-plexed assays also apply for performing a fit-for-purpose validation of multiplex assays. The reader is encour-aged to review these key publications for performing a fit-for-purpose validation on those aspects that are not covered here.
Biomarker Work Plan
A biomarker work plan (BWP) is a formal writtendocument that establishes the study objectives for abioanalytical project and provides general expectationsfor method performance (9). The BWP also defines therigor of validation work necessary and addresses otherconsiderations necessary for a successful outcome. Whilenot a regulatory requirement, a BWP is good businesspractice, particularly for multiplex methodologies wheremethod feasibility experiments, validation, and sampletesting are more complex than single-plexed assays. Aflexible strategy often helps overcome many issues gener-ally observed during multiplex validation. Similar to whathas been described in earlier publications for single-plexedassay validation (1,2), depending on the utility of the datafrom the multiplex method (and for each analyte in themultiplex), the scientist may choose to performprevalidation experiments as recommended in this paperonly or carry out a fit-for-purpose level of validationwhere the robustness is also assessed. An understandingof the importance (including the intended purpose) ofeach biomarker included in the panel prior toimplementing validation experiments will help generateappropriate target acceptance criteria. Common targetacceptance criteria for prevalidation and validation exper-iments are summarized in Table III.
Number of Analytes in a Panel
Considerations for the number of total analytes in amultiplex panel include the intended use of the data and therequired rigor of the validation acceptance criteria. In somecases, large multiplex panels with lower level of performancemay be acceptable for discovery-type exploratory uses wherethe scientist is often only looking for patterns or relativedifferences between the analytes. Some of the analytes in apanel may not need as high a level of validation as otheranalytes in the panel, which should be stated in the validationplan. For example, higher analytical variability may beacceptable for certain analytes in a panel even in a clinicaltrial setting such as for pharmacodynamic (PD) biomarkers, ifthe planned study size provides adequate power to detect theeffect/response of such analytes. However, if all analytes inthe multiplex assay are to be used to support criticaldecisions, a high level of validation for all the analytes wouldbe recommended. If there are cases where time is a criticalfactor, a smaller number of analytes in the panel is more
practical, to balance between the quality of the data andthe other benefits of multiplexing, such as savings onmatrix volume, time, and cost. QC data from smallerpanels are easier to interpret and are therefore enablingbetter decision making. In addition, the chances of failingthe assay run acceptance criteria and having to retestsamples increase considerably with the addition of moreanalytes in the panel.
Precision and Accuracy
In a multiplexing environment, the accuracy andprecision of each assay will likely vary from the sameassay run in single-plexed format. The user is encouragedto maintain a level of flexibility, based on the intendeduse, in setting up target prestudy and in-study validationacceptance criteria for each analyte that is part of amultiplex panel. Thus, this section focuses on the aspectsof assay validation that are unique to multiplex assays. Aset of validation samples (VS) for the analyte of interestat levels covering the target range of study samples areused to assess accuracy and precision. These VS may beused as the QCs when monitoring assay performanceduring sample analysis after validation is successfullycompleted. It is noted that some companies prefer togenerate a separate set of QCs at different concentrations,within the range of quantitation. It is recommended thatthe decision is made based on quantity of samples andlength of study, so that an uninterrupted supply of QCs isavailable to last through the study.
As with single-plexed assays, to monitor assay perfor-mance and precision in multiplex assay, endogenous QCs arerecommended for as many analytes as possible. If anendogenous QC is identified early during developmentstudies, it should be included in validation as it can providegreater confidence of assay performance during sampletesting over time; however, unless there is an orthogonalmethod for determining concentration, the endogenous QCcannot be used to assess accuracy but only precision.However, in order to work with spiked QCs, the nominalvalue may need to be corrected for the endogenous concen-tration. In addition, if the manufacturers provide QCs to usewith commercial kit assays, it can be most convenient sourceto prepare VS as well as in study QCs. It is recommended thatQCs are as representative of Bin-study^ samples as possibleand that they are produced in-house so that they helpdetermine whether the kits are performing as expected bythe manufacturer and to help bridge different lots of kits asnecessary. This also helps to establish lab and method-specificanalyte levels, as well as statistically relevant acceptableranges. For some kits, in cases where the sample matrix hashigh endogenous levels of some of the analytes, the VS ordilutional linearity samples used to assess precision andaccuracy may need to be made using a surrogate matrix.For in-study run acceptance criteria, the 4-6-30 rule may beacceptable for LBA quantitative methods. Conversely, werecommend that the acceptance criteria should be statis-tically aligned to method performance and be based onclinical understanding of the analyte, intended use,biological variability of patient population, and expectedphysiological changes.
8 Jani et al.
TableIII.
Param
etersforEvaluationDuringPrevalid
ationan
dValidationExp
erim
ents,Including
TargetAccep
tanceCriteria
Param
eter
tobe
mea
sured
Num
berof
plates
forprev
alidation
Num
berof
plates
forva
lidation
Recom
men
dedtarget
values
(withinqu
antitative
rang
e)
Stan
dard
curveaccuracy
andprecision
Twoplates
Sixto
tenplates
Mea
n%
CVof
back-calculatedconcen
trations
≤20
%(inter-assay
)Mean%RE
ofba
ck-calculatedconcen
trations
≤20
%.N
otethat
precisiondu
ring
valid
ationistested
usingva
lidationsamples
prep
ared
atthreeto
five
leve
ls.
Validationsamples
N/A
Three
tofive
leve
ls(L
LOQ,
low,m
id,h
igh,
ULOQ)in
chosen
matrix
Testin
duplicateov
er3–5da
yswithmultiplescientists.R
esults
forea
chan
alyteshou
ldbe
accurate
(%RE
≤30
%)an
dprecise
(%CV
≤30
%).
Lim
itof
quan
titation
N/A
N/A
Calculatedusingmultipleruns
from
valid
ationforeach
analyte
Parallelism
oren
dogeno
usdilution
allin
earity
(ifen
doge
nous
leve
lsare
high
enou
gh)
Fiveindividu
als,disease-state
individu
alsifavailable
Ten
individu
als,disease-state
individu
alsifpo
ssible
Results
forea
chdilution
whe
nrecalculated
forthedilution
factor
shou
ldbe
100%
original
(neat)
result±
(inter-assay
CV%
×3).
Thisiseq
uivalent
toaccepting±3S
Dof
theex
pected
result
(app
roximateconfi
dencelim
it99
.5%
).Thismeans
that
anacceptan
celim
itisstatistically
sign
ificant
tothespecificmetho
d’s
performan
ce.
Matrixspikerecove
ryFiveindividu
alsamples,u
nspike
dan
dspiked
athigh
QC
andlow
QC
levels,u
sing
disease-state
individu
alsifavailable.
Ten
individu
alsamples,u
nspike
dan
dspiked
athigh
QC
andlow
QC
levels,u
sing
disease-state
individu
alsifav
ailable.
For
both
low
andhigh
spike:
3/5du
ring
prevalidationan
d7/10
during
valid
ationyield70–1
30%
ofex
pected
concen
trations
(sum
ofen
doge
nous
leve
lplus
spikeleve
l);
%Recov
ery=
Mea
suredCon
centration
/Exp
ectedCon
centration
×10
0%.
Dilu
tion
allin
earity
ofhigh
-spike
drecombina
ntprotein
Fiveindividu
aldisease-statesamples
(ifav
ailable),spiking
analyteat
high
QC
levels,a
nddilutedthroug
hassay
rang
e.
Fiveindividu
aldisease-statesamples
(ifpo
ssible),spikingan
alyteat
high
QC
levels,a
nddilutedthroug
hassayrang
e.
Results
forea
chdilution
whe
nrecalculated
forthedilution
factor
shou
ldbe
within±3S
Dof
theoriginal
(neat)
result.
Stab
ility
N/A
Ifpo
ssible,three
levelsof
endo
geno
ussamples.
≥24
hat
room
tempe
rature,≥
5cycles
offreeze/tha
w.T
hemean
values
forstressed
matrixshou
ldbe
within±3S
Dof
themea
nva
lues
forun
stressed
matrixat
each
leve
l.
Qua
litycontrolsareused
during
sampletesting.
Itisrecommen
dedto
test
threelevelschosen
basedon
prevalidationexpe
rimen
tsspan
ning
thestan
dard
curve.
Itisex
pected
that
resultsforea
chan
alyteshou
ldbe
accurate
(%RE
≤30
%),an
dprecise(%
CV
≤30
%)follo
wingge
neral4
-6-30rule;h
owev
er,criteriamay
beflexible
forcertainan
alytes
depe
ndingon
theintend
eduse,
biolog
ical
variab
ility,e
tc.
N/A
notap
plicab
le,%
CV
percen
tcoefficien
tva
riation,
%RE
percen
trelativ
eerror,LLOQ
lower
limitof
quan
titation
,ULOQ
uppe
rlim
itof
quan
titation
9Use and Fit-for-Purpose Validation of Biomarker Multiplex LBA
Limits of Quantitation
Determining the limits of quantitation for multiplexassays is essentially the same process as used in single-plexed assays. Basically, each analyte should be prepared intarget matrix at a range of concentrations. The accuracy andprecision of each concentration should be calculated from theanalysis of validation controls from multiple assay runs, andthe lowest concentration that retains accuracy within ±30%and precision of arbitrary 30% CV is referred to as the lowerlimit of quantitation (LLOQ). However, the main issue withsome analytes is the presence of endogenous analyte in thesample matrix. In these cases, either the endogenous level canbe determined or a surrogate matrix may be used todetermine the LLOQ. The highest concentration of theanalyte that retains accuracy within ±30% and precisionwithin arbitrary 30% is the ULOQ. Assessment of the ULOQis typically easier to achieve given that sample matrix may bespiked with purified, recombinant analyte; thus, surrogatematrix is often not required, although it can be used with theQCs. It is important to understand that the LLOQ andULOQ are measured in the presence of all the other assayreagents and analytes in the multiplex assay. The estimatedLLOQ and ULOQ define the range of quantitation for eachanalyte in the multiplex assay.
Dilutional Linearity and Parallelism
Dilutional linearity should be assessed with at least fiveto ten samples that contain high levels of recombinantanalyte. In the case of multiplex assays, this may requiremultiple sets of samples to ensure that all the analytes areevaluated. It may be possible to create a single set of dilutiontest samples. This could be done by spiking the samples witha purified, recombinant source of the analyte. Once a set(s) ofdilutional linearity samples is obtained, the assessment ofdilutional linearity is done in the same manner as for single-plexed assays except that information is generated onmultiple analytes in lieu of one. Results for each dilutionwhen recalculated for the dilution factor should be 100%original (neat) result±3SD (based on the inter-assay precisionof each assay). The range of sample dilutions that meets thesecriteria dictates the acceptable dilution range of the assay andmay be limited by analyte concentration in the samplestested. Equally, if there are consistent dilution-adjustedresults that differ from the neat result, it is probable that thelowest dilution of the consistent results is defining the MRDrequired in the assay to overcome matrix interferences(Fig. 2).
The parallelism test determines whether the recombinantprotein is appropriate for the measurement of the endoge-nous analyte. It is recommended that the link betweenreproducibility and parallelism is evaluated prevalidation asdescribed earlier for each analyte of multiplex panel. Thisexperiment cannot be done until the assay has been shown tohave reliable performance with the kit standard. Parallelism isalso assessed for as many analytes as possible using incurredindividual subject samples that have high levels of theendogenous analytes or, if not available, commercial individ-ual samples. The reproducibility experiments that are con-ducted during prevalidation indicate which analytes are likely
to meet validation parallelism acceptance criteria and,therefore, which analytes will need to be taken out of thepanel. Establishing parallelism for a multiplex assay followsthe same guidelines as a single analyte assay. Assessingparallelism also assists in determining the minimum, andpossibly maximum, sample dilution factor that can be used, asshown in Fig. 3 and as described above and in Fig. 2. Ifparallelism experiments can be completed with biologicalmatrix containing sufficient endogenous concentrations of theanalytes of interest, then these experiments prove therecovery of the true endogenous molecule. In circumstanceswhere parallelism experiments are successful, spiked recoveryand dilutional linearity experiments using exogenous mole-cules would add no value to the method performance detailsin relation to its ability to reproducibly quantify the endog-enous biomarker.
Selectivity (Matrix Spike and Recovery)
Selectivity is the ability of a reagent or antibody tounequivocally recognize only the analyte of interest, typicallyrecombinant, even in the presence of other componentspresent in the matrix. For LBA multiplexing, it is important
30405060708090
100110120130
0 8 16 24 32 40 48 56 64
Dilu
tio
n A
dju
sted
% r
eco
very
vs
MR
D r
esu
lt
Dilution factor
Parallelism - 6 matrix samplesDilution adjusted analyte % Recovery vs MRD (1/8)
mean - 6 matrix samples method 2 - MRD = 1/16
Fig. 3. Determining parallelism when sufficient endogenous markeris present in matrix. Red dashed line indicates assay acceptance limits(≤23%—equivalent to ±3× inter-assay CV% for these particularmethods). This is a parallelism experiment covering two differentanalytes. The lines represent mean % recovery of six different matrixsamples. It demonstrates that there is a matrix interference effectwhen using a dilution of less than 1/8 (analyte #1—blue) or 1/16(analyte #2—brown). Consensus is achieved over the dilution rangeof 1/8 to 1/32 and 1/16 to 1/64, respectively. Hence, different analytesmay demonstrate different MRDs, and so for multiplexed assays, theanalytical ranges and sensitivities of all the analytes combined may becompromised due to one or more analytes that require differentMRDs. In this example, a dilution of 1/16 minimum must be used tocapture valid results from both analytes. Larger multiplexes may givemore differences, but overall, the largest MRD required will need tobe used unless certain analytes are withdrawn from the panel. Thiscould be because the resulting analytical range or sensitivity limits ofthe analytes are unsatisfactory due to the particular requirements ofthe project
10 Jani et al.
to note that the selected matrix or surrogate matrix may notbe optimal for good recovery for all analytes, and therefore,the accurate measurement of each analyte spike may becompromised. For these experiments, ten individual samples,either normal or (and preferably) disease-state samples, arespiked with recombinant protein at high and low levels aspredetermined for each analyte during prevalidation and aretested at the minimum assay dilution. It is recommended that7/10 samples yield 70–130% of expected concentrations (sumof endogenous level plus spike level) calculated based on theformula: % Recovery=Measured Concentration/ExpectedConcentration×100%. An alternative approach will be touse a subtraction method using the formula: Measured conc.minus endogenous level/Spiked level×100%. This subtractionmethod has been supported by several publications (21–23).If an analyte fails to meet acceptance criteria, it should beremoved from multiplex and tested as single assay. It shouldbe noted here that if precision, parallelism, and dilutionallinearity results prove to be acceptable, recovery is a lessimportant component for relative quantitative assays; hence,the scientist is encouraged to make the decision regardingtesting this parameter based on the risk level and consideringthe intended use of the multiplex assay. The spike recovery ofa recombinant protein in different individual lots tests theeffectiveness of the MRD to address non-specific interferenceas much as the potential for lack of reagent specificity due tocross-reactivity. In cases where literature or historical datapoint that the assay will not be measuring very high levelsamples or if there are sufficient endogenous levelsavailable for each analyte, additional parallelism assess-ments would be more important to demonstrate the abilityto quantify endogenous analyte—this is the intendedpurpose of the assay and would provide the neededconfidence in the assay. While not a multiplex-specificissue, this is important to address in any biomarkerrelative quantitative assay.
Stability
The assessment of stability is the same for multiplexassays as it is for single-plexed assays. However, the challengewith multiplex samples is determining the conditions that areamenable for all the analytes. Suggested experiments areincluded in Table III. It is recommended and critical thatfreeze-thaw, bench-top, and long-term stability studies areinitiated with endogenous analytes in target matrix, sinceendogenous analyte may behave differently from spikedrecombinant analyte. Stability will be limited by the leaststable analyte. In cases where one analyte is highly unstablethan others in the multiplex, thus causing sample-handlingissues for all the other analytes, the analyte may need to beremoved from the multiplex panel. Stability assessments withrecombinant material would be the same for multiplex as it isfor single-plexed assays.
Solving Challenges with Lot-to-Lot Differences
A key consideration early in the selection of a multiplexkit is the availability of a large lot that is sufficient to avoid lotchanges during the course of the study. This includessufficient kits for prevalidation experiments, in-study
validation, and sample analysis. Kit manufacturers usuallyhave their own processes to monitor and control the kitmanufacturing processes to limit variation between lots.However, when kits are used in longitudinal studies, varia-tions in the performance of kits across multiple lots remain agenuine concern. This is especially true for multiplex assaysgiven the complexity involved (analytes, capture and detec-tion reagents, standards, etc.). The types of issues causing lot-to-lot variations from a single vendor are often related tocritical reagents for one or more analytes, e.g., the quality ofrecombinant proteins in the kits, the coupling procedures forcapture or detection reagents, and a general lack of informa-tion on critical reagent identity and characterization. Afurther concern is the unannounced changes in critical kitcomponents, overall poor kit quality, and expiration dates onkits or individual component stability associated with kit lotsduring the study. Multiplex assays are sometimesmanufactured by mixing some of the reagents from multiplesingle-plexed assays together, which makes lot-to-lot consis-tency difficult to control. Thus, for multiplex assays, an earlyassessment of the number of kits required to cover validationand sample analysis is highly recommended. If the intendeduse of the assay is exploratory, it may be sufficient to justifyextending the expiration date based on an assessment ofassay performance using precision and accuracy. For moredefinitive assays, evaluation of expiration dates should bemore rigorous, and the methods discussed for assessing lot-to-lot variability may be applicable. In unforeseen circumstanceswhen lots are changed, the bridging can be fulfilled usingmainly three statistical approaches:
1. Show equivalence using ratios and limit of agreementDifferences in sample results from an old to a new lotcan be assessed using the approach recommended forincurred sample reanalysis (24) and PK method cross-validation (25). We refer the readers to these papersfor details and only briefly state the approach here.We recommend testing with both lots, side-by-side onthe same day by the same scientist; 20–50 commer-cially purchased individual subject samples, withresults that span the target range of study samples.Incurred clinical samples may be used with appropri-ate informed consent. For each sample, the ratio ofnew lot concentration to old lot concentration iscalculated; from these, the mean ratio (MR), the90% confidence interval of the mean ratio (MRL),and 67% limits of agreement (LA) are determined.The new lot is considered as similar to the old lot ifMRL is within 0.8 to 1.25 and if LA is within 0.7 to1.43. Even if the MRL is within 0.8 to 1.25 but doesnot contain the value 1, the bias between the old andnew lots is considered as statistically significant. In thiscase, a correction factor based on the value of MR canbe applied to the results from the new lot.The importance of the use of a correction factor whenlot-dependent performance changes are observed hasalso been discussed elsewhere, and the reader isreferred to these papers (26,27). The use of acorrection factor may be considered to compensatefor content and proportional errors if experimentaldata have been generated and are available tosupport. This exercise will require careful and proper
11Use and Fit-for-Purpose Validation of Biomarker Multiplex LBA
documentation during clinical sample testing. It isrecommended that standard curves and controls beevaluated when results from multiple studies need tobe linked, when the assay is transferred from oneCRO to another, and when reagents from one kit tothe next need to be bridged. As the PD data may beused for decision making, it is highly recommendedthat users of the biomarker data in the clinical team bekept informed of any performance variability. Due tothe complexity of the multiplex assays, basing similar-ity on accuracy and precision performance may not beadequate. Application of a correction factor should bea risk-based approach: for example, the criticalbiomarker would require sufficiently accurate andprecise data to determine dose-effective relationshipwhile others may be additional confirmation of thebiological effect. The scientist should remain awarethat the application of a correction factor couldcompromise the data for critical biomarkers fordecision making, while making corrections for non-critical exploratory markers could provide the addi-tional information needed for path forward.
2. Correction factor using slope coefficientAn alternate statistical approach for determining acorrection factor is to fit linear regressions for the QCresponses versus nominal concentrations for both theold and new lots and determine if the two lines areparallel and super-imposable (28). If these conditionshold, the results for the two lots are considered similarand no correction factor is needed. If the slopes aresimilar, but there is a significant difference in inter-cepts, the ratio of responses at each QC concentrationcan be calculated and averaged across QC level, toprovide a correction factor.Using incurred or purchased samples that cover thetarget range (n=25–50), predicted concentrations forsamples assayed with the new lot (y-axis) regressed onthe concentrations from the old lot (x-axis) shouldshow strong agreement with the Bidentity line^ (old lot= new lot; slope=1 and intercept=0) if there are no lotdifferences. If the 95% confidence interval for theslope includes B1^ and the 95% confidence interval forthe intercept includes B0,^ then there is strongagreement between lot results (results fall close tothe identity line) and no correction is needed. If theintercept is not significantly different from 0 but theslope is significantly different than 1, the slopecoefficient is the correction factor. The slope multi-plied by the new-lot-predicted concentration providesthe appropriate adjustment.
3. Correction factor using regression equation based onpredicted valuesAnother approach is to prepare two aliquots of eachsample to be assayed in two different experiments. Inthe first experiment, the samples are assayed withboth lots, and the concentrations for the samplesassayed with the new lot are regressed against thosefrom the old lot. Using the regression equation, newpredicted values for these are generated from theregression equation. The second set of samples is thenassayed with the new lot , and the actual
concentrations obtained for these samples are com-pared to the values predicted from the regressioncalculations. The acceptance of the two assay lotswould be based on achieving results for the second setof data that were within the expected analyticalperformance of the original method. Interested scien-tists are encouraged to read Clinical LaboratoryImprovement Amendments (CLIA) guidelines (29)to understand the basics of using correction factor.In conclusion, various statistical tools and experimen-tal approaches are available in the field to comparetwo lots of kits. Based on the limited experienceavailable to date with multiplex assays, we recom-mend a two-tiered approach to allow for a fullassessment. Lots can be compared using option 1based on the ratio calculation since this can beachieved using an Excel spreadsheet. If it is deter-mined that the lots are equivalent using option 1, nocorrection factor is needed. If, however, the two lotsare determined to be different, option 2 provides abetter description of lots differences. Parallel slopeswith a shift of intercepts suggest that a correctionfactor can be applied as described for option 2.Unparallel slopes indicate that a correction factorcannot be applied since lot differences change withchanges in concentration. The approaches taken mustbe applied to each of the analytes in the multiplexassay. It should be noted that in multiplex assays,often the scientists face the situation where correctionis required for some analytes and not for the others. Ascientific decision should govern the path forward,e.g., either excluding analytes completely or continueusing analytes without correction factor. Correctionfactors can be applied to individual analytes asneeded.
Considerations for Transitioning Multiplex Assaysinto Diagnostic Assays
Diagnostic assays are performed in CLIA-certifiedlaboratories to assist health care providers in identifyingdisease, health risk factors, and guiding medical treatment.Homebrew or research-use-only bioanalytical methodsmay evolve to laboratory developed tests—performed asdiagnostics assays in a single laboratory under theauthority of Center for Medicare and Medicaid Services(30,31) and eventually to FDA-approved in vitro diagnos-tic assays (IVDs). Companion diagnostics are IVDs thatprovide information that is essential for the safe andeffective use of a corresponding therapeutic. There is anincreased use of companion diagnostic assays to improvethe design of clinical trials and the efficiency of patientselection for specific therapeutic interventions. The advan-tages of incorporating a multiplexed companion diagnosticassay into the development of a pharmaceutical programare similar to those previously discussed—including theability to simultaneously screen multiple analytes from asmall sample size. Commercially available diagnostic kitsare available for indications such as autoimmune, oncol-ogy, and infectious disease.
12 Jani et al.
Companion diagnostics must formally be approved bythe FDA (CDRH) as in vitro diagnostics as category 3 highcomplexity devices (42 CFR 493.17). As such, companiondiagnostics must undergo both an analytical and clinicalvalidation (21 CFR 809.3). On August 2014, FDA releasedthe final Guidance on In Vitro Companion DiagnosticDevices (32). Transforming a LBA into a companiondiagnostic assay requires a tremendous amount of work andknowledge of diagnostic assays. It is out of the scope of thispaper to describe the detailed parameters for LDTs and otherdiagnostic assays. Some key areas that need to be addressedfor an assay to be converted and marketed as a companiondiagnostic include ensuring that the proper diagnostic rightson all key reagents have been obtained, approval of theinstrumentation and software, thorough review of the rele-vant patents, and an evaluation of already approved assays.Multiplex assays must meet all the requirements just assingle-plexed assays. However, one of the biggest challengeswould be the GMP manufacturing of the multiplex assay,which would have to meet rigorous testing and ruggednesscriteria. The manufacturing of the reagents, calibrationstandards, and reference standards would also have to bedefined and characterized. An additional challenge withmultiplex assays would be to define the contribution ofeach aspect of the multiplex panel, either individually orin combination. All these factors taken together will alsoimpact “intended use”, which is the basis of the approvalof the multiplex assay. Thus, the transition of multi-plex assays to a diagnostic assay would require asignificant amount of investment in both time and money.The recommendation would be to form a closerelationship with a partner that has the needed ex-pertise to transition the assay into an approved diagnosticassay.
CONCLUSIONS
Multiplex assays are very powerful analytical tools thatare becoming more widely used in the drug developmentprocess. There are many unique challenges associated withmultiplex biomarker assays that are different than thoseencountered with single-plexed assays. We have identifiedthese unique challenges and provided recommendations toovercome them. The challenges include the selection, char-acterization, and validation of multiplex assays within thecontext of fit-for-purpose biomarker assay development.Specifically unique are the challenges with different detectionranges, more complex specificity issues and matrix interfer-ence, cross-talk, and cross-reactivity between reagents. Theguidelines that we have provided mainly apply to assays thatare for research uses only (RUO). It is up to the discretion ofthe scientist to choose the appropriate level of characteriza-tion depending on the intended use of the multiplex. In theserecommendations, we have specified acceptance criteria forassessing assay parameters during validation and solutions tochallenges associated with lot-to-lot variability with multiplexkits, in particular a statistical approach to applying acorrection factor between lots if necessary. We also havehighlighted how to handle data when one or more of theanalytes fail during validation or during in-study sampleanalysis.
ACKNOWLEDGMENTS
Authors thank FDA members for their critical review ofthe manuscript. We also thank the members of The LigandBinding Assay Bioanalytical Focus Group (LBAFG) of theBIOTEC Section, American Association of PharmaceuticalScientists (AAPS) organization, for critical review of themanuscript.
Disclaimer The contents of this article reflect the personalopinions of the authors and may not represent the officialperspectives of their affiliated organizations.
Conflict of Interest Andreas Jeromin is an advisor to QuanterixCorporation and owns stock options.
REFERENCES
1. Lee JW, Weiner RS, Sailstad JM, Bowsher RR, Knuth DW,O’Brien PJ, et al. Method validation and measurement ofbiomarkers in nonclinical and clinical samples in drug develop-ment: a conference report. Pharm Res. 2005;22(4):499–511.
2. Lee JW, Devanarayan V, Barrett YC, Weiner R, Allinson J,Fountain S, et al. Fit-for-purpose method development andvalidation for successful biomarker measurement. Pharm Res.2006;23:312–28.
3. Chau C, Rixie O, McLeod H and W Figg, Validation of analyticmethod for biomarkers used in drug development, ClinicalCancer Research, 2008 14(19).
4. Draft guidance for industry: bioanalytical method validation.U.S. Department of Health and Human Services Food and DrugAdministration Center for Drug Evaluation and Research(CDER), Center for Veterinary Medicine (CVM), September2013.
5. Tighe P, Negm O, Todd I, Fairclough L. Utility, reliability andreproducibility of immunoassay multiplex kits. Methods.2013;61:23–9.
6. Ling MM, Ricks C, Lea P. Multiplexing molecular diagnosticsand immunoassays using emerging microarray technologies.Expert Rev Mol Diagn. 2007;7(1):87–98.
7. Hsu HY, Joos To and Koga H. Multiplex microsphere-based flowcytometric platforms for protein analysis and their application inclinical proteomics—from assays to results. Electrophoresis30(23) 4008-19.
8. Marquette CA, Corgier BP, Blum LJ. Recent advances inmultiplex immunoassays. Bioanalysis. 2012;4(8):927–36.
9. Khan M, Bowsher RR, Cameron M, Devanarayan V, Keller S,King L, et al. Application of commercial kits for quantitativebiomarker determinations in drug development: recommenda-tions for method adaptation and validation. Bioanalysis.2014;7(2):229–42.
10. Salvante KG, Brindle E, McConnell D, O’Connor K,Nepomnaschy PA. Validation of a new multiplex assay againstindividual immunoassays for the quantification of reproductive,stress, and energetic metabolism biomarkers in urine specimens.Am J Hum Biol. 2012;24(1):81–6.
11 . Bas ta rache JA e t a l . Va l ida t ion of a mul t ip lexelectrochemiluminescent immunoassay platform in human andmouse samples. J Immunol Methods. 2014;408:13–23.
12. Backen. The combination of circulating Ang1 and Tie2 levelspredicts progression-free survival advantage in bevacizumab-treated patients with ovarian cancer. Clin Cancer Res.2014;20(17):4549–58.
13. Marchese RD, Puchalski D, Miller P, Antonello J, Hammond O,Green T, et al. Optimization and validation of a multiplex,electrochemiluminescence-based detection assay for the quanti-tation of immunoglobulin G serotype-specific antipneumococcalantibodies in human serum. Clin Vaccine Immunol.2009;16(3):387–96.
13Use and Fit-for-Purpose Validation of Biomarker Multiplex LBA
14. Todd DJ, Knowlton N, Amato M, Frank MB, Schur PH,Izmailova ES, et al. Erroneous augmentation of multiplex assaymeasurements in patients with rheumatoid arthritis due toheterophilic binding by serum rheumatoid factor. ArthritisRheum. 2011;63(4):894–903.
15. Fraser S, Soderstrom C. Due diligence in the characterization ofmatrix effects in a total IL-13 Singulex™ method. Bioanalysis.2014;6(8):1123–9.
16. Bowsher RR, Sailstad JM. Insights in the application of research-grade diagnostic kits for biomarker assessments in support ofclinical drug development: bioanalysis of circulating concentra-tions of soluble receptor activator of nuclear factor ligand. JPharm Biomed Anal. 2008;48:1282–9.
17. Belabani et al. A condensed performance-validation strategy formultiplex detection kits used in studies of human clinicalsamples. J Immunol Methods. 2013;387:1–10.
18. Backen AC et al. ‘Fit-for-purpose’ validation of SearchLightmultiplex ELISAs of angiogenesis for clinical trial use. JImmunol Methods. 2009;342:106–14.
19. Chowdhury F et al. Validation and comparison of twomultiplex technologies, Luminex® and Mesoscale Discovery,for human cytokine profiling. J Immunol Methods.2009;340:55–64.
20. Li et al. Validation of a multiplex immunoassay for serumangiogenic factors as biomarkers for aggressive prostate cancer.Clin Chim Acta. 2012;413:1506–11.
21. Stevenson L, Kelley M, Gorovits B, et al. Large molecule specificassay operations: recommendations for best practices andharmonization from the Global Bioanalysis Consortium Harmo-nization Team. AAPS J. 2014;16(1):83–8. See endogenousproteins section.
22. Bastarache JA, Koyama T, Wickersham NE, Mitchell DB,Mernaugh RL, Ware LB. Accuracy and reproducibility of amultiplex immunoassay platform: a validation study. J ImmunolMethods. 2011;367:33–9.
23. Marcelletti JF, Evans CL, Saxena M, Lopez AE. Calculations foradjusting endogenous biomarker levels during analytical recov-ery assessments for ligand-binding assay bioanalytical methodvalidation. AAPS J. 2015. doi:10.1208/s12248-015-9756-2.
24. Rocci ML, Devanarayan V, Haughey DB, Jardieu P. Confirma-tory reanalysis of incurred bioanalytical samples, AAPS J 9(3)Article 40, 336-343 (2007).
25. Thway TM, Ma M, Lee J, Sloey B, Yu S, Wang YM, et al.Experimental and statistical approaches in method cross-validation to support pharmacokinetic decisions. J PharmBiomed Anal. 2009;49(3):613–8.
26. Ezzelle J, Rodriguez-Chavez IR, et al. Guidelines on goodclinical laboratory practice: bridging operations between re-search and clinical research laboratories. J Pharm Biomed Anal.2008;46(1):18–29.
27. King LE, Farley E, Imazato M, et al. Ligand binding assaycritical reagents and their stability: recommendations and bestpractices from the Global Bioanalysis Consortium Harmoniza-tion team. AAPS J. 2014;16(3):504–15.
28. Weisberg S. Applied linear regression, 3rd Ed. published byWiley/Interscience (2005).
29. CLSI Guidelines: EP9A2IR: method comparison and biasestimation using patient samples: approved guideline 2nd
edition: Interim Revision 2011.30. FDA Draft Guidance. In vitro companion diagnostic devices
draft guidance for industry, food and drug administration staff,and clinical laboratories: framework for regulatory oversight oflaboratory developed tests (LDTs), (October 2014).
31. FDA Final Guidance. Draft guidance for industry, food and drugadministration staff, and clinical laboratories: FDA notificationand medical device reporting for laboratory developed tests(LDTs) (October 2014).
32. FDA Final Guidance. Guidance for industry and food and drugadministration staff: in vitro companion diagnostic devices(August 2014).
14 Jani et al.