xpose 2.0 user's manual
TRANSCRIPT
Xpose2.0User’sManual
NiclasJonsson MatsKarlsson
Departmentof Pharmacy,UppsalaUniversity, Sweden
Draft 4Mar 12,1998
Contents
1 Intr oduction 4
1.1 Termsof usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 ChangesbetweenXpose1.1andXpose2.0 . . . . . . . . . . . . . . 5
1.3 About thismanual. . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 About thedistribution . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Producinginput data for Xpose 6
2.1 Creatingthetablefile(s) for Xposeto read . . . . . . . . . . . . . . . 7
2.1.1 Thesdtab . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 Thepatab . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 Thecotab andcatab . . . . . . . . . . . . . . . . . . . . . 8
2.1.4 Themutab . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.5 Themytab . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Theextrafile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 TheXposedatabases. . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 How Xposereadsfiles . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5 Xposedatavariables . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5.1 Xposedatavariableswithoutdefaultsettings . . . . . . . . . 10
2.6 Usingonly onetablefile for all dataitems . . . . . . . . . . . . . . . 10
3 Starting and quitting Xpose 10
4 Moving around in Xpose 12
5 Redefiningand creatingvariables 13
5.1 Creatingand/ormodifyingdataitems . . . . . . . . . . . . . . . . . 15
6 Plotting conventions 16
6.1 Plottingsymbolsandindividualcurves . . . . . . . . . . . . . . . . . 16
6.2 Smoothes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.3 Visualreferencegrids . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.4 Trellis Graphics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.4.1 Multi-panelconditioning . . . . . . . . . . . . . . . . . . . . 17
1
6.4.2 Box andwhiskersplots . . . . . . . . . . . . . . . . . . . . . 18
6.5 Whatdatais plotted? . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6.5.1 Handlingof missingdata . . . . . . . . . . . . . . . . . . . . 21
7 Documentingruns 21
7.1 Therunsummary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7.2 Therun record. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
8 Checkinga data set 25
9 Comparing models 26
10 The GAM 27
10.1 Overview of theGAM . . . . . . . . . . . . . . . . . . . . . . . . . 28
10.2 RunningtheGAM . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
10.3 Displayingtheresultsof a GAM run . . . . . . . . . . . . . . . . . . 28
10.3.1 Numericalsummaryof a GAM fit . . . . . . . . . . . . . . . 29
10.3.2 DisplayingtheGAM resultsgraphically . . . . . . . . . . . . 29
10.4 Obtainingdiagnosticsfor a GAM run . . . . . . . . . . . . . . . . . 29
10.4.1 TheAkaikeplot . . . . . . . . . . . . . . . . . . . . . . . . . 30
10.4.2 Studentizedresidualsfor a GAM fit . . . . . . . . . . . . . . 31
10.4.3 Individual influenceon theGAM fit . . . . . . . . . . . . . . 31
10.4.4 Individual influenceon theGAM terms . . . . . . . . . . . . 32
10.5 Changingthedefaultbehavior of theGAM . . . . . . . . . . . . . . 32
10.5.1 List currentGAM settings . . . . . . . . . . . . . . . . . . . 33
10.5.2 Estimationof thedispersionfactor . . . . . . . . . . . . . . . 33
10.5.3 Stepwisesearchfor covariates . . . . . . . . . . . . . . . . . 33
10.5.4 Useall lines . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
10.5.5 Useweights. . . . . . . . . . . . . . . . . . . . . . . . . . . 34
10.5.6 Excludeindividualsfrom theGAM analysis. . . . . . . . . . 34
10.5.7 Specifyingastartingmodelfor theGAM . . . . . . . . . . . 34
10.5.8 Normalizethecovariatesto themedian . . . . . . . . . . . . 34
10.6 Settingthemodelscopefor theGAM . . . . . . . . . . . . . . . . . 34
10.6.1 TheGAM scopeandits default settingsin Xpose . . . . . . . 35
10.6.2 Smoothersin theGAM scope . . . . . . . . . . . . . . . . . 36
2
10.6.3 Settingthemaximumnumberof modelsin thescope . . . . . 36
10.6.4 Changingthedefaultsmoothersandtheirarguments . . . . . 36
10.6.5 Limit thescopefor specificcovariates . . . . . . . . . . . . . 37
10.6.6 Changemodelsfor specificcovariates . . . . . . . . . . . . . 38
10.6.7 Viewing thecurrentGAM scope. . . . . . . . . . . . . . . . 39
10.6.8 Re-settingthescope . . . . . . . . . . . . . . . . . . . . . . 39
11 The bootstrap of the GAM 39
11.1 Overview theBootstrapof theGAM . . . . . . . . . . . . . . . . . . 39
11.2 Runningthebootstrapof theGAM . . . . . . . . . . . . . . . . . . . 40
11.3 Summarizingabootstrapof theGAM run . . . . . . . . . . . . . . . 40
11.4 Plottingtheresults . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
11.4.1 Plottingtheinclusionfrequencies . . . . . . . . . . . . . . . 41
11.4.2 Plottingthemostcommoncovariatecombinations . . . . . . 42
11.4.3 Thedistributionof modelsizes. . . . . . . . . . . . . . . . . 42
11.4.4 Thestabilityof theinclusionfrequencies . . . . . . . . . . . 43
11.4.5 Inclusionindex of covariates. . . . . . . . . . . . . . . . . . 43
11.4.6 Inclusionindex of covariatesandindividuals . . . . . . . . . 44
11.5 Settingsfor thebootstrapof theGAM . . . . . . . . . . . . . . . . . 45
11.5.1 Specifyingthemaximumnumberof iterations. . . . . . . . . 46
11.5.2 Selectingtheconvergencealgorithm . . . . . . . . . . . . . . 46
11.5.3 Specifyinganiterationto startcheckingconvergence. . . . . 48
11.5.4 Specifyingtheinterval betweenconvergencechecks . . . . . 48
11.5.5 Specifyingaseednumber . . . . . . . . . . . . . . . . . . . 48
12 Treebasedmodeling 49
12.1 Fitting a treemodel . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
12.2 Plottinga tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
12.3 Pruningtrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
13 Printing and exporting graphics 51
14 Altering the the way Xposedraws plots 53
15 Making plots of a subsetof the data 53
3
1 Intr oduction
Populationanalysisusingnon-linearmixed effectsmodelshasbecomean importanttool in theanalysisof pharmacokinetic/pharmacodynamic data[1] andthebenefitsofthistypeof analysisoverthetraditionalalternativeshavetosomelengthbeendiscussedin the literaturee.g.[2, 3]. Thepriceonehasto payfor thesebenefitsis an increasedcomplexity of themodelsthatarefitted to thedataandacorrespondingincreasein thenumberof assumptionsthat hasto be made[4]. Assumptionsof e.g. independenceof residualsandhomogeneityof residualvarianceis of coursealsoa part of modeldependentindividualanalysisbut thechallengeof non-linearmixedeffectsmodelsliesin thehierarchicalnatureof themodel,i.e. wedonotonly havethedifferencesbetweentheobservedandpredictedobservationsto take in to account,wealsohaveto take intoaccountthat individualsmaydiffer, which resultsin a secondtypeof residuals— thedifferencebetweenindividual parametervaluesandwhat the modelpredictsfor thecorrespondingtypical individual in thepopulation.This togetherwith thepossibilityto includethe relationshipsbetweendemographicdata,for exampleage,weight andclinical laboratorymeasurements,and parameters,which in a senseis a regressionwithin the regression,makesthe taskof assessingthegoodnessof fit of a non-linearmixedeffectsmodela lot harderthanfor anindividualspecificmodel.
Xposeis an S-PLUSbasedmodelbuilding aid for populationanalysisusingNON-MEM. It facilitatesdatasetcheckout,explorationandvisualization,modeldiagnostics,candidatecovariateidentificationandmodelcomparison.Datasetcheckout includesvisualizationof theobservedvariable(s),covariatesandplotsto revealerrorsin thedatafile. Modeldiagnosticplotsincludestheusualresidualplotsbut alsoplotsto checkthevalidity of assumptionsspecificto non-linearmixedeffectsmodels.Dataexplorationis alsodoneby variousplots but also includesauxiliary screeninganalysessuchasstepwisegeneralizedadditivemodeling(GAM) andtreebasedmodeling.Thestabilityof theGAM resultswith respectto covariatemodelselectionaswell asthe impactofinfluentialindividualsandcertaintypesof covariateinteractionscanbeexploredusinga bootstrapre-samplingprocedure.To facilitatedocumentation,Xposecanproducerunsummariesandrunrecords.Therunrecordsareonepagesummariesof a runcon-sistingof a combinationof goodnessof fit plots, parameterestimatesandthe modelfile. Runrecordsaretabulatedsummariesof many runsconsistingof usercomments,terminationmessagesandobjectivefunctionvalues.
The basicideais that the usershouldonly have to tell Xposewhat run that is to beprocessedto beableto produceanythingbetweensimplenumericalsummariesof, e.g.covariates,to sophisticatedbootstrapof theGAM analyses.Theonly otherthing theuserhasto do is to make NONMEM produceoneor more tablefiles that containstheitems,for exampleresidualsandpredictions,that is to bedisplayedor usedin theXposeanalyses.In otherwords,Xposetakescareof readingthe datainto S-PLUS,reformattingof thedatato suit eachplot andanalysisspecificrequirementsandto doanythingelsethatis necessaryto accomplishwhattheuserselectsfrom themenus.
1.1 Termsof usage
Any personwhowantsto useXposeshouldbeawareof thefollowing:
4
� Xposeis copyrighted by NiclasJonssonandMatsKarlsson
� Xpose must be obtained from the official distribution sitewww.biof.uu.se/Xpose andmaynot,without thepermissionfrom theau-thors,bedistributedin any otherway.
� NiclasJonssonandMatsKarlssoncannot beheld responsiblefor any harmfulconsequenceof bugsin theXposesourcecodeand/ormisuseof theprogram.
� In publications,presentations,reportsetc. which usestheoutputand/orresultsfrom Xposewe would be grateful if Xposeis referenced.At presentthe bestreferenceis: E.N.JonssonandM.O. Karlsson“Xpose- anS-PLUSbasedmodelbuilding aid for populationanalysiswith NONMEM”, in The populationap-proach:measuringandmanagingvariability in response,concentrationanddose,eds.L. Aarons,L.P. Balant,M. Danhofet al. EuropeanCommission,Brussels(1997),but this is likely to change.Checkwww.biof.uu.se/Xpose for anupdate.
The usersof Xposeare encouragedto report if they think that Xpose is behavingstrangelyor if they find a bug. Even if we can not take on any official supportre-sponsibilities,weareinterestedin having a usefulandbug freeprogram.If wearenotawareof thebugswecan’t doanythingaboutthem...We arealsointerestedin sugges-tionsandcommentsbut againwe cannotpromiseto implementthingsbecauseoneora few usersrequestsit.
ThereasonwewantpeopletoobtainXposefromhttp://www.biof.uu.se/Xposeis thatwe want to know, at leastroughly, how many usersof Xposethereis but it isalsofor your own sake. S-PLUSsourcecodeis very easyto alterandif your friendlycolleaguenext dooroffersyou a copy of Xposeyou cannot besurethat it is identicalto theoriginal code.Anyway, you mayuseXposefreely at your own risk andwe donot takeany responsibilityfor how Xposeis usedor misused.
1.2 ChangesbetweenXpose1.1and Xpose2.0
ThechangesbetweenXposeversion2.0andthepreviousversion(1.1)arenumerous.Themainuservisible differenceis thatXpose2.0 usestheTrellis GraphicsTM libraryinsteadof thecoregraphicsof S-PLUS.Anotherchangeis that it is no longerstrictlynecessarywith separatetablefiles for, e.g. covariatesandparameters.New is alsothepossibilityto saveplots,afteroptionalcustomizationof theplot title andaxislabels,toafile suitablefor inclusionin otherprogramssuchaswordprocessors.Therearealsoalargenumberof new plotsspecificallydesignedto checkcertainassumptionsmadeinnon-linearmixedeffectsmodelingaswell asdiagnosticsfor GAM models,includinga new Bootstrapof theGAM algorithm.Xpose2.0 is alsoa lot moreflexible thantheolder version. To make the transitioneasierfor usersof the previous version,majordifferences,thatmightbeconfusingto someoneexpectingXpose1.1behavior, will bepointedout in footnotes.
5
1.3 About this manual
Thismanualshouldbeviewedasabeginnersguideto Xposeandshould,togetherwiththeinstallationinstructionsprovidedwith thesourcecode,beenoughto getanew usergoing. It will beassumedthatXposeis installedon thesystemandthattheXpose2.0library is in theS-PLUSsearchpath(Seetheinstallationinstructionsfor details.)Wealsoassumethat the userknows how to interpretthe usualgoodnessof fit plots, i.e.wewill notexplainall theplotsavailablein Xpose.Insteadwewill concentrateon theareasthatmightnotbeintuitiveandwewill alsogivereferencesto theliteraturewhereappropriate.
1.4 About the distribution
The Xpose2.0 distribution containsthe sourcecode,the Xpose2.0 User’s Manual,installationinstructionsandexamplefiles. The installationof the sourcecodeis de-scribedin the installationinstructions. The User’s Manual is this documentand isdistributedin bothapostscriptandHTML version.Detailsaboutthedocumentationisgivenin theREADME file in theDoc sub-directory. Theexamplefiles containsthreeruns,completewith datafile, modelfiles,outputfilesandtablefiles. Thedetailsof theexamplesaregivenin theREADME file in theExamples sub-directory.
1.5 Acknowledgment
Wewouldliketo thankDr. Lewis Sheinerfor theimpetusto Xpose.Wehaveusedpartsof theS-PLUScodeprovidedin thecoursefolder for the ”IntermediateWorkshopinPopulationPharmacokineticDataAnalysis Using the NONMEM System”given byDr. Lewis SheinerandDr. StuartBeal. Dr. JanetWadewrotesomeof the in houseS-PLUSscriptsweusedbeforethedevelopmentXposeandonwhichsomeof theplotsin Xposearebased.We would alsolike to thankDr. StuartBealfor makingavailablea versionof NONMEM with someadvancedfeatures.(Pleasenotethatthis versionisnot yet generallyavailable,neitheris it requiredfor usingXpose.)We wouldalsoliketo thanktheparticipantsin theS-news mailing list, especiallyDr. Venables,for manyhelpful discussionsandcodeexamples.We arealsogratefulfor thehelpful commentsandsuggestionssentby Xpose1.1usersandXpose2.0betatesters.
2 Producing input data for Xpose
Xposemostoften usesthe contentsof oneor moreNONMEM tablefiles whenpro-ducingplots andanalyses.For the Run summaryandRun recordXposealsoneedstheNONMEM outputfiles andmodelfiles. To checka datasetthedatafile is needed.For moredetailsabouthow Xposereadsmodelfiles, output files anddatafiles seeSections7.1and8.
6
2.1 Creating the table file(s) for Xposeto read
Xposeneedsat leastoneNONMEM tablefile for input. If presentin thecurrentdirec-tory, Xposewill alsoreadothertablefiles. Thesetablefiles shouldbenamedin a spe-cific way. Thetablefile namesaredividedupin two parts.Thefirst partis thestem, forexamplethestandardtablefile hasthestemsdtab , andthesecondpartis therun num-ber. Thestandardtablefile for run numberoneshouldthereforebenamedsdtab1 .1
The other tablefiles that Xposereads,if presentin the currentdirectoryandif theymatchtherun number, arepatab , cotab , catab , mutab andmytab . Xposeex-pectsthesetablefiles to beproducedwith theNOPRINTandONEHEADERoptionsontheNONMEM $TABLErecordandwith thePOSTHOCoptionon the$ESTIMATIONrecord.For example:
$EST POSTHOC$TABLE ID TIME IPRED IWRES
NOPRINT ONEHEADERFILE=sdtab1
2.1.1 The sdtab
sdtab standsfor standardtable file. It shouldcontain items describingthe over-all goodnessof fit. The recommendedcolumn items and order for this table fileis ID , TIME2, IPRED and IWRES(in additionto theseNONMEM by default addsDV, PRED, RESandWRES). IPRED (individual predictions)and IWRES(individualweightedresiduals)arenotNONMEM defineditemsandhasto bedefinedby theuser.Thesecanbeobtainedby thefollowing codein the$ERRORblock3:
$ERRORIPRED = FW = ; Your choice: 1 = additive error model
; F = constant CV error model; F**THETA(.) = power error model; (F**2+THETA(.)**2)**0.5; = additive plus; proportional; error model
IRES = DV-IPREDIWRES = IRES/WY = IPRED + W*EPS(1)
NotethattheIWRES = IRES/W line will, in somecases,needreformulationto avoiddivisionby zero.
1It is possibleto alter the tablefile namestructureby the additionof a table file namesuffix, e.g. .txt,makingXposelook for tablefilesendingin .txt, for examplesdtab1.txt. Seetheinstallationinstructionsfordetails.
2Or whatever independentvariablethatis usedin theNONMEM analysis3The codeis taken from the notesof the NONMEM intermediateworkshopgiven by StuartBeal and
Lewis Sheiner. It is alsosomewhatdifferentfrom thecorrespondingcodein theXpose1.1manual.
7
2.1.2 The patab
patab standsfor parametertablefile. It shouldcontainthe parameterestimatesofthefit. Therecommendedcolumnorderis ID followedby theparametersof interest,for exampletheindividualCL estimates,the � sand/orthetypical individualparameterestimates.
2.1.3 The cotab andcatab
cotab andcatab standsfor continuousandcategorical tablefile respectively. Thetablefiles shouldcontainthecovariates,dividedup into continuous(cotab ) andcat-egorical (catab ) covariates.The reasonfor this division is that they aretreateddif-ferently in plotsandanalyses,e.g. theGAM. The recommendedcolumnorderis IDfollowedby thecovariates.
2.1.4 The mutab
mutab standsfor multipleresponsevariablestablefile. In asimultaneousfit of twodif-ferenttypesof measurementsit is usuallya goodideato make,for example,goodnessof fit plots for thetwo measurementsseparately. For Xposeto beableto differentiatebetweenthe two, it is necessarywith a flag variablethat will be, for example,1 forpharmacokineticobservationsand2 for pharmacodynamicobservations.Themutabwasin Xposeversion1.1 usedto provide sucha flag variable. In Xpose2.0 it is notnecessarytoprovidetheflagvariablein thisway(seeSection15)butmutab is keptforcompatibility reasons.Therecommendedcolumnorderis ID , FLAG, TIME, IPREDandIWRES.
2.1.5 The mytab
mytab is new to Xpose2.0 andcanbe usedfor itemsnot fitting into the tablefilesdescribedabove,for examplevariablescreatedwithin NONMEM. Sincethis tablefileis completelyuserdefinedthereis no recommendedcolumnorder.
2.2 The extra file
The tablefiles describedin theprevioussectionarelimited to thevariablesavailablewithin NONMEM. NONMEM hasa limit of 20 datacolumnsso if thedatasetto beanalyzedhasmorecolumnsthanthat it is not possibleto accessall, e.g.,covariatesinXpose. To solve this problem,Xposecanreadan extra file (if presentin the currentdirectory),calledextra followed by the run number(for exampleextra1 ). Theformatof this file shouldfollow the formatof NONMEM tablefiles, i.e. thecolumnheadersonthesecondline andnoalphabeticcharactersbelow this line, excepttheE ore in exponents(theperiod,. , thatcanbeusedin NM-TRAN datafiles doesnotwork,neitherdo datesor clock timeswith a colon separatinghour andminutes). It is alsoimportantthat thenumberof lines in theextra file is thesameasin theNONMEMtablefiles. Careshouldalsobetakenwhenselectingthecolumnnames,seeSection2.4.
8
2.3 The Xposedata bases
Whenthe userstartsXpose(seeSection3) it promptsfor a run numberto process.Thisnumberwill bethecurrentrunnumberuntil anotherrunnumberis specified(seeSection5). Eachrunnumberis associatedwith anXposedatabasecontainingthedatafrom the matchingtablefiles andextra file. This databaseis just a regular S-PLUSdataframewith a few add-ons(seeSection2.5) andis calledxpdb followedby therun number. TheXposedatabaseis storedin thecurrentS-PLUSworking directoryand is henceaccessiblein later S-PLUSor Xposesessions4. The Xposedatabasematchingthecurrentrunnumberis calledthecurrentdatabase.
Whenthedatais read(seenext section)theuserareaskedfor (optional)documentationof thecurrentdatabase.Thiscanbethoughtof asa remindernoteto identify acertainrunwithouthaving to consultany otherrundocumetation.
Type any documentation for the new data base and finish witha blank line:1:Test of documentation feature 2:
2.4 How Xposereadsfiles
Whentheuserspecifiesa run number, Xposefirst checksto seeif thereis a matchingXposedatabaseavailablein thecurrentS-PLUSworkingdirectory. If thereis, theuseris askedif he/shewantsto usethealreadyexistingdatabaseor if thedatabaseshouldbe recreatedfrom the tablefiles andextra file. If thereis no matchingdatabasetheuseris askedif he/shewantsto createit.
WhenXposeis told to createadatabase,therelevantfilesarereadin thefollowing or-der(if they existsin thecurrentdirectory):sdtab , mutab , patab , catab , cotab ,mytab andextra . Thedatafrom eachof thesefilesareput togetherin onelargedataframeandcolumnswith duplicatednamesaredeleted5. Whenthereareno duplicateditemsin thecurrentdatabase,Xposedeletesall lineswherethecolumnnamedWREShasanentryof zero(i.e. keepsall lineswith observations)unlesstold otherwise(seeSection5). Thenext stepis to determinewhateachcolumnis, or rathertheotherwayaround,Xposetries to definethe Xposedata variables. This processis describedindetail in thenext section.WhentheXposedatavariablesaredefinedthedataframeisgiventheS-PLUSclassattributexpose andis savedto disk.
2.5 Xposedata variables
WhenXposemakesa plot of, say, WRESvs PREDit takesthe columnin thecurrentdatabasedefinedto be thewres-variableandplots thatagainstthecolumndefinedtobe the pred-variable. The wres- andthe pred-variablesareexamplesof Xposedata
4This is amajordifferencefrom Xpose1.1werethetablefileswerereadeachtimeaplot or analyseswasselectedfrom theXposemenus
5Thefirst occuranceof a duplicatedcolumnis kept,whichmeansthatif theextra file containsa columnwith thesamenameasany of thecolumnsin thetablefiles, thatcolumnwill bedeletedregardlessof if theactualnumbersin thecolumnsarethesame.With the tablefiles this is not a problemsincein NONMEMtwo differentvariablescannothave thesamename.
9
variables. All plotsandanalysesavailablein Xposearedefinedin termsof xposedatavariables.Sinceit is possibleto redefinethesevariablesin Xposeit meansthat it isquite easyto changethe datathat is plotted,e.g. concentrateon oneparameterat atime,evenif weputall parametersin thepatab 6.
Thedefault Xposedatavariableassignmentis donewhenthetablefiles areread(seepreviousSection)andis basedon thepositionof thecolumnsin thetablefiles andthecolumnnamesasdescribedin Table1.
2.5.1 Xposedata variableswithout default settings
Therearea few Xposedatavariablesthatdo not have a default setting.Thesearethedefinitionsof typical parametervalues(tvpar), randomeffects(ranpar) andweights(weight).
As soonastherearecovariatesincludedin themixed-effectsmodelthetypical valuesof theparameterscanbedifferentfrom oneindividual to another. To assesstheappro-priatenessof the inter-individual error model(the � model)onecanplot the randomeffects(i.e. � s)vs thecorrespondingtypicalparametervalues[4]. To beableto do thatin Xposeit is necessaryto definethe typical parametervaluesandthecorrespondingrandomeffects.
If the weightvariableis definedit will be usedin the GAM andthe bootstrapof theGAM analysesasprior weights(seeSections10.5.5and11.5).
2.6 Usingonly onetable file for all data items
Thereasonfor the ratherstrict organizationof the tablefiles is that it helpsXposetodecidewhich columnis what. As indicatedpreviously it is not necessaryto usetherecommendedtablefiles nor the recommendedtablefile ordersinceit is possibleto(re)defineall the Xposedatavariableswithin Xpose. This is a greateasefor NON-MEM IV userswhichno longerneedsto usemultiple$PROBLEMstatementsto makeNONMEM producemorethanonetablefile in a singlerun. It is recommendedthateitherthesdtab , themutab or themytab is usedif only onetablefile is produced.SeealsoSection5.
3 Starting and quitting Xpose
StartS-PLUSin thedirectorywhereyou have theNONMEM tablefiles (seesection2.1andtheinstallationinstructions).At theS-PLUSprompttypexpose2() . Xposewill startanda graphicswindow will be opened.Type the numberof the run to beprocessed.If therealreadyis anXposedatabasepresentthatmatchestherunnumber,Xposewill askif youwantto useit or recreatetheXposedatabasefrom thetablefiles.
6This is oneof themajorchangesin Xposebetweenversion2.0 and1.1. In 1.1 we would have to lookatplotsof all parametersdefinedin thepatab andtheonly wayavailableto changewhatwasplottedwastore-arrangethetablefiles
10
Table1: Descriptionof thedefault Xposedatavariableassignments.Thevariablesappearintheorderthey aredefinedwhenXposecreatesadatabase.
Xposedata variable Xposeusage Default definition �id Usedto identify 1. A columnwith thenameID
individuals 2. Thefirst columnin thefirstexistingfile in inputfile order
�idlab Usedto labeldata Sameasthe id-variable
individualsdv Usein plotswith 1. A columnwith thenameDV
with thedependent 2. Thefourthcolumnfrom theendvariable in thefirst existingfile in input
file order�
res Not usedin the 1. A columnwith thenameRESdefault set-upof 2. Thesecondcolumnfrom theendplots in thefirst existingfile in input
file order�
wres Usedin plotswith 1. A columnwith thenameWRESweightedresiduals 2. Thelastcolumnin thefirst
existingfile in inputfile order�
pred Usedin plotswith 1. A columnwith thenamePREDpopulationpredictions 2. Thethird columnfrom theend
in thefirst existingfile in inputfile order
�idv Usedin plotswith the 1. A columnwith thenameTIME
independentvariable 2. Thesecondcolumnin sdtab3. Thethird columnin mutab
ipred Usedin plotswith the 1. A columnwith thenameIPREindividualpredictions 2. Thethird columnin sdtab
3. Thefourthcolumnin mutabiwres Usedin plotswith the 1. A columnwith thenameIWRE
individualweighted 2. Thefourthcolumnin sdtab orresiduals fifth columnin mutab
flag Usedto makeplotsof Thesecondcolumnin mutabsubsetsof thedata�
curflag Indicatesthelevel of theflag Not definedto beprocessed�
occ Not usedin the Not definedthedefaultset-upof plots
prameters Usedin plotsand Columnstwo from thebeginninganalysesof parameters to thefifth from the
endin patabcovariates Usedin plotsand Columnstwo from thebeginning
analysesof covariates to thefifth from theendin cotab andcatab
�miss Usedto indicate Default setto -99
missingdata�� Alternativearenumberedin theorderthey aretried�Theinputfile orderis sdtab , mutab , patab , catab , cotab , mytab andextra
� SeeSection15�Covariatesfoundin catab will becodedasfactorsin S-PLUS
� SeeSection6.5.1
11
Main menu(8)
Document-ation(4)
Create/modify items
menu(8)
Managedatabases
menu(26)
Datasetcheckout menu
(7)
Goodness of fitplots menu
(7)
Covariatemodel menu
(6)
Parameterplots menu
(7)
Quit Xpose
Tree menu(4)
Modelcomparison
menu(4)
Residual errormodel
diagnosticsmenu(11)
Structuralmodel
diagnosticsmenu(15)
Bootstrap ofthe GAM menu
(5)
GAM menu(9)
Settings for thebootstrap of
the GAM menu(13)
Plots for thebootstrap of
the GAM menu(10)
Scope menufor the GAM
(7)
Settings for theGAM menu
(11)
Figure1: The Xposeroadmap. Eachbox, except the Quit option, is a menu. Thenumberof optionsin eachmenuis indicatedby thenumberin parenthesis.
To quit Xposeselectoption 8 from the MAIN MENU(SeeSection4). Xposeasksifyou justwantto quit Xposeor to quit bothXposeandS-PLUS.
4 Moving around in Xpose
Theuserinterfaceis a systemof text basedmenuswith up to 4 levelsand167menuoption(Figure1).
Below is theMAIN MENU:
MAIN MENU
1: Documentation2: Manage databases3: Data checkout4: Goodness of fit plots5: Parameters6: Covariate model7: Model comparison8: Quit
12
Selection:
Typing a numberat theSelection: promptselectsthecorrespondingoption fromthemenu.Someoptionsleadsto new menuswhile othersleadto a plot or ananalysis.In all menusexcepttheMAIN MENU, optionnumberoneleadsto themenuabove inthesystemof menus.
5 Redefiningand creatingvariables
It is possibleto redefineXposedatavariablesaswell asto createnew dataitemswithinanXposesession.Thisis donefrom theMANAGEDATABASESMENU(option2 fromtheMAIN MENU).
DATABASEMANAGEMENTMENU
1: Return to the main menu2: Change run number/database3: List items in current database4: View the documentation for the current data base5: Change id variable6: Change independent variable7: Change dependent variable8: Change pred variable9: Change ipred variable10: Change wres variable11: Change iwres variable12: Change variable to label data points with13: Change flag variable14: Change current value of the flag variable15: Change occasion variable16: Change missing data variable17: Change parameter scope18: Change covariate scope19: Change typical parameter scope20: Map random effects to typical parameters21: Change name of a variable22: Change weight variable23: Copy variable definitions from another data base24: Create/modify items in the current data base25: Turn color printing on26: Keep all lines when reading table filesSelection:
Option2 will promptthe userfor a new run numberto process,similar to the initialquestionwhenXposeis started.
Option3 will list theitemsin thecurrentdatabase,includingtheXposedatavariabledefinitions:
The current run number is 4 .
13
The data base contains the following items:ID TIME IPRE IWRE DV PRED RES WRESVID CL V KA Q VSS ETA1ETA2 ETA3 ETA4 ETA5 VIST SEX SMOKCIGS AGE HT WT CMPL CRCL
The following variables are defined:
Id variable: IDLabel variable: IDIndependent variable: TIMEDependent variable: DVParameters: CLCovariates: SEX WT CRCL AGE(Continuous: WT CRCL AGE )(Categorical: SEX )Missing value label: -99Wres variable: WRESIwres variable: IWRERes variable: RESIpred variable: IPRE
This optiontells usthecurrentrun number(4 in this case),what itemswe have in thecurrentdatabaseandwhichof theXposedatavariablesthataredefined.
Option4 displaysthedocumentation,if any, for thecurrentdatabase.
Options5-20canbeusedto redefinethedefault settingsfor thexposedatavariables.
Option21is usedtochangevariablenames,i.e. thedefaultnamein plotswill bethethenewly definednameinsteadof thefour letternameobtainedfrom theNONMEM tablefile. Unfortunatelytherearesomelimitations in Xpose(or S-PLUSrather)regardingthenamesthatcanbegiventoavariable.Thenamesshouldnotcontainspaces7 orother“strange”characterslike parentheses.Notethatthisdoesonly applywhenrenamingavariablein thisway. In thespecificationof axislabelsandplot titles it worksperfectlyfinewith any ASCII string(seeSection13).
With option22it is possibleto defineprior weightsusedin theGAM andthebootstrapof theGAM analyses(seeSections10.5.5and11.5).
If only onetablefile is usedit is usuallynecessaryto manuallychangesomeof theXposedatavariabledefinitions. Having to do so for eachrun caneasilybecomete-dious. With option23 it is possibleto copy variabledefinitionsfrom oneXposedatabaseto another.
Option24 leadsto theCREATE/MODIFY DATA ITEMS MENU(SeeSection5.1.)
Option25 will toggletheusageof color whenprinting andexportingplots (seeSec-tion 13).
Option26affectsthewayXposereadstablefiles. By default,Xposewill not put tablefile lineswith WRESequal0 into thecurrentdatabase.Thereare,however, instanceswhenthis behavior is not desiredandby selectingthis optionbeforereadinga setoftablefiles,Xposewill keepall tablefile lines.
7Spacesdo actuallywork in someplots. If you wantto have spacesin your variablenames,just try it, itmightwork!
14
5.1 Creatingand/or modifying data items
CREATE/MODIFY DATA ITEMS MENU
1: Return to the previous menu2: Add a Time After Dose item to the current data base3: Add averaged covariates to the current data base4: Add exponentiated values of an item to the current data base5: Add the logarithm of an item to the current data base6: Add absolute values of an item to the current data base7: Change covariates categorical <-> continuous8: Copy data item from another data baseSelection:
This menu,which canbe reachedfrom the MANAGEDATABASESMENUcontainsoptionsthatcreatesor modifiesdataitemsin thecurrentdatabase.
Option 2 createsa time after dosedataitem (the Xposedatavariableis tad). Thisis useful if we aredealingwith a long and/orvariabledosinghistory, i.e. whenthetimesof theobservationsarerelative to thefirst doseratherthanthetimesincethelastdose.Selectingthis optionwill createa new item in thecurrentdatabasecalledTAD.Redefiningthe idv variableto TADwill make theplotsusingtheindependentvariable,e.g.thebasicgoodnessof fit plots,usetimeafterdoseinsteadof thetimesincethefirstdose. This option will try to readeitherof sdtab , mutab or mytab in that order.The columnin the readtablefile that matchesthe nameof the independentvariablein thecurrentdatabasewill beusedfor thecalculations.If no suchcolumnis found,Xposewill usethesecondcolumnfor thesdtab andmytab or thethird columnforthemutab .
Option 3 will addcolumnswith within individual averagesof continuouscovariatesthatvarieswithin oneor moreindividuals. These“new” covariateswill get thenameof the original covariatewith an m appended,e.g. the within individual averageofserumcreatininelevels(SECR) will beSECRm.
Option4 will adda columnwith theexponentiatedvaluesof a userspecifieditem inthe currentdatabase.The nameof the new columnwill be thenameof the originalcolumnprependedby exp , e.g.expPRED.
Option5 will adda columnwith the logarithmof a userspecifieditem in thecurrentdatabase. The nameof the new column will be the nameof the original columnprependedby log , e.g. logDV . This option will not work if therearevaluesin thecolumnthatarelessthanor equalto zero
Option 6 will adda columnwith the absolutevaluesof a userspecifieditem in thecurrentdatabase.Thenameof thenew columnwill bethenameof theoriginalcolumnsurroundedby verticalbars,e.g. |ETA1| .
With option 7 it is possiblechangecontinuouscovariatesinto categorical covariatesandviceversa.
With option 8 it is possibleto copy a dataitem from anotherxposedatabaseto thecurrentdatabase. The new column will be called the sameas the variablecopiedappendedby the run numberof the databaseit wascopiedfrom, e.g. CL1 (the CLcolumnof thedatabasefor runnumber1).
15
6 Plotting conventions
This sectiondescribessomedetailsaboutthe way that Xposedraws plots andalsoadescriptionof themoreunusualplot types.
6.1 Plotting symbolsand individual curves
Many of theplotsin Xposedonotusetheregulartypeof plottingsymbols,like◆ or ❑,but rathera number, whichby default is theID number(Figure2). This is to facilitatethe identificationof individualsandto make it possibleto follow individualsbetweenplots.Thedataitemusedastheplottingsymbolcanbespecifiedin thedatabaseman-agementmenu.It is, for example,perfectlypossibleto specifythevariableSEXasthevariableto label datapointswith, makingit directly visible if, e.g. malesin generalhave largerWRESthanfemales.It is alsopossibleto turn theusageof ID numbersoff(SeeSection14.)
In someplots, individual datapointsareconnectedwith a line (Figure2) to make iteasierto spottrendsin, e.g.goodnessof fit plots. To decidewhichpointsthatbelongsto acertainindividual,XposeusestheXposedatavariableid. This is defaultsetto theID , but canberedefinedin thedatabasemanagementmenuto besomethingelse.Thedataitem to beusedasplotting symbolandthedataitem identifying individualscanbespecifiedseparately(seeSection14).
6.2 Smoothes
One way to make it easierto discernunderlyingpatternsin a plot with many datapoints is to adda smoothnon-parametriccurve. The ideawith sucha curve is thatit shouldsmoothlyfollow the generaltrendin the datawithout makingassumptionsaboutthe functional form of the possiblerelationshipbetweenthe variablesplotted.Suchsmoothesarefrequentlyusedin the plots in Xpose(Figure2). The curvesareobtainedusingtheloess() functionin S-PLUS.loess useslocally weightedlinearor quadraticregressionin overlappingintervalsof thedata[5]. The loess curvesinXposeusea locally linearmodelandaninterval spanof 2/3of thedatarange.
6.3 Visual referencegrids
In someplots thereis a visual referencegrid (Figure3). Thesearenot intendedforreadingoff individual datavaluesbut ratherfor makingthecomparisonwith anotherplot or paneleasier.
6.4 Trellis Graphics
Trellis GraphicsTM is a new graphicslibrary in S-PLUS.It hasbeendevelopedbyWilliam S.Clevelandandcoworkersandis describedin thebookVisualizingData [6].ThetheorybehindTrellis GraphicsTM is basedonmodernresearchonexploratorydata
16
1
1
1
1
1
1
1
1
1
2
2
22
2
22
2
2
2
3
3
3
3
3
3
3
3
3
3 4
4
4
444
4
4
4
55
5
5
5
5
55
5
6
6
6
6666
6
6
7
7
7
7
77
7 888
88
88
88
9
999
99
9
9
10
10
10
10
10
12
12
12
1212
121212
12
13
13
131313
13
1313
13
13
14
1414
14
14
14
14
14
14
14
15
15
15
15
15
1515
15
15
15
16
16
16
16
16
1616
16
16
16
17
17
17
1717
17
171717
17
18
18
18
181818
18
1818
18
19
19
1919
19
19
1919
19
19
20
20
20
2020
20
20
20
20
21
21
21
21
21
21
2121
21
22
22
2222
222222
2323
23
232323
23
2323
23
24
2424
24
24
24
2424
24
24
25
25
25
25
25
25
25
25
25
25
26
26
26
2626
2626
26
26
2627
27
27
27
27
27
27
27
27
27 28
28
2828
28
282828
28
28
29
29
29
29
29
29
29
29
29
30
30
30
30
3030
30
30
30
30
31
31
31
31
31
31
31
31
3131
32
32
32
32
323232
32
32
32
3333
33
33
33
33
33
33
33
33
34
343434
34
3434
34
35
35
35
35
35
35
35
35
36
36
36
36
36
36
36
36
36
36
37
37
37
37
37
38
3838
38
38
39
39
39
39
39
39
39
3940
40
40
4040
4040
40
40
40
4141
41
41
41
41
41
41
41
41
42
42
42
42
42
42
43
43
43
43
43
43
43
43
43
43
44
44
44
44
44
44
44
44
44
44
45
45
45
45
46
46
464646
46
46
46
46
46
47
47
47
47
474848
48
484848
48
4848
48
48
48
49
49
49
49
4949
49
49
49
49
50
50
50
50
50
50
50
50
50
50
51
51
5151
5151
515151
52
52
52
52
52
52
52
52
52
52
53
53
53
53
53
53
53
53
53
53
54
5454
54
54
54
54
54
54
54
55
55
55
5555
55
55
5555
56
56
56
56
56
56
57
57
57
57
57
57
57
57
57
57
58
58
58
5858
58
58
59
59
59
59
59
59
59
59
60
60
60
6060
60
60
60
60
60
61
6161
61
6161
616161
6162
6262
62
62
62
62
62
62
63
6363
6363
6363
6363
64
64
64
64
64
64
646464
64
65
6565
65
65
65
65
65
65
65
-2
0
2
4
0 20 40 60 80
Predictions
WRE
SWRES vs Predictions�
Figure2: A plot of WRES vs predictions.The pointsarelabeledwith the ID num-berandeachindividualspointsareconnectedby a dashedline (alsobasedon the IDnumber).Addedto theplot is a smoothcurveto enhanceany underlyingpattern.
analysisandvisual perception[7, 8]. Xposedoesnot utilize the full power of Trel-lis GraphicsTM . To do so it would be necessarywith a far moreflexible graphicaluserinterfacethanthetext basedmenusystemthatis currentlyimplementedin Xpose.Nevertheless,Trellis GraphicsTM offersa lot of advantages(andimplementationdraw-backs!)over thecoreS-PLUSgraphicsandwill bethedefault graphicsmethodologyin S-PLUSfrom version4 andonward.
6.4.1 Multi-panel conditioning
Oneof the big advantagesof Trellis GraphicsTM is the easewith which plots of onevariablevs anothergiven a third canbe drawn. This is in the coreS-PLUSgraph-ics calleda coplot but is in the Trellis GraphicsTM terminologycalleda multi-panelconditioningplot [9]. The ideabehindthis type of plot is to extendthe regular twodimensionaldiagramsto threeor moredimensions(morethanthreedimensionsis pos-sibleby specifyingmorethanonegivenvariable,however, Xposeusesonly onegivenvariable)andthusbe ableto visualizerelationshipsthat would otherwisebe hardtosee.
Figure4 is anexampleof a multi-panelconditioningplot of CL vs HCTZ(a categori-
17
1
1
1
11
1
1
1
1
101
101 101
101
101
101
101
101
101
101
2
2
2
2
2
2
2
22
2
3
33
3 3 3 3 3 3 3
4
4�
4�
4
4
4�
4�
44�5
5
55
5 5 55 5
105105
105 105105 105 105
105 105
6
6
6
6
6
6
6
66
77 7
7
7
7
7
8 8
8
8
8
88
88
108108
108108
108108
108108 108
9
9
9
9
99
99
10
10
10 1010
110
110
110110
110
110
1212 12 12 12 12 12 12 12
1313 13 13 13 13 13 13 13 13
113
113113
113113 113 113 113 113 113
1414
14 1414 14 14 14 14 14
15
1515
1515
1515
15 1515
115
115 115 115
115115
115115
115115
1616
1616 16 16 16 16 16 16
17
17
1717
17
17
1717
1717
117
117 117
117
117
117
117
117
117
117
18
18
18
1818
1818 18
18 18
118 118118
118118
118118
118
19
1919
19
1919
1919
1919
119
119
119
119
119
119119
119
119
11920
20
20 2020
20 20 2020
2121 21 21 21 21 21 21 21
121
121121
121 121121 121 121 121 121
22
22
2222
2222
22
122
122122
122 122122
122 122 122 122
23
23
2323
2323
2323
2323
24
24
24
2424 24
2424
2424
124
124 124124
124
124 124124
12425
25
25 2525
25
25 25 2525
125
125
125 125125
125 125
125125 125
26
26 26
26
26
2626
2626
26
126 126
126
126126
126126 126
126 126
27
27
27
27 2727
27 27
2727
28
2828
28
28
28
28
28
28
28
29
2929
29 29
29 29
29
29
129
129129
129
129
30 30
30
3030
3030
30
30
30
31 31
31 31
31 31 31 3131 31
32
32
32
32
3232
32 3232
32
132
132132
132
132 132 132132
132132
3333
33
33
33
33
33
33
3333
133
133133
133133 133
133133
34 34 34 34 34 34 34 34
35
35
35
35 35
3535
3536
36
36
36
36
36
36
36
3636
37
3737
37
37
3838
3838 38
39 3939 39 39
39 39 3940� 40� 40
4040�
40�
40 40 40�
40
140
140 140140 140
140 140140
140
41� 41�
41 4141�
41�
4141 41
�41
141
141 141 141 141141
141141
141
42� 42�
4242
42�
42
142
142
142
142
142
142
142
142
142 142
43� 43�
43 43 43�
43�
43 43 43�
43143
143 143 143 143 143 143 143 143
44� 44�
44 4444�
44�
4444 44
�44
45� 45
45
45�
145
145
145145
46�
46�
46
46
46�
46�
46
46
46�
46�
146
146
146
146
47�
47
47�
47
47
48�
48�
4848
48�
48�
48 48 48�
48�
48�
4849�
49�
4949
49�
49�
4949
49�
49
149
149149
149149
149149 149
149149
50
5050
50
50
5050
50
5050
150
150
150
150
150
150150
150
150 150
51
51
51
5151
5151
51 51
151
151151
151151 151 151 151 151
5252
52
52
52
5252
52
525253 53
53 53 53 53 53 53 53 53
153153 153 153 153 153 153 153 153 153
54
54
54
54
5454
5454
5454
154
154
154154
154 154
55
5555
55
55
55
55
55
55
155
155 155
155
155
155
155
155
155155
56
5656
56 56 56
57
57 57
5757
57
5757 57 57
58
58 58 58 58 58 58
158 158 158 158158 158 158
158 158 158
59
59
59
59
59
5959
59
159
159159 159
159
159
159
159
159
60
60
60
60
6060
60 60
6060
61 61 61 61 61 61 61 61 61 61
161161 161 161 161 161 161 161 161
62
62
62 62
62
62
62
62
62
162
162
162
162
162
162162
162 162
6363
63 63 63 63 63 63 63
163 163 163163 163
163 163 163 163 16364
64
64
64
6464
64 64 64 64164
164164 164
164 164164
164164 164
65
65 65
65
65 6565
6565 65
0
50
100
150
Observations vs Time
2 4 6 8 10 12
1
1
1
1
1
1
1
1
1
101
101
101
101
101
101
101
101
101
101
2
2
2
2
2
2
2
2
2
23 3 3 3 3 3 3 3 3 3
4�
4�
4
4
4�
4�
4
4�
4
55
55
5 5 55 5
105105
105105
105 105 105 105 105
6
6
6
6
6
6
66
6
77
7
7
7
7
7
8 8
8
8
88
88
8
108108
108
108
108108
108108
108
9
9
9
99
9
99
1010 10
1010
110110
110
110
110
110
12 1212
12 12 12 12 12 12
1313
13 13 13 13 13 13 13 13
113113
113113
113 113 113 113 113 113
1414
14 14 14 14 14 14 14 14
15
15
15
1515
1515
1515
15
115
115 115115
115115
115115
115115
16 1616 16 16 16 16 16 16 16
17
17 1717
17
17
1717
17
17
117
117117
117
117
117
117
117
117
117
18
18
18
1818
1818
1818 18
118 118118
118118
118118
118
19
19 1919
1919
1919
1919
119
119119
119
119
119
119119
119
11920 20 20
20 20 20 2020 20
2121 21 21 21 21 21 21 21
121 121 121 121 121 121 121 121 121 121
2222
2222
2222
22
122
122
122122
122122 122 122 122 122
23
23 2323
2323
2323
2323
24
24
24
2424
2424
2424
24
124124 124
124124
124124
124124
2525 25 25
2525
25 25 25 25
125
125
125
125125
125125
125125 125
26
2626
26
26
2626
26
2626
126 126
126
126
126126
126126
126126
2727
2727
2727
2727
2727
28
2828
28
28
28
28
28
28
28
2929
29
29
29
2929
2929
129
129
129
129
129
30 30
30
30
30
30
30
30
30
30
3131
3131
3131 31 31 31 31
32 3232
3232
3232
3232 32
132
132
132132
132132
132 132132 132
3333
33
33
3333
3333
3333
133
133 133133
133133
133
13334 34 34 34 34 34 34 34
35
3535
35
35
35
35
35
36
36 3636
36
36
3636
3636
37
37 37
37
37
38 3838
38
38
39 39 39 39 39 39 39 39
40� 40 40 40
�40�
40 40 40�
40 40�140
140 140 140 140140 140
140140
41� 41 41
41�
41�
4141 41�
41 41�141
141 141 141 141141 141 141
141
42� 42
42
42�
42
42�
142
142
142
142
142
142
142
142
142
14243� 43 43 43
�43�
43 43 43�
43 43�143 143 143 143 143 143 143 143 143
44� 44 44 44
�44�
44 44 44�
44 44�
45� 45
�45�
45
145
145
145
145
46�
46
46
46�
46�
46
46
46�
46�
46
146
146
146
146
47� 47
�47�
47�
47�48
�48
4848�
48�
4848
48�
48�
48 48 48�
49� 49
49
49�
49�
4949
49�
4949�
149
149 149149
149149
149149
149149
50
5050
50
50
50
5050
5050
150
150
150
150
150
150
150150
150150
51
51
51
5151
5151
51 51
151
151151
151151
151 151 151151
52
52
52
52
52
52
52
52
52
5253 53 53 53 53 53 53 53 53 53
153 153 153 153 153 153 153 153 153 153
54
54
54
54
54
5454
5454
54
154
154
154154
154154
55
55
55
55
55
55
55
55
55
155
155
155
155
155
155
155
155
155
155
56
5656
56 56 56
57
57 5757
5757
5757
5757
58
5858 58 58 58 58
158158 158 158 158 158 158 158
158 158
59
59 5959
5959
59
59
159
159 159159
159
159
159
159
159
60
60
60
60
6060
6060
6060
61 61 61 61 61 61 61 61 61 61
161 161 161 161 161 161 161 161 161
62
6262
62
62
62
62
62
62
162
162 162
162
162
162162
162162
63 6363
6363
63 63 63 63
163 163 163163
163 163 163 163163 163
6464 64
6464
6464 64 64 64
164164 164 164
164164
164164
164164
65
65 6565
6565
65
6565
65
Individual predictions vs Time
2 4 6 8 10 12
Time
Con
cent
ratio
n
Observed and predicted concentrations vs Time
Figure3: A plotof theobservedandpredictedconcentrationsvstime. Theobservationsandpredictionsareplottedvs time in separatepanelsusingthesamescaleson thex-andy-axesrespectively. A visualreferencegrid hasbeenaddedto facilitatecomparisonof theplotsin thetwo panels.
cal covariateindicatingwhetherthepatientwasalsotreatedwith hydrochlorothiazide)givenSEX(thegivenvariable). In this casetheconditioningis basedon thelevels inthevariableSEX(1 or 2). It is alsopossibleto conditionon intervalsof a continuouscovariate(Figure5). The top panel,calledthegivenpanel,shows the intervals. Thex-axisof thegivenpanelgivestherangeof thegivenvariableandthehorizontalbarsshowstheintervals.Themulti-panelplot at thebottomis theactualplot, dividedup intheintervals. Thelowestbar in thegivenplot correspondsto thelower left panel,thesecondlowestbarcorrespondsto the lower right panelandsoon from bottomleft totop right.
In Xpose,thegivenintervalsareconstructedsothatthereareapproximatelythesamenumberof pointsin eachpanel,with approximatelyhalf of thepointsin commonbe-tweenthe intervals. The latter is to have a smoothtransitionbetweenthe panelssothat,e.g. a smooththroughthedatain eachpanelcanbefollowedfrom lower left totop right. Xposewill determinethenumberof intervalsfrom thetotal numberof datapoints.With fewer than50 datapointstherewill betwo intervals,with between50-75datapointstherewill bethreeintervals,with between75-100therewill be4 intervalsandwith morethan100pointstherewill besix intervals.
6.4.2 Box and whiskersplots
The plot in Figure4 is a box andwhiskersplot (in the coreS-PLUSgraphicscalleda boxplot). Theseplotsarein Trellis GraphicsTM plottedhorizontallyratherthanver-tically. The solid dots in the middle of the boxesare the medians,which measuresthecenter, or location,of thedistributions. The lengthof thebox, which is the inter-quartilerange(i.e. themiddle50%of thedataareenclosedin thebox), is a measure
18
1
0
SEX: 1
10 20 30 40
1
0
SEX: 2
CL
HCTZ
CL vs HCTZ given SEX
Figure 4: A multi-panelconditioningplot of the relationshipbetweenCL and hy-drochlorothiazide(HCTZ ) usagegivenSEX . Seetext for furtherdetails.
of thespreadof thedistribution. If theinter-quartilerangeis smallthemiddledataaretightly packedaroundthemedian.If the inter-quartilerangeis large,themiddledataspreadout far from themedian.Therelative distanceof theupperandlower quartilesfrom the mediangivesinformationaboutthe shapeof the distribution. Unequaldis-tancesindicatesthat the distribution is skewed. The whiskersencodethe upperandlower adjacentvalues.Theupperadjacentvalueis the largestobservationthat is lessthanor equalto theupperquartiletimes1.5.Theloweradjacentvaluefollow thesame,but opposite,calculations.Outsidevalues,whichareobservationsbeyondtheadjacentvalues,aregraphedindividually. Theadjacentandoutsidevaluesprovide summariesof thespreadandshapeat thetailsof thedistribution.
Thegroupsin eachof thepanelsof Figure4 areorderedsothattheHZTCgroupwiththelowestgrandmedian(i.e. regardlessof thedivisionin malesandfemales)is plottedfirst, thatis lowest,in eachpanel.This is calledmaineffectsorderingandis partof theTrellis GraphicsTM philosophy[6]. It alsomakesit easierto seesubgroupdeviationsfrom thegeneraltrend. This orderingis usedin almostall box andwhiskersplots inXposewith thenotableexceptionof theGAM plots(seeSection10).
19
10
20
30
40
WT
30 40 50 60 70
WT
WT
10
20
30
40
WT
30 40 50 60 70
AGE
CL
1
2
3
4
60 80 100 120 140
WT
Inte
rval
s
CL vs AGE given WT
Figure5: A multi-panelconditioningplot of CL vs AGE givenintervalsof WT . Seetext for furtherdetails.
6.5 What data is plotted?
In theprocessof readingtheinput tablefilesandtheextrafiles(seeSection2.4)Xposediscardsthe lines in the tablefiles with a WRESentry equalto zero. This is to getrid of the linescorrespondingto dosingevents,which do not containany informationaboutthefit. Theremainingdata,henceforthreferredto asall data,correspondsto theobservationeventsin theNONMEM datafile.
In many plotsweareinterestedin all data,for exampleplotsincludingWRES. In plotsandanalysesof the parameters,on the otherhand,we areusuallyonly interestedinonedatapoint per individual. In Xpose,this datapoint is taken from the first line ineachindividual. For example,if we want to plot CL vs CRCL(creatinineclearance),Xposewill plot the CL andCRCLvalueson the first line of eachindividual, regard-lessof whethertheCRCLchangewithin the individual. This behavior is fine aslongasthecovariatesdo not changewithin the individuals. If therearesmallwithin indi-vidual changesin the covariatesor if the changeis of no interestto the analysis,we
20
can let Xposecalculatewithin individual averagesof the continuouscovariates(seeSection2.4)andthenusetheaveragedcovariatesinsteadof thecorrespondingrealco-variateswhenmakingtheplots. An alternative strategy is to constructa new variablethatchangevalueevery time thewithin individual varyingcovariatechange(this canpresentlynot bedonewithin Xpose).Making theplot with theXposedatavariableidsetto thisnew variablewill displayall covariatevalues.
Thereare othermeansto influencethe way Xposeselectsthe datato be plottedoranalyzed.In theGAM we cantell Xposeto useall datainsteadof the first datalinefor eachindividual (seeSection10). It is alsopossibleto makeplotsandanalysesonasubgroupof thedata(seeSection15).
6.5.1 Handling of missingdata
If the Xposedatavariablemissis set(the default valueis -99 but canbe changedinthedatabasemanagementmenu),Xposewill treatthevalueof thisvariableasmissingdata. The way this works in practicedependson the plot or analysisrequested.Inplotsof onedataitem vs another, for examplea parametervs a covariate,individualswith a valueof eitherthecovariateor theparameterequalto themissvariable,will beexcludedfrom theplot. In plotsor analysesthathandlesblocksof data,e.g. theGAMor multi panelconditioningplots, individualswill be excludedthat have at leastonedataentryequalto themissvariable.For examplein theGAM, if an individual havea missingvalueof, say, CRCL, that individual will be excludedcompletelyfrom theGAM analysis.
7 Documentingruns
A new featurein Xpose2.0 is thepossibility to documentrunsin otherwaysthanintheform of plots. Rundocumentationcanin Xpose2.0bedonein two ways. First, asinglerun canbedocumentedusinga run summary. Second,a seriesof runscanbesummarizedin a run record. Thesedocumentationfeaturescanbe reachedfrom theDOCUMENTATIONMENUon theMAIN MENU.
DOCUMENTATIONMENU
1: Return to previous menu2: Change run number3: Summarize run4: Construct a run recordSelection:
7.1 The run summary
Therun summaryis a onepagesummaryof a run consistingof goodnessof fit plots,NONMEM terminationmessages,parameterestimatesand,optionally, themodelfile.An examplerun summaryis displayedin Figure68. Therun summaryis intendedas
8Thecodein thefigureworksonly for NONMEM V
21
a way to documentruns.It is, however, nota substitutefor theNONMEM outputfile,whichcontainsloadsof other, useful,information.
At the top of the run summaryarefour basicgoodnessof fit plots. Below the plots,in the left column, are the terminationmessagesfrom NONMEM, followed by thevalue of the objective function. Next follows a table with parameterestimates. inthe caseof the � , the estimatesare reportedas they comein the NONMEM outputfile. Thevarianceestimates( �� and �� ) areconvertedto standarddeviationsandthecovariances(off-diagonal � s and � s) areconvertedto correlations. The standarderrorestimatesarereportedasthecoefficientof variations(CVs)of thecorrespondingparameterestimates,i.e. thestandarderrordividedby theparameterestimate.For the�� s and �� s theCV refersto thevarianceestimateandnot to thestandarddeviation.Xposecanhandlea maximumof 40 � sand30 �� sand �� s.
Below the tablewith parameterestimatesis themodelfile, possiblycontinuingin theright column.Theinclusionof themodelfile is optional(seebelow). Therunsummaryis only onepagelongsoif themodelfile is longerthanwhatwill fit in theleft andrightcolumn,it will betruncated.
A run summarycan be requestedwith option 3 on the DOCUMENTATIONMENU.Whentheuserselectsthisoption,Xposefirst askstheuserif heor shewantsto sendtheoutputdirectly to theprinter9 (this is differentfrom all otheroptionswhereoption toprint theplot is availablefirst aftertheplot hasbeenshown onscreen,seeSection13).Whentheuserhasspecifiedwheretherun summaryis to besent,Xposepromptsforthenameof themodelfile. Pressingreturnwill acceptthedefault modelfile name10
andtyping 0 meansthatno modelfile shouldbe includedin therun summary. If thedefault file nameis not right the userhasto type the correctfile namefollowed byreturn. Thenext questionis aboutthe nameof the outputfile name.Pressingreturnacceptsthedefault outputfile name11 andtyping 0 will returnto theMAIN MENU. Ifthedefault file nameis not right theuserhasto typethecorrectfile namefollowedbyreturn.An exampleof how a runsummarycanberequestedis givenbelow:
DOCUMENTATIONMENU
1: Return to previous menu2: Change run number3: Summarize run4: Construct a run recordSelection: 3Do you want to send the summary output to the printer y(n)?
Type the name of the model file (0=skip model file,return=run4.mod)run4.modType the name of the output file (0=exit,return=run4.lst)run4.lst
9Therun summaryoutputis optimizedfor printingandwill consequentlynot look verygoodonscreen10Thedefaultmodelfile nameis constructedfrom thecombinationof run , therunnumberand.mod , for
examplerun1.mod .11Thedefault outputfile nameis constructedfrom thecombinationof run , therun numberand.lst , for
examplerun1.lst .
22
Summary of run 3
0 50 100 150
020
4060
80
11
1
1
1
1
1
1
1
22
2
2
2
2
22
223 333333333
44
4
44
44
44
555555555
66
6
6
6
6
66
6
77
7
7
7
7
7
88
8
88
88
88
99
9
99
9
99
1010
10
10
101212
12121212121212
13131313131313131313
14141414141414141414
151515
15151515151515
161616
16161616161616
1717
17
17
17
17
1717
1717
1818
18
1818
1818
181818
19 19
19
191919
1919
1919
202020202020202020
212121212121212121
22 2222
2222
2222
232323
2323
232323
2323
242424
242424
24242424
252525
25252525252525
26 2626
2626
262626
2626
2727
27
2727
2727
272727
2828
28
28
28
28
2828
2828
29
29
29
29
29
2929
2929
3030
30
30
30
30
3030
3030
31313131313131313131
323232
3232
323232
3232
3333
33
33
33
33
3333
33333434343434343434
35 35
35
3535
3535
35
3636
36
36
36
36
3636
3636
37 37
37
3737
3838
38
38
38
3939393939393939
404040
4040
4040404040
41414141
41414141
4141
4242
42
42
42
42
43434343434343434343
444444
4444
4444444444
45
45
4545
4646
46
46
46
46
4646
46
46
47
47
47
47
47
4848
48
484848
484848484848
49 49
49
4949
494949
4949
5050
50
5050
5050
5050
50
5151
51
5151
51515151
5252
52
5252
525252
525253535353535353535353
5454
54
5454
5454
5454
54
5555
55
55
55
55
55
55
5556
5656565656
57 57
57
5757
5757575757
58
5858585858
58
59 59
59
59
5959
59
59
6060
60
6060
606060
60606161
6161616161616161
6262
62
6262
6262
6262
63636363636363
6363
64 64
64
646464
64646464
65 65
65
656565
65656565
PRED DV
0 50 100 150
020
4060
8010
0
1
1
1
1
1
11
11
22
2
2
2
22
2
223 333333333
44
44
44
44
4
555555555
6
6
6
6
6
66
66
77
7
7
7
7
7
888
88
88
88
9
9
99
99
99
101010
1010121212121212121212
13131313131313131313
14141414141414141414
151515
1515
1515151515
16161616161616161616
17
1717171717
1717
1717
18
18
1818
181818181818
19191919
1919
191919
1920202020202020202021
2121212121212121
22 222222
222222
2323232323
2323232323
2424
2424
24242424
242425
252525252525252525
262626
2626
2626
262626
272727
2727
272727
2727
28
2828
28
28
28
2828
2828
2929
292929
2929
2929
3030
30
30
30
3030
30
3030
3131
3131
313131313131
323232
3232
3232323232
333333
3333
3333
3333
333434343434343434
3535
353535
3535
35
36
36 3636
3636
3636
3636
373737
37
37
3838383838
3939393939393939
40404040404040404040
41414141414141414141
42
4242
42
42
42
43434343434343434343
44444444444444444444
4545
4545
4646
46
46
46
46
4646
46
46
47
4747
47
47
484848
484848484848484848
4949
4949
4949
49494949
5050
505050
5050
5050
50
51
5151
515151
515151
52
52
52
52
52
525252
525253535353535353535353
54
54
54
5454
545454
5454
5555
55
55
55
5555
5555
56
5656565656
575757
5757
5757575757
58
585858585858
59595959
5959
59
59
606060
6060
606060
606061616161616161616161
62
6262
62
62
6262
6262
636363636363636363
64 646464646464646464
656565
656565
65656565
IPRE DV
•••
•
•••
••
••
•
••
•
•••
•
••
•
••
•
••
••
•
••
•
••
•
•
••
•••
•
••••
•••
••
••
•
•
•
•••
•••
•
•••
••
••• •••••
•
•
•
•
•
•
•••
•
•
•••
•
•• •
•
••••••
•
• •••
•
•••
•
•
•
••
•
••••
••
• ••••
••••••
•
•
•••
••
•••
•
••••••
•
•
•
•
•••
•••
•••
••
••••
•
•
• ••••
•••
•
•
•
•
•••••
•
•
•
••••
•••••
•••
•
••••
•
•
•
•
•
•
••
••
•
•
•
••••••• •••
•
••
••
••
•
•
•
••••••
•
•
•••
•
•
••• •
••
•
••••
••
•
•
••
•
•••
•
•••
•••••
••
•••
••
•
•
•
•
••••
••••
••
•
•
•
•
••
•
•
•
•
•
•
•
•
••
•
•
•
•
•••
••••••
•
•
••••
•
•
••••
•••
•
•
•••
••
•
•
•••
•••
•
••
•
••
•••
•
•
•
•
•
•••
•
•
•
•••
•
••
• ••••
••
••
••
•
•
••• •
••
••
•
•
••••
•
••
••
•
••
••
•
••
•
•
••
••
••
••
•
•
•••
••••••
••
•
••••
•
•••••
•••••
•
••
•••
•• •
•
•
••
••
•
••
•
••
•
• •
••
•
••
••
••
•
•
•••
••
•
••
•
•••
••
•
•
•
••••
••••
•
•
••
••
••
•
•••
••••
••
••••
•
•
••
•
••
•
•
•
••
•
•
•
••
•
•
••
•
•
•
0 20 40 60 80 120
0.0
0.2
0.4
0.6
| IWRE | vs IPRE
2 4 6 8 10 12
-20
24
6
11 1
1
1 1 11
1
2
2
2
2
22
2 2
22
3
3
3
3 33
3
3
334
4
4
4
44
4
4
4
5
5 55
5
5
5 5
56
66
6 6 66
6
67
7
7 7
7 7
78
8
88
8
8
8 88
99
9 9 9 99
9
10
10
10
10
10
12
1212
12 1212
1212
12
13
13
13 13 1313 13 13
1313
14
14 14
14
14 14 1414
1414
15
15
15
1515 15
15
15
15
15
1616
16
16
1616 16
16
16 16
17
1717
1717
17
1717 17
1718
1818
18 18 18 1818
18
18
19
1919
19
19
19
19
19
19
19
20
20
2020
20 20
20
20
20�21
2121 21
21
21
2121
21
22
22
2222
22 22 2223 23
23
23 2323
2323 23 23
24
2424 24
24
24
24
2424
2425
25
2525
25
25
25 25
25
25
26
26
26
26 2626
26
26 2626
27
2727
27
2727
27 27
27
27
28 28
28
2828 28
28 28
2828
29
2929
29
2929
29 292930
3030
30
30 3030 30
30
30
31
31
31
31
31
31 31
31
31
31
32
3232 32
32 32 3232
32
3233 33
33
33
33
33
33
33
3333
34
34 3434
34 34 34
3435
35
35
35
35
3535
35
36
36
36
36
36
36
36 36 36
36
37
3737
37
37
3838 38 38
38
39
39
39
3939
39
39
3940
40
40 40 40
4040 40
40
40
41 41
4141
41
41
4141 41
41
4242 42
42
4242
43
43 4343 43
43
43
43
43
43
44
4444 44 44
44
44
44
44
44
45
45
45
45�46 46 46 46
4646
4646
46
46�
47
4747
47
4748
48
4848
4848
48
4848
4848� 48
49
49
49 49
4949 49
4949
49
5050
50
50
50
5050 50
50
5051
5151 51
51 51 51 51 51
52 52 52
52
52
5252
52
52
525353
53 53
53
53
53
5353
535454
54
54 54
54 54 5454 5455
55
55
5555
55
5555
55�
56
56
56
56
56
56
57
57 5757
57 57
5757
57
5758
5858
58 58
5858
59 59
59
5959
59
59
59
60
60
60
60
6060 60
60
60
60
61 6161
6161 61
61 6161 6162
62
62
6262
6262
62
62
6363
63 63 6363
6363 63
64
64 6464
64
64
64 64 6464
65
65 65 65
65
6565
65
65
65
WRES vs TIME
MINIMIZATION SUCCESSFUL�NO. OF FUNCTION EVALUATIONS USED: 223
NO. OF SIG. DIGITS IN FINAL EST.: 3.1
Objective: 2186.855
TH1
TH2
TH3
TH4
OM1:1
20.7
85.1
1.75
0.916
0.37
OM2:2
OM3:3
SI1:1
0.43
1.2
0.17
$PROB Simulated data distributed with Xpose 2
;;
;; As run2 but with the covariance step and NM V tables
;;
;;
$INPUT ID AMT TIME DV SS II SEX AGE RACE HT SMOK CON1 CON2 CON3 WT SECR DOSE
$DATA data1.dta IGNORE=#
$SUBROUTINE ADVAN2 TRANS2
$PK
SXCL = 1
IF(SEX.EQ.2) SXCL = THETA(4)
TVCL = THETA(1)*SXCL
TVV = THETA(2)
TVKA = THETA(3)
CL = TVCL *EXP(ETA(1))
V = TVV *EXP(ETA(2))
KA = TVKA *EXP(ETA(3))
S2=V
$THETA (0,20) (0,87) (0,2) (0,0.5)
$OMEGA 0.12 0.14
$OMEGA 1.8
$ERROR
DEL = 0
IF(F.LE.0) DEL = 1
IPRED = F
IRES = DV - F
W = F + DEL
IWRES = IRES/W
Y = IPRED + W*EPS(1)
$SIGMA 0.23
$EST NOABORT POSTHOC
; Table statement for NONMEM IV. The data variables has to be defined
; within Xpose.
;$TABLE ID TIME IPRED IWRES CL V KA AGE HT WT SECR SEX RACE SMOK
; CON1 CON2 CON3
; NOPRINT ONEHEADER FILE=mytab3
; Table statements for NONMEM V
$TABLE ID TIME IPRED IWRES
NOPRINT ONEHEADER FILE=sdtab3
$TABLE ID CL V KA ETA1 ETA2 ETA3
NOPRINT ONEHEADER FILE=patab3
$TABLE ID AGE HT WT SECR
NOPRINT ONEHEADER FILE=cotab3
$TABLE ID SEX RACE SMOK CON1 CON2 CON3
NOPRINT ONEHEADER FILE=catab3
Figure6: An exampleof a runsummary.
23
7.2 The run record
A runrecordis basicallya tablewith summaryinformationaboutmany runs.It canbethoughtof asanindex to therunrecords.TheinformationthatXposeincludesin a runrecordis the run number, usersuppliedcomments,NONMEM terminationmessagesandobjectivefunctionvalues.
Therun numbersto beprocessedaresuppliedby theuser. Only continuousrangesofrunnumbersarepermitted(seebelow).
Theusersuppliedcommentsaretaken from themodelfiles. Thecommentcharacterin NM-TRAN modelfiles is thesemi-colon(; ). Usersuppliedcommentsfor therunrecordshouldbeprecededwith two semi-colons(;; —). In otherwords,all text aftertwo semi-colonspresentin the modelfile will be extractedand includedin the runrecord.Below is anexample:
$PROBLEMExample of user supplied comments;; This line will be extracted into the run record.;; As will this.; This line will not be included.$INPUT ...$DATA .......
It is usuallyagoodideato includethecommentsjustbelow the$PROBLEMstatement.
Theterminationmessagesandobjectivefunctionvaluesaretakenfrom theNONMEMoutputfiles.
The run recordis not displayedon screen. Instead,it is either sentdirectly to theprinteror to a file suitablefor insertioninto a word processor. The latteroption is tomake it possiblefor theuserto influencetheformattingof therun record,for examplewith pagenumbers.The itemsin the run recordfile aretab-separated,i.e. whentherun recordfile hasbeenimportedinto the word processorthe userhasto selecttheimportedtext andusea convert text to tableoption(presentin for exampleMicrosoftWord). Furthermore,separatelineswith in oneentry in therun record,e.g. theusersuppliedcommentsin theexampleaboveor theNONMEM terminationmessages,areseparatedby two hashcharacters(##). Thismeansthattheprocessof importinga runrecordfile into awordprocessorcanbesummarizedin thefollowing four steps:
1. Constructtherun recordfile (seebelow)
2. Import thetableinto awordprocessor
3. Convert theimportedfile to a table
4. Replaceall ## entriesin thetableto manualline feedsor equivalent.
5. Formatthetable
Below is anexamplesession:
24
DOCUMENTATIONMENU
1: Return to previous menu2: Change run number3: Summarize run4: Construct a run recordSelection: 4Please type the range of run numbers to create a run record forRange starts at run number (1):1Range ends at run number:3Do you want to export the documentation to a file?n(y)n
8 Checkinga data set
In the Data checkout menu(option 3 on the Main menu)there is an option calledCheck a data set 12 With this optionit is possibleto plot the ID numbervs thevaluesof eachcolumnin anNM-TRAN datafile to find any grosserrors.
Theformatof thedatafile hassomerestrictions.Thefirst columnof thedatafile will beplottedvs thevaluesin theothercolumns.Hence,to plot the ID vs theothercolumnsit is necessaryto havetheID -numberin thefirst column.Whenthedatafile is readbyS-PLUSit will regarddatesandcolonseparatedclock timesascharacterdata,whichcannotbeplotted.To avoid theerrorthesecolumnswouldgiveriseto if Xposetriestoplot them,it is necessaryto give themthenameXXXX, i.e. whenXposehave readthedatafile it avoidsall columnswith thisname.A similarproblemwill ariseif NULLs intheNM-TRAN datafile arecodedby a dot (. ). SinceNULLs arelegitimatein someof thecolumnsof greatestinterest(e.g.AMTandDV) they arebetterreplacedby zeros.
When the user selects the Check a data set option from theDATA CHECKOUTMENUhe or sheis first promptedfor the nameof the datafile.This should,obviously, be thenameof anexisting NM-TRAN datafile, followedbyreturn. Xposethenpromptsfor the line numberat which the columnheadersare13.Thedefault is 2, whichis thesameasaNONMEM tablefile wouldhave. It is howeverperfectlyfine to specifyanotherline number, makingit possibleto keepcommentsre-garding,e.g. thecontentsof thedatafile, at the top of thefile. Below is anexamplesession:
DATACHECKOUTMENU
1: Return to the main menu2: Numerically summarize the covariates3: Histograms of the covariates4: Histograms of the covariates given categorical covariates5: Scatterplot matrix of covariates6: Check a data set7: DV vs the independent variable
12A similar optionwasavailablein Xposeversion1.1. In thenew versionof Xposetheseplotsaremuchfaster, unfortunatelywith thelossof some,but notvery informative, information.
13This is new comparedto Xpose1.1. It is no longernecessaryto follow theNONMEM tablefile conven-tion with thecolumnheaderson thesecondline.
25
Selection: 6Please type the name of the data file you want to checkdata4Please type the line number of the line in the data filecontaining the column headers (2):4
Thecontentsof thedatafile is plottedusingdotplots,onefor eachcolumn(Figure7).On they-axisaretheID numbersandonthex-axisaretheAGEvalues.Eachentryforeachvalueof ID will getonedoton thecorrespondingline.
1
2
3
4
5
6
7
8
9
10
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
30 40 50 60 70
AGE�
.ID
Data set checkout plot
Figure7: A dotplot of ID vs AGE .
9 Comparing models
To discriminatebetweenrival modelsis an importantaspectof model building. InNONMEM it is possibleto usethe differencein the objective function valueto dis-criminatebetweenhierarchical,or nested,models.In Xposeit is possibleto visualizethe differencesbetweentwo model fits. Option 7 in the MAIN MENUleadsto theMODELCOMPARISONMENU:
MODELCOMPARISONMENU
1: Return to previous menu
26
2: Basic model comparisons3: Additional model comparisons4: Delta PRED/IPRED/WRES/IWRES vs covariatesSelection:
Theplotsin thismenuplots,for example,WRESfromthecurrentdatabasevstheWRESfrom another, reference,database. Whenoneof the plot optionsis selectedXposepromptsthe userfor a referencerun number. Similarly to changingthe run numberin theDATABASEMANAGEMENTMENU, Xposechecksif thereis anexisting xposedatabasematchingtherun number. If thereis, theusergetstheoptionto usethedatabaseor to recreateit from thetablefiles. If no matchingdatabaseis foundtheuserisaskedif heor shewantsto createit from thetablefiles. Notethatthesamedatahastobeusedin thetwo runsto make thecomparisonmeaningful.
10 The GAM
Direct inclusionof covariaterelationshipsin themodelis oneof the importantbene-fits with populationanalysisover individual specificanalysis.Identifying therelevantcovariatescan,however, be a time consumingtask. To speedthis up, Mandemaandcoworkers[10, 11] suggestedtheuseof astepwisegeneralizedadditivemodelingpro-cedure(GAM14 ).
The GAM canbe run in Xposeby selectingthe COVARIATE MODELMENUin theMAIN MENU(option6) andthentheGAMMENU(option4):
GAMMENU
1: Return to previous menu2: Run a GAM3: Summarize GAMfit4: Plot GAMresults5: Akaike plot6: Studentized residuals for GAMfit7: Individual influence on GAMfit8: Individual influence on GAMterms9: Settings for the GAMSelection:
TheGAM in Xpose2.0offer agreatdealof flexibility 15 andanumberof GAM specificplotsthatarenotdirectlyavailablein S-PLUS.
14WeusethetermGAM to referto thestepwisegeneralizedadditive modelingprocedurein contrastto ageneralizedadditive model,which is justsinglemodelwithoutany variableselection.
15Therearea lot of differencesbetweentheway theGAM is implementedin Xpose1.1 andXpose2.0,someareuservisible while others,perhapsthe majorpart of thedifferences,arenot. It is still possibletorun theGAM in thesameway, thatis, usingthesameoptions,asin Xpose1.1,andtheGAM results,givenacertaindataset,shouldbeidentical.
27
10.1 Overview of the GAM
This sectioncontainsa brief descriptionof how theGAM works in S-PLUS.It is in-tendedto give enoughbackgroundfor the following sectionsthat describeshow theGAM canbeusedin Xpose.For a moreformaldescriptionof this featurein S-PLUS,pleasereferto theS-PLUSmanuals[5] or to thebookStatisticalmodelsin Sby Cham-bersand Hastie[12]. For a technicaldescriptionof generalizedadditive modelsingeneral,thebookby HastieandTibshirani[13] is agoodstart.
TheGAM canbeusedto find a subsetof theavailableexplanatoryvariables(i.e. thecovariates)thataremostusefulin explainingthevariability in thedependentvariable(e.g. CL). This is doneby trying combinationof modelsof theexplanatoryvariablesin a stepwisefashion.Themodeldiscriminationis madeby comparisonof theAkaikeinformationcriteria (AIC). Theorderin which themodelsaretried is definedby themodelscope. For eachexplanatoryvariable,a hierarchyof possiblemodelsis defined,for example:not included,a linearmodelanda non-linearmodel.Theinitial modelisusuallya combinationof thefirst modelsin thescopebut canbeany combinationofthe modelsdefinedin the scope.In eachstep,for eachcovariate,the modelsup anddown in thehierarchyaretried andthemodelthatdecreasestheAIC statisticthemostareretainedto thenext step.Thesearchis terminatedwhennomodelcandecreasetheAIC any further.
10.2 Running the GAM
To run theGAM in Xposewe needto have a currentdatabasein which we have co-variatesandat leastoneparameterdefined.Thedefaultscope,for eachcovariate,is notincluded,a linearregressionmodelanda naturalcubicsplinewith oneinternalbreak-point (e.g. ns(AGE,df=2) ). For categoricalcovariatesthe third modelis omitted.Whenimplementingacovariatemodelin NONMEM it is usuallyagoodideato centerthe covariatesto the middleof the data(this makesit easierto get reliableestimatesof theinterceptin a linearmodel).In Xpose,continuouscovariatesareby defaultcen-teredaroundthemediancovariatevalue. Selectingoption2 in theGAM MenustartstheGAM analysis.
Xposesavesthe GAM objectscreatedby the GAM analysis.This makesit possibleto go backandlook at theresultsof a GAM run at a later time. Theseobjects,Xposegamobjectsaregivennameswhichareacombinationof gam.xpose. , theparametername and the run number, for examplegam.xpose.CL.1 . If theGAM is requestedfor a run numberandparameterwithanalreadyexisting Xposegamobject,theuserhasto confirmthatheor shewantstodeletetheold object.
10.3 Displaying the resultsof a GAM run
Theresultsof a GAM runcanbedisplayedeithernumericallyor graphically.
28
1101
23
4
5
105
67
8
108
910110
12
1311314
15115 16 1711718
118
19
119
2021
121
22122
23
24124
25125
26
126
27
2829
12930
3132132
33 13334
35
36
37 3839 40
140
41
14142
142
43
14344
45145
46
146
47
48
49149
50150
51
151
52
53153 54154 55
155
56
57
58158
59159
60
61
16162
162
6316364
164
65
1
2
-10 0 10 20
Residuals
SEX
Example GAM results
1
101 23
4
5
1056
7
8
1089
10
110
12
13
113
14
15115
16
17
117
18
118
19
119
2021
121
22
122
23
24
124
25
12526126
2728
29
129
3031
32132
33
133
34
35
36
37
38
39
40
140
41
141
42
142
43
143 44
45
145
46
146
47
48
49
149
50
150
51
151
52
53
153
54
154
55
155
56
5758
158
59
159
60
61
161
62
162
63
163
64
164
65
-10
0
10
20
-30 -20 -10 0 10
HT
Resid
uals
Example GAM results
1
101
2
3
45
105
6
7
8
108 910
110
12
13
113
14
15115
16
17
117
18
118
19
119
20
21
121
22
122
23
24
124
25
125
26126
27
28
29
129
3031
32132
33
133
34
35
36
37
38
39
40
140
41
141
42
142
43
143
44
45
145
46
146
47
48
49
149
50
150
51
151
52
53
153
54
154
55
155
56
57
58
158
59
159
60
61
161
62
162
63
163
64
164
65
-10
0
10
20
-30 -20 -10 0 10
AGE
Resid
uals
Example GAM results
1
101
2
34
5105
6
7
8
108 910110
12
13
113
14
15115
16
17117
18
118
1911920
21121
22122
23
24
124
25125
26
126
27
28 29129
30
31
32132
33 133
34
35
36
37
38
39
40140
41
141
42 142
43143
44
45
145
46
14647
48
4914950150
51151
52
53153
54154 55155
56 57
58
158
59159
60
61161
62
162
63163 64
16465
0
1
-10 0 10 20
Residuals
HCTZ
Example GAM results
Figure8: Plotof GAM results.Eachpanelshows therelationshipbetweena covariateandCL . Datapointsarelabeledby theID number
10.3.1 Numerical summary of a GAM fit
Selectingoption3 ontheGAMMENUproducesanumericalsummaryof anXposegamobjectin thecurrentS-PLUSworking directory. This summaryconsistsof theoutputfrom the S-PLUSfunction gam.summary plus informationaboutthe XposeGAMsettingsused(seeSection10.5)to producetheXposegamobject.
10.3.2 Displaying the GAM resultsgraphically
It is alsopossibleto view theGAM resultsgraphically. Option4 on theGAMMENU,will produceplotsof thefinalGAM model(Figure8). Individualdatapointsarelabeledwith theid variableandasmoothis drawn in theplotswith continuouscovariates.Thefitted valuesof relationshipswith categoricalcovariatesareindicatedby a solid blackdotandaverticalblueline.
10.4 Obtaining diagnosticsfor a GAM run
Usingstepwiseprocedures,suchastheGAM, is notwithoutcriticism(e.g.G.E.PBox,Technometrics(8) 1966625-629)andevenif theGAM hasprovento beusefulfor the
29
SEX + ns(HT, df = 2) + ns(AGE, df = 2) + HCTZ
SEX + ns(HT, df = 2) + ns(AGE, df = 2) + SMOK + HCTZ
SEX + ns(HT, df = 2) + ns(AGE, df = 2) + HCTZ + WT
ns(HT, df = 2) + ns(AGE, df = 2) + HCTZ
SEX + HT + ns(AGE, df = 2) + HCTZ
SEX + ns(HT, df = 2) + AGE + HCTZ
SEX + ns(HT, df = 2) + AGE + SMOK + HCTZ
ns(HT, df = 2) + AGE + HCTZ
SEX + ns(HT, df = 2) + AGE + HCTZ + WT
SEX + ns(HT, df = 2) + AGE + SMOK
SEX + ns(HT, df = 2) + AGE
SEX + HT + AGE + HCTZ
SEX + ns(HT, df = 2) + HCTZ
SEX + HT + AGE + SMOK
SEX + ns(HT, df = 2) + ns(AGE, df = 2)
SEX + HT + AGE
SEX + ns(HT, df = 2)
ns(HT, df = 2) + AGE
SEX + ns(HT, df = 2) + AGE + WT
SEX + HT + AGE + WT
SEX + HT + ns(AGE, df = 2)
SEX + HT + SMOK
SEX + HT + HCTZ
SEX + HT
ns(HT, df = 2)
HT + AGE
HT + HCTZ
HT + SMOK
SEX + HT + WT
HT
4700 4800 4900 5000
Akaike value�
Mod
elAkaike plot�
Figure9: Akaike plot of a GAM run. On the y-axis are the most importantmodelstestedby theGAM andonthex-axisarethecorrespondingAIC values.
identificationof importantcovariates,oneshouldbecarefulin theinterpretationof theresults.Oneof the thingsthatcaninfluencetheGAM resultsareoutlying individualdatapoints. Thesecanbothhideandcreatecovariaterelationships[14]. Xposehasanumberof optionsto checkthe final modelfor unduly influential individuals. Theseoptionsusesquite involvedcalculationsof approximatecase-deletiondiagnosticsforthefinal GAM model[15]. A detaileddescriptionof thewaytheseareimplementedinXposeis givenin JonssonandKarlsson[14] andin theAppendix(to bewritten).
For options6, 7 and8 it is necessaryto useparametricsmoothers,i.e. ns , bs andthehockey stickmodels16 (seeSection10.6).
10.4.1 The Akaik eplot
Option5 on theGAM menuwill producea dot plot of the30 mostimportantmodelstried in thestepwisesearchfor thefinal model(Figure9). Onthey-axisarethemodels(usingS-PLUSsyntax)andon thex-axisarethecorrespondingAIC value. With thisplot it is possibleto evaluatehow muchbetterthefinal modelis comparedto theothermodels.It canbeviewedasa measureof theuncertaintyin thefinal GAM model.
16Using the parametricmodelswill make the generalizedadditive modelsto generalizedlinear models,andthecalculationsinvolvedin option6, 7 and8, only holdsfor generalizedlinearmodels.
30
3625260
119354739
11515
2145
432724
1624244
1212056301828
153117142
193121
150101
6545
124159146
753
4143
54125
12129113
2646
11037502334
1265
51132
41155
32595838331629
15418
16455
1056
5714
149163
1710
108140
4836
122141
49161
63118
9133158
222540646113
151
-2 -1 0 1 2�
Studentized residual
IDStudentized residuals for a GAM run
Figure10: Studentizedresidualsfor a GAM run. On they-axisaretheID numberandonthex-axisaretheresiduals.TheID numbersareorderedaccordingto thesizeof theresiduals.
10.4.2 Studentizedresidualsfor a GAM fit
Option 6 on the GAM Menu will producea dot plot of the studentizedresidualsforeachindividual. A studentized,or jacknifed,residuals,is theresidualbetweenan in-dividual datapoint (parameter)andthe predictionfrom a fit with that particulardatapointomitted.Theseresidualsarenormalizedsothatthevarianceshouldbe1 andthemeanzero,makingthemsuitablefor thedetectionof influentialpoints.
An exampleof this plot is given in Figure 10. On the x-axis are the valuesof thestudentizedresidualsandonthey-axisaretheID numbersasgivenby the id-variable.TheID numbersareorderedaccordingto thesizeof theresidual.
10.4.3 Indi vidual influenceon the GAM fit
Option 7 will producea plot (Figure11) of the Cooksdistancevs leverage[14, 15].Cooksdistanceis ameasureof theinfluenceacertaindatapointhas,i.e. how muchthefit will changeif that datapoint is omittedfrom theanalysis.A high valueindicatesa high influence. The leverageis a measureof how a datapoint influencesthe cer-tainty with which thefit is obtained.A high valueindicatesa higherleverage.A highleveragedatapointneednotbeinfluentialandviceversa.A pointwith ahighvalueof
31
1101
2
3
45
105
6
7
8
108
9
10
11012
13
11314
15115
16
17117
18
118
19
119
20
21121
22
122
23
24
124
25
125 26126
27
2829129
30
31
3213233
133
34
35
36
3738
39
40
140
41
14142
142
43
143
44
45
145
46146
47
4849
149
50
150
51
151
52
53
153
5415455155
56
57
58
158
59159
60
61
161
62
162
63
163
64
164650.0
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.1 0.2 0.3 0.4 0.5
Leverage (h/(1-h))
Cook
s dist
ance
Individual influence on a GAM fit
Figure11: Cooksdistancevs leveragefor aGAM fit. Datapointsarelabeledby theIDnumber
Cooksdistanceandleverageis very importantto thefit andoftenaffectsthecovariateselection[14]. Thispropertyalsoseemto “travel” into theNONMEM model,i.e. if anindividual point hasa high leverageandinfluencein theGAM heor shewill alsobeimportantfor thecovariatemodelin NONMEM [14].
10.4.4 Indi vidual influenceon the GAM terms
Option8 will produceplotsof theCooksdistancefor eachcovariatetermin thefinalGAM model. This plot canbe usedto identify which of the covariatetermsthat areaffectedby theindividualsdetectedwith option7.
10.5 Changingthe default behavior of the GAM
Option9 leadsto theGAMSETTINGS MENU:
GAMSETTINGS MENU
1: Return to previous menu2: List current settings3: Estimate dispersion factor (on/off)
32
4: Stepwise search for covariates (on/off)5: Use all lines (on/off)6: Use weights (on/off)7: Exclude individuals8: Specify starting model9: Normalize to median (on/off)10: Set model scope
Theoptionsin thismenucanbeusedto changetheway theGAM analysisbehaves.
10.5.1 List curr ent GAM settings
Option 2 will list the currentGAM settings. Thesesettingswill be effectual untilchangedin the GAMSETTINGS MENUor until S-PLUSis quit. The settingswithwhich a certain Xpose GAM object was obtained can be listed using theSUMMARIZEA GAMRUNoptionin theGAMMENU(seeSection10.3).
10.5.2 Estimation of the dispersionfactor
This option is a toggle,e.g. if thesettingis not to estimatethedispersionfactor(thedefault) andtheoptionis selectedfrom themenu,theGAM will estimateit, andviceversa.
The dispersionfactor is usedin the definition of the AIC andis equalto the “cost”,dividedby 2, perdegreeof freedomincurredby addingor droppinga term from theGAM model. If this option is not set, the dispersionfactoris taken to be the scaledPearsonchi-squaredstatisticfor thestartingmodel17 (seeSection10.5.7).If theoptionis set,thedispersionfactoris estimatedaccordingto the following scheme:For eachcovariate,the modelswithout covariate,a linear function of the covariateand non-linearfunctionof thecovariate(anaturalcubicsplinewith oneinternalbreakpoint)arecompared.Thedispersionfactoris takenfrom thegamfit of themodelresultingfromthecombinationof thesignificantmodelsaccordingto thisunivariateanalysis.
10.5.3 Stepwisesearch for covariates
Option4 will turn thestepwisesearchon or of (a toggle). Thestepwisesearchis onby default. Whenit is off, theGAM will fit themodelspecifiedby thestartingmodel.This canbeusefulwhena covariatemodelhave beenimplementedin NONMEM andtheuserwantstoobtainapproximatecasedeletionstatisticsfor thatparticularcovariatemodel.Option8 (Specifystartingmodel)canbeusedto specifythemodelto befit.
10.5.4 Useall lines
Option5 will toggletheuseof only oneline per individual on or off. If thesettingisoff, all linesin theXposedatabasewill beused.
17In Xpose1.1thedispersionparameterwasnot estimated.
33
10.5.5 Useweights
Option6 will turn theuseof prior weightsonor off. If thesettingis on,Xposewill usetheXposedatavariableweight(seeSection2.5)asprior weightsin theGAM analysis.
10.5.6 Exclude individuals fr om the GAM analysis
If a certainindividual is identifiedashaving a largeinfluenceandleverageon thefinalGAM model, it is interestingto know whetherthe exclusion of the individual willchangethe covariateselectionby the GAM. Option 7 canbe usedto excludeoneormoreindividualsfrom theGAM analysis.Selectingtheoptionwill prompttheuserforthe ID numbersof theindividualsto beexcluded.The ID number(s)to specifyis thenumberappearingin theplotsof theGAM results.
10.5.7 Specifyinga starting model for the GAM
Thestartingmodelfor theGAM is themodelfromwhichthestepwisesearchis started.It is alsousedto definethe dispersionfactorusedin the search.Selectingoption 8will prompttheuserfor the individual modeltermsthatmakesup thestartingmodel.Thesetermsshouldbe the termsusedby S-PLUSto specifythe startingmodel. Forexample,if we want to usea linear relationshipwith AGEanda non-linearrelation-shipwith HT asthestartingmodel,thecorrespondingcommandin S-PLUSwould begam(CL˜AGE+ns(HT,df=2)) (assumingwe wantedthenaturalcubicsplinewithoneinternalbreakpoint asthe non-linearmodel). The termsthat shouldbegiven toXposewhenspecifyingthismodelareconsequentlyAGEandns(HT,df=2) .
Thisoptioncanalsobeusedto specifythemodelto fit if thestepwisesearchis turnedoff (seeSection10.5.3).
10.5.8 Normalize the covariatesto the median
Option9 will turn thenormalizationof continuouscovariatesto themedianon or off.Mediannormalizationis the default, i.e. the parameterestimatesof the final GAMmodel (seeSection10.3) can be usedas initial estimatesin the NONMEM model.Note that the mediannormalizationis doneafter individualswith missingcovariatevaluesareomitted(seeSection6.5.1).
10.6 Setting the model scopefor the GAM
For a continuouscovariate,the default modelscopein the GAM is: not included,alinearmodelandanaturalcubicsplinewith oneinternalbreakpoint18. For acategoricalcovariate,thethird, non-linear, modelis omitted.
Option 11 leadsto yet anothermenuin which the usercan alter the default modelscope.
18In Xpose1.1this is theonly availablemodelscope
34
ns(AGE,df=2)
Model 3 Model 4Model 2Model 1
AGE1Not used in
the default scope
Figure12: Pictureof thedefault scopefor a continuouscovariate
GAMSCOPEMENU
1: Return to previous menu2: Set maximum number of models3: Default smoothers and their arguments4: Limit model scope for specific covariates5: Change models for specific covariates6: Show current GAMscope7: Use default scopeSelection:
10.6.1 The GAM scopeand its default settingsin Xpose
The scopefor a given covariateis a hierarchyof modelsthat the GAM will test. Itis hierarchicin the sensethat the GAM will never look beyondonestepup or downfrom the currentmodel, i.e. for the default scope,which is depictedin Figure 12,theGAM will never try thenon-linearmodel(Model 3) unlessit first finds the linearmodel(Model 2). Thescopeshown in Figure12 is thedefault scopefor a continuouscovariate.For a categoricalcovariateit is not possibleto fit a non-linearmodel. A 1,asin Model 1 in Figure12, is usedto denotethat thecovariateis not includedin theGAM model,thenameof thecovariate,AGEin Figure12,denotea linearrelationshipandns(AGE,df=2) denotesanaturalcubicsplinewith oneinternalbreakpoint.Thelatteris anexampleof thenon-linearmodelsthatcanbedefinedin theGAM scope.
In Xposeit is possibleto alterthescopein oneor moreways:increasingor decreasingthe numberof levels in the modelhierarchy, changethe default smoother(i.e. ns ),excludeoneor moreof thecovariatesfrom oneor morelevelsin themodelhierarchyandchangethemodelusedfor aspecificcovariatein aspecificlevel of thehierarchy.
Notethatchangingthemodelscopecanalterthefinal GAM modelsubstantiallyanditis consequentlypossibleto “find” thecovariatemodelonewishesto find. An exampleof therationalfor giving theusertheincreasedflexibility is asfollows: Assumethatwehave a non-linearrelationshipbetweena parameteranda covariateandthat the(sub-stantial)curvatureis off center(seeSection10.6.2),i.e. theparametricsmootherswillhaveproblemsdetectingit. Ignoringthis fact,i.e. beingcontentwith a linearrelation-ship,is likely leadto structuresin theunexplainedvariability thatmight induceother,false,relationships,or, andworse,mighthidetruerelationships.In thiscaseit is desir-ableto beableto specifyanother, moreflexible, non-linearfunctionthatcapturesthenon-linearitywithout beingboundto thebreakpointsof theparametricsmoothers.Asimilar, but opposite,situationmightoccurif wehaveoutlyingindividualsthatinducesa non-linearrelationshipevenif thetruerelationshipis clearlylinear.
35
10.6.2 Smoothersin the GAM scope
Theideawith generalizedadditivemodelscomparedto generalizedlinearmodelsis tobe ableto usenon-parametricsmoothers,e.g. splinefunctionsthat usethe observeddatafor its definition. The default smoother, or non-linearrelationship,in the GAMasimplementedin Xpose,is a naturalcubic splinewith oneinternalbreakpoint,i.e.a parametricsmoother. Consequently, theway that theGAM in Xposeis definedbydefault makesit a generalizedlinearmodel(parametricsmoothersis a requisiteof theapproximatecase-deletiondiagnosticsavailable,seeSection10.4).
Therearebasicallyfivedifferentsmoothersthatcanbespecified.Two arenon-parametric,i.e. lo ands , andthreeareparametric(at leastin thesensethat it is possibleto pro-ducetheGAM diagnosticsif they areused),i.e. ns , bs andthehockey stick models(seeSection10.6.6). The last smoothercan only be specifiedusing one of the op-tions in theGAMSCOPEMENUandwill bedescribedtogetherwith thatoption (seeSection10.6.6).TheothersmoothersareregularS-PLUSsmoothersandtheS-PLUSmanualsgivesfurtherdetails.
Theparametricsmoothershaveto havethebreakpointsspecified,i.e. wherethecentersof the curvatureare. The default valueof thesebreakpointsarethe median(for onebreakpoint)andthequartiles(for morethanonebreakpoint).If thecurvatureis muchoff centerit might be a good idea to try oneof the non-parametricsmoothersor tospecifyotherbreakpoints
10.6.3 Settingthe maximum number of modelsin the scope
Thedefaultnumberof modelsfor acontinuouscovariateis 3. It is possibleto decreasethis to 2 or to increaseit to 4. Whenthe maximumnumberof modelsis changed,the default scopeis also changed. Reducingthe numberof modelsto 2 omits thenon-linearmodel(Figure12),increasingthenumberof modelsto 4 addsasecondnon-linearmodel(a naturalcubicsplinewith two internalbreakpoints,i.e. a moreflexiblemodelthanonebreakpointspline). Changingthe maximumnumberof modelsdoesnotaffect thescopeof thecategoricalcovariates.
10.6.4 Changingthe default smoothersand their arguments
It is possibleto changethedefault smoothersfor eachhierarchiclevel of thescopeforthecontinuouscovariates.Xposewill asktheuserfor a smootherto usefor eachlevelin scope.Thefollowing Xposeoutputchangesthedefaultsmootherfor thethird modelfrom ns to s :
GAMSCOPEMENU
1: Return to previous menu2: Set maximum number of models3: Default smoothers and their arguments4: Limit model scope for specific covariates5: Change models for specific covariates6: Show current GAMscope7: Use default scope
36
Selection: 3
Type the name of the smoother you want to use for the *first*covariate model in the GAMmodel scope (q=exit,d=default,0=no covariate effect,1=linear model):
Type the name of the smoother you want to use for the *second*covariate model in the GAMmodel scope (q=exit,d=default,0=no covariate effect,1=linear model):
Type the name of the smoother you want to use for the *third*covariate model in the GAMmodel scope (q=exit,d=default,0=no covariate effect,1=linear model):s
Type any *argument* to the smoother for the third covariate modelin the GAMmodel scope (multiple argument should be commaseparated, q to exit, d =default):
(Notethatthewordsmootherrefersto any modelof: not included,linearaswell astheactualsmoothers.)
For eachmodelthe usercanspecifythe default (d), no covariateeffect (0), a linearmodel(1) andthenameof thesmoother(e.g.s ). If a smootheris specifiedtheuserisalsopromptedfor theargumentsto thesmoother. For thes in theabove examplethedefaultargumentwasaccepted(df=4 ) but couldhavebeensetto, for example,df=3 .
10.6.5 Limit the scopefor specificcovariates
With option4 it is possibleto limit thescope,or in otherwords,excludea continuouscovariatefrom a specificscopelevel. For example,if we have two continuouscovari-atesAGEandWTandwe want to skip the linearmodelfor AGEin thedefault scope,theXposesessionmight look somethinglike this:
GAMSCOPEMENU
1: Return to previous menu2: Set maximum number of models3: Global smoothers and their arguments4: Limit model scope for specific covariates5: Change models for specific covariates6: Show current GAMscope7: Use default scopeSelection: 4The following continuous covariates are defined in thecurrent data baseAGE WT
Type the names of the covariates you want to excludefrom the first model in the model scope and finish with an emptyline:1:
37
Type the names of the covariates you want to excludefrom the second model in the model scope and finish with an emptyline:1: AGE2:
Type the names of the covariates you want to excludefrom the third model in the model scope and finish with an emptyline:1:
For eachlevel in themodelscopetheuseris askedfor thename(s)of thecovariate(s)that is to beomitted. Answeringwith a return(andno covariatename)will leave thescopeunaffectedfor all covariates.
10.6.6 Changemodelsfor specificcovariates
With option5 it is possibleto specifythescopemodelsfor individual continuousco-variates.Below is anexamplesessionfor thecovariateAGEandWT:
GAMSCOPEMENU
1: Return to previous menu2: Set maximum number of models3: Global smoothers and their arguments4: Limit model scope for specific covariates5: Change models for specific covariates6: Show current GAMscope7: Use default scopeSelection: 5The following continuous covariates are defined in thecurrent data baseAGE WTType the name of the covariate you want to specify a specialmodel scope for (q to quit): AGE
Type the name of the *smoother* you want to use in the first modelof the model scope (q to exit, d=default, 0= excludeand 1= linear model, R=hockey stick right, L= hockey stick left):
Type the name of the *smoother* you want to use for the secondmodel in the model scope (q to exit,d=default,0=exclude,1=linear model, R=hockey stick right, L= hockey stick left):
Type the name of the *smoother* you want to use for the thirdmodel in the model scope (q to exit,d=default,0=exclude,1=linear model, R=hockey stick right, L= hockey stick left):R
(New covariate)
Type the name of the covariate you want to specify a specialmodel scope for (q to quit): q
38
Thefirst theuserhasto do is to typethenameof thefirst covariateto specifymodelsfor, in theabovecaseAGE. Xposethenstepsthroughthelevelsof themodelscopeandpromptsthe userfor a model. The optionsarethe default (d), exclude(0), a linearmodel(1), hockey stick right or left (RandL respectively) or thenameof a smoother.In thelattercasetheuserwill bepromptedfor any argumentssimilar to option3 ontheGAMSCOPEMENU(seeSection10.6.4).Thehockey stick right andleft modelsfits aslopeto theleft or right handsideof themediancovariatevalue,leaving thefit on theoppositesidehorizontalat the“intercept” in themiddleof thedata.Thesemodelsarefairly inexpensive non-linearalternativesto thenormalsmoothersandcansometimescapturenon-linearitiesthatwouldotherwisenotbedetected.
WhenXposehassteppedthroughthe scopelevels for the first covariatethe userispromptedfor a new covariatename.Theanswershouldbeeitherthenameof thenextcovariateor q if nomorecovariatespecificadjustmentsof thescopeis needed.
10.6.7 Viewing the curr ent GAM scope
Option6 printsthecurrentGAM scopewith, if any, adjustmentsmadeby theuser.
10.6.8 Re-settingthe scope
Option7 resetsthescopeto thedefault.
11 The bootstrap of the GAM
11.1 Overview the Bootstrap of the GAM
The bootstrapis a proven statisticalmethodology, which reliesmoreon brutecom-putationalpower ratherthanclever analyticalsolutions. A goodstartingpoint to thebootstrapliteratureis Introductionto theBootstrapby EfronandTibshirani[16].
Thebootstrapof theGAM canbeusedto assesstheimportanceof thecovariatesandto obtainsimilar diagnosticsinformationto thatavailableto theGAM, thedifferencebeingthat the informationis obtainedfor all covariates,not only for the“significant”ones. In addition, the bootstrapof the GAM can also give information abouthowcovariatesinteractwith respectto inclusion/exclusionfrom thecovariatemodel.
In the bootstrapof the GAM, the GAM is run a large numberof timeson bootstraprealizationsof theoriginal dataset. (Thetermdatasetwill in this sectionbeusedtoreferto thedatasetof individualparametervaluesandcovariates,usuallyoneentry, orline, per individual.) Thebootstrappeddatasetsareconstructedby randomlysamplewith replacementfrom theoriginaldatasetandtheGAM is thenrun oneachof them.Thebasicresultsfrom a bootstrapof theGAM run is obtainedsimplyby countingthenumberof timeseachof the covariatesareselected,regardlessof functionalform ofthe relationship,aswell ashow many timesthecontinuouscovariatesareselectedasa non-linearfunction19. This “numberof times” arepresentedasrelative frequencies,
19Xpose1.1did notdifferentiatebetweenlinearandnon-linearmodels.
39
i.e. thenumberof timesacertaincovariateor covariatemodelwasselecteddividedbythenumberof bootstrappeddatasets.
Thebootstrapof theGAM, asit is implementedin Xpose,hasbeendescribedby Jon-ssonandKarlsson[14] andsimilarusesof thebootstraphavebeendescribedby MickandRatain[17], SaubereiandSchumacher[18] andEtte[19].
To usethebootstrapof theGAM selectoption6 in theMAIN MENUandthenoption5from the COVARIATE MODELMENU. This will lead to theBOOTSTRAPOF THE GAMMENU:
BOOTSTRAPOF THE GAMMENU
1: Return to the previous menu2: Run the bootstrap of the GAM3: Summarize a bootstrap of the GAMrun4: Plot menu5: Settings for the bootstrap of the GAMSelection:
11.2 Running the bootstrap of the GAM
It is necessaryto havecovariatesandat leastoneparameterdefinedin thecurrentdatabaseto run thebootstrapof theGAM.
Optiontwo will run thebootstrapof theGAM. Xposesavesthebootstrapof theGAMrunin Xposebootgamobjects. They arenamedbyacombinationof bootgam.xpose. ,theparameternameandtherun number, e.g. bootgam.xpose.CL.1 . If thereal-readyis anXposebootgamobjectin thecurrentdirectorythatmatchestheparameternameandrunnumber, theuserhasto confirmthatheor shewantsto overwriteit.
Oneof theimportantquestionsin thebootstrapof theGAM is thenumberof bootstrapiterationsto make, i.e. the numberof bootstrappeddatasetsto use. In Xpose2.0therearebasicallythreedifferentstrategies. Thefirst is to usea fixed,userspecified,numberof iterations20. Theothertwo strategiesusesalgorithmsto determinewhentheinclusionfrequenciesof thecovariatesarestable,i.e. Xposedeterminesthenumberofiterationsneeded.Section11.5describesthesealgorithmsin moredetailandalsogivesinformationabouthow to adjusttheirbehavior.
11.3 Summarizing a bootstrap of the GAM run
Bootstrapof theGAM runscanbeviewedbothnumericallyandgraphically. Option3on theBOOTSTRAPOF THE GAMMENUwill producea numericalsummaryof therun:
Convergence algorithm: Fluctuation ratio
Convergence criteria: 1.0368 (target= 1.04 )Number of iterations: 150
20In Xpose1.1theuseralwayshadto specifythenumberof iterationsto use.
40
Initial dispersion not estimated.No start model specified.Median normalization on.Seed number: 286
Model size:Min. 1st Qu. Median Mean 3rd Qu. Max.
1 3 3 3.18 4 5
Prob Nonlin FrNonlinHT 0.747 0.44 0.589
AGE 0.613 0.247 0.402HCTZ 0.6 0 0
SEX 0.493 0 0SMOK0.42 0 0
WT 0.307 0.12 0.391
Thefirst half or soof theoutputcontainsinformationaboutthesettingsusedin therunandthesecondhalf containsresults.First is descriptive statisticson thesizeof all thefinalGAM models.Below thatcomesatablewith threecolumnscontainingtherelativeinclusionfrequenciesof thecovariatesandthenon-linearcovariatemodels,followedby the latter divided by the former, i.e. a measureof how importantthe non-linearrelationshipwascomparedto thetotal numberof timesthecovariatewasfound.
11.4 Plotting the results
Theresultsfrom abootstrapof theGAM runcanalsobeviewedgraphically. Selectingoption 4 on the BOOTSTRAPOF THE GAMMENU leads to thePLOT MENUFOR THE BOOTSTRAPOF THE GAM:
PLOT MENUFOR THE BOOTSTRAPOF THE GAM
1: Return to previous menu2: Inclusion frequencies3: Most common covariate combinations4: Distribution of model size5: Inclusion stability - covariates6: Inclusion stability - non-linear7: Inclusion index of covariates8: Inclusion index of non-linear models9: Inclusion index of covariates/individuals10: Inclusion index of non-linear models/individualsSelection:
11.4.1 Plotting the inclusion fr equencies
Option2 will producea plot of therelative inclusionfrequenciesof thecovariatesandthenon-linearcovariaterelationships(Figure13).
Therelative inclusionfrequenciesplottedin thetop panelin Figure13 wereobtainedby countingthenumberof timeseachcovariatewasselectedby theGAM regardless
41
WT�
AGE�
HT
0.0 0.2 0.4 0.6 0.8 1.0
Inclusion frequency
Relative inclusion frequencies of non-linear models
WT�SMOK
SEX
HCTZ
AGE� HT
0.0 0.2 0.4 0.6 0.8 1.0
Inclusion frequency
Relative inclusion frequencies of covariates
Figure13: Barplot of the relative inclusionfrequenciesof the covariates(top panel)andthenon-linearcovariatemodels(lowerpanel).
of thefunctionalformof thecovariaterelationship. Therelative inclusionfrequenciesplottedin thelowerpanelwereobtainedbycountingthenumberof timeseachcovariatewasincludedin theGAM modelasa non-linearrelationship.
11.4.2 Plotting the mostcommoncovariate combinations
Option3 will plot themostcommonone,two, threeandfour covariatecombinationsin theGAM model(Figure14).
The frequenciesfor thesecombinationsareobtainedby countinghow many timesacertaincombinationof covariatesare found in the sameGAM model. The top leftpanelin Figure14 is thesameasthetoppanelin Figure13andtheotherpanelsshowsthefour mostcommoncombinationsof two, threeandfour covariatesrespectively.
11.4.3 The distribution of modelsizes
Anotherpieceof informationwecangetfrom abootstrapof theGAM runarethesizes,i.e. thenumberof covariates,of thefinal GAM modelsfor thebootstrappeddatasets.Option4 will displaythisdistribution(Figure15).
Theideahereis to learnsomethingaboutthetypicalnumberof covariatesnecessaryto
42
HT+AGE+SMOK+WT
SEX+HT+AGE+HCTZ
HT+AGE+HCTZ+WT
SEX+AGE+HCTZ+WT
0.0 0.4 0.8
Frequency
Most common fourcovariate combinations�
HT+AGE+SMOK
AGE+HCTZ+WT
SEX+AGE+HCTZ
HT+AGE+HCTZ
0.0 0.2 0.4 0.6 0.8 1.0
Frequency
Most common three covariate combinations
HT+SMOK
AGE+HCTZ
HT+HCTZ
HT+AGE
0.0 0.2 0.4 0.6 0.8 1.0
Frequency
Most common two covariate combinations
WT
SMOK
SEX
HCTZ
AGE
HT
0.0 0.2 0.4 0.6 0.8 1.0
Frequency
Covariate inclusion frequency
Most common covariate combinations
Figure14: Themostcommonone,two, threeandfour covariatecombinations.
explain thevariability in theparameter.
11.4.4 The stability of the inclusion fr equencies
Oneof theimportantquestionswhenrunningthebootstrapof theGAM is thenumberof bootstrappeddatasetto use(how to setthenumberof datasetsto useis describedin Section11.5). To assesswhetherthenumberof datasetsusedin a particularrun isenoughto make theinclusionprobabilitiesstable,option5 and6 canbeused.Option5 will plot the relative inclusion frequenciesvs the bootstrapiterations(Figure 16)andoption6 will make thethesameplot for theinclusionfrequency of thenon-linearcovariatemodels.In theseplotsthepanelsareorderedfrom lower left to thetop rightaccordingto thetotal inclusionfrequency of eachcovariate.
11.4.5 Inclusion index of covariates
Oneinterestingpossibilitywith the bootstrapof theGAM is to investigatethe inclu-sion/exclusioninteractionsbetweenthecovariates,i.e. will theinclusionof onecovari-atein theGAM modelleadto theinclusionor exclusionof anothercovariate.Option7 producesaplot of the inclusionindex of covariates(Figure17).
Theinclusionindex is givenEq.1.
43
0
10
20�
30�
40�
50
60
1 2�
3�
4�
5
Number of covariates in model
Perc
ent o
f Tot
alFinal model size distribution
Figure15: Thedistributionof thesizesof thefinal GAM models.
��� �"!$#&%('*),+ !.-0/"!0�213-4!.5067!.� 8:9<;=!.#?>@!08:AB!0�C13-4!.5067!.� 8:9!.#D>@!08:AB!0�E13-4!05067!$� 8,9 (1)
wherethe expectedfrequencyis the productof the relative inclusion frequenciesofthetwo covariatesin questionandtheobservedfrequencyis theobservedrelative fre-quency of thesetwo covariatesappearingin thefinal GAM modelstogether. The in-clusionindex will haveavaluegreaterthanoneif thecombinationof thetwo covariatearemorecommonthanexpectedandviceversa.
In Figure17eachpanelrefersto onecovariate.Theinclusionindicesof thatcovariatetogetherwith theothercovariatesareindicatedby +-es. Thepanelsareorderedfrombottomleft to top right accordingto thetotal inclusionfrequency for eachcovariateasarethecovariateson they-axes.
Option 8 will producea similar plot for the inclusion index of non-linearcovariatemodelsandthecovariates(Figure18).
11.4.6 Inclusion index of covariatesand individuals
Similar to the inclusionindex of covariateswe cancalculatethe inclusionindex forcovariates/non-linearcovariatemodelsandindividuals.Thecalculationof theindex isdonein thesamewayasfor thecovariateindex (Eq.1). Option9 will produceplotsof
44
0.0
0.2
0.4
0.6
0.8
1.0WT
0 20 40 60 80 100 120 140
SMOK
SEX
0.0
0.2
0.4
0.6
0.8
1.0HCTZ
0.0
0.2
0.4
0.6
0.8
1.0AGE HT
0 20 40 60 80 100 120 140
Number of bootstrap iterations
Freq
uenc
y of in
clusio
nStability of the relative inclusion frequencies
Figure16: Stability of relative inclusionfrequencies.Eachpanelshows theinclusionfrequency vs thenumberof iterationsfor thecovariateindicatedin thestripaboveit.
theinclusionindex of individualsandcovariates.
In analogywith the other inclusion indiceswe can also plot the inclusion index ofindividualsandnon-linearcovariatemodels.
11.5 Settingsfor the bootstrap of the GAM
Therearea numberof waysthat the usercaninfluencethe way the bootstrapof theGAM behaves.This is donein theBOOTSTRAPOF THE GAMSETTINGS MENU,whichcanbereachedbyselectingoption5 ontheBOOTSTRAPOF THE GAMMENU:
BOOTSTRAPOF THE GAMSETTINGS MENU
1: Return to previous menu2: List current settings3: Estimate dispersion factor (on/off)4: Specify starting model5: Exclude individuals6: Normalize to median (on/off)7: Set maximum number of bootstrap iterations8: Change convergence algorithm9: Specify iteration to start check convergence at
45
+
+
+
+
+
WT
SMOK
SEX
HCTZ
AGE
HT
WT
-0.2 0.0 0.2 0.4
+
+
+
+
+
SMOK
+
+
+
+
+WT
SMOK
SEX
HCTZ
AGE
HT
SEX
+
+
+
+
+
HCTZ
+
+
+
+
+WT
SMOK
SEX
HCTZ
AGE
HT
AGE
+
+
+
+
+
HT
-0.2 0.0 0.2 0.4
Index
Inclusion index of covariates
Figure17: Theinclusionindex (seetext) of covariates.Eachpanelreferto onecovari-ateasindicatedin thestripabove it. Thedashedhorizontalline indicatestheexpectedfrequency
10: Specify at what interval to check the convergence11: Set seed number12: Use weightsSelection:
Option2 is usedto list thecurrentsettings.Options3-7 and13 areequivalentto thesamesettingsin theGAMSETTINGS MENU(seeSection10.5)while options8-12arespecificto thebootstrapof theGAM. Thesettingsarein effectuntil S-PLUSis quit.
11.5.1 Specifyingthe maximum number of iterations
Option8 setsthemaximumnumberof bootstrapiterationto thevaluespecifiedby theuser. Thedefault is 150.Notethatthis is not thesameasspecifyinga fixednumberofiterations,seeSection11.5.3.
11.5.2 Selectingthe convergencealgorithm
Option 9 is usedto changeconvergencealgorithmand convergencecriteria. Thesealgorithmsareusedby Xposeto determinewhenthenumberof iterationsareenough
46
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
oo
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
1135F18
29145
41144839
13322
15061
154105140
3363
1102G43
17164
5453
946
10827
119282642
7H129
201230
1513740
12462
110
12519
142162117
654I64
14151131555
1593J115
1534536
1325850
1462521594938
101122
322331
143163
47121149
5256
835
126161
4416
6158155
57246034
118
-0.6 -0.2 0.2 0.4 0.6
Index
ID
WTK
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
43163
56121
961
145159117140
632G427H125
254814
1085F105
158605031
1433J10
1221817
15557
83033
1012851592119384965
4I153
45121654
119132
37364026
124146
4161
11353
12964
1542729
161118150
23162
2046
126133141
22151
5844
14952
14232552435
1154734396215
110164
13
-0.2 0.0 0.2
Index
ID
AGEL
o
o
o
o
oo
o
o
o
o
oo
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
3J132
3447
101311838222744531356
81016173615352524
14932
162158154125146150141122
21412033
161592657
137
10512
15955
1334965
1216246
64863
16461
12628
1199
108155
54110
5F4I153
1637H140
5852144550
1434229
11823
2G51
1243043
14264
129145
19117
40115
60151
39113
-0.2 0.0 0.1 0.2 0.3
Index
ID
HT
Inclusion index of non-linear models
Figure18: A plot of the inclusionindex of individualsandnon-linearmodels.On they-axis are the ID numbersandon the x-axis is the index. Eachpanelrefersto onecontinuouscovariate.
to satisfytheconvergencecriteria.Thelatterdependson thealgorithmused.
Thedefault algorithmin Xposeis thefluctuationratio. This algorithmterminatesthebootstrapof the GAM whenthe fluctuationsin the relative inclusionprobabilitiesofthe covariatesis lower thancertainvalue. The fluctuationof the inclusionfrequencyfor a specificcovariateis determinedover a certaininterval (thedefault is the last20iterations,seeSection11.5.4)and the measureof fluctuation, M , is the ratio of thehighestandlowestinclusionfrequency in thatinterval. Theoverallfluctuation,MONQPRN isobtainedby Eq.2:
MSNQPRNUTVXWY[ZO\ M YS]_^OYV`Wa ZO\ ^ a (2)
M Y is thefluctuationof therelativeinclusionfrequency in theintervalof the b thcovariateand
^cYis therelative inclusionfrequency of the b th covariateat theendof theinterval.
47
The valueof MSNQPRN , which will be 1 if thereareno fluctuations,arecomparedto thecritical value,by default1.04.
Thefluctuationratio will put moreweight to thecovariateswith a high inclusionfre-quency, whichis desirableif weareonly interestedin theoverallstabilityof thecovari-ates.If weareinterestedin theinfluencethatindividualsand/orothercovariatesexerts,it is necessaryto make surethatalsothecovariateswith low inclusionfrequenciesarefoundby theGAM enoughtimes.This is to make surethat theinclusionindices(e.g.Eq.1) do not rely on only a few occurrencesof thecovariatein thefinal GAM model.The secondterminationalgorithm, lowestabsolutejoint inclusionfrequency, makessurethattheindividual thathasbeenincluded(oneor moretimes)in thebootstrappeddatasetsthe leastnumberof times,is includedat leasta certainnumberof times( d –lowestabsolutejoint inclusionfrequency) togetherwith a hypotheticalcovariatewitha certain,hypothetical,inclusionfrequency ( e – lowest importantrelative inclusionfrequency). Let
>bethelowestrelative inclusionfrequency of all individualsat a cer-
tain iteration, f . (The relative inclusionfrequency of an individual is the numberoftimesan individual hasbeenincludedin any of thebootstrappeddatasetsdividedbythe numberof datasets.) The critical valuefor this algorithmis: e ] > ] f , which iscomparedto d . Thedefault valuesof d and e are25and0.2respectively.
Theconvergenceof thelowestabsolutejoint inclusionalgorithmis only dependentontherandomnumberpermutationusedin aspecificbootstrapof theGAM problem.ThismeansthatXposecouldactuallycalculatethenumberof iterationsneededfor conver-gence,However, the bootstrapof the GAM is not implementedthat way andXposecheckstheconvergenceat eachinterval asspecifiedby theconvergencecheckingin-terval (seeSection11.5.4).
11.5.3 Specifyingan iteration to start checkingconvergence
Option 10 is usedto specify the first interval at which the convergenceshouldbechecked.Thedefault is 30. If theuserwish to setthenumberof bootstrapiterationstoa specificnumber, it canbedoneby settingthefirst convergencecheckinginterval tothesameasthemaximumnumberof bootstrapiterations.
11.5.4 Specifyingthe interval betweenconvergencechecks
Option 11 is usedto specifythe interval betweenconvergencechecks(after the firstcheck).
11.5.5 Specifyinga seednumber
Theconstructionof bootstrappeddatasetsis arandomprocessand,consequently, eachbootstrapof theGAM run is unique.Option12 canbeusedto specifya seednumberfor therandomnumbergenerator. Theseednumberusedfor a bootstrapof theGAMrunis givenin thenumericalsummaryof theresults(seeSection11.3),this is trueevenif theseednumberwasnot suppliedby theuser. To repeata run exactly asa previousrun, it is necessaryto settheseednumberto thesameastheoneusedin theold run
48
12 Treebasedmodeling
Treebasedmodelsareanothermethodto find covariatesto includein theNONMEMmodel[20]. It hasso far not beenmuchusedin the areaof populationpharmacoki-netic/pharmacodynamicanalysis.It offerssome,at leasttheoretical,advantagesovertheGAM in thatthefit is invariantto transformationof theexplanatoryvariables(co-variates)andautomaticallyincludesthepossibilityof identifying interactionsbetweencovariates.For a moredetaileddescriptionof treebasedmodels,pleasereferto Statis-tical modelsin Sby ChambersandHastie[12] or Modernappliedstatisticswith S-plusby VenablesandRipley [21].
To fit atreemodelin Xposeit is necessaryto havecovariatesandat leastoneparameterdefinedin the currentdatabase.Selectoption 6 from the MAIN MENUfollowedbyoption6 in theCOVARIATE MODELMENU. Thiswill leadto theTREE MENU:
TREE MENU
1: Return to previous menu2: Fit tree3: Plot tree4: Find optimal tree sizeSelection:
12.1 Fitting a tr eemodel
To fit a treemodel,selectoption 2 from the TREE MENU. If thereis morethanoneparameterdefinedin thecurrentdatabase,Xposewill prompttheuserfor aparametername.
12.2 Plotting a tr ee
To plot the fitted tree selectoption 3 from the TREE MENU. Xposesaves the fit-ted treesso it is alsopossibleto specifyan old treefit. The Xposetreeobjectsarenamedwith a combinationof tree.xpose. , theparameterandtherunnumber, e.g.tree.xpose.CL.1 . Theuserwill bepromptedfor thesizeof thetreeto beplotted(seethenext section).Pressingreturnastheanswerto this promptwill producea treesimilar to Figure19.
12.3 Pruning tr ees
Thetreein Figure19 is anunprunedtree,i.e. it is a resultof growing thetreeuntil itis no longerpossibleto make any moresplits in theterminalbranches,a point whichis reachedwhentherearelessthan10 pointsin eachof the terminalbranches.Thisis likely to induceover fit andit is usuallynecessaryto prunethe tree21, i.e. theuserhasto specifywhichsizeof thetreethatis relevant(hencethepromptwhenplottinga
21In Xpose1.1 it wasnot at all possibleto prunethe fitted treesmakingthe treemodulein thatversionalmostuseless.
49
|HT<177.5
AGE<56.5
SEX:b
WT<94.235
HT<162.5
WT<81.875
SEX:b
WT<77.905 HT<171.5
HCTZ:b
HT<184 AGE<58.5
WT<94.46
14.62 18.2122.13
25.02 20.68
12.42 14.9620.07 15.63
18.93 29.73
28.68 34.81
26.78
Example fit of a tree model
Figure19: Plot of a treefit. The lengthof the vertical lines areproportionalto theimportanceof thesplit
fitted tree). Someguidanceto theimportanceof a specificsplit is givenby thelengthof the branchesin the treeplot but this can,however, not be usedasan indicatoroftherelevantsizeof a tree.Oneapproachto find therelevanttreesizeis to usea crossvalidationprocedure.First, the dataset is divided into x numberof groups,second,x numberof treesaregrown, eachwith oneof thex groupsomittedfrom thegrowthprocessandthird, thefitted treesareusedto predictthegroupof datanotusedto growthe tree. The improvementin thefit (measuredby thedeviance),at eachnodeof thetree,is averagedoverthex groupsandtheresultsareplotted.Thekey pointhereis thattheimprovementin fit will only continueupto acertainpointafterwhichthedeviancewill increase,i.e. the plot of the averageddeviancevs treesizewill often exhibit aminimum at somepoint, which is the suggestedoptimal tree size. This procedurewill not alwayswork but sinceit a randomprocessit is possibleto repeatit until asatisfactoryminimumis found. The last sentenceimplies a degreeof subjectivity inthe determinationof the optimal treesize. To minimize this, Xposewill repeatthecross-validationsix timesanddisplaytheresults.An exampleof cross-validationplotsareshown in Figure20andtheresultingprunedtree(size6) areshown in Figure21.
Admittedly, this is not satisfactorybut is betterthanusingthe unprunedtrees. Withthis implementationit seemsasif the the treemodelsaremostusefulfor exploratoryanalysis.
50
size
devia
nce
5800g 60
0062
0064
00h 6600
6800
2 4 6 8 10 12 14
1600 360 110 100 80 53 -Inf
size
devia
nce
5500
6000h
6500h
2 4 6 8 10 12 14
1600 360 110 100 80 53 -Inf
size
devia
nce
5600
5800g 60
0062
0064
00h 6600
6800
2 4 6 8 10 12 14
1600 360 110 100 80 53 -Inf
size
devia
nce
5600
5800g 60
00h 6200
6400
6600
2 4 6 8 10 12 14
1600 360 110 100 80 53 -Inf
size
devia
nce
5800
6000
6200
6400
6600
2 4 6 8 10 12 14
1600 360 110 100 80 53 -Inf
size
devia
nce
5400
5600
5800
6000
6200
6400
6600
2 4 6 8 10 12 14
1600 360 110 100 80 53 -Inf
Example of tree size exploration
Figure20: Resultsfrom exploring theoptimaltreesizeusingcross-validation.On they-axisis thedevianceandon thex-axisarethecorrespondingtreesizes.
13 Printing and exporting graphics
Theplotsin Xposearealwaysdisplayedonscreen22 (excepttherunsummaryandrunrecord).Whentheplot hasbeendrawn, theuseris askedif heor shewantsto print theplot or export to afile:
Do you want to print/export the graph? n(p=print,e=export)
Pressingn or returnacceptsthedefault– not to print or export theplot. Pressingp willsendtheplot to theprinterandpressinge will export theplot to a file.
Whena plot is printedor exportedthe userhasthe possibility to customizethe plottitle and/ortheaxis labels(thelatter is not availablewhentherearemultiple plotsperpage).To omit theplot title completely, typen at theprompt.
22In Xpose1.1 theuserhadselectif theplotsshouldbedisplayedon screenor sentto theprinterbeforemakingtheplot.
51
|HT<177.5
AGE<56.5
SEX:b
HCTZ:b
HT<184
17.96 22.72
15.26
18.93 29.73
29.49
Example of a pruned tree
Figure21: Prunedtreeof size6.
Do you want to print/export the graph? n(p=print,e=export)pType new title (return for default):Final modelType new x-axis title (return for default):Time (h)Type new y-axis title (return for default):Weighted residuals
Whenprintingaplot it is sentto thesystemsdefaultprinter. To changethatbehavior ona UNIX system it is necessary to use a command similar tops.options(command="lp -d my.printer") , pleaserefer to the S-PLUSmanualsfor furtherdetails.On a PCtheusercanspecifytheprinterin theusualWin-dowsway.
Theformatof anexportedplotdependsontheplatform.OnUNIX machinesthedefaultformat is postscriptand the plot will end up in a file called xp.ps in the currentdirectory. On PCsthe plots are exportedto the clipboard in windows metafileformat,i.e. whenaplot hasbeenexportedit is necessaryto switchto anotherWindows
52
application,e.g.MicrosoftWord,andpastetheplot in a document.Note,ona PCit isgenerallynotpossibleto exportmulti-pageplots.
As perdefaultplotsareexportedandprintedin blackandwhite. To usecolor, goto theDATABASEMANAGEMENTMENUandselectoption26. Thisoptionwill togglecolorprintingandexportingonandoff.
If the userwishesto changethe behavior of theprinting and/orexportationit canbedonein theXposefunctionsask.print andset.xp.print.dev .
14 Altering the the way Xposedraws plots
Section2.5describeshow theXposedatavariablescanbere-defined.For somevari-ablesit is alsopossibleto undefinethem. To undefinea variable,selectthe relevantoptionon the MANAGEDATABASESMENUandtypeNULL at theprompt(notethecapitalization).Only somedatavariablescansuccessfullybeundefined:idlab, wres,iwres, pred, ipredanddv.
Undefiningidlab will suppresstheusageof ID numbersasplottingsymbols.
Undefiningeither or both of pred and ipred will suppressthe plotting of the cor-respondingcurve in the Individual plots option in the GOODNESSOF FIT
PLOTS MENU. Thisalsoworksin thePredictions vs dependent variableoptionin thesamemenu.
Undefiningeitheror twoof pred, ipredanddvwill suppressthecorrespondingplot(s)inthePredictions vs independent variable in theSTRUCTURALMODELDIAGNOSTICS MENU.
Undefiningeitherof iwres or wres will suppressthecorrespondingplot in options5 and6 on theRESIDUAL ERRORMODELDIAGNOSTICS MENU
15 Making plots of a subsetof the data
Sometimesit is desirableto concentrateon a subsetof thedata,for examplethephar-macodynamic data from a fit of a simultaneous pharmacokinetic/pharmacodynamicmodel. This canbedonein Xposeby specifyingtwo Xposedatavariables:flag andcurflag (currentvalueof flag). Theflag variablecanbespecifiedintheDATABASEMANAGEMENTMENUor setto adefaultvalueby theuseof amutab(seeSection2.1.4).Theflag variabledetermineswhich dataitem to beusedto definethesubsetsof thedata. Theflag variablecan,however besetwithout makingXposeproduceplots for a certainsubgroup23. To usea certainsubgroupit is alsonecessaryto specifythecurrentvalueof theflag variable. (Seebelow for exceptions.)Thisdonewith option14 in theDATA BASE MANAGEMENTMENU:
Type the value to be used as the current flag value. Possible valuesare: 2 1 (q to leave it as is, n to unspecify):
23Usinga mutab in Xpose1.1 madeit possibleto selectplots from a certainmenuwith plots thatknewhow to handlemultiple responsevariables,i.e. it wasenoughto specifythe“flag” variablein themutab .
53
It is necessaryto specifytheflag variablebeforesettingthecurrentvalueof theflag.In theaboveexampletheflag variablehasbeensetto SEX. Whenselectingthecurrentvalueof theflag, Xposeshows thepossiblevaluesthecurrentvaluecanhave, in thiscase1 and2, andgivestheusertheoptionsto leave it asis, that is, not to changethecurrentvalueof theflag, to unspecifyit, i.e. to go backto look at all dataat thesametime,andto specifyany of thepossiblevalues.Note! Theflag variablewill betreatedasa factorvariable,meaningthatit is nota goodideato setit to a continuousvariablelikeAGE.
Whentheflag andcurflag variablesareset,all plotsandanalyseswill bemadebasedon thesubsetof thedataspecifiedby thesevariables.
If theflag is setandnot thecurflag thereareafew plotsthatoffersto usethatflag vari-ableastheconditioningvariable.Specifically, thisis thecoplotsontheSTRUCTURALMODELDIAGNOSTICS MENUandtheRESIDUAL ERRORMODELDIAGNOSTICS MENU(seeFig. 1) thatcorrespondsto theplotsin theBasic goodness of fit plotsandAdditional goodness of fit plots thatusesthecovariatesasthecon-ditioningvariable.Theseoptionswill offer to usetheflag variableastheconditioningvariableinsteadof thedataitemsdefinedby thecovariate variables.
54
References
[1] L. Aarons.Sparsedataanalysis.Eur. J. Metab. Pharmacokin., 18:97–100,1993.
[2] T. H. Grasela,E. J. Antal, R. J. Townsend,andR. B. Smith. An evaluationofpopulationpharmacokineticsin therapeutictrials.PartI. Comparisonof method-ologies.Clin. Pharmacol.Ther., 39(6):606–612,1986.
[3] L. B. SheinerandS. L. Beal. Evaluationof methodsfor estimatingpopulationpharmacokineticparametersII. Biexponentialmodel;experimentalpharmacoki-neticdata.J. Pharmacokin.Biopharm., 9(4),1983.
[4] M. O. Karlsson,E. N. Jonsson,C. Wiltse,andJ.R. Wade.Assumptiontestinginpopulationpharmacokineticmodels:Illustratedwithin ananalysisof moxonidinepk data.Submitted.
[5] StatSci,a division of Mathsoft, Inc, Seattle. S-PLUSguide to statistical andmathematicalanalysis,Version3.3, 1995.
[6] W. S. Cleveland. Visualizingdata. Hobartpress,Summit,New Jersey, USA,1993.
[7] W. S. Cleveland. Theelementsof graphingdata. Hobartpress,Summit,NewJersey, USA, 1985.
[8] J.W. Tukey. Exploratorydataanalysis. Addison-Wsley, Reading,Massachusetts,USA, 1977.
[9] R. A. BeckerandW. S.Cleveland.S-PLUSTrellis Graphicsuser’smanual. Seat-tle:MathSoft,Inc, MurrayHill: Bell Labs,1996.
[10] J. W. Mandema, D. Verotta, and L. B. Sheiner. Building populationpharmacokinetic-pharmacodynamic models.I. modelsfor covariateeffects. J.Pharmacokin.Biopharm., 20(5):511–528,1992.
[11] J.W. Mandema, D. Verotta, and L. B. Sheiner. Building populationpharmakokinetic-pharmacodynamic models. In D. Z. D’Ar genio, editor, Ad-vancedmethodsofpharmacokineticandpharmacodynamicsystemsanalysis,Vol-ume2. PlenumPress,1993.
[12] J. M. ChambersandT. J. Hastie,editors. Statisticalmodelsin S. Chapman&Hall, London,1993.
[13] T. J. HastieandR. J. Tibshirani. Generalizedadditivemodels. ChapmanandHall, New York, 1990.
[14] E. N. Jonssonamd M. O. Karlsson. Identification of factors that influ-ences the importanceof covariate relationshipsin population pharmacoki-netic/pharmacodynamicmodels.Submitted, 1997.
[15] A. C. Davison and E. J. Snell. Residualsand diagnostics. In D. V. Hinkley,N. Reid,andE. J.Snell,editors,Statisticaltheoryandmodelling:In honorof SirDavidCox. Chapman& Hall, London,1991.
55
[16] B. Efron andR. J. Tibsgirani. An introductionto the bootstrap. ChapmanandHall, New York, 1993.
[17] R. Mick and M. J Ratain. Bootstrapvalidation of pharmacodynamicmodelsdefinedvia stepwiselinear regression.Clin. Pharmacol.Ther., 56(2):217–222,1994.
[18] W. SaubereiandM. Schumacher. A bootstrapresamplingprocedurefor modelbuilding: Application to the Cox regressionmodel. Statisticsin medicine,11:2093–2109,1992.
[19] E. I. Ette. Stability andperformanceof a populationpharmacokineticmodel. J.Clin. Pharmacol., 37:486–495,1997.
[20] E. I. EtteandT. M. Ludden.Populationpharmacokineticmodelling:Theimpor-tanceof informativegraphics.Pharm.Res., 12(12):1845–1855, 1995.
[21] W. N. Venablesand B. D. Ripley. Modern applied statisticswith S-PLUS.Springer-Verlag,New York, 1994.
56