be boundless advancing data-intensive discovery in all fields...epic fail, global impact •2010...

46
Reproducibility: failures & futures David A. C. Beck Chemical Engineering & eScience Institute Advancing data-intensive discovery in all fields Knowledge and solutions for a changing world Be boundless

Upload: others

Post on 22-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Reproducibility:failures&futures

DavidA.C.BeckChemicalEngineering&eScience Institute

Advancingdata-intensivediscoveryinallfields

Knowledgeandsolutionsforachangingworld

Beboundless

Page 2: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Reproducibility

• Cananexperimentalresultbereproduced?• Reproducibilitycomesindifferentflavors

– Samedata,sameanalyses(Reproducible)– Similardata,sameanalyses(Replicability)– Samedata,similaranalyses(Robustness)– Others?

– TodayI’lluseReproducibility tocoverallofthese

Page 3: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Reproducibility

• Cananexperimentalresultbereproduced?– Medicalscience

• Drugtrial,Doesadrugprovideabenefit?Isitharmful?• Isthereageneticassociationwithacancer?

– Economics• Isausteritythebestwaytogetanationaleconomyoutofrecession?

• Isa2billiondollarindustrialplantafinanciallysensibleinvestment?

Page 4: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Reproducibility

• Cananexperimentalresultbereproduced?– Socialscience

• Doesanin-personconversationchangeviewsonmarriageequality?

– Engineering• Doesawastewatertreatmentstrategyremovemicro-pollutantsdowntoasafelevel?

Page 5: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Reproducibility

• Cananexperimentalresultbereproduced?– Theaboveexamplesallhavedatasciencecomponents

Isn’tjustacademicscience&engineering!

Page 6: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Reproducibility

• Cananexperimentalresultbereproduced?– Marketing

• Doloyaltyprogramsalterbuyerbehavior?• Doesremovingfieldsfromaregistrationformincreaseusercompletion?

• Doesawebpagelayoutincreasepurchasing?• Sidebar:

– Toseesomeofhowthisworks,checkoutthishowto:» https://webdesign.tutsplus.com/articles/split-testing-with-google-analytics-experiments--webdesign-7879

• Otherexamples?

Page 7: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

EpicfailSchadenfreude*parade

*a feeling of joy that comes from seeing or hearing about another person's troubles or failures. - Wikipedia

Page 8: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail

• In2011,Bayer(pharmaceuticals)triedtoreplicate67importantpapers– Oncology– Women’shealth– Cardiovascularmedicine

Onlyabout21%werereproducible

Begley,C.G.;Ellis,L.M.(2012)."Drugdevelopment:Raisestandardsforpreclinicalcancerresearch".Nature 483 (7391):531–533.

Page 9: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail,part2

• In2012,AmgenpublishedareportinNature– Examined53landmarkstudiesincancer

6of53(11%)werereproducible

Begley,C.G.;Ellis,L.M.(2012)."Drugdevelopment:Raisestandardsforpreclinicalcancerresearch".Nature 483 (7391):531–533.

Page 10: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail,part3Primer:microarrays

Miller, M. B. and Y. W. Tang (2009). "Basic concepts of microarrays and potential applications in clinical microbiology." Clin Microbiol Rev 22(4): 611-633.

Page 11: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail,part3

Ionnidis,P.etal.Repeatabilityofpublishedmicroarraygeneexpressionanalyses.NatGen,41:2,Feb2009

Attempttoreproduce18tablesandfigurespaperspublishedinNatureGeneticsusingmicroarrays

Page 12: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfailsinmedicine

• Whataretherepercussionsofirreproducibleresultsinmedicine?

– Biotechcompanies– Government– People?

Page 13: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail,globalimpact

• Grabyourway-backhatandputiton!

Page 14: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail,globalimpact

• Grabyourway-backhatandputiton!

Page 15: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail,globalimpact

• 2010paperbyReinhart&Rogoff“GrowthinaTimeofDebt”– …highdebt/GDPlevels(90percentandabove)areassociatedwithnotablylowergrowthoutcomes.

– DebttoGDPratiosover90%havereadGDPgrowthof-0.1%

– Seldomdocountries“grow”theirwayoutofdebts.

Reinhart,CarmenM.,andKennethS.Rogoff.2010."GrowthinaTimeofDebt."AmericanEconomicReview,100(2):573-78.

Page 16: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail,globalimpact

• Paperwaswidelycitedby– Politicalparties– Governments– Internationallendingagencies

• Toshowthatausterity wasthesolutiontotheglobalrecession

• Evenpartofthe2012USpresidentialelection!

Reinhart,CarmenM.,andKennethS.Rogoff.2010."GrowthinaTimeofDebt."AmericanEconomicReview,100(2):573-78.

Page 17: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail,globalimpact

• UMassAmherstGraduatestudentThomasHerndon– Triedtoreproducetheresultsofthepaperforaclass:couldn’t

– Requestedthe‘code’forthecomputationsfromR&R:gotanExcelspreadsheet

– Foundmultipleerrors

Reinhart,CarmenM.,andKennethS.Rogoff.2010."GrowthinaTimeofDebt."AmericanEconomicReview,100(2):573-78.ThomasHerndon,MichaelAsh&RobertPollin,DoesHighPublicDebtConsistentlyStifleEconomicGrowth?ACritiqueofReinhartandRogoff

Page 18: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail,globalimpact

• UMassAmherstGraduatestudentThomasHerndon– Foundmultipleerrors

Reinhart,CarmenM.,andKennethS.Rogoff.2010."GrowthinaTimeofDebt."AmericanEconomicReview,100(2):573-78.ThomasHerndon,MichaelAsh&RobertPollin,DoesHighPublicDebtConsistentlyStifleEconomicGrowth?ACritiqueofReinhartandRogoff

Codingerrors,selectiveexclusionofavailabledata,andunconventionalweightingofsummarystatisticsleadtoseriouserrorsthatinaccuratelyrepresenttherelationshipbetweenpublicdebtandGDPgrowth.

Page 19: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail,globalimpact

• Herndonfixedtheerrorsandreexaminedclaims• Originalclaims

– DebttoGDPratiosover90%haverealGDPgrowthof-0.1%

– Inarecession:Austeritygood,spendingbad• Modifiedclaims

– DebttoGDPratiosover90%haverealGDPgrowthof2.2%

– Inarecession:SpendinggoodReinhart,CarmenM.,andKennethS.Rogoff.2010."GrowthinaTimeofDebt."AmericanEconomicReview,100(2):573-78.ThomasHerndon,MichaelAsh&RobertPollin,DoesHighPublicDebtConsistentlyStifleEconomicGrowth?ACritiqueofReinhartandRogoff

Page 20: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail,globalimpact

• Grabyourway-backhatandputiton!

Page 21: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfail,globalimpact

• WhateffectdidtheincorrectR&Rpaperhave?

Page 22: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Epicfailure,part4

http://www.nature.com/news/over-half-of-psychology-studies-fail-reproducibility-test-1.18248

Page 23: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Reproducibility

• Whydowecare?

“Non-reproduciblesingleoccurrencesareofnosignificancetoscience.”

– KarlPopper

Popper, K. R. 1959. The logic of scientific discovery. Hutchinson, London, United Kingdom.

Page 24: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Scienceincrisis?

Baker,M.1,500scientistsliftthelidonreproducibility.Nature 533,452-454(2016).

Page 25: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Reproducibility:Thingsarebad

Page 26: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Whyisthishappening?

• Socialfactors,e.g.– Fraud,misconduct– Pressuretopublish

• p-hacking• Poorexperimentaldesign

– Smalleffectsize– Smallsamplesize

• Datanotdisclosed• Methodsnotdisclosedorproperlydescribed

– Softwarenotavailable

ImportantbutnotDataSciencerelated.WEAREWORKINGONTHESE!

Page 27: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

p-hacking

• Doastudytotestsomehypothesis– E.g.anappleadaykeepstheDr.away

• Useap-valueof0.05– i.e.5%chanceofseeingadifferenceatleastasbigaswehave,bychancealone

• Perform1000sofstatisticaltests• Whathappens?

~50significantresultsbychancealone

1. Simmons, J.P., N.D. Nelson, and U. Simonsohn. 2011. False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science 22(11):1359-1366.

Page 28: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

p-hacking• Testverylargenumberofhypothesisonadataset

searchingforanystatisticallysignificanteffect• Goesbymanynamesindifferentdisciplines

– Multiplecomparisons(1950s,moststatisticians),– Filedrawerproblem(Rosenthal,1979),– Significancequesting(RothmanandBoice,1979),– Datamining,dredging,torturing(Mills,1993),– Datasnooping(White,2000),– Selectiveoutcomereporting(Chanetal.,2004),– Bias(Ioannidis,2005),– Hiddenmultiplicity(Berry,2007),– Specificationsearching(Leamer,1978),and– p-hacking(Simmonsetal.,2011).

https://www.nap.edu/read/21915/chapter/4#43

Page 29: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

p-hacking

• Isthisintentionallyevil?• Whyisn’titmisconduct?

• Myopinion:– Mosttimes,probablynot– Reflectslackofunderstandingabouthypothesistesting

Page 30: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

p-hacking

• Whatisbeingdoneaboutit?– Registerthestudybeforehand“Preregistration”– Leteveryoneknowwhattheprecisehypothesisbeingtestedbeforedataarecollected

– Getfreefromthetyrannyofthep-value– Betterstatisticseducation

Page 31: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Poorexperimentaldesign

• Wanttotesttoxicityofmynewfluorescentbrowndye

Page 32: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Poorexperimentaldesign

• Wanttotesttoxicityofmynewfluorescentbrowndye– Feedsometo10people– Watchhowlongtheylive

10subjects,day0

Page 33: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Poorexperimentaldesign

• Whataresomeproblemswiththisexperimentaldesign?

– Controlgroup?

WHATDOYOUMEANYOUFORGOTTHECONTROL?

10subjects,nodye

Similardemographics

Page 34: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Poorexperimentaldesign

• Isittoxic?

10subjects,day0 10subjects,day1

*Averagelifespaninusis78years*Averagelifespaninusis78yearswithastandarddeviationof15years

Page 35: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Poorexperimentaldesign

• Isittoxic?

10subjects,day0 10subjects,50years

*Averagelifespaninusis78yearswithastandarddeviationof15years

Page 36: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Poorexperimentaldesign

• Isittoxic?

10subjects,day0 10subjects,50years

*Averagelifespaninusis78yearswithastandarddeviationof15years

Page 37: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Poorexperimentaldesign

• Whataresomeproblemswiththisexperimentaldesign?– Whatistheeffectsizeyouwanttobeabletomeasure?E.g.howmanyyearsdifference?

– Whatisthesamplesizerequiredtoseethateffect?

• Smallsamplecanseeaneffectduetochance– Won’tbereproducible!

Page 38: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Poorexperimentaldesign

• Whatisbeingdoneaboutit?– Betterstatisticseducation– Replicatesignificantresultswithsmalleffectsizewithwaymoresamples

SAMPLES

Page 39: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Datadisclosure

• Dataunavailable– Lostordestroyed– Streamingdatatoobigtostore

• Rawdatanotkept,onlyprocessed• Dataintentionallynotshared

– Bylaw(FERPA,HIPPA)– Corporatedata(e.g.twitter,JSTOR)– Somejerkjustwon’tshare

Page 40: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Datadisclosure

• Dataunavailable– Lostordestroyed– Streamingdatatoobigtostore

• Rawdatanotkept,onlyprocessed• Dataintentionallynotshared

– Bylaw(FERPA,HIPPA)– Corporatedata(e.g.twitter,JSTOR)– Somejerkjustwon’tshare

Page 41: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Datadisclosure

• Whatisbeingdoneaboutit?– Federalfundingagenciesnowrequiredatasharing– Sciencejournalsrequireopendata– Depositrawdataassoonascollected

• Similartopreregistration

– Opendatabadgesforresearchers– Datasharingrepositories

• NationalCenterforBiotechnologyInformation• Dryad(20GBlimit,$100/10GBbeyond)

Page 42: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Methods

• Poorlywrittenmethods– Stepsmissing

• Intentionalmethodsomissions– Toprotectamonopolyonanexperimentalprocedure

• Thefix:– Betterpeerreviewinscience– Bettercommunicationskillseducationinbusiness

Page 43: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Software

• Softwareunavailable– Why?

• Whataresomeotherothersoftwareissues?– Un-runnable,i.e.broken– Notdocumented– Dependenciesnotknownorgiven– Hardwareconstraints

Page 44: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Software

• Whatisbeingdoneaboutit?– Useopensourcesoftware– Virtualenvironments

• UsesomethingthatcanFREEZE thestateofthesoftwareandhardware

• Dockerimages• AmazonMachineImages(AMI)• Virtualmachinesgenerally

– Educatingscientistsinsoftwareengineering• Versioncontrol,documentation,testing,…

Page 45: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Resources

• eScienceInstituteReproducbilityGroup– http://uwescience.github.io/reproducible/

• BerkeleyInstituteforDataScienceReproStuff– https://bids.berkeley.edu/working-groups/reproducibility-and-open-science

• CenterforOpenScience– https://cos.io

• CourserafromJHU– https://www.coursera.org/learn/reproducible-research

• Otherlinksinthispresentation

Page 46: Be boundless Advancing data-intensive discovery in all fields...Epic fail, global impact •2010 paper by Reinhart & Rogoff “Growth in a Time of Debt” –…high debt/GDP levels

Thankyou!

• Seeyounextweekforlastseminar!

• CSE491folks:– Don’tforgettotakethequiz!– Don’tforgettotakethequiz!– Don’tforgettotakethequiz!– Don’tforgettotakethequiz!– Don’tforgettotakethequiz!– Don’tforgettotakethequiz!