open science monitor draft methodological note - ec.europa.eu · in the osm framework, supply...

16
1 OPEN SCIENCE MONITOR DRAFT METHODOLOGICAL NOTE Brussels, April 30 th 2018 Consortium partners: Subcontractor:

Upload: hoanghanh

Post on 15-Nov-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

1

OPENSCIENCEMONITOR

DRAFTMETHODOLOGICALNOTE

Brussels,April30th2018

Consortiumpartners:

Subcontractor:

2

1 Introduction........................................................................................................................................................3

1.1 Objectives....................................................................................................................................................41.2 Scope..............................................................................................................................................................5

2 Indicatorsanddatasources.........................................................................................................................62.1 Openaccesstopublications.................................................................................................................7

2.2 Openresearchdata..................................................................................................................................8

2.3 Opencollaboration..................................................................................................................................82.3.1 OpenCode...........................................................................................................................................8

2.3.2 Openscientifichardware.............................................................................................................8

2.3.3 Citizenscience...................................................................................................................................92.3.4 Altmetrics............................................................................................................................................9

3 Nextsteps..........................................................................................................................................................11Annex:TechnicalreportontheidentificationofOpenAccesspublishing...................................12

3

1 IntroductionOpen science has recently emerged as a powerful trend in research policy. To be clear,opennesshasalwaysbeena corevalueof science, but itmeantpublishing the resultsorresearchinajournalarticle.Today,thereisconsensusthat,byensuringthewidestpossibleaccess and reuse to publications, data, code and other intermediate outputs, scientificproductivitygrows,scientificmisconductbecomesrarer,discoveriesareaccelerated.Yetitisalsoclear thatprogress towardsopenscience is slow,because ithas to fit ina systemthat provides appropriate incentives to all parties. Of course dr. Rossi can advance hisresearch faster by having access to dr. Svensson’s data, butwhat is the rationale for drSvenssontoshareherdataifnooneincludesdatacitationmetricsinthecareerassessmentcriteria?

TheEuropeanCommissionhasrecognizedthischallengeandmovedforwardwithstronginitiativesfromtheinitial2012recommendationonscientificinformation(C(2012)4890),such as the Open Science Policy Platform and the European Open Science Cloud. OpenaccessandopendataarenowthedefaultoptionforgranteesofH2020.

TheOpenScienceMonitor(OSM)aimstoprovidedataandinsightneededtosupporttheimplementationofthesepolicies.ItgathersthebestavailableevidenceontheevolutionofOpenScience, itsdriversandimpacts,drawingonmultipleindicatorsaswellasonarichsetofcasestudies.1

Thismonitoringexercise ischallenging.Openscience isa fastevolving,multidimensionalphenomenon. According to the OECD (2015), “open science encompasses unhinderedaccesstoscientificarticles,accesstodatafrompublicresearch,andcollaborativeresearchenabledbyICTtoolsandincentives”.Thisverydefinitionconfirmstherelativefuzzinessoftheconceptandtheneedforacleardefinitionofthe"trends"thatcomposeopenscience.

Preciselybecauseofthefastevolutionandnoveltyofthesetrends,inmanycasesitisnotpossible to find consolidated,widely recognized indicators. Formore established trends,such as open access to publications, robust indicators are available throughbibliometricanalysis.Formostothers,suchasopencodeandopenhardware,therearenostandardizedmetrics or data gathering techniques and there is theneed to identify the best availableindicatorthatallowsonetocapturetheevolutionandshowtheimportanceofthetrend.Thepresentdocumentillustratesthemethodologybehindtheselectedindicatorsforeachtrend.Thepurposeof thedocument is toensure transparencyand togather feedback inordertoimprovetheselectedindicators,thedatasourcesandoverallanalysis.TheinitiallaunchoftheOSMcontainsalimitednumberofindicators,mainlyupdatingtheexistingindicatorsfromthepreviousMonitor(2017).Newtrendsandnewindicatorswillbe added in the course of the OSM project, also based on the feedback to the presentdocument.1TheOSMhasbeenpublishedin2017asapilotandre-launchedbytheEuropeanCommissionin2018throughacontractwithaconsortiumcomposedbytheLisbonCouncil,ESADEBusinessSchoolandCWTSofLeidenUniversity(plusElsevierassubcontractor).Seehttps://ec.europa.eu/research/openscience/index.cfm?pg=home&section=monitor

4

1.1 Objectives

TheOSMcoversfourtasks:

1. Toprovidemetricsontheopensciencetrendsandtheirdevelopment.2. Toassessthedrivers(andbarriers)toopenscienceadoption.3. Toidentifytheimpacts(bothpositiveandnegative)ofopenscience4. Tosupportevidencebasedpolicyactions.

Theindicatorspresentedherefocusmainlyonthefirsttwotasks:mappingthetrends,andunderstandingthedrivers(andbarriers)foropenscienceimplementation.Thechartbelowprovidesanoverviewoftheunderlyingconceptualmodel.Figure1:Aconceptualmodel:aninterventionlogicapproach

The central aspect of themodel refers to the analysis of the open science trends and isarticulatedalongsidethreedimensions:supply,uptakeandreuseofscientificoutputs.

IntheOSMframework,supplyreferstotheemergenceofservicessuchasdatarepositories.Thenumberofdatarepositories(oneoftheexistingindicators)isasupplyindicatorofthedevelopment of Open Science. On the demand side, indicators include, for example, theamountofdatastoredintherepositories,thepercentageofscientistssharingdata.Finally,becauseof thenatureofOpenScience, theanalysiswillgobeyondusage,since thereusedimensionisparticularlyimportant.Inthiscase,relevantindicatorsincludethenumberofscientist reusingdatapublishedbyother scientists, or thenumberofpapersusing thesedata.

5

Ontheleftsideofthechart,themodelidentifiesthekeyfactorsinfluencingthetrends,bothpositively and negatively (i.e. drivers and barriers). Both drivers and barriers areparticularly relevant for policy-makers as this is the area where an action can makegreatestdifference, andare therefore strongly related topolicy recommendations.Theseinclude“policydrivers”,suchasfunders’mandates.Itisimportanttoassessnotonlypolicydriversdedicatedtoopenscience,butalsomoregeneralpolicydriversthatcouldhaveanimpactontheuptakeofopenscience.Forinstance,theincreasingrelianceonperformancebased funding or the emphasis on market exploitation of research are general policydriversthatcouldactuallyslowdowntheuptakeofopenscience.

Therightsideofthechartinthemodel,illustratestheimpactsofopensciencetoresearchor the scientific process itself; to industry or the capacity to translate research intomarketableproductsandservices;tosocietyorthecapacitytoaddresssocietalchallenges.

1.2 Scope

By definition, open science concerns the entire cycle of the scientific process, not onlyopenaccesstopublications(Burgelmanetal.,2010).Hencethemacro-trendscoveredbythestudyinclude:openaccesstopublications,openresearchdataandopencollaboration.Table1:Articulationofthetrendstobemonitored

Categories Trends

Openaccesstopublications

• Openaccesspolicies(fundersandjournals),• Greenandgoldopenaccessadoption(bibliometrics).2

Openresearchdata

• Opendatapolicies(fundersandjournals)• Opendatarepositories• Opendataadoptionandresearchers’attitudes.

Opencollaboration

• Opencode,• Altmetrics,• Openhardware,• Citizenscience.

Newtrendswithintheopenscienceframeworkwillbeidentifiedthroughinteractionwiththestakeholder’scommunitybymonitoringdiscussiongroups,associations(suchasResearchDataAlliance-RDA),mailinglists,andconferencessuchasthoseorganisedbyForce11(www.force11.org).

2AccordingtotheEC,“‘Goldopenaccess’meansthatopenaccessisprovidedimmediatelyviathepublisherwhenanarticleispublished,i.e.whereitispublishedinopenaccessjournalsorin‘hybrid’journalscombiningsubscriptionaccessandopenaccesstoindividualarticles.Ingoldopenaccess,thepaymentofpublicationcosts(‘articleprocessingcharges’)isshiftedfromreaders’subscriptionsto(generallyone-off)paymentsbytheauthor.[…]‘Green.openaccess’meansthatthepublishedarticleorthefinalpeer-reviewedmanuscriptisarchivedbytheresearcher(orarepresentative)inanonlinerepository.”(Source:H2020ModelGrantAgreement)

6

The study coversall research disciplines, and aims to identify the differences in openscience adoption and dynamics between diverse disciplines. Current evidence showsdiversityinopensciencepracticesindifferentresearchfields,particularlyindata-intensiveresearchdomains(e.glifesciences)comparedtoothers(e.ghumanities)The geographic coverage of the study is 28 Member States (MS) and G8 countries,including the main international partners, with different degrees of granularity for thedifferentvariables.Asfaraspossible,datahastobepresentedatcountrylevel.Finally,theanalysisfocusesonthefactorsatplayfordifferentstakeholdersasmappedinthechartbelow(table2).Foreachstakeholder’scategory,OSMwilldeliberatelyconsiderbothtraditional(e.gThomsonReuters)andnewplayersinresearch(e.gF1000).Table2:Stakeholderstypes

Researchers Professionalandcitizensresearchers

Researchinstitutions

Universities,otherpubliclyfundedresearchinstitutions,andinformalgroups

Publishers Traditionalpublishers

NewOAonlineplayers

Serviceproviders Bibliometricsandnewplayers

Policymakers Atsupranational,nationalandlocallevel

Researchfunders Privateandpublicfundingagencies.

2 IndicatorsanddatasourcesBecauseofthefastandmultidimensionalnatureofopenscience,awidevarietyofindicatorshavebeenused,dependingondataavailability:

- Bibliometrics:thisisthecaseforopenaccesstopublicationsindicators,andpartiallyforopendataandaltmetrics.

- Onlinerepositories:therearemanyrepositoriesdedicatedtoprovidingawidecoverageofthetrends,suchaspoliciesbyfundersandjournals,APIsandopenhardware.

- Surveys:surveysofresearchersshedlightonusageanddrivers.Preferenceisgiventomulti-yearsurveys.

- Adhocanalysisinscientificarticlesorreports:forinstance,reviewsofjournalspolicieswithregardtoopendataandopencode

- Datafromspecificservices:openscienceservicesoftenofferdataontheiruptake,asforSci-starterorMendeley.Inthiscase,dataofferlimitedrepresentativenessaboutthetrendingeneral,butcanstillbeusefultodetectdifferences(e.g.bycountryordiscipline).Wherepossible,inthiscase,wepresentdatafrommultipleservices.

7

AtthetimeofthepublicationofthenewOSM(May2018),onlyasub-groupoftheindicatorslistedbelowarealreadypublished(indicatedwitha“*”signinthetables).Theotherswillbepublishedatregularintervals.

2.1 Openaccesstopublications

Besidethelonglistofindicatorsbelow,thedetailedmethodologyforcalculatingthepercentageofOApublicationsispresentedintheannexattheendofthisdocument.

Indicator SourceNumberofFunderswithopenaccesspolicies* SherpaJuliet3NumberofJournalswithopenaccesspolicies* SherpaRomeo4P-#Scopuspublicationsthatenterintheanalysis* Scopus,DOAJ,ROAD,

PubMedCentral,CrossRef,OpenAire

P(oa)-#ScopuspublicationsthatareOpenAccess(CWTSmethodforOAidentification)*

Scopus,DOAJ,ROAD,PubMedCentral,CrossRef,OpenAire

P(greenoa)-#ScopuspublicationsthatareGreenOA*

Scopus,DOAJ,ROAD,PubMedCentral,CrossRef,OpenAire

P(goldoa)-#ScopuspublicationsthatareGoldOA* Scopus,DOAJ,ROAD,PubMedCentral,CrossRef,OpenAire

PP(oa)-PercentageOApublicationsoftotalpublications*

Scopus,DOAJ,ROAD,PubMedCentral,CrossRef,OpenAire

PP(greenoa)-PercentagegoldOApublicationsoftotalpublications*

Scopus,DOAJ,ROAD,PubMedCentral,CrossRef,OpenAire

PP(goldoa)-PercentagegreenOApublicationsoftotalpublications*

Scopus,DOAJ,ROAD,PubMedCentral,CrossRef,OpenAire

TCS-TotalCitationScore.SumofallcitationsreceivedbyPinScopus.

Scopus,DOAJ,ROAD,PubMedCentral,CrossRef,OpenAire

FWCI–FieldWeightedCitationScore. Scopus,DOAJ,ROAD,PubMedCentral,CrossRef,OpenAire

TP1/TP10-Top1/Top10percentilehighlycitedpublications

Scopus,DOAJ,ROAD,PubMedCentral,CrossRef,OpenAire

3http://v2.sherpa.ac.uk/juliet/4http://www.sherpa.ac.uk/romeo/index.php?la=en&fIDnum=|&mode=simple

8

2.2 Openresearchdata

Indicator Source

NumberofFunderswithpoliciesondatasharing* SherpaJuliet

NumberofJournalswithpoliciesondatasharing* Vasilevskyetal,20175

Numberofopendatarepositories* Re3data

%ofpaperpublishedwithdata Bibliometrics:Datacite

Citationsofdatajournals Bibliometrics:Datacite

Attitudeofresearchersondatasharing* SurveybyElsevier,follow-upofthe2017report.6

2.3 Opencollaboration

Indicator Source

Membershipofsocialnetworksonscience(Mendeley,ResearchGate,f1000)

Scientificsocialnetworks

2.3.1 OpenCode

Indicator Source

NumberofcodeprojectswithDOI MozillaCodemeta

NumberofscientificAPI* Programmableweb

%ofjournalswithopencodepolicy* Stodden20137

NumberofscientificprojectsonGithub Github

2.3.2 Openscientifichardware

Indicator Source

Numberofprojectsonopenhardwarerepository* OpenHardwarerepository85Vasilevsky,NicoleA.,JessicaMinnier,MelissaA.Haendel,andRobinE.Champieux.“ReproducibleandReusableResearch:AreJournalDataSharingPoliciesMeetingtheMark?”PeerJ5(April25,2017):e3208.doi:10.7717/peerj.3208.6Berghmans,Stephane,HelenaCousijn,GemmaDeakin,IngeborgMeijer,AdrianMulligan,AndrewPlume,SarahdeRijcke,etal.“OpenData:TheResearcherPerspective,”2017,48p.doi:10.17632/bwrnfb4bvh.1.7Stodden,V.,Guo,P.andMa,Z.(2013),“Towardreproduciblecomputationalresearch:anempiricalanalysisofdataandcodepolicyadoption”,PLoSOne,Vol.8No.6,p.e67111.doi:10.1371/journal.pone.0067111.

9

Numberofprojectsusingopenhardwarelicense* OpenHardwarerepository

2.3.3 Citizenscience

Indicator Source

N.ProjectsinZooniverseandScistarter* ZooniverseandScistarter

N.ParticipantsinZooniverseandScistarter ZooniverseandScistarter

2.3.4 Altmetrics

Indicator SourceP(tracked)-#Scopuspublicationsthatcanbetrackedbythedifferentsources(e.g.typicallyonlypublicationswithaDOI,PMID,Scopusid,etc.canbetracked).

Scopus&PlumAnalytics

P(mendeley)-#ScopuspublicationswithreadershipactivityinMendeley

Scopus,Mendeley&PlumAnalytics

PP(mendeley)-ProportionofpublicationscoveredonMendeley.P(mendeley)/P(tracked)

Scopus,Mendeley&PlumAnalytics

TRS-TotalReadershipScoreofScopuspublications.SumofallMendeleyreadershipreceivedbyallP(tracked)

Scopus,Mendeley&PlumAnalytics

TRS(academics)-TotalReadershipScoreofScopuspublicationsfromMendeleyacademicusers(PhdS,Professors,Postdocs,researchers,etc.)

Scopus,Mendeley&PlumAnalytics

TRS(students)-TotalReadershipScoreofScopuspublicationsfromMendeleystudentusers(MasterandBachelorstudents)

Scopus,Mendeley&PlumAnalytics

TRS(professionals)-TotalReadershipScoreofScopuspublicationsfromMendeleyprofessionalusers(librarians,otherprofessionals,etc.)

Scopus,Mendeley&PlumAnalytics

MRS-MeanReadershipsScore.TRS/P(tracked) Scopus&PlumAnalyticsMRS(academics)-TRS(academics)/P(tracked) Scopus&PlumAnalyticsMRS(students)-TRS(students)/P(tracked) Scopus&PlumAnalyticsMRS(professionals)-TRS(professionals)/P(tracked) Scopus&PlumAnalyticsP(twitter)-#Scopuspublicationsthathavebeenmentionedinatleastone(re)tweet

Scopus&PlumAnalytics

PP(twitter)-ProportionofpublicationsmentionedonTwitter.P(twitter)/P(tracked)

Scopus&PlumAnalytics

8https://www.ohwr.org

10

TTWS-TotalTwitterScore.SumofalltweetsmentionsreceivedbyallP(tracked)

Scopus&PlumAnalytics

MTWS-MeanTwitterScore.TTWS/P(tracked) Scopus&PlumAnalytics

11

3 NextstepsThismethodologicalnoteispublishedforpublicfeedback,tobegathereduntilJuly30th2018.Thefeedbackisparticularlyaimingatconcrete,actionablesuggestionsforimprovement:

- Byidentifyingnewtrends,notselectedsofar- Byproposingimprovedindicatorsfortheselectedtrends

- Byselectingnew,improveddatasourcesfortheselectedindicators.

InOctober2018thefinalmethodologywillbereleased,togetherwithnewindicatorsanddata.

12

Annex:TechnicalreportontheidentificationofOpenAccesspublishingThedvanLeeuwen&RodrigoCostasCentreforScienceandTechnologyStudies(CWTS),LeidenUniversity,theNetherlands

Introduction

InthisdocumenttheapproachfortheidentificationandcreationoftheOpenAccess(OA)labels fortheOpenScienceMonitor (hereafterreferredtoasOSMonitor) ispresented.AsstatedintheTermsofreference,CWTSisfollowingtheexactsamemethodthathasbeendeveloped over the last two years, and which has been reported at the Paris 2017 STIConference (van Leeuwen et al, 2017). In this method we strive for a high degree ofreproducibility of our results based upon data carrying OA labels following from themethodologywedeveloped.Our initialdevelopmentswerebasedon theWebofScience,butfortheOSMonitortheexactsamemethodwillbebasedonElsevier’sScopusdata.

The methodological approach that we propose mainly focuses on adding different OAlabels to the Scopus database, using various data sources to establish this OA status ofscientificpublications.ItisimportanttohighlightthattwobasicprinciplesforthisOAlabelaresustainability and legality.Bysustainabilitywemean that it should, inprinciple,bepossibletoreproducetheOAlabelingfromthevarioussourcesused,repeatedly,inanopenfashion,witharelatively limitedriskofthesourcesuseddisappearingbehindapay-wall,andparticularlythatthereportedpublicationsasOAwillchangetheirstatustoclosed.Thesecond aspect (legality) relates to the usage of data sources that represent legal OAevidence for publications, excluding rogue or illegal OA publications (i.e. we do notconsiderOApublicationsmade freelyavailable inplatformssuchasResearchGateorSci-hub).Whiletheformercriterionismainlyorientedtoascientificrequirement,namelythatofreproducibilityandperdurabilityovertime,the lattercriteria isparticularly importantforsciencepolicy,indicatingthatOApublishingalignswithpoliciesandmandates.

DatasourcesusedforestablishingOAlabels

Asmaindatasources to identifyevidenceofOpenAccess forpublicationscovered in theScopusdatabasefortheyears2009to2016,weused:

• theDOAJlist(DirectoryofOpenAccessJournals)[https://doaj.org/],• the ROAD list (Directory of Open Access scholarly Resources)

[http://road.issn.org/],• PMC(PubMedCentral)[https://www.ncbi.nlm.nih.gov/pmc/],• CrossRef[https://www.crossref.org/],and• OpenAIRE[https://www.openaire.eu/]

13

ThesefivesourcesservetolabelthepublicationsaccordingtotheterminologyusedintheOAdevelopment.Thefirsttwosources(DOAJandROAD)servetoidentifyandlabelGoldOA,whilethelastthreesources(PMC,CrossRefandOpenAIRE)servetoidentifyandlabelGreenOA.IncaseswherepublicationspublishedinGoldOAjournalswerealsoidentifiedinoneoftheothersources,wedeterminethestatusofthepublicationasGoldOA.SoGoldOAgoesoverGreenOA,asGoldisamoredeliberatechoiceoftheauthors,oftendrivenbyamandateofpublishinginajournalthatisfullyOA.All these five sources fulfill the above-mentioned requirements while other popular‘apparent’OA sources such asResearchGate and SciHub fail tomeet these twoprinciplerequirements. Thus, it is important to stress here that our approach has a more policyperspectivethanautilitarianone(i.e.justidentifyingpublicationsthatarefreelyavailable).Inotherwords,ourapproachaimstoinformthenumberandshareofsustainableandlegalOApublications (i.e. publications thathavebeenpublished inOA journalsor archived inofficialandlegalrepositories),insteadofthemereidentificationofpublicationswhosefulltextcanberetrievedonline(regardlessthesourceorthe legalstatusoftheaccesstothepublication).ForabroaderdiscussiononothertypesofOAaswellasotherpossibilitiesofidentifying OA we refer the reader to our recent paper Martín-Martín et al. (2018)[https://arxiv.org/abs/1803.06161]

SourcesofOpenAccessevidence

The sources that were mentioned above were fully downloaded (as provided by theoriginal sources) using their public Application Programming Interfaces (API). ThemetadataobtainedhasbeenparsedandincorporatedintoanSQLenvironmentintheformofrelationaldatabases.

DOAJAfirstsourceweusedwastheDOAJlistofOAjournals.ThislistwaslinkedtotheScopusdatabaseonthebasisoftheregularISSNcode,aswellastheeISSNcodeavailableinboththe DOAJ list as well as in the Scopus database. This resulted in a recall of 1,028,447publicationslabeledinScopusasbeingOA,viatheregularISSNcode,whiletheeISSNcoderesultedin95,162additionalpublications.

ROAD

AnextsourceusedtoaddlabelstotheScopusdatabaseistheROADlist.ROADhasbeendevelopedwiththesupportoftheUNESCO,andisrelatedtoISSNInternationalCentre.Thelistprovidesaccess toa subsetof the ISSNRegister.This subset comprisesbibliographicrecords which describe scholarly resources in OA identified by an ISSN: journals,monographicseries,conferenceproceedingsandacademicrepositories.ThelinkingoftheROAD list is based upon the ISSN code, as well as the eISSN code available in both theScopusaswell as in theROAD list.This resulted ina totalof524,082publicationsbeinglabeledasOA,whiletheeISSNcoderesultedin938,787additionalpublications.

14

CrossRefAthirdsourcethatwasusedtoestablishanOpenAccesslabeltoScopuspublicationswasCrossRef,basedupontheDOI’savailableinbothsystems.Thisledtotheestablishmentofatotalof37,119publicationsasbeinglicensedasOAaccordingtoCrossRef.

PubMedCentralA fourthsourceused is thePubMedCentraldatabase.This isdone in twoways; the firstbasedupontheDOI’savailableinboththePMCdatabaseaswellasintheScopusdatabase.This resulted in total in 1,974,941 publications being labeled as OA in the Scopusenvironment.ThesecondapproachwasbaseduponthePMIDcode(wherePMIDstandsforPubMedID)inthePMCdatabaseaswellasintheScopusdatabase.Thisresultedinatotalof1,102,937publicationsbeinglabeledasOAintheScopusdatabase.

OpenAIREAfifthandfinaldatasourceusedtoaddOAlabelstotheScopusdatabaseistheOpenAIREdatabase.OpenAIREisaEuropeandatabasethataggregatesmetadataonOApublicationsfrom multiple institutional repositories (mostly in Europe), including also thematicrepositories suchasarxiv.org.Thematching isdone in twodifferentways: the firstonebased upon amatching by using the DOI’s or PMIDs available in both OpenAIRE and inScopus(resultingin2,326,442publications);andsecond,onafuzzymatchingprincipleofdiverse bibliographic metadata both in Scopus and OpenAIRE (including articles’ titles,publicationyears andotherbibliographic characteristics) (resulting in total in2,976,620publications) (the methodology is similar to the methodology for citation matchingemployedatCWTS–Olenskyetal.2016.

In comparison with the previous studies in which our methodology of labeling OA wasapplied toWebof Science (WoS), the implementationof themethodologyon the Scopusdatabase offerswith respect to theDOAJ andROAD lists the advantage that Scopus alsocontains the eISSN codes, contrary to WoS. This results in a relative larger number ofpublicationscoveredbythemethodologyrelatedtoDOAJandROAD,hencethenumbersofpublications as well as the share of publications in Gold OA are higher as compared toresultsobtainedfortheWoSdatabase.The fuzzymatching algorithmsunderlying the linking ofOpenAIRE to Scopushavebeenrevised,andmademoreaccurate incomparisontothepreviousversionofthealgorithm.So thisprobably leads tohigher recall aswell.Due to the fact that this isapplied first inWoSandnowinScopus,withbothdatabasesdifferingincoverageandalsotimeperiods,itisimpossibletostatewhattheexactdifferenceis.

AnewsourceofOpenAccessevidence:Unpaywalldata

15

More recently, a new source forOA evidence appeared on the scene, the formerOADOI,nowadaysUnpaywalldatabase (https://unpaywall.org/).Wehavenotyet integrated thatinto thecurrentanalysis,butplan to integrate the informationstored in this system inanext run of the analysis, leading to expanding the filling of the OS Monitor with(potentially) additional OA publishing information. For now we have conducted a fewanalyses,comparingourmethodologyandthenumbersofpublicationlabeledwithOAtags,withtheUnpaywalldata(seealsoMartín-Martínetal,2018).Afewimmediatedifferencesworthmentioningare:

- OurmethodologyincludestheROADlist,asourcenotcoveredbyUnpaywall;- Our methodology includes the OpenAIRE dataset, a source not covered by

Unpaywall; this implies that ourmethodologyhas a somewhat better coverage inEurope(whichisthescopeofOpenAIRE),whileUnpaywallseemstoslightlybetterrepresentOApublishingintheUSandothernon-Europeancountries;

- UnPayWalldiscloseshybridOApublishing,ofwhichasub-setconsistsofBronzeOAtagstopublications.

In the immediate futurewewill startworking on the possibilities to include UnpaywalldataintothemethodologythattagspublicationswithOAlabels.ThisrequiresconductingresearchtobetterunderstandwhatdataUnpaywallactuallydisclose,whetheralltypesofOAevidenceactually fit intoour criteriaofbuildingOAevidence, andwhether thereareotherpotentialconceptualissuesrelatedtosometypologiesofOAprovidedbyUnpaywall(e.g. it is not totally clearwhether the Bronze OA typology disclosed by UnPayWall canreallybeconsideredasustainableformofOA,cf.Martín-Martínetal,2018).

16

References:

van Leeuwen TN,Meijer I, Yegros-Yegros, A & Costas R, Developing indicators on OpenAccessbycombiningevidencefromdiversedatasources ,Proceedingsofthe2017STIConference, 6-8 September, Paris, France (https://sti2017.paris/)(https://arxiv.org/abs/1802.02827)

Martín-Martín,A.,Costas,R.,vanLeeuwen,T.,&DelgadoLópez-Cózar,E.(2018).Evidenceof Open Access of scientific publications in Google Scholar: a large-scale analysis.SocArXivpapers.DOI:10.17605/OSF.IO/K54UV

Olensky, M., Schmidt, M., & Van Eck, N.J. (2016). Evaluation of the Citation MatchingAlgorithms of CWTS and iFQ in Comparison to the Web of Science. Journal of theAssociation for Information Science and Technology, 67(10), 2550-2564.doi:10.1002/asi.23590.