measuring open access- current state of the art
Post on 17-Aug-2015
37 Views
Preview:
TRANSCRIPT
Measuring Open Access - Current State of the Art by
Eacuteric Archambault DPhil President and CEO Science-Metrix and 1science
ESSS 2015 - Leuven
2
The OA revolution is firmly in motion Librarians can play a key role
Traditional role ndash percolation New role ndash diffusion
Researchers too ndash be fruitful and multiply OA in academic publications complex beast Understanding the OA universe is key to useful measurement
BACKGROUND
3
Definitions Key vantage points Measuring OA Results Conclusions
SYNOPSIS
4
Budapest Open Access Initiative (2002) ldquoThe literature that should be freely accessible online is that which scholars give to the world without expectation of payment Primarily this category encompasses their peer-reviewed journal articles but it also includes any unreviewed preprints that they might wish to put online for comment or to alert colleagues to important research findings There are many degrees and kinds of wider and easier access to this literature By open access to this literature we mean its free availability on the public internet permitting any users to read download copy distribute print search or link to the full texts of these articles crawl them for indexing pass them as data to software or use them for any other lawful purpose without financial legal or technical barriers other than those inseparable from gaining access to the internet itself The only constraint on reproduction and distribution and the only role for copyright in this domain should be to give authors control over the integrity of their work and the right to be properly acknowledged and citedrdquo
DEFINITIONS
5
Green OA The main idea behind Green is self-archiving Archiving can be done in institutional and thematic repositories
Gold OA The main idea behind Gold is that journal publishers make papers available There are Gold journals (cover-to-cover) but also Gold papers published in subscription-based journals (aka ldquohybrid journalsrdquo)
DEFINITIONS
6
Complexity of OA definition and measurement notably due to
Embargoes Transiency Rights of all kind (to self-archive to crawl to recombine to use commercially etc) Discoverability
DEFINITIONS
7
Rules of involvement in OA
VANTAGE POINTS
8
Directoriesregistries of repositories Directory of OA journals
VANTAGE POINTS
9
Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary
Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software
VANTAGE POINTS
10
Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central
Aggregators OpenAire BASE CORE
A typical repository hosted by the Umearing universitet Library
VANTAGE POINTS
11
Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with
BIBLIOMETRICS ndash PROPORTION OF OA
12
Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items
(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)
BIBLIOMETRICS ndash PROPORTION OF OA
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
2
The OA revolution is firmly in motion Librarians can play a key role
Traditional role ndash percolation New role ndash diffusion
Researchers too ndash be fruitful and multiply OA in academic publications complex beast Understanding the OA universe is key to useful measurement
BACKGROUND
3
Definitions Key vantage points Measuring OA Results Conclusions
SYNOPSIS
4
Budapest Open Access Initiative (2002) ldquoThe literature that should be freely accessible online is that which scholars give to the world without expectation of payment Primarily this category encompasses their peer-reviewed journal articles but it also includes any unreviewed preprints that they might wish to put online for comment or to alert colleagues to important research findings There are many degrees and kinds of wider and easier access to this literature By open access to this literature we mean its free availability on the public internet permitting any users to read download copy distribute print search or link to the full texts of these articles crawl them for indexing pass them as data to software or use them for any other lawful purpose without financial legal or technical barriers other than those inseparable from gaining access to the internet itself The only constraint on reproduction and distribution and the only role for copyright in this domain should be to give authors control over the integrity of their work and the right to be properly acknowledged and citedrdquo
DEFINITIONS
5
Green OA The main idea behind Green is self-archiving Archiving can be done in institutional and thematic repositories
Gold OA The main idea behind Gold is that journal publishers make papers available There are Gold journals (cover-to-cover) but also Gold papers published in subscription-based journals (aka ldquohybrid journalsrdquo)
DEFINITIONS
6
Complexity of OA definition and measurement notably due to
Embargoes Transiency Rights of all kind (to self-archive to crawl to recombine to use commercially etc) Discoverability
DEFINITIONS
7
Rules of involvement in OA
VANTAGE POINTS
8
Directoriesregistries of repositories Directory of OA journals
VANTAGE POINTS
9
Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary
Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software
VANTAGE POINTS
10
Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central
Aggregators OpenAire BASE CORE
A typical repository hosted by the Umearing universitet Library
VANTAGE POINTS
11
Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with
BIBLIOMETRICS ndash PROPORTION OF OA
12
Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items
(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)
BIBLIOMETRICS ndash PROPORTION OF OA
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
3
Definitions Key vantage points Measuring OA Results Conclusions
SYNOPSIS
4
Budapest Open Access Initiative (2002) ldquoThe literature that should be freely accessible online is that which scholars give to the world without expectation of payment Primarily this category encompasses their peer-reviewed journal articles but it also includes any unreviewed preprints that they might wish to put online for comment or to alert colleagues to important research findings There are many degrees and kinds of wider and easier access to this literature By open access to this literature we mean its free availability on the public internet permitting any users to read download copy distribute print search or link to the full texts of these articles crawl them for indexing pass them as data to software or use them for any other lawful purpose without financial legal or technical barriers other than those inseparable from gaining access to the internet itself The only constraint on reproduction and distribution and the only role for copyright in this domain should be to give authors control over the integrity of their work and the right to be properly acknowledged and citedrdquo
DEFINITIONS
5
Green OA The main idea behind Green is self-archiving Archiving can be done in institutional and thematic repositories
Gold OA The main idea behind Gold is that journal publishers make papers available There are Gold journals (cover-to-cover) but also Gold papers published in subscription-based journals (aka ldquohybrid journalsrdquo)
DEFINITIONS
6
Complexity of OA definition and measurement notably due to
Embargoes Transiency Rights of all kind (to self-archive to crawl to recombine to use commercially etc) Discoverability
DEFINITIONS
7
Rules of involvement in OA
VANTAGE POINTS
8
Directoriesregistries of repositories Directory of OA journals
VANTAGE POINTS
9
Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary
Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software
VANTAGE POINTS
10
Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central
Aggregators OpenAire BASE CORE
A typical repository hosted by the Umearing universitet Library
VANTAGE POINTS
11
Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with
BIBLIOMETRICS ndash PROPORTION OF OA
12
Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items
(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)
BIBLIOMETRICS ndash PROPORTION OF OA
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
4
Budapest Open Access Initiative (2002) ldquoThe literature that should be freely accessible online is that which scholars give to the world without expectation of payment Primarily this category encompasses their peer-reviewed journal articles but it also includes any unreviewed preprints that they might wish to put online for comment or to alert colleagues to important research findings There are many degrees and kinds of wider and easier access to this literature By open access to this literature we mean its free availability on the public internet permitting any users to read download copy distribute print search or link to the full texts of these articles crawl them for indexing pass them as data to software or use them for any other lawful purpose without financial legal or technical barriers other than those inseparable from gaining access to the internet itself The only constraint on reproduction and distribution and the only role for copyright in this domain should be to give authors control over the integrity of their work and the right to be properly acknowledged and citedrdquo
DEFINITIONS
5
Green OA The main idea behind Green is self-archiving Archiving can be done in institutional and thematic repositories
Gold OA The main idea behind Gold is that journal publishers make papers available There are Gold journals (cover-to-cover) but also Gold papers published in subscription-based journals (aka ldquohybrid journalsrdquo)
DEFINITIONS
6
Complexity of OA definition and measurement notably due to
Embargoes Transiency Rights of all kind (to self-archive to crawl to recombine to use commercially etc) Discoverability
DEFINITIONS
7
Rules of involvement in OA
VANTAGE POINTS
8
Directoriesregistries of repositories Directory of OA journals
VANTAGE POINTS
9
Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary
Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software
VANTAGE POINTS
10
Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central
Aggregators OpenAire BASE CORE
A typical repository hosted by the Umearing universitet Library
VANTAGE POINTS
11
Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with
BIBLIOMETRICS ndash PROPORTION OF OA
12
Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items
(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)
BIBLIOMETRICS ndash PROPORTION OF OA
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
5
Green OA The main idea behind Green is self-archiving Archiving can be done in institutional and thematic repositories
Gold OA The main idea behind Gold is that journal publishers make papers available There are Gold journals (cover-to-cover) but also Gold papers published in subscription-based journals (aka ldquohybrid journalsrdquo)
DEFINITIONS
6
Complexity of OA definition and measurement notably due to
Embargoes Transiency Rights of all kind (to self-archive to crawl to recombine to use commercially etc) Discoverability
DEFINITIONS
7
Rules of involvement in OA
VANTAGE POINTS
8
Directoriesregistries of repositories Directory of OA journals
VANTAGE POINTS
9
Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary
Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software
VANTAGE POINTS
10
Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central
Aggregators OpenAire BASE CORE
A typical repository hosted by the Umearing universitet Library
VANTAGE POINTS
11
Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with
BIBLIOMETRICS ndash PROPORTION OF OA
12
Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items
(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)
BIBLIOMETRICS ndash PROPORTION OF OA
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
6
Complexity of OA definition and measurement notably due to
Embargoes Transiency Rights of all kind (to self-archive to crawl to recombine to use commercially etc) Discoverability
DEFINITIONS
7
Rules of involvement in OA
VANTAGE POINTS
8
Directoriesregistries of repositories Directory of OA journals
VANTAGE POINTS
9
Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary
Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software
VANTAGE POINTS
10
Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central
Aggregators OpenAire BASE CORE
A typical repository hosted by the Umearing universitet Library
VANTAGE POINTS
11
Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with
BIBLIOMETRICS ndash PROPORTION OF OA
12
Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items
(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)
BIBLIOMETRICS ndash PROPORTION OF OA
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
7
Rules of involvement in OA
VANTAGE POINTS
8
Directoriesregistries of repositories Directory of OA journals
VANTAGE POINTS
9
Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary
Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software
VANTAGE POINTS
10
Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central
Aggregators OpenAire BASE CORE
A typical repository hosted by the Umearing universitet Library
VANTAGE POINTS
11
Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with
BIBLIOMETRICS ndash PROPORTION OF OA
12
Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items
(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)
BIBLIOMETRICS ndash PROPORTION OF OA
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
8
Directoriesregistries of repositories Directory of OA journals
VANTAGE POINTS
9
Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary
Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software
VANTAGE POINTS
10
Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central
Aggregators OpenAire BASE CORE
A typical repository hosted by the Umearing universitet Library
VANTAGE POINTS
11
Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with
BIBLIOMETRICS ndash PROPORTION OF OA
12
Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items
(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)
BIBLIOMETRICS ndash PROPORTION OF OA
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
9
Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary
Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software
VANTAGE POINTS
10
Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central
Aggregators OpenAire BASE CORE
A typical repository hosted by the Umearing universitet Library
VANTAGE POINTS
11
Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with
BIBLIOMETRICS ndash PROPORTION OF OA
12
Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items
(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)
BIBLIOMETRICS ndash PROPORTION OF OA
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
10
Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central
Aggregators OpenAire BASE CORE
A typical repository hosted by the Umearing universitet Library
VANTAGE POINTS
11
Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with
BIBLIOMETRICS ndash PROPORTION OF OA
12
Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items
(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)
BIBLIOMETRICS ndash PROPORTION OF OA
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
11
Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with
BIBLIOMETRICS ndash PROPORTION OF OA
12
Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items
(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)
BIBLIOMETRICS ndash PROPORTION OF OA
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
12
Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items
(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)
BIBLIOMETRICS ndash PROPORTION OF OA
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
13
Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems
(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that
BIBLIOMETRICS ndash PROPORTION OF OA
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
14
Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines
BIBLIOMETRICS ndash PROPORTION OF OA
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
15
Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable
Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)
BIBLIOMETRICS ndash PROPORTION OF OA
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
16
Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples
SAMPLING AND METROLOGY
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
17
A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows
Retrieval Precision = 119905119905119905119905+119891119905
Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records
Recall = 119905119905119905119905+119891119891
Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula
Adjustment = 119905119905+119891119891119905119905+119891119905
SAMPLING AND METROLOGY
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
18
Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows
119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1
+ 05119891
SAMPLING AND METROLOGY
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
19
The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers
MEASURING THE OF OA PAPERS
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
20
For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample
MEASURING THE OF OA PAPERS
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
21
Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013
RESULTS
0
5
10
15
20
25
30
35
40
45
50
55
60
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
22
Translation of OA availability between April 2013 and April 2014
RESULTS
y = 2E-21e00234x
Rsup2 = 0976
y = 3E-17e00186x
Rsup2 = 09473
0
5
10
15
20
25
30
35
40
45
50
55
60
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
o
f pa
pers
ava
ilabl
e in
OA
Adjusted OA April 2014
Adjusted OA April 2013
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
23
OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011
RESULTS
y = 2E-112e01335x
Rsup2 = 09976
0
20000
40000
60000
80000
100000
120000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of
OA p
aper
s ba
ckfil
led
betw
een
Apr
il 20
13 a
nd A
pril
2014
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
24
Growth of the number of papers available in OA as measured in April 2014 1996ndash2013
RESULTS
y = 2E-73e009x
Rsup2 = 09971
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014
Num
ber
of p
aper
s in
O
A
Adjusted OA
Measured OA
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
25
Scientific impact of OA and non-OA papers published in 1996ndash2011
RESULTS
00
02
04
06
08
10
12
14
16
18
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Ave
rage
of re
lativ
e ci
tatio
ns
(ARC 1
= w
orld
ave
rgae
)
OAAll PapersNot OA
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
26
Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals
RESULTS
1st place 2nd place 3rd place Least impact
Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061
Field
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
27
OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery
CONCLUSION
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
28
Growth of OA should be understood to comprise two main aspects
Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve
CONCLUSION
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
29
On average openly accessible papers have a decidedly greater impact
In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established
No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do
CONCLUSION
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
30
Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers
Visit Science-Metrix to learn about our evaluation and measurement activities
THANK YOU
- Slide Number 1
- Background
- SYNOPSIS
- Definitions
- Definitions
- Definitions
- Vantage Points
- Vantage Points
- Vantage Points
- Vantage Points
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Bibliometrics ndash Proportion of OA
- Sampling and METROLOGY
- Sampling and METROLOGY
- Sampling and METROLOGY
- Measuring the of OA papers
- Measuring the of OA papers
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- RESULTS
- Conclusion
- Conclusion
- Conclusion
- Thank you
-
top related