static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._deliverable_5…  · web viewpart...

172
FP7 Grant Agreement 266632 Deliverable No and Title 5.3 Selection of Samples Dissemination Level PU (public) Work Package WP5. Bibliometric Indicators Version 1.0 Release Date 6 th August 2013 Author(s) Lorna Wildgaard, Birger Larsen, Jesper W Schneider Project Website http://research-acumen.eu / European Commission 7th Framework Programme SP4 - Capacities Science in Society 2010 Grant Agreement: 266632 1

Upload: phunghanh

Post on 06-Feb-2018

227 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

1

FP7 Grant Agreement 266632

Deliverable No and Title 5.3 Selection of Samples

Dissemination Level PU (public)

Work Package WP5. Bibliometric Indicators

Version 1.0

Release Date 6th August 2013

Author(s) Lorna Wildgaard, Birger Larsen, Jesper W Schneider

Project Website http://research-acumen.eu/

European Commission7th Framework ProgrammeSP4 - CapacitiesScience in Society 2010Grant Agreement: 266632

Page 2: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

2

PART 1

Page 3: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Part 1. Preparing for the analysis. Sampling strategy and methodological considerations in developing bibliometric indicators of the performance and impact of individuals for use in the ACUMEN portfolio.

Work Package 5: New Bibliometric indicators June 28th, 2013

Project partners: Department of Information Studies, Royal School of Library and Information Science; Department of Library and Information Science, Humboldt University Berlin

Executive Summary:

Based on the samples from the four research fields used in the other WPs we have identified 793 researchers with online publication lists. Publication data from these researchers were gathered and combined with demographic data from the survey.  Bibliometric analyses of these publications were undertaken in WoS and Google Scholar using a set of indicators designed for assessment at the individual level. The sample of 64 indicators were previously identified in the review of 114 bibliometric indicators, as presented in Madrid in January 2013. The set of 64 indicators has been reduced to 40 using a number of selection criteria.

We decided to use (construct) a decision-tree (which in a reworked form could go into the portfolio) as the guiding principle when choosing and comparing indicators. Our basic pragmatic assumption is that since indicators are already provided on many curriculum vitaes (CV’s), though there are great variations across fields, simplicity and the ease with which such indicators can be obtained and/or compiled, are the basis for our analyses and later recommendations. We observed that what sets the ACUMEN portfolio apart from the current use of indicators on CV’s, is the portfolios potential to give the researcher guidelines to aid interpretation of the indicators and set them in a narrative the enriches the cv.

The main tasks therefore are 1) to characterize types of indicators; 2) to examine (within the dataset) to what extent easily obtainable indicators correlate with more sophisticated indicators, as the latter would be close to impossible for individuals to

3

Page 4: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

obtain and provide in a CV; 3) subsequently provide an annotated guideline for the use of individual indicators in relation to their CV’s, with special focus on gender, current career position, research field, as well pitfalls/deficiencies (important here is that the perspective is the researcher); 4) an ethical perspective on the use of individual metrics (for example, ecological fallacies concerning journal indicators being used at the individual level etc.), and finally we will also provide a guideline including the ethical perspective for evaluators (aka their point of view).

It is essential that our suggestions as to which type of indicators to use (and not use), are supported with guidelines - more explicit than ”read the fine print” – on their interpretation and limitations, and how to present such indicators on a CV.

4

Page 5: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Introduction The ACUMEN portfolio is more than just a registry of CVs and publication lists. The portfolio aims to help the researcher document their activities and connect these activities with their results and the effect of these on research spaces. In this sense the portfolio enables the researcher to express the full richness of what they do. The idea is that through bibliometrics, bibliographic information can be linked to these research activities and their reception in the scientific and public communities. This is challenging as these activities and their effects are in the form of different types of publications, uses, values, applications, relationships, and roles in inspiring creativity and innovation; these in turn are only measurable by the researcher dependent on the completeness of their record and accessibility. Figure 1 illustrates interconnections in the research zone and thus the challenges we face in fitting indicators to at the level of the individual. So apart from recommending bibliometric indicators, WP5 aims to develop standards and guidelines for implementation and interpretation, to do help the researcher do meaningful bibliometric self-evaluation. But ultimately success is dependent on a fair amount of effort on the part of the researcher, which is why simplicity is the key.

5

economic

dimension

social

dimension

output

event

project

institutional

dimension

PERSON

Cultural dimension: influence, expertise and skills in the

academic, industrial and public sphere

Page 6: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Fig.1. Visualising the research zone

The informed use of bibliometrics will make it possible for the researcher to disseminate their academic identity. Disseminating an identity is philosophically, socially and culturally challenging. To ease this, WP5 suggests that only the researcher who owns the CV can edit and append the created document and the bibliometric analyses. The identity researchers present through their ACUMEN portfolio are their academic profiles that the consumer or those who have permission to view the CV should validate, not ACUMEN. Hence, guidelines will also be tailored to the consumer to guide interpretation of bibliometrically enriched CVs to allow contextual judgements of performance, and the use of bibliometrics at the individual level.

Clearly trust is an issue just as ethics are an issue. Self-evaluation presents the researcher with the opportunity to exploit the procedures for their own personal gain at the detriment to science (Cheung, 2008; Lawrence, 2008). The challenge for the bibliometrics is to improve the representativeness of research output evaluations at the individual level. Where it is not the ACUMEN portfolios’ task to validate the bibliographic and bibliometric information the researcher provides on his portfolio CV, it is our task to provide appropriate bibliometrics that are designed for micro-level analysis, that are transparent in their application, and understandable so their use and limitations are clear. We must consider if the effort it takes the researcher to do the analyses and contextualise the scientific activities reported on the CV is worth it, as ethically speaking, how reliable is the outcome?

Reliability is trust-based and a different parameter conditioned on the point of view: from the evaluators' point of view the main issue is if individual level bibliometric evaluation is at all ethically defensible while from the individual researcher’s point of view, the issues could be more related to self-promotion. A core problem is that self-evaluation is subjective (Potočnik, 2005) and it is a common fear that instead of monitoring the research process, bibliometrics will be used in evaluations to monitor the researcher (Collini 2012; Bach 2011; Cheung 2008). Hopefully encapsulating bibliometrics in a narrative will avoid fitting the indicators to the natural sciences’ traditions of writing, publishing in journals and linking these publications to citations represented in WOS, (Campbell 2008; Laloë & Mosseri, 2009; Bornmann, L. et al, 2008). It should also reduce the pressure to publish, preferably in journals with a high impact factor included in citation databases, rather than journals that fit the writing talent of the author and content of the paper. This approach can result in competitive and aggressive researchers being rewarded over modest or irregular publishers (Cheung, 2008).

Accordingly, the guidelines and contextualisation of results help researchers enrich the information on their CVs and consumers understand the listed information, and this is where the ACUMEN portfolio stands apart from other CV providers with bibliometric applications. Common for existing providers is the lack of “fine print” describing the limits of bibliometrics and their interpretations, or the fine print being so distant from the CV that it is intelligible, such as HEP Inspire where the bibliometric results are

6

Page 7: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

presented as a box of statistics at the end of a publication list. ACUMEN supports a short narrative, that briefly and explicitly presents the meaning of such statistics for the consumer. When used correctly the informed use and informed interpretation of bibliometrics can bring objectivity into the process of individual evaluation (Bornmann et al, 2008). This avoids promoting “ready to use” amateur indicators where the validity of the use of these measures can affect the validity of self-evaluation (Lundberg, 2009). As both the researcher and evaluator are bound by professional codes of conduct that ensure professional reliability and accountability we assume this applies in an evaluation. To avoid the researcher or evaluator relying on the parsimony principle ‘one indicator is better than two’, such as the h-index (Zitt, 2008), we suggest developing a pallet of robust and valid indicators to recommend to the researcher. The indicators must be easy to use and understand. Our codex is an accompliment to these indicators to regulate ethical principles and rules of behaviour for bibliometric self-evaluation.

AimOur aim is to recommend bibliometric indicators, traditional and new, researchers can use themselves to enrich their CVS. When combined with the other ACUMEN members’ expertise, a portfolio of validated qualitative and quantitative measures will be available for the researcher to document not only their publication activities, but also contextualise these activities in narratives that showcase their expertise and influence in the context of their demographic information, specialty and academic seniority. The aim of the bibliometric indices is to document the core activities of output and reception to their work. This is nothing new. However, investigated as a form of self-evaluation, new complex aspects are introduced, such as access to data, ethics and the dependency of the success-rate of indicators dependent on complicated mathematics, software or complete datasets. The beauty of our study is that it is tested on real life data, that is flawed, incomplete and under-representative of certain academic groups and gender. But such is demographic of the scientific community and thus our dataset is highly representative of how science is practiced.

It is important to remember that bibliometric indicators are not limited to publication and citation counts, or limited to traditionally measureable forms of scientific communication in scientific journals. They are used in combination with qualitative and quanitative indicators recommended in other work packages, to document all a researcher’s activity. Thus, the combined indicators also support the researcher’s creativity and work with perhaps low-prestige but highly relevant problems that are “published”, in the broadest sense of the word, as a lot of communication is on the web, through popular media channels or in interactive installations. The following case study exemplifies our aim with enriching the CV with bibliometric indicators.

7

The publication list for Researcher A is presented as it appeared on the website. The font or layout has not been changed. Only part of it is shown here.

(This list is presented chronologically and includes all editions of books and compendiums. The list includes reviews, chronicles, popular science articles and textbooks.)

1.        Researcher A. (1979): XXX, Speciale i biologi ved Kbh. Universitet

2.        Researcher A. (1980): ”Article 1”

3.        Researcher A. (1981): “Article 2”, s. 96-151 i Niche: Nordisk tidsskrift for kritisk biologi. Årg. 2 nr. 2.

Page 8: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

8

The publication list for Researcher A is presented as it appeared on the website. The font or layout has not been changed. Only part of it is shown here.

(This list is presented chronologically and includes all editions of books and compendiums. The list includes reviews, chronicles, popular science articles and textbooks.)

1.        Researcher A. (1979): XXX, Speciale i biologi ved Kbh. Universitet

2.        Researcher A. (1980): ”Article 1”

3.        Researcher A. (1981): “Article 2”, s. 96-151 i Niche: Nordisk tidsskrift for kritisk biologi. Årg. 2 nr. 2.

Short Narrative: addition to researcher A’s curriculum vitae

Bibliometrics

Output

My output is defined as the 112 published works from 1993-2013. This total is compared to three reference groups, comparison values resourced April 2013. The reference group on the Local Level consists of the median number of publications of associate professors at my institution; likewise the National Level consists of associate professors in my field at from the University of Copenhagen, Aalborg and Roskilde, while the Expert Reference group consists of the publications of leading scholars in my field.

1993-2013 my output level is 112 publications; w.r.t the local level it is 32 (range 5-76); w.r.t. the national level 62 (range 28-214); w.r.t. the expert level 129 (31-414).

Generally, I do not co-author works. 93/112 works are single authored. I have been most comfortable working in repeated small collaborations; these works are authored by teams of 2 to 5 scholars and a single workshop paper by 8 scholars. In terms of number of papers I do certainly better than the median person on a local and national level and in terms of the expert group I am in the top 10, rank 10/21. Fifty-five of my works, in 80 publications, have been published in 6 languages and are included in 362 academic library holdings.

Citations

It is interesting to know where my works are being cited. Even though citations to books and national language works are under-represented in citation indices, one can roughly see that I have influence in: cybersemiotics, computer science, business and economics, linguistics, engineering, social sciences, library and Information Science as well as Philosophy. Citations to my works and those of the Expert reference group have been sourced in Google Scholar and Web of Science.

Parameter Myself Expert (median scores)Npapers 112 129Year of first publication

1993 1977

Works per year 5.6 3.5H index 16 11M quotient 0.8 0.47

More recently, the use of the h-index (the number of papers that have received more citations than their rank in a list sorted according to number of citations; if the h-index is n then the person has n papers that have received

Page 9: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Sampling strategyThe sample of publication lists used for the bibliometric analyses were sourced from the shared dataset of 2,154 academic profiles collected by WP2. The shared dataset includes 4 subject areas (astronomy & astrophysics, public environmental and occupational health, environmental engineering, and philosophy (including the history and philosophy of science)) and 15 European countries (Bulgaria, Czech Republic, Denmark, Estonia, Finland, France, Germany, Hungary, Israel, Italy, The Netherlands, Poland, Slovenia, Spain, and the United Kingdom). Details of the method and rationale of how the shared dataset was collected can be found in the Progress Report (2): ACUMEN Web Presence Survey Results (WP2, 2012).

Briefly, WP2 formed the shared dataset by extracting automatically a list of emails from published research papers indexed in the Thomsen Reuters Web of Science (WOS) during 2005-2011 in the four studied fields, which are based on WOS subject categories, for each European country. Because of the low coverage of Philosophy in WOS the Scopus citation index was also sourced to get sufficient email addresses for this field. A large scale survey in selected scientific fields and EU countries was conducted, resulting in information on online presence from 2,154 respondents. This information included URLs, online CVs, PDFs, PPT files publication lists, links to repositories, journals, individual websites, group websites and group publication lists as well as demographic data (gender, affiliation, discipline/specialty, and academic status).

We originally intended to use the entire sample of n2154 researchers as our aim was to identify how much variation exists or is estimated to exist in the population in relation to the performance of the indicators. However, not all these respondents had an online presence. Therefore the dataset was reduced further by only including the researchers who provided a link or links to any form of online material, figure 2. From this set we extracted only the researchers who had the academic status of PhD Student, Post Doc, Assistant Professor, Associate Professor or Professor resulting in a set of n1211 researchers. The professional titles were limited to these five seniorities to ensure we could investigate potential correlations or trends in academic life cycles and bibliometrics. Finally, all links were followed to verify how many actually led to a publication list. This led to a further reduction of the dataset as the following were excluded: dead links, duplicates, links to materials that were not an individuals’ publication list or CV including a list of publications, not one of our identified 5 academic status’ or subjects that fall outside our four disciplines. Our resulting sample is 793 publication lists, appendix 1 & 2.

Cleaning the base data, collecting publication and citation data, and validating bibliogaphical information is a time craving process, but is resulting in god data of a high quality with which we can contextualize the bibliometric results and counts to. We collected enough baseline data to capture an entire iteration (or cycle) of the researcher’s life cycle. An iteration should account for the different types of variation seen within these process, such as cycles, trends, volume ranges, cycle time ranges etc.

9

Page 10: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Fig. 2. Flowchart of sampling strategy

10

2154 researchers in shared dataset

Excluded: Other academic positions

Phd students, Post Docs, Assistant Professors, Associate Professors, Professors

Link to online resource

n1211

Excluded: no link to online resource

Excluded:

Dead links n172Duplicates n12Not discipline n19

Not publication list n214Working link to online publication list

n793

Astronomy:

PhD n15Post Doc n49

Assis Prof n27

Environment:

PhD n3Post Doc n18

Assis Prof n42

Philosophy:

PhD n9Post Doc n23

Assis Prof n49

Public Health:

PhD n9Post Doc n14

Assis Prof n31

Page 11: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Characteristics of sample

Gender and disciplinary representationIn our sample of 793 researchers, 182 are women, 23%. This is under the expected European percent for women in science, 30% and 44% dependent on field as reported in the SHE figures for 2012: http://ec.europa.eu/research/science-society/document_library/pdf_06/she-figures-2012_en.pdf.

Table 1. Gender ratio and disciplinary representation (women:men)

Astronomy Environment Philosophy Public Health Seniority ratioPh.D. 1:4 0:3 1:2 1:3 1:2Post Doc. 1:3 1:2 1:6 1:1 1:3Assis. Prof. 1:3 1:3 1:5 1:3 1:4Assoc. Prof. 1:5 1:5 1:3 1:2 1:4Professor 1:19 1:6 1:5 1:2 1:5Disciplinary ratio 1:5 1:4 1:4 1:2

Academic posts and disciplinary representation

The prime objective of the indicators, are their stability and performance on different academic seniorities. For bibliometrics, this means their usability and ease to calculate small amounts of citation and publication data (as in phd students with 3 years publishing history) to large amount of data (professors with publishing histories spanning decades). The distribution of researchers across academic seniorities and disciplines is unequal, skewed in favour of senior researchers.

Table 2. Academic posts and disciplinary representation

Astronomy Environment Philosophy Public Health Seniority TotalPh.D. 15 3 9 9 36Post Doc. 49 18 23 14 104Assis. Prof. 27 42 49 31 149Assoc. Prof. 72 85 82 53 292Professor 40 55 87 30 212Disciplinary Total 203 203 250 137 793

Disciplinary and linguistic representation

This demographic represents the disciplinary and linguistic representation of the departments to which the academics in our sample are affiliated. Linguistic hereditary of the research centres in the sample are more indicative of disciplinary

11

Page 12: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

publication and citation traditions than the researcher’s nationality or the centre’s geographical location. Figure x illustrates how the sample is weighted towards the Romance (Italian, Spanish, French and Algerian), Germanic (German, Dutch, Yiddish and Swiss), and Anglo-Saxon (English, American and Australian) research and writing traditions. The corresponding table shows that at a disciplinary level the distribution is weighted differently dependent on the discipline. The categories are based on the indo-european family of languages, appendix 3.

Fig. 3. Linguistic representation of research centres in the entire sample

24%

15%

11%20%

29%

0%

Germanic

Slavic (west, east, south)

Scandinavian

Anglo-saxon

Romance (italic, latin)

Asian

Table 3. Disciplinary and linguistic distribution

Anglo-Saxon

Asian

Germanic

Romance

Scandinavian

Slavic

Total

Astronomy 37 3 59 62 7 35 203

Environment 25 32 60 33 53 203

Philosophy 71 56 83 20 20 250

Public Health 28 46 27 28 8 137

12

Page 13: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

13

Page 14: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Limitations

Gender bias

Our sample has a strong male bias, the overall ratio of men to women is 3:1, which is though the same ratio as is the original shared data set. However, the gender distribution at the disciplinary level differs in two of the fields compared to the shared dataset. In the shared dataset the ratio men to women in Astonomy is 1:4, our sample represents 1:5, and in Environment there are 1:3 women, our data shows 1:4. However, it is a fact that women are outnumbered by men in math, science and engineering fields, which are two of our four selected disciplines. Our data includes relatively few women in high-level faculty positions, which is also supported in the literature (RAISE, 2013). A study, detailed in the journal Psychological Science (Murphy et al, 2007) claims to bring a new feature of gender bias to light that is important to remember when we contextualize our counts of scientific activity, write the guidelines and the indicators included in the ACUMEN portfolio. The feature is that women are less likely to participate in science and engineering settings in which they are outnumbered by men. These “situational cues” have an important meaning and effect on the careers of women, and these cues are the cultural and social factors that discourage women from a career in science. This includes socialization in which girls are taught, directly and indirectly, to steer clear of studies and jobs typically pursued by boys and men. In addition, past research has revealed an unconscious bias at universities where evaluators rate resumes and journal articles lower on average for women than men1. The responsibilities of family caretaking still fall disproportionately on women and so women often choose the stay-at-home-mum position or their household responsibilities make it nearly impossible for them to meet the long hours required for a high-level faculty position. Conversely, our sample also shows traces of the effect of female dominated fields on men, Public Health Policy, where the academic playing field is more evenly distributed, perhaps this could be attributed to the male sense of not belonging.

Ultimately, this means that our analyses of effects on gender are limited and we will as a result be focusing on academic status and research field. “Gender” will be supplementary analyses where the amount of data allows sensible investigations.

Sampling biasWe used the shared dataset as it has been an aim of ACUMEN since the kick off meeting in 2011 to connect the work packages through a shared dataset with real world parameters. In this way the findings of the work packages compliment and supplement each other in a way that the respondents and their bibliographic data are investigated through interviews, surveys, institutional documents, web presence and bibliometrics. For our work package this has meant that a sample has been drawn from the shared dataset and is as such defined as “convenience” sampling, i.e. a type of nonprobability sampling which involves the sample being drawn from that part of the population which is close to hand. Using such a sample means we cannot make scientific generalizations to the total population. This type of sampling is however useful for pilot testing and power analyses. Power analyses are used to calculate the minimum sample size required to detect an effect and accordingly determine how significant our results have to be, to be considered statistically significant even though we cannot test the significance of our results. As we

1 A overview of sources is too extensive to list. Please refer to, amongst others, the Boston University Recruitment Guide lines and corresponding reference list, available at: http://www.bu.edu/apfd/recruitment/fsm/assumption_awareness/

14

Page 15: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

have a convenience sample, several important matters must be considered in the design of the bibliometric analyses:

the sample is weighted in favour of senior researchers. the academic seniorities are unevenly distributed across the disciplines. the disciplines are represented unevenly, range 137 to 250 researchers. This affects the types of

analysis we can implement, the statistics we can use and the strengths of the conclusions we can draw.

can the purpose of our analyses be adequately answered using a convenience sample?, ie characterize types of indicators, examine the correlation between simple and sophisticated indicators, provide guidelines for application of indicators on CVs and the ethical perspectives on the use of individual metrics.

at the present time we are unaware of any controls within our analyses which can lessen the impact of a our convenience sample, thereby ensuring the results will be more representative of the population. But, how can we be sure that our convenience sample is responding or behaving differently than a random sample from the same population?

Sources used in data collection

A copy of each publication list was saved, as the internet is dynamic and we are well aware that the links that are working today could be dead tomorrow. Further a publication list is a living document that is updated and thus our base data can potentially change. We used sources of citation data that are readably available to researchers in all disciplines. Four students from RSLIS were employed to extract the data in June 2013. Multiple IP addresses were generated to solve the aggressive blocking policy of Google Scholar. The process for finding and exporting publication data from WOS and GS are described in detail in the Work Task description, appendix 4.

Publication lists, bibliographic and citation data were thus sourced in Web of Science (WOS) and Google Scholar (GS) with the aim to compare the alignment and performance of a multi-disciplinary structured citation index and a scholarly web search engine, where full text information is collected and presented through a web-crawler. Performance is defined as usefulness at the individual (disciplinary) level and the effect the choice of database has on the size of the researcher’s indices. It was a tactical choice to use multidisciplinary databases rather than disciplinary specific databases such as the Astrophysics Data System (ADS) or High Energy Physics Literature Database (Inspire). Common for these systems are that they provide ready to use indices and to some extent “fine print” that define the function of bibliometric indicator and how to interpret them. However, none provide clear guidelines for implementation and their limitations and none attempt to contextualise the results. Instead the indices are presented as statistics beside a profile of the researcher. Likewise there are publication databases that attempt full discipline coverage, such as the Philosophers Index or ECON lit. Although more representative of a discipline’s

15

Page 16: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

literature than WOS, citations are not indexed and we do not have the necessary knowledge of a researcher’s subject speciality and hence preference of database. Would the public health researchers in our sample prefer we sourced their publications in Pubmed, as all medical publications that are worth anything can be found there, or in Cinahl, as the research is nationally oriented and practice-based? Likewise how can we guess if an environmental scientist regards Inspec as the database rather than the Energy Citation Database (ECD)? Rather, the disciplinary specific indexes will be used in our case studies as we are very aware of the importance of these databases and it is important to address their role in the ACUMEN portfolio. In the case studies we show how good the coverage of subject-specific databases are compared to WOS and GS, the quality of the data and how difficult it is for the researcher to extract publication and citation information from these sources.

We did though experience some practical problems with our choice of citation sources, due to the amount of data we extracted. These problems are described below, but are considered not be an issue at the individual level, as extracting citation information for one publication list at a time is vastly different than extracting 793 publication lists.

Google Scholar

Data is difficult and time consuming to extract en masse from GS. Hence we used Harzing’s Publish or Perish version 4.0.12 (POP) software2 to identify publications and retrieve, and to a limited extent analyse, academic citations in GS. We are aware that GS offer a personal citation service “My Citations” where the researcher can create a profile in GS that automatically harvest relevant publication and citation data. This service is easy to use but the generated bibliometrics are limited to h index, total citations, citations over time and i10 index. We are instead recommending the researcher uses POP to search GS even though it requires effort to keep the amount of citations up-to-date, remove duplicates and publications that are not written by the researcher. Another thought behind this choice is that by researchers actively updating their publication and citation lists, they will build an understanding for what bibliometric results are built on, and not blindly trust ready to use indices presented out of context. Unlike GS, POP support this rationale by presenting a range of indices that attempt to cover basic assessment considerations such as adjusting for writing collaboratives and length of publication history (amongst others number of citations, cites per year, cites per paper, h, g, hc, hl, AWCR, AW, e, and hm-index). Publication data can be easily sorted in POP and citation results can be easily exported into Excel. At the individual level the amount of data cleaning would be, in comparison to our study, minimal.

In February 2013 GS reduced the maximum number of results per page from 100 to 20. This means that Publish or Perish now has to retrieve up to 5 times as many result pages per query in order to show the full results and has following effect on data extraction:

2 http://www.harzing.com/pop.htm

16

Page 17: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

More page requests mean that POP hits the maximum number of requests that Google Scholar allows per hour sooner.

If the number of page requests exceeds the maximum that Google Scholar allows, the IP address will be temporarily blocked by Google Scholar. This block can last for up to 24 hours.

To avoid hitting the maximum allowable request limit, POP uses an adaptive request rate limiter. This limits the number of requests that are sent to Google Scholar within a given period, both short-term (during the last 60 seconds) and medium term (during the last hour).

It is no longer possible to limit to research field: Google Scholar has redesigned its interface and integrated the advanced search page in its general search page. In doing so it removed the option to select specific subject areas. As a result subject filtering is now no longer possible, neither in Google Scholar, nor in Publish or Perish.

By default, Google Scholar matches the name and initials anywhere in the list of authors, so CT Kulik would also be matched by P Kulik, CT Williamson. To match an author's initials only in combination with her or his own surname, use "quotes" around the author's name: "CT Kulik" will not match P Kulik, CT Williamson, but it will match CT Kulik and CTM Kulik, or any other name that contains both CT and Kulik. To exclude unwanted author names, these have to be found by sorting through the results list and entering them in the Exclude these names field. For example, to exclude CLC Kulik from the earlier example, enter "CLC Kulik" in the Exclude these names field. However for both au id #9 (B Jansen) & #11 (S Ward) the result lists numbered over 1000 even after excluding unwanted names and the only option left is to manually remove publications not written by the researcher.

To achieve the required reduction in requests, Publish or Perish delays subsequent requests for a variable amount of time (up to 1 minute). The higher the recent request rate, the longer the delays.

This meant that for our study the amount of data collection per session was limited and the speed of data extraction was slow. The alternative is being blocked by Google Scholar for up to 24 hours. As we are performing queries that yield many results (several hundred or more at the professor level) and issue a large number of queries in short succession, the request rate limiter will insert progressively longer delays to keep the overall request rate within acceptable limits and warn us of an upcoming block from GS. To avoid being block or having to stop collection to stay within a required rate, we created 100 IP addresses which we switched between when we received a warning.

Extended citation analysis of GS data

A drawback of using POP for analyzing a great quantity of citation information is that it does not support export of details of citing sources. It links instead directly to the list of citing sources in GS. This lack of detail hampers our analysis of the foundations of the indicators. We are investigating the possibility of using the Online Citation Service3 (OCS) software to retrieve details of citing sourcs, with the kind permission of the developers, Professor Erhard Rahm and Professor Stefan Endrullis from Leipzig

3 http://dbs.uni-leipzig.de/ocs/

17

Page 18: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

University. Apart from the traditional search by author name and venue, OCS allows the upload of a list of publications and returns the results for this. However, OCS has recently been affected by the GS interface changes and aggressive blocking policy. Knowing this, the advantages of the OCS have to be revisited and other options discussed before we implement any extended analysis of data.

Web of ScienceWOS is a highly valuable resource for researchers to discover prior work in their research areas, as the scope extends across multiple publisher’s lines. The use of WOS in the evaluation of academic performance through the counting of individuals' publications and citations, weighted often by Journal Citation Reports (JCR) as a proxy indicator of the quality of the publications, is more contentious in the bibliometric community.

This contention arises in part from the peer review process and publishing quota that has to be met before a journal is accepted. Critics of the database suggest that these barriers have resulted in a strong bias in favour of “long-established, commercial publishers (disciplines), and against recently-started publications, independent journals, and conferences” (Clarke & Pucihar 2012). Moreover, the declared policy of WOS is that only current and forthcoming issues are considered in the evaluation. Back issues are not accepted (TS 2013a) i.e. recognition of worth is not retrospective. The result of the WoS approach is that major journals of relevance to some disciplines could be missing, or have been taken up only from recent dates and without any retrospectivity. This means that for some senior researchers, the proportion of their publications that are indexed by WoS is as low.

A further consideration is that journals are deleted from Web of Science throughout the year (TS 2013b). This represents historical revisionism, with publications and citations being effectively cleansed from the record (Clarke 2008). Also publications and citation-counts are not cumulative, because they change not only upwards, as new documents are published, but also downwards, as venues are deleted. Studies have also shown database bias towards international English language journals (ref), and certain document types, primarily articles and the citation culture in article-based disciplines.

Table 4. Overall ISI coverage by main field*

EXCELLENT (> 80%)

VERY GOOD (60-80%) GOOD(40-60%) MODERATE (<40 %)

Biochem & Mol Biol Appl Phys & Chem Mathematics Other Soc SciBiol Sci – Humans Biol Sci – Anim & Plants Economics Humanities & ArtsChemistry Psychol & Psychiat EngineeringClin Medicine GeosciencesPhys & Astron Soc Sci ~ Medicine

*table reference: (Moed 2007)

In a preliminary randomised study of 20 researchers we confirmed the common conception that WOS under-represents the “softer” sciences and non-article based disciplines and searches in GS result in a lot

18

Page 19: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

of noise and clean-up. We found that WOS underrepresents Philosophers, books and national language/small publications and Google Scholar requires patience and tenacity to search, Table 5.

19

Page 20: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Table 5. Disciplinary representation in GS and WOS

Author id

Discipline Seniority N publications

Found GS Citations GS Found WOS

Citations WOS

1 Astronomy Prof 257 233 3614 148 73022 Astronomy Assoc Prof

42 54 257 28 1713 Astronomy Assis Prof

89 143 1407 46 9074 Astronomy Post Doc 251 262 291 54 1385 Astronomy Phd 10 15 67 7 366 Environment Prof 84 167 1459 41 2827 Environment Assoc Prof

63 74 3927 46 20668 Environment Assis Prof

30 30 398 33 4269 Environment Post Doc 25 - - 5 2110 Environment Phd 12 20 34 3 1311 Health Prof 415 - - 441 824512 Health Assoc Prof

90 200 3472 0 013 Health Assis Prof

151 95 407 21 15114 Health Post Doc 49 13 32715 Health Phd 24 17 138 19 21116 Philosophy Prof 41 22 43 13 1217 Philosophy Assoc Prof

36 27 36 4 018 Philosophy Assis Prof

18 35 91 7 5719 Philosophy Post Doc 8 10 11 0 020 Philosophy Phd 4 3 0 1 0

The overlap between citations and publications sourced in Web of Science and Google Scholar was not investigated, as this is not an issue for us. We are calculating indicators separately in each database and contextualising the results as we would not expect the researcher to attempt an indicator using combined data from both sources where the citation data is cleaned for duplicates to calculate a fully representative citation count. In the process of collecting data for the analyses we have main broad observations that GS is finding citations from national language publications, books and book chapters, and local journals published in English language as well as citations from sources indexed in WOS. The question is if is there a pattern in the type of publications we don’t find and if this is problematic for what we want to do? What is the effect if we miss something highly cited or many minor publications?

We accept there is an overlap, and acknowledge that the researcher would wish to write the highest resulting indicator on the CV. However in the bibliometric analysis we did compare the difference between results in GS and WOS and find that the score only varies by ± 1 dependent on the discipline.

20

Lorna Elizabeth Wildgaard, 28/02/14,
Adjust this when we get that far. This result is based on the test of PHD students.
Page 21: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

We are aware of potential ethical and validity problems here which is why in the guidelines we stipulate the researcher reports which database was used to calculate the indicator and we offer alternative indicators that account for database bias, such as the hmx - index (the median h of h-indices calculated in WOS, GS and Scopus).

In summary, disciplinary (under)representation in WOS has been well documented in the literature (Clarke, 2008; Salisbury 2009). However there appears to be an agreement, that even though other databases such as GS or Scopus cover a wider range of materials, WOS has much more complete coverage, with more articles indexed and more current citations. As with bibliometric analysis in any single database publication counts are of limited value and citation analysis should always be in context as the future of research assessment exercises lies in the intelligent combination of metrics and peer review (Moed 2007). This observation forms the ACUMEN portfolio, and sets it apart from any other CV enrichment application currently available.

Final observations in preparation for the bibliometric analyses.The exploratory study of 20 researchers also provided useful information in guiding the data-collection and analysis. The results are listed below:

1. A publication list is not a publication list! It is a link to a webpage with selected publications, a short narrative, a link to a database a list in pdf format or a list on a website separated into article types, chronological, and each type accompanied by a short narrative.

2. Some authors publish more than one publication list, an institutional list and a full list on a their personal website fx author id #3, table x, gave 4 publication lists: ADS (89 references), ArXiv (59 references), SPIRES (dead link), Citebase (not a publication list).

3. Some lists are more complete than others. Some include only peer reviewed, published articles while others include everything: rapid responses, popular articles, encyclopedia, conference papers, letters, articles, book chapters and works in preparation.

4. Publication lists are not as a rule up to date. During data-collection we should expect to find more publications by an author than listed on the publication list.

5. Publications by authors with common names, such as au id# 9 & 11, are bordering on the impossible to verify in GS using Publish or Perish. We expect the sample to be reduced.

6. Au id #12 writes national language articles and publishes in books. Even though #12 is an accomplished author he or she is not represented in WOS. Further the publication list is written in Italian, and GS includes both Italian and English translations of the works. Even though this increases the publication list two-fold, we consider translated and original papers as two different works, attracting different readers and different citations.

21

Page 22: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

22

Page 23: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Method of Bibliometric analysis

Characterization of types of indicators.

The indicators tested in our study were previously identified in a comprehensive literature review of 114 bibliometric indicators used in individual evaluation (Wildgaard et al, 2013). In the review we categorised the indicators into the main type of impact they purport to measure, be it outcome, output, quality, impact, sustainability, innovation & social benefits or research infrastructure. The mathematical foundation of each indicator was rated on scale of 1 to 5, where 1 is simple counting and 5 is extremely advanced math. Likewise we studied how difficult it would be for the individual to access and collect the information needed to calculate the indicator. This rough complexity rating reduced the set from 114 to 64 indicators that were considered potentially useful for self-evaluation.

In preparation for the analyses of the indicators, we sorted and filtered the indicators investigating in detail their applicability at the individual level. This resulted in separating the set into 37 indicators and 16 potentially useful reference standards, appendix 5. The applicability of this set was discussed during a meeting of WP5 in May 2013. Using the decision tree, below, we identified and categorised the indicators, discussed their function in the light of previous findings and disciplinary considerations as well as the potentials for correlative analyses. Disagreements were discussed until consensus was reached.

Is the indicator relevant for our 4 disciplines?

No. Exclude indicator from study.Yes. Continue to next question.

Can the indicator be calculated in WOS and GS?

No. Exclude indicator from study.Yes. Continue to next question.

Is the data needed to calculate the indicator available to the individual in WOS or GS?

No. Exclude indicator from study.Yes, see appendix 6. Continue to next question.

23

Page 24: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Is there information redundancy between the indicators?No. Continue to next question.Yes. Does this overlap need investigating before we can responsibly exclude one of the indices from the set? Yes. Include the indicator in the study. No. Exclude the indicator from the study

This resulted in 40 indicators that were then categorised as “simple”, n27, or “sophisticated”, n13. We wish to compare and correlate the performance of simple and sophisticated indicators. A research question that developed during our discussions is if, at the individual level, simple perhaps rougher indicators perform just as useful as the sophisticated (professional) refined indicators. The sophisticated indicators tend to be more complicated in design and calculation. Finally, the indicators were sorted into the ACUMEN sub-portfolio they best represent, Table 6.

Table 6. Bibliometric indicators included in the analysis; their description, the type of impact they purport to measure, complexity and sub-portfolio categorization.

ID Indicator Description Type of impact

Complexity

*Sub-portfolio

1 P Count of production used in formal communication Output Simple Output

2 Pisi, Pgs Publications indexed in WOS or GS Output Simple Output3 Pts Publications in sources defined as important by

researcher’s affiliated institution or specialty Output Simple Expertise

4 Co-publications

Collaboration on a group, departmental, institutional, national or international level Output Simple Output

5Categorised publication

typeDistinction between document types Output Simple Output

6 C +sc Citations including self-citations Outcome Simple Influence

7 CPP Citations per paper Outcome Simple Influence

8Number of significant

papersTop cited papers Outcome Simple Influenc

e

9 PtopPublications among the top 20, 10, 5 or 1% most frequently cited papers in subject/field/world in a given year

Outcome Sophisticated

Influence

10 Age and productivity

Effects of academic age on productitivty and impact Outcome Sophisticat

ed Output

11 %Pnc Share of publications that are not cited. Identify trends in type, subject etc Outcome Simple Output

12 Number of different co-

Growth of co-operation at group, departmental, institutional, national or international level.

Research Infrastructure

Simple Expertise

24

Page 25: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

authors13 Hi-index Accounts for co-authorship effects Research

Infrastructure Simple Influence

14POP variation individual H

indexAccounts for co-authorship effects Research

Infrastructure Simple Influence

15 n-index Accounts for co-authorship effects Research Infrastructure Simple Influenc

e

16 Alternative h index Accounts for co-authorship effects Research

InfrastructureSimple

(same as hi index)

Influence

17 Hp Accounts for co-authorship effects Research Infrastructure

Sophisticated

Influence

18 Diachronous IF

Development of impact over time of a set of papers Impact Simple Influenc

e19 Y Factor Scientific impact defined as a combination of

popularity and prestige Impact Sophisticated

Expertise

20 NJI Normalised journal impact Impact Sophisticated

Influence

21 JFIS Journal to field impact score Impact Sophisticated

Influence

22 DIF Discipline impact factor Impact Sophisticated

Influence

23 IFmed Median impact factor Impact Sophisticated

Influence

24 NJP Normalised journal position Impact Sophisticated

Influence

25 FCSField Citation Score, number of citations expected for a paper of the same type within a field and year.

Impact Sophisticated

Influence

26 CPP/JCSm Normalised citation score (CS/NCS) Impact Sophisticated

Influence

27 H Cumulative achievement Quality Simple Expertise

28 hmx Median h across multiple databases Quality Simple Expertise

29 g Cumulative acheivement, includes more information than h Quality Simple Expertis

e30 H(2) Weights most productive papers, but requires

more citations to be included in index Quality Simple Expertise

31 A-index Magnitude of citations to a researcher’s papers Quality Simple Influence

32 R-index Improves sensitivity of A Quality Simple Influence

33 ħ-index Structure of citations to papers Quality Simple Influence

34 M-quotient Adjusts h for length of career Quality Simple Influence

35 E index Includes ignored excess citations in h index Quality Simple Influence

36 Citation Age The age of citations referring to a researchers work Sustainability Simple Influenc

e

37Aggregate Immediacy

indexHow quickly papers in a subject are cited Sustainability Sophisticat

edInfluenc

e

38AWCR, AW & per author

AWCRAge weighted citation weight Sustainability Simple Influenc

e

39 WorldCat Inclusion in academic libraries internationallyInnovation and social benefits

Simple Expertise

40National and local Library Catalogues

Inclusion in national library catalogues and bibliographies that include press coverage

Innovation and social benefits

Simple Expertise

25

*As we learn more about the indices during the tests, we expect to find that some measure activity better in another sub-portfolio than that they were originally assigned.

Page 26: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

To fully understand how complicated even simple indices can be and ensure that this is the final list of indicators for the analyses, we examined the independence or dependence of the indicators on other indices and if their interpretation is dependent on the use reference standards and weighting systems, appendix 7. No unexpected complications were discovered and no further indicators were excluded.

Table 7. Analysis of independence

ID Indicator Independent

Dependent on another index

Dependent on reference standard

1 P2 Pisi, Pgs3 Pts Authority list4 Co-publications5 Categorised publication

type6 C +sc7 CPP8 Number of significant

papers9 Ptop Authority list10 Age and productivity CPP11 %Pnc12 Number of different co-

authors13 Hi-index H dependent14 POP variation individual H

index H, authors per paper15 n-index H, journal h,16 Alternative h index H, authors per paper17 Hp H, authors per paper18 Diachronous IF19 Y Factor ISI JIF20 NJI Citation average in subfield21 JFIS 5 year field journal average22 DIF Number of citable items

in journal over time23 IFmed Median IF of journals in subject

category24 NJP JCR category ranked by JIF25 FCS Field citation score26 CPP/JCSm Average citation rate of individuals in

journal set27 H28 hmx29 g H dependent30 H(2) H dependent31 A-index H, A dependent32 R-index33 ħ-index H dependent34 M-quotient H dependent35 E index H dependent36 Citation Age37 Aggregate Immediacy

index38 AWCR, AW & author

AWCR39 WorldCat

26

Page 27: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

40 National and local Library Catalogues

27

Page 28: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Method of analysisThe forty indicators will enable the following analyses that will help us include stable and recommended indices in the portfolio:

1. The success of simple contra sophisticated indicators. 2. Correlation between simple and sophisticated indicators.3. Correlation between the four disciplines and the indicators.4. Correlation between the five seniorities and the indicators.5. Correlation between gender and the indicators (where data allows sensible analyses)6. Correlation between (gender) seniority, field and indicator.7. Correlation between (gender) seniority, field and indicator categorised as simple or sophisticated.8. The differences in performance between indicators of the same type of impact.9. The effect of discipline on the success of the indicators.10. The effect of seniority on the success of the indicators.11. The effect of gender on the success of the indicators, (if data allows sensible analyses).12. The effect of data quantity on the indicators.

Methodological considerationsSimple vs sophisticatedLessons learnt from the test-case narrative taught us that simple indicators can give a lot of information which in turn can be demanding to contextualize. We wish to understand if they perform just as well as the sophisticated indicators which more or less indicate the same thing and to understand the correlation between them and how useful they are for the discipline and the seniority. This is why these sophisticated indicators appear on the list, even though they would be too intricate and demanding for the researcher to calculate. The indices in the impact category are all apart from one “sophisticated” and traditional disciplinary benchmarks. This problematic was already identified in the review, because good measures of impact are dependent on a high level of aggregation to be comparable to global performance standards. We are interested in if other indicators such as CPP are as informative as these and could used as a proxy for impact.

Indicators that account for co-authorship effectsThe hi, POP variation, N, alternative h and hp overlap and are information redundant if used together. We will rank these and discuss which are the most disciplinary representative at the individual level. The usefulness of identifying individual contribution depends on the field. Of course bibliometrically it is interesting to provide a metric that accounts for the number of papers researchers would have written if they had worked alone or support intra- or interdisciplinary analysis. But from the researchers point of view it is debatable if this is important. If it is a disciplinary tradition to multi-author papers, fractionalising the contribution would be detrimental to the individual and we would not recommend the author to use fractionalisation schemes. However, if researchers in a multi-authoring discipline

28

Page 29: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

choose to write alone, it is important to provide the fractionalisation counting tools to emphasize their efforts.

Indicators of qualityThe information redundancy between the h, g, n H(2), A, R, ħ, M-quotient and e-indies will be investigated and if the indices favour an academic seniority or field. Further the use of the h-index (or g-index) as a benchmark in different areas, for different seniorities or gender will be investigated, such as h-index of author compared to h index of seniority (within specialty). Further, we wish to investigate if CPP gives a better representation of impacts of quality than h-index. Compared to h, CPP is more intuitive as all citations and papers are included in the calculation rather than a “core” of papers. As h is acknowledged for its simplicity and is known in the research community, the guidelines for both the evaluator and the researcher the main pitfalls of the h-index will be listed, emphasizing how comparison across fields is unwise.

Indicators of impact Clearly there are more sophisticated indicators of impact in our study than simple ones. Note though, that these are designed for a higher level of aggregation than the individual. However, researchers will undoubtedly want to draw attention to how successful they are within their field especially if they have published in journals with high impact factor and their papers have received a lot of citations throughout their career. We will test Y, NJI, JFIS, DIF, IFmed, NJP and SPP/JCSm to understand how they correlate with more simple impact indicators, and if these simple indicators can be aggregated to be used as local bench marks, Table 8.

Table 8. Local benchmarks developed from simple indicators

Reference Standard Indicator

Production of colleagues of same academic seniority within department or institution

P

Production of same academic seniority within field, national or international level

P

Production of experts in specialty P

Citations to colleagues of same academic seniority within department or institution

C + sc

Citations to same academic seniority within field, national or international level

C + sc

Citations to experts in specialty C + sc

29

Page 30: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

H index at local, national or international level H

M quotient at local, national or international level M-quotient

The case narrative taught us that simple indicators can be aggregated to useful local performance benchmarks. However indicators that are simple at an individual level become complex and time consuming when used on a higher level of aggregation. The time and effort needed in calculation must be clear in the guidelines as this affects the practicality and usefulness of the standard, however relevant it may be. Other possible benchmarks, where the amount of data allows for sensible comparisons, could be in disciplinary databases, such as the individual’s visibility and representation in ADS, Inspire, Inspec, Biomed, PubMed, or the Philosophers Index. The challenge for us, is to find an easy method the researcher can reproduce, to find out which are the most highly cited papers in regards to a researchers specialty and not ISI defined subject category. This will be extremely difficult in areas where citation activity is not high and we need to analyse how publication types, years and citations correlate with sophisticated field-citation indicators.

Indicators of sustainabilityTogether with the indicator Age and Productivity, with is purported to primarily measure outcome, we will test which of the indicators in this category best reflect the researcher’s currency.

Indicators of innovation and social benefitsThe success and informativeness of the indicators of innovation and social benefits are dependent on the completeness of the information on the researcher’s CV and are also highly dependent on culture, politics and economics of the country and/or domain the researcher is active. A self-evaluation questionnaire covering the issues of knowledge exchange, earning capacity, use in the public sphere, patent applications and the effects of publication is currently being tested in the HEFCE evaluations in the UK (Neiderkrontenhaler et al 2011; Wildgaard et al 2013). This form of evaluation falls outside our framework of bibliometrics. We recommend Neiderkrontenhalers questionnaire as useful in developing a checklist or guideline for reporting innovation and social benefit. In the narrative case study, we found WorldCat and the Danish bibliography accessed through bibliotek.dk useful sources for indicating incorporation of published works in public libraries and appearance in the media. Being in a public library catalogue is used as a proxy for dissemination in the social sphere and appearance in the media is also assumed to be a measure of societal impact. The disciplinary usefulness of similar national library catalogues will be investigated.

Next stepsStatus June 2013: data is still being collected and analysed.

30

Page 31: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

The bibliometric analyses, results, conclusions and recommendations will be presented in the final report due August 2013. The thorough methodological preparations and preliminary studies described in this document have enabled us to design analyses targeted to our potential users within the four disciplines that will result in useful information. Further, we can already now sketch a structure for the guidelines that will accompany the recommended bibliometric indicators:

For Researchers: Guidelines for using bibliometric indicators on your CV

Coverage in databases. How to choose where to extract data? Gender Academic status Discipline Suggestions to benchmarks that are relevant to you Pitfalls Deficiencies Presentation techniques Good self-evaluation practice

For Evaluators: Guideline for Evaluators

Interpreting bibliometric self-evaluation Ethics of self-evaluation

Reference list

Bach, J. (2011). On The Proper Use of Bibliometrics to Evaluate Individual Researchers. Rapport de l’Académie des sciences - 17 janvier 2011: Report presented on 17 January 2011 to the Minister of Higher Education and Research.

Bornmann, L., Mutz, R., Neuhaus, C., & Daniel, H.-D. (2008). Citation counts for research evaluation: standards of good practice for analyzing bibliometric data and presenting and interpreting results. Ethics in Science and Environmental Politics, 8, pp. 93-102.

Clarke, R (2008) An Exploratory Study of Information Systems Researcher Impact. Commun. AIS 22, 1 (January 2008), PrePrint at http://www.rogerclarke.com/SOS/Cit-CAIS.html

Clarke, R., & Pucihar, A., (2012) The Web of Science Revisited: Is it a Tenable Source for the Information Systems Discipline or for eCommerce Researchers? Accessed 9. June 2013 http://www.rogerclarke.com/SOS/WoSRev.html

Campbell, P. (2008). Escape from the Impact Factor. Ethics in Science and Environmental Politics, 8, pp. 5-7.

31

Page 32: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Cheung, W. (2008). The Economics of Post-Doc Publishing. Ethics in Science and Environmental Politics, 8, pp. 41-44.

Collini, S. (2012). Bibliometry. In What Are Universities For? London: Penguin.

Laloë, F., & Mosseri, R. (2009). Bibliometric Evaluation of Individual Researchers: not even right....not even wrong! Europhysics News, 5, pp. 26-29.

Lawrence, P. (2008). Lost in Publication: how measurement harms science. Ethics in Science and Environmental Politics, 8, pp. 9-11.

Lundberg, J. (2009). Lifting the crown: citation z-score. Journal of Informetrics, 1(2), pp. 145-154.

Moed, H. F. (2007) The use of bibliometric indicators in research evaluation and policy. Power Point lecture presented at: Colloque de l’Académie des sciences "Évolution des publications scientifiques - Le regard des chercheurs" des 14-15 mai 2007. Retrieved June 10, 2013 from : http://www.academie-sciences.fr/video/v140507.htm

Murphy, M., Steele, C.M., & Gross, J.J., (2007) Signaling Threat: How Situational Cues Affect Women in Math, Science, and Engineering Settings. Psychological Science, 18(10), pp. 879-885.

Niederkrotenthaler, T., Dorner, T. E., & Maier, M. (2011). Development of a practical tool to measure the impact of publications on the society based on focus group discussions with scientists. BMC Public Health, doi:10.1186/1471-2458-11-588.

Potočnik, J. (2005). The European Charter for Researchers: The Code of Conduct for the Recruitment of Researchers. Brussels: European Commission:Directorate-General for Research.

RAISE. (2013). Recognizing the acheivements of Women in Science, technology, eningeering, maths and medicine, at http://www.raiseproject.org/index.php [accessed 10 June 2013]Salisbury, L (2009) Web of Science and Scopus: A Comparative review of Content and Searching Capabilities. The Charleston Advisor (July), pp.5-18.

TS (2013a) Journal Submission Process' Thomson Reuters, at http://ip-science.thomsonreuters.com/mjl/selection/ [accessed 10 June 2013]

TS (2013b) 'The Thomson Reuters Journal Selection Process' Thomson Reuters, May 2012, at http://thomsonreuters.com/products_services/science/free/essays/journal_selection_process/ [accessed 8 June 2013]

32

Page 33: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Wildgaard, L., Schneider, J., & Larsen, B. (2013). Quantitative Evaluation of the Individual Researcher: a review of the characteristics of 114 bibliometric indicators. Manuscript submitted for publication.

WP2. (2012) Progress Report (2): ACUMEN Web Presence Survey ResultsWork Package 2: Assessing the Institutional Web presence of researchers . University of Wolverhampton, Statistical Cybermetrics Research Group. Retrieved June 10, 2013 from: https://www.dropbox.com/home/Project%20Managment/WP2%20work%20folder.

Zitt, M. B. (2008). Challenges for scientometric indicators: data demining, knowledge-flow measurements and diversity issues. Ethics in Science and Environmental Politics, 8, pp. 49-60.

33

Page 34: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

List of Appendices

1. Sample, corrected for working links and duplicates removed, p.26

2. Researchers excluded from sample, p.27

3. Seniority, Disciplinary and geographical distribution, p.28

4. Work task protocol, p.29

5. Reduction of 64 indicators for analysis to 37 plus 16 reference standards, p.41

6. Identification of the information needed to calculate the indicators and reference standards, p.50

7. The dependence of indicators on other indicators, reference standards and weighting systems, p.55

34

Page 35: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Appendix 1: Sample corrected for working links and duplicates

We have a sample of researchers, n1211, who provided links to a publication list. I have been through all the links to remove duplicates, researchers who do not belong in the discipline, deadlinks and links to material other than personal publication list, eg. blogs, group websites and information about areas of research. This has resulted in a sample of 776 researchers with working links to publication list(s), distributed as follows:

In Astronomy we have 203 researchers, 17% women

Astronomy Phd Post Doc Assis. Prof Assoc. Prof ProfACUMEN shared data set 57 142 66 144 86Provide link to web material 18 71 37 93 63Working link to publication list 15 49 27 72 40Men/women with working link 12/3 37/12 20/7 61/11 38/2

In Environmental Science we have 203 researchers, 23% women

Environment Phd Post Doc Assis. Prof Assoc. Prof ProfACUMEN shared data set 31 65 92 200 126Provide link to web material 8 29 64 135 83Working link to publication list 3 18 42 85 55Men/women with working link 3/0 12/6 33/9 71/14 47/8

In Philosophy, we have 250 researchers, 19% women

Philosophy Phd Post Doc Assis. Prof Assoc. Prof ProfACUMEN shared data set 25 47 85 147 151Provide link to web material 14 34 67 124 129Working link to publication list 9 23 49 82 87Men/women with working link 6/3 20/3 41/8 64/18 72/15

In Public Health we have 137 researchers, 39% women

Health Phd Post Doc Assis. Prof Assoc. Prof ProfACUMEN shared data set 48 54 82 194 97Provide link to web material 17 21 49 97 58Working link to publication list 9 14 31 53 30Men/women with working link 2/7 7/7 36/13 36/17 20/10

35

Page 36: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Overall in our sample of 793 researchers, 182 are women, 23%. This is under the expected European percent for women in science, 30% and 44% dependent on field as reported in the SHE figures for 2012: http://ec.europa.eu/research/science-society/document_library/pdf_06/she-figures-2012_en.pdf.

36

Page 37: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Appendix 2: Researchers excluded from sample

Astronomy Phd Post Doc Assis. Prof Assoc. Prof Prof TotalDead link 2 16 1 6 13 38Not Discipline 1 1Duplicate 1 1Not publication list 1 6 8 15 10 40Not correct seniority

Environment Phd Post Doc Assis. Prof Assoc. Prof Prof TotalDead link 2 6 7 25 11 49Not Discipline 1 2 3DuplicateNot publication list 2 4 15 25 15 61Not correct seniority 1 1

Philosophy Phd Post Doc Assis. Prof Assoc. Prof Prof TotalDead link 2 5 9 12 17 45Not Discipline 1 1 4 2 8Duplicate 1 1 5 3 10Not publication list 2 4 8 21 20 55Not correct seniority

Public Health Phd Post Doc Assis. Prof Assoc. Prof Prof TotalDead link 3 2 8 17 10 40Not Discipline 2 1 1 2 6Duplicate 1 1Not publication list 3 4 9 26 16 58Not correct seniority

37

Page 38: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

38

Page 39: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

AU BG CH CN CZ DE DK DZ EE ES FI FR HU

IL IN IT NL NO

PL RU SK UK USA

Astro Phd 1 3 2 4 2 2 1Astro Post Doc 1 1 16 1 5 1 3 1 6 1 10 3Astro Assis Prof 1 2 4 2 1 1 1 9 1 5Astro Assoc Prof 2 3 2 2 9 11 3 3 2 12 8 4 1 2 6 1Astro Prof 1 1 1 2 2 3 1 5 4 6 3 1 3 7Total Astro. 1 1 1 1 7 28 5 2 19 2 17 11 11 2 26 19 10 1 3 31 5

Enviro Phd 1 1 1Enviro Post Doc 1 1 5 2 2 1 1 1 4Enviro Assis Prof 1 4 1 2 1 5 3 11 8 3 1 2Enviro Assoc Prof 5 1 13 7 5 3 3 4 7 15 4 1 4 1 12Enviro Prof 3 4 3 2 5 4 3 7 4 9 1 3 7Total Enviro. 1 14 7 23 10 17 9 6 15 11 37 14 1 11 2 25

Phil Phd 1 2 1 1 1 3Phil Post Doc 2 6 1 1 3 2 1 1 2 4Phil Assis Prof 3 8 1 1 3 1 5 1 6 6 1 1 11 1Phil Assoc Prof 5 8 1 12 2 6 1 1 16 4 5 1 18 2Phil Prof 1 10 3 7 6 3 16 7 2 31 1Total Phil. 6 30 13 2 25 7 19 2 7 39 19 6 4 67 4

P. Health Phd 2 1 1 4 1P. Health Post Doc 5 2 1 1 5P. Health Assis Prof 4 2 1 1 1 1 4 9 8P. Health Assoc Prof 4 11 1 1 1 3 1 3 3 7 7 2 9P. Health Prof 7 8 3 4 3 5Total P. Health. 20 25 1 2 6 3 4 4 3 16 23 2 28Overall

1 2 1 1 27 85 66 1 16 67 21 46 32 32 2 118 75 1 29 1 9151

9

Appendix 3: Seniority, disciplinary and geographical distribution

39

Page 40: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Appendix 4:

Work guideline: Extracting publications from Google Scholar and Web of Science.

ContentsACUMEN Project description: What is ACUMEN?...............................................................................30

Your Job: A brief outline and how to save your work..........................................................................31

Method of Data Collection: Google Scholar, through Publish or Perish..............................................33

How to export from POP to excel........................................................................................................33

Tips to searching..................................................................................................................................34

Example of a step-by-step search strategy..........................................................................................35

Example of an author that is impossible to verify...............................................................................35

Method of Data Collection: Web of Science........................................................................................36

How to export to from WOS Excel.......................................................................................................39

Before the next search........................................................................................................................40

40

Contact Information

Lorna Wildgaard (project leader, Copenhagen) [email protected] tlf: 32341460

Jesper W Schneider (project leader, Århus) [email protected]

Birger Larsen (project advisor, Copenhagen) [email protected]

Send your email address to Lorna to join the project’s Dropbox folder to share files, experiences and store completed work.

Page 41: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

ACUMEN Project description: What is ACUMEN?

ACUMEN stands for Academic Careers Understood through Measurements and Norms. ACUMEN is a European research collaboration aimed at understanding the ways in which researchers are evaluated by their peers and by institutions, and at assessing how the science system can be improved and enhanced. This FP7 project is a cooperation among nine European research institutes with Professor Paul Wouters (CWTS – Leiden University) as principal investigator.

The aim? To use the ACUMEN member’s combined expertise to produce a portfolio of both traditional indicators and new (useful) qualitative indices and quantitative web-based and bibliometric measures. These measures will be presented to the researcher as an online enriched CV, which documents their research activities as well as supporting assessments of their expertise, output and influence in the context of their demographic information and career path narratives. This visualization tool will support the core creativity of research in all disciplines and not steer the aim of research as publishing in high JIF journals rather than work with low-prestige but relevant problems. Hence the indicators are not limited to publication and citation counts, or limited to traditionally measureable forms of scientific communication in journals as a lot of communication now-a-days is on the web or through popular media channels or interactive installations.

The philosophy behind the project is to address the gap between creating research, evaluating research and promoting excellence. There is a problem in current systems of research evaluation and this problem is complicated. Researchers are people who are being evaluated between narrow frameworks and limited technology. In these systems the societal role of their research is secondary and the methods of evaluation, such as peer review can be biased, subjective, give power to scientific elite and enforce the gender power structure. To understand the effect of evaluation, we need to be aware of differences between disciplines, gender and culture. Thus to obtain a consistency between the mission of the researcher and the mission of evaluation ACUMEN will also be developing guidelines for Good Evaluation Practice, in the hope that evaluation will be implemented in such a way that does not undermine the authority of the researcher in their process of quality, and support their craftsmanship without giving them all the freedom or taking freedom away.

What difference will ACUMEN make? ACUMEN is investigating how evaluation plays out in diversity of labour force and gender. This questions the neutrality of evaluation and how straightforward it is. In cooperation with the European Commission, ACUMEN will contribute to policies and that get research evaluation on a better track. The goal is still to promote excellence and tools that can solve societal problems but keep space for creativity. The connection of analysis of the individuals career with evaluation and the interaction between evaluation process and career advancement will be strengthened. The measures created will enrich CVs and point to activities in systematic way that is acceptable to evaluators. The ACUMEN Portfolio is the link between knowledge evaluation and how this is embedded in research careers evaluation.

41

Page 42: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

42

Page 43: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Your Job: A brief outline and how to save your work

Please send your email to [email protected] (Lorna) and you will be invited to join the Dropbox Folder: ACUMEN Data Extraction. In the Dropbox folder “ACUMEN Data Extraction” you will find a folder for each of the four disciplines. There is also a “Troubleshooting” folder where you will find tips on how to search Web of Science and Google Scholar. Feel free to up load your own tips to share with your project colleagues.

You will be allocated a master excel sheet containing a list of authors and links to their online publication list(s). All text in the excel sheets is to be written in English. The only information you alter is this sheet is the following:

Part 11.1) Follow the link to the author’s publication list. 1.2) Verify that the link is working. Mark in the Excel sheet, in the cell “link”, if the link is:working and a researcher within the discipline you have been assigned (w), dead (d), not a publication list (n), not the academic seniority you have been assigned (not seniority)if the researcher does not belong to the discipline (nd), or if the researcher appears on the list more than once (duplicate)1.3) If the link leads to a publication list (w) of a researcher within the discipline you have been assigned (duplicates removed), copy/past the whole line of author information into the 2nd sheet, labeled “working links”. 1.4) Save this excel sheet in the Dropbox folder, ACUMEN Data Extraction, under the correct discipline, under the correct academic seniority as so:Discipline_academic seniority_workinglinks_your initials1.5) Save a copy of the publication list in the corresponding folder in our Dropbox. Save it as “Author surname_Bib ID number_your initials” for example “Druckmullerova_8_LEW”What format to save in?-If the publicationlist can be easily exported, export into an excel file, test file or word document (whatever is easiest). -If the publication list is a PDF, save as PDF where as, -if the publication list is a list on a website that requires the references are copy/pasted one by one, take a screen shot and save that. Ensure you have all the bibliographical information.

Part 22.1) Using the sheet “working links” as your master, start with the first author on the list. Follow the link and keep it open while you find the authors publications in Web of Science and Google Scholar. 2.2) Add 3 more cells in the header of the “working links” at the end of the author information: “number of publications on list”, “number of publications GS”, “number of publications WOS”.2.3) If the author has links to more than one list, you’ll have to compare the lists for duplicates. Assess what the author writes about, the institutions they are affiliated to and the age range of the

43

Page 44: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

publications. This will help you verify the publications found in Web of Science and Google Scholar.2.4) Note how many publications the author has listed, and write the amount in the cell “number of publications on list”

Part 33.1) For each author create a new Excel folder “Discipline_seniority_author name_yourinitials” with 3 sheets – name the first “author name_GS”, the second “authorname_WOS”, and the third “authorname_duplicates”.3.2) Search Google Scholar (GS) using Publish or Perish version 4 or newer, for publications by the author and export to the sheet “author name_GS”. 3.3) Search Web of Science (WOS) for publications by the author and export to the sheet “authorname_WOS”.3.4) Some researcher’s names are so common that they generate an enormous amount of results in GS and it is accordingly impossible to verify authorship. Mark in the authors excel sheet (“Discipline_seniority_author name_impossible_yourinitials”) that they were impossible and save this sheet to the Dropbox folder ACUMEN Data Extraction, Impossible3.4) Copy and paste the GS list into the third sheet “authorname_duplicates”. Highlight the list with a colour. Copy and paste the WOS list into the same sheet. Make sure the titles are in the same column. Mark the entire list and sort after title alphabetically. The colour makes it easy to see the duplicate publications, both between WOS and GS, and GS and GS.

If you make changes to the files you have saved in the Dropbox folder, please save with a revised number, such as Public Health_Professor_JSmith_LW02

44

For both GS and WOS: if the researcher has no publications please write in their corresponding excel sheet and write “No publications”.

Page 45: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Method of Data Collection: Google Scholar, through Publish or Perish

Download and install Publish and Perish: http://www.harzing.com/pop.htmSearch using the Author Impact function. The Author impact analysis page allows you to perform a quick analysis of the impact of an author's publications. The Author impact analysis page contains the following panes:-Author query pane -Results pane

How to perform an Author impact analysisTo perform a basic impact analysis:

1. Enter the author's name in the Author's name field; 2. Click Lookup or press the Enter key. 3. The program will now contact Google Scholar to obtain the citations, process the list, and

calculate the Citation metrics, which are then displayed in the Results pane. The full list of results is also available for inspection or modifications and can be exported in a variety of formats.

From the researcher’s publication list see how does the researcher writes their name in the author byline. Use this form to search the databases. Fx The author name below has the following forms, so you will have to search them all. Write them with “quotes” with OR in between each name.

“Piotr A Dybczynski” OR “PA Dybczynski” OR “Dybczynski, P”

45

How to export from POP to excel:

Step 1 Step 2Copy> Copy>copy statistics for excel with header copy results for excel with headerOpen excel arc Open excel arcctrl v ctrl v

Page 46: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Tips to searching1. Always use "quotes" around the author’s name, e.g. "A Harzing". 2. PoP is not case dependent, "A HARZING" gives the same result as "a harzing" 3. The order of search terms does not matter. "A Harzing" will give the same result as "Harzing A". 4. Use an author’s initials rather than their full given name as not all journals publish author names

in full. 5. If an author has consistently published with only one initial, you can exclude namesakes using

2nd and 3rd initials by using wildcards in the "exclude these names" field, e.g. when searching for "G Sewell", you can exclude "G* Sewell" "G** Sewell".

6. You cannot use "*G Sewell" to exclude "WG Sewell" or "AG Sewell". You need to manually exclude these authors by listing them in the "exclude these names" field. To exclude certain author names, enter them in the Exclude these names field. For example, to exclude CLC Kulik from the earlier example, enter "CLC Kulik" in the Exclude these names field. You can enter more than one exclusion in Exclude these names: "CL Kulik" "CLC Kulik" would exclude both these combinations from the search.

7. If an author has published under two different names (e.g. maiden name and married name) use OR between search terms for a combined search “WG Sewell” OR “W Sewell"

8. If an author has mostly published with two initials, but has incidental publications with one initial, a combined search with initials and full given name (e.g. "CT Kulik" OR "Carol Kulik") will usually capture all of their publications.

9. Do not try to use the AND keyword in an author search. Google Scholar does not recognize this keyword and will treat it as a normal search word. Instead, just enter multiple author names; this will behave as an "and" search by default.

10. If you are looking for an author whose name contains accented letters, then it might help if you include several variations of the name, both with and without accents, and also with the accented letters missing. For example, to search for someone with the surname Veríssimo (note the accent on the first 'i'), use the following names in the Author field: “Veríssimo” OR “Verissimo” OR “Verssimo”

11. If the list of results is fairly limited, you can manually include or exclude citations from the analysis by checking or clearing the boxes in the Results list.

Limiting yearBefore limiting the year range, always check whether an author has highly cited publications without a year listing. If you know that a certain author only published after (or before) a certain year, you can enter the start or end years in the Year of publication between ... and ... fields. You can also use these fields if you want to analyse the author's publications from a given period.

(De)Selecting and merging resultsYou can deselect publications not published by the target author. Simply remove the tick mark in the first column by clicking on it.

46

How to export from POP to excel:

Step 1 Step 2Copy> Copy>copy statistics for excel with header copy results for excel with headerOpen excel arc Open excel arcctrl v ctrl v

Page 47: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

You can (de)select more than one publication at once by first selecting the relevant publications and then clicking the "(un)check selection" button.

If the results contain duplicate entries, you can merge them by dragging and dropping the duplicate entries onto the master record.

Selecting relevant publications for unchecking or merging can be made easier by first sorting the results by Cites, Authors, Title, Year, Publication, or Publisher. Sorting is done simply by clicking on the corresponding column heading. Click twice to reverse the sort order.

Here are some shortcuts:

1. The Check all button places check marks in all boxes; 2. The Uncheck all button clears all boxes; 3. When you use the keyboard to travel up and down in the Results list, pressing the space bar

toggles the check mark on and off on the selected line. 4. You can also select a consecutive range of items in the list (left-click on the first item, then hold

either Shift key and left-click on the last item) and use the Check selection/Uncheck selection buttons to check/uncheck all selected items and recalculate the citation statistics.

Example of a step-by-step search strategy

Search for the target academic’s name with his/her first initial and surname in quotes, e.g. "a harzing". Please note that Google Scholar matches the surname and initials anywhere in the initials+surname combination, so "C Kulik" would be matched by CT Kulik, CLC Kulik, but also by PC Kulik.

It is generally better to use fewer initials and then exclude the ones you don't want (see next point) instead of using more initials, because many citations (or authors) are sloppy with the initials they use. With too many initials in the Author's name field you run the risk of missing a substantial number of relevant articles.

To exclude certain names, enter them in the Exclude these names field. For example, to exclude CLC Kulik from the previous example, enter "CLC Kulik" in the Exclude these names field (and keep "C Kulik" in the Author's name field). You can enter more than one exclusion in Exclude these names: "CL Kulik" "CLC Kulik" would exclude both these combinations from the search.

If the result includes publications not published by the target academic, deselect those publications (remove the tick mark in the first column by clicking on it). If the list is long, it might be easier to deselect all publications first and then only select the relevant publications. Please note that any titles with less than 5 citations usually have very little or no impact on the h-index, but might influence the g-index. Hence, if you are faced with a very long list and are only interested in the h-index, you might consider deselecting all and only reviewing titles with 5 or more citations.

47

Page 48: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Selecting relevant publications might be easier by sorting the results by Cites, Authors, Title, Year, Publication, or Publisher. Sorting is done simply by clicking on the corresponding column heading.

Example of an author that is impossible to verify Common names are time consuming, but it is still quicker to use POP than export by hand. I found that for common names general search is quicker than author search. Write the name of the author in quotes in the author field and then in the “None of the words” field write the author names you wish to exclude, again in quotes around each name.

Author’s name: B Jansen

None of the words: "BJ jansen" "BAJ Jansen" "BG Jansen" "KMb Jansen" "bsh Jansen" "bjp Jansen" "bes Jansen" "bmp Jansen" "bh jansen" "bd jansen" "hb jansen" "be jansen" "bjm jansen" "gb jansen" "br jansen" "rb jansen" "brj Jansen""hwb Jansen" "bd jansen" "ba jansen" "jb jansen" "bgm jansen" "bc jansen" "mb jansen" "bjm jansen" "lb jansen" "bjh jansen" "bd jansen" "pb jansen" "bp jansen" "jansen-schulz"

Year of publication: 2001-2013

The search time still returns over 1000 references. Also I’m being warned that Google will block me. When you find such an author, mark in your dataset that he/she impossible. Copy the all the author’s information from your master excel arc into the ACUMEN data extraction dropbox folder_impossibles.

Searching and making the results accurate is time-consuming as in February 2013 Google Scholar reduced the maximum number of results per page from 100 to 20. This means that Publish or Perish now has to retrieve up to 5 times as many result pages per query in order to show the full results and has following effect on data extraction:

More page requests mean that POP hits the maximum number of requests that Google Scholar allows per hour sooner.

If the number of page requests exceeds the maximum that Google Scholar allows, our IP address will be temporarily blocked by Google Scholar. This block can last for up to 24 hours.

To avoid hitting the maximum allowable request limit, POP uses an adaptive request rate limiter. This limits the number of requests that are sent to Google Scholar within a given period, both short-term (during the last 60 seconds) and medium term (during the last hour).

To achieve the required reduction in requests, Publish or Perish delays subsequent requests for a variable amount of time (up to 1 minute). The higher the recent request rate, the longer the delays.

This means for us that the amount of data collection per session is limited and the speed of data extraction is slower than before. The alternative is being blocked by Google Scholar for up to 24 hours. As we are performing queries that yield many results (several hundred or more) and issue a number of

48

Page 49: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

queries in short succession, the request rate limiter will insert progressively longer delays to keep the overall request rate within acceptable limits. To avoid this, spread the queries over the day.

Method of Data Collection: Web of Science

Open Web of Science (a citation database that is part of Web of Knowledge).

Enter the researcher’s surname and possible initials in the search box. Limit the field to “author”. Limit the search, under Timespan, from the earliest publication year reported on the author’s publication list.

Press search.

49

Page 50: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

First limit to author name: In the column refine results click on Author, and more options. Click the surname and initial option(s) that are relevant and click refine to just include these variants.

Repeat for Web of Science categories. If there are just a few categories click on those you wish to exclude and then click on “exclude”. If there are many options, select the relevant categories and “refine”. Think broadly when using the categories and narrow the search slowly, continuously checking the results list. Philosophy can for example also be included in the mathematics, social studies, or management category.

When you are satisfied with the list, click the boxes beside the references to add the articles to your marked list. You find the marked list at the top of the search. Click the plus to add to your list. Click on the number in parenthesis to enter your marked list.

50

Marked list

Page 51: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

51

Page 52: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Step 1: In the marked list check “All records in this list” and “Select All”

Click on Step 2: Selected destination and save as Tab de-limited Win or Mac dependent on your computer.

Save

How to export to from WOS Excel

Save the file on your computer.

Open Excel and choose the “Data” tab from the navigation menu. Click on “from text”. Choose the text file from the pop-up menu and “import”. The Text Import Guide pops up. Follow the guide to import the text into the cells of the Excel sheet.

52

Page 53: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

53

Before the next search

Before you do a new search in WOS, remember to clear your marked list.

After you have typed in the next author name, check the year limits are correct.

Page 54: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

54

Page 55: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

55

Page 56: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Indicators of Output: Published or unpublished countable works

ID nr.

Indicator Description WOS GS Astro.

Enviro.Sci

Phil.

Health

Comments

1 P Count of production used in formal communication

From authors CV

2Pisi

Used in the calculation of impact compared to world subfield citation average based on ISI citation data.

()

Also in GS.

3 Pts Number of publications in selected sources defined important by the researcher’s affiliated institution.

Exemplify with BFI for Denmark, evt. other countries authorized lists

4 Co-publications Collaboration on departmental, institutional, inter- or national level & identify networks.

More relevant in some fields than others

5Fractional counting on papers

Shared authorship of papers gives less weight to collaborative works than non-collaborative ones.

Fractional counting is not beneficial from the individual’s viewpoint. No one would want to reduce their score.

6 Proportional or arithmetic counting

Shared authorship of papers, weighting contribution of first author highest and last lowest.

Ditto

7 Geometric counting

Assumes that the rank of authors in the byline accurately reflects their contribution

Ditto

8 Harmonic counting

The 1st author gates twice as much credit as the 2nd, who gets 1.5 more credit than the 3rd, who gets 1.33 more than the 4th etc.,

Ditto

9 Noblesse oblige Indicates the importance of the last author for the project behind the paper.

Ditto

10 FA First author counting

Credit given to first author only ditto

11 Weighted publication count

A reliable distinction between different document types.

Which weights should be applied there are no standards. A table summary of type of work would be interesting- If the author does it themselves, a high level of detail is achievable, if we do it in GS/WOS it would be limited.

56

Page 57: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Remember: The researcher has to be able to do these indicators themselves.

57

Page 58: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Indicators of Outcome: Use in scientific community, measured in citations

ID nr.

Indicator Description WOS GS Astro.

Environ. Sci.

Phil.

Health

Comments

1 C + sc Indication of all usage for whole period of analysis

2 C Recognised benchmark for analyses. Indication of usage by stakeholders for whole period of analysis

Do self-citations include cites from co-writers? This could be messy

3 Scimago Total Cites (STC)

Indication of usage by stakeholders for whole period of analysis

Citing info only available from after 1996. Access to Scopus can be limited because of the cost

4 C-sc Measure of usage for whole period of analysis

5 % SELFCIT Share of citations to own publications

6 CPP Trend of how cites evolve over time Very rough measure

7 Ptop Identify if publications are among the top 20, 10, 5, 1% most frequently cited papers in subject/subfield/world in a given publication year.

Percentiles not affected by skewed distribution. Requires reference standard

8 Field top % citation reference value

World share of publications above citation threshold for n% most cited for same age, type and field

Ditto

9 E(Ptop) Reference value: expected number of highly cited papers based on the number of papers published by the research unit.

More interesting on department level

10 A/E(Ptop) Relative contribution to the top 20, 10, 5, 2 or 1% most frequently cited publications in the world relative to year, field and document type.

Ditto

11 Age of citations If a large citation count is due to articles written a long time ago and no longer cited OR articles that continue to be cited.

12 Number of significant papers

Gives idea of broad and sustained impact Logical measure if individuals define own reference standard and compare to that

13 Age and productivity

Effects of academic age on productivity and impact.

Other effects could be more interesting such as effect of grant on productivity.

14 %Pnc Share of publications never cited after certain time period, excluding self-citations

Useful in reflection and justifying why something is not cited fx according to type;

58

Page 59: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

encyclopeadia, preface or schism between language & subject

Evt. Library holdings?

Indicators of Research Infrastructure: Collaboration and to which extent these are citing the work

ID nr. Indicator Description WOS

GS Astro. Environ. Sci.

Phil. Health Comments

1 Number of co-authors Indicates cooperation and growth of cooperation at inter- and national level;

General interest to see if author works in groups, alone, repeated collaborations

2 Co-citations Thematic networks and influence and impact of researcher.

Not interesting for CV

3 Fractional counting on citations

Designed to remove the dependence of co-authorship (Egghe, 2008)

Not interesting for individual to reduce citation count

4 hi-index Indicates number of papers with at least h citations scientist would have written if worked alone.

Useful in subjects with extreme co-authorship such as Astronomy. Not too much work for author as limited to h core

5 POP variation individual

H-index

Accounts for co-authorship effects Above is easier even though granularity is lost.

6 n-index Enables comparison of researchers working in different fields:

Based on a journal’s h (how will researcher get that, comparison between fields is interesting to evaluator not author)

7 Alternative H index Indicates the number of papers a researcher would have written along his/her career if worked alone.

Same as hi-index

8 Pure h-index (Hp) Corrects individual h-scores for number of co-authors

Based on fractional counting and place in author by-line. A lot of work.

9 Cognitive orientation Identify how frequently a scientist publishes or is cited in various fields; indicates visibility/usage in the main subfields and peripheral subfields.

Interesting to see where work is published and cited (used). Graphically good addition to CV, easy to read.

10 Visual representation techniques

Based on bibliographic data graphical representations are generated of publishing, collaboration, citations, growth and activity in research field.

Sure. But which graphics/tools should be used in which fields?

59

Page 60: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

60

Page 61: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Indicators of Impact: Visibility in the field. (Highlighted were excluded in review, as these are impact of journal and not author).

Even though these are indicators of journal performance, we have to establish a field norm. A field norm is used as comparison in the other categories (fx sustainability, quality) and general yardstick measure of what is expected. If the researcher can document he is performing better than a field standard he will want to do that. Thus, the portfolio has to either present field norms that are up to date or present methods for the researcher to define his own standard.

ID nr. Indicator Description WOS

GS Astro.

Enviro. Sci.

Phil. Health

Comments

1 ISI JIF (SIF)Synchronous IF

Average number of citations a publication in a specific journal has received limited to ISI document types and subject fields.

Limited usefulness, but calculable by the individual. Measure of journal popularity and not designed for individual performance

2 Diachronous IF Reflects actual and development of impact over time of a set of papers.

Possible in WOS, time consuming with GS. Better represents impact of researcher than ISI JIF.

3 Y Factor Scientific impact defined as a combination of popularity and prestige

Based on JIF. Measure of journal impact.

4 Scimago Journal Rank (SJR)

Average per article PageRank based on Scopus citation data

Not GS or WOS

5 EigenFactor Journal’s total importance to the scientific community

Not GS or WOS

6 Article influence score (AI)

Measure of average per-article citation influence of the journal

Not GS or WOS

7 Normalised journal impact

Mean impact value of all the normalized citation counts for publications in a specific journal

Measure of journal impact

8 Journal to field impact score(JFIS)

Journal to fields citation score that indicates relative impact of a journal

Measure of journal impact

9 Discipline Impact Factor (DIF)(Hirst, 1978)

Number of times a journal is cited by the core literature of a single subfield rather than a complete set of ISI journals.

Index loses detail as dependent on ISI Journal Citation Reports i.e. it is affected by JCR field coverage and minimum cites inclusion criterion.

10 Median impact factor (IF med)

The aggregate Impact Factor for a subject category. Median value of all journal Impact Factors in the subject category.

Author can specify journals from websites (if report IF). Aggregate impact factor for a subject category. Compliments JIF

61

Page 62: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

11 Normalised journal position (NJP)

Compare reputation of journals across fields Based on JCR, used in across field comparisons. Not relevant for individual

12 Field citation score (FCS)

Represents the number of citations expected for a paper of the same type, published in all journals within a specific field in the same year, and document type.

ISI CI field categories are inadequate for some disciplines, providing a distorted picture

13 Field Citation Score Mean (FCSm)

Weighted average for comparison of impact in different subfields

Indicator on a higher level of aggregation than individual

14 JSCS or JRVJournal citation score (journal reference value)

Worlds average of citations to publications according to type and age. Journal-based worldwide average impact as an international reference level for the university/institute/department/group/researcher etc.

How can the individual do this?

15 Normalised Journal Citation Score (JSCm)

Reference value accounting for type of paper and years in which papers were published. Mean citation rate of all articles published in the journals in which the individual has published.

More accurate for activity in subfields than FSCm especially for developing and interdisciplinary fields.

16 JCSM/FCSm Journal based worldwide average impact mean for an individual researcher compared to average citation score of the subfields

Favours senior researchers as minimum publication value if 50 is recommended for informative analysis. Dependent on calculation of JCS and FCS

17 Crown Indicator CPP/FCSm

Individual performance compared to world citation average to publications of same document types, ages, and subfields.

Limited to same document type as world citation average is based on. Dependent on calculation of FSCm.

18 Ptj Performance of articles in journals important to (sub)field or institution.

19 CPP/JCSm Indicates if the individual’s performance is above or below the average citation rate of the journal set.

We can’t expect the individual to calculate the score of the journal set. These would have to be available standards, hence relation to individual is limited. Also limited in philosophy and public health (national interest)

20 JCSm/FCSm Relative impact level of the journals compared to their subfields.

Measure of journal impact

21 C/FCSm Applied impact score of each article/set of articles to the mean field average in which the

Dependent on calculation of FCSm

62

Page 63: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

researcher has published

63

Page 64: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Indicators of quality: Level and performance of research

ID nr. Indicator Description WOS GS Astro.

Enviro.Sci.

Phil. Health Comments

1 h-index Cumulative achievement Anbefale reference standard w.r.t specialty. Guidelines how to establish on local (peers in dept), national and expert level if necessary (leaders in field).

2 g-index The distinction between and order of scientists (Egghe, 2006; Harzing, 2008)

3 Hg-index Greater granularity in comparison between researchers with similar h- and g- indices.

4 Normalized h-index Normalizes h to compare scientists achievement based across fields

Not relevant for this study

5 H(2) index Weights most productive papers but requires a much higher level of citation attraction to be included in index.

Weight most productive papers but requires higher citation level.

6 A-index Describes magnitude of each researcher’s hits, where a large a-index implies that some papers have received a large number of citations compared to the rest

Average number of citations in H core, to imply that some papers are more highly cited than others. Has information redundancy with h.

7 R-index Citation intensity and improves sensitivity and differentiability of A index

Square root of H and A index. Pretty much the same as g, but easier to calculate

8 ħ-index Comprehensive measure of the overall structure of citations to papers

Includes citations to all papers (square root of half of the total number of citations to all publications)

9 m-index Impact of papers in the h-core (median nr of citations to papers in h core)

To demanding to be used as reference standard, as detailed citation data required. M quotient better.

10 M-quotient Adjusts for length of career simple

11 e-index Complements the h-index for the ignored excess citations

Can only be used with h, as e accounts for the “more than h” citations, thus providing complete citation information

12 Hmx-index Ranking of the academics using all citation databases together.

Maximun h across WOS, GS and Scopus (can compare with WOS, GS and database of choice fx ADS in astrology?)

64

Page 65: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

13 w-index The integrated impact of a researcher’s excellent papers.

Not as recognisible as H and just like h the cut off point is arbitrary.

14 Index of Quality and Productivity

Quality reference value; judges the global number of citations a scholar’s work would receive if it were of average quality in its field.

Could be interesting but requires reference standards to field and academic seniority. I’ll look at it again to see if it is researcher tool or a system/evaluator tool.

15 Q2 Relates two differentdimensions in a researcher’s productive core: the number and impact of papers

Dependent on calculation of m index and h index.

ID nr

Indicator Description WOS GS Astro. Enviro. Sci.

Phil. Health Comments

1 Knowledge exchange

Knowledge production, knowledge exchange, knowledge use and earning capacity

Information from CV as this is weighted count of keynote speeches, activity in agencies & organisations, public forums, committees, conferences & co-operation with companies. How to weight?

2 Dissemination in public sphere

Impact and use in public sphere (knowledge transfer)

Often not reported on CV. Count of contributions to, inc.: tv & radio programs, newspapers, non-peer reviewed journals, text books, public & professional websites and news forums.

3 Patent applications Innovation Count of patent applications. Quality or significance of patents is not on an equal level; Citations in patents is more interesting. How can researcher get these, and what are reasons to cite – influence, legal or political?

4 Tool to measure societal relevance

Aims at evaluating the the level of the effect of the publication, or at the level of its original aim

Questionnaire used as the (self-assessment) application form and the assessment form for the reviewer (Niederkrotenthaler, Dorner, & Maier, 2011)

Indicators of Innovation & Social Benefits: Contribution to society’s social, economic and cultural capital

65

Page 66: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Indicators of Sustainability: Use or decline in use

ID nr. Indicator Description WOS GS Astro. Enviro. Sci.

Phil. Health

Comments

1 Citation age c(t) The age of citations referring to a researcher’s work.

3 AR-index AR is the square root of the sum of the average number of citations per year of articles included in the h-core. Accounts for citation intensity and age of publications in H core

do not consider AR convincing as a ranking metric in research evaluation as the decay of a publication is very steep and insensitive to disciplinary differences

4 Price index – PI (Price, 1970)

Percentage references to documents, not older than 5 years, at the time of publication of the citing sources

Interesting bibliometrically, but not interesting for researcher

5 Immediacy index Speed at which an average article in a journal is cited in the year it is published

6 Aggregate Immediacy Index (AII)

How quickly articles in a subject are cited

If we can define a subject area and journals this could be an useful metric

7 Cited half-life (CHL) & Aggregate Cited Half-Life (ACHL)

A benchmark of the age of cited articles in a single journal

8 Classification of durability

Durability of scientific literature on distribution of citations over time among different fields

Only tested in WOS using journal subject categories

9 Age-weighted citation rate (AWCR, AW & per-author AWCR) *

AWCR measures the number of citations to an entire body of work, adjusted for the age of each individual paper

Field norm has to be decided to account for field characteristics such as expected age of citations, “sleeping beauties”, and delayed recognition.

*The AW-index is defined as the square root of the AWCR. It approximates the h-index if the mean citation rate remains constant over the years. The per-author age-weighted citation rate is similar to the plain AWCR, but is normalized to the number of authors for each paper.

66

Page 67: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

67

Page 68: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Appendix 6.

68

Identification of the data needed to calculate the indicators and reference standards

Page 69: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Elements needed to calculate metricOutput Autho

r name

Author byline

Full CV

affiliation

country Publication list

Article id

Authority list

Ref. standard(s)

Weightingstandard

Citation database

WOSonly

P

Pisi

Pts

Co-publicationsWeighted publication count

Elements needed to calculate metricOutcome Author

nameAuthor byline

Full CV

affiliation country Publication list

Article id

Authority list

Ref. standard(s)

Weightingstandard

Citation database

WOSonly

CPP

Ptop

Age of citations

Number of significant papers%Pnc

Elements needed to calculate metricResearch Infrastructure

Author name

Author byline

Full CV

affiliation country

Publication list

Article id

Authority list

Ref. standard(s)

Weightingstandard

Citation database

WOSonly

Numbers of co-authorsHi-index

Cognitive orientationVisual representation techniques

69

Page 70: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

70

Page 71: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Elements needed to calculate metricImpact Author

nameAuthor byline

Full CV

affiliation country Publication list

Article id

Authority list

Ref. standard(s)

Weightingstandard

Citation database

WOSonly

Diachronous IFPtj

CPP/JCSm

Elements needed to calculate metricQuality Author

nameAuthor byline Full

CV

affiliation country

Publication list

Article id

Authority list

Ref. standard(s)

Weightingstandard

Citation database

WOSonly

h-index

g-index

H(2) indexA-index

R-index

ħ-index

m-index

M-quotiente-index

Hmx-indexQ2

71

Page 72: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Innovation & social benefits

Author name

Author byline

Full CV

affiliation country Publication list

Article id

Authority list

Ref. standard(s)

Weightingstandard

Citation database

WOSonly

Knowledge exchangeDissemination in public spherePatent applications

Evt. Patent citation database

Tool to measure societal relevanceLibrary holdings (academic/com-munity library)

(WorldCat)

Elements needed to calculate metricSustainability Author

nameAuthor byline

Full CV

affiliation country Publication list

Article id

Authority list

Ref. standard(s)

Weightingstandard

Citation database

WOSonly

Citation age c(t)

AR-index

Classification of durabilityAge-weighted citation rate (AWCR, AW & per-author AWCR)

72

Reference standards individual can calculate

Auth

or

nam

e

Auth

or

bylin

e

affilia

tion

coun

try

Publ

icati

on

list

Jour

nal l

ist

w.r.

t. su

bjec

t(s)

Artic

le id

Auth

ority

lis

t

Wei

ghtin

gst

anda

rd

Cita

tion

data

base

WO

S o

nly

ISI JIF synchronous IF

Y factor

Field citation score (FCS)/(FCSm)

JSCS or JRVJournal citation score (journal reference value)Normalised Journal Citation Score (JSCm)

C/FCSm

production of colleagues of same academic seniority at dept-/institution,

Production of same academic seniority within field, national level/internationalProduction of expert reference group

citations to colleagues of same academic seniority at dept-/institution,

Citations/ median citations to same academic seniority within field, national level/internationalCitations/median citations to expert reference group

H index at local/national/expert level

M quotient at local/national/expert level

Index of Quality and Productivity

Aggregate Immediacy Index (AII)

Page 73: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

73

Page 74: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Appendix 7

74

Overview of the dependence of indicators on other indicators, reference standards and

weighting systems.

Page 75: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

37 indicators of individual performance. An overviewMetric independent Dependent on

calculation of another index

Dependent on calculation of reference standard

Comments

P

Pisi

Pts

Co-publications

Weighted publication count

CPP

Ptop

Age of citations

Number of significant papers

%Pnc

Numbers of co-authors

Hi-index (h) Supplement to hCognitive orientation

Visual representation techniques

Diachronous IF

Ptj

CPP/JCSm

h-index

g-index

H(2) index (h)A-index (h) Supplement to h

R-index (h) (a) Supplement to hħ-index

m-index (h) Supplement to hM-quotient (h)e-index (h) Supplement to hHmx-index (h)

75

Page 76: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Q2 (h)Knowledge exchange

Dissemination in public sphere

Patent applications

Tool to measure societal relevance

Library holdings

Citation age c(t)

AR-index (h) Supplements h

Classification of durability

Age-weighted citation rate (AWCR, AW & per-author AWCR)

76

Page 77: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

16 Reference standards, suggested methods that can be calculated by the individual.Metric independent Dependent on calculation

of another metricDependent on weighting Comments

ISI JIF synchronous IF

Y factor (isi jif)Field citation score (FCS)/(FSCm) If both FCS and JSCS are calculated,

then JSCSm/FCSm (impact mean for an individual researcher compared to average citation score of the subfields)

JSCS or JRVJournal citation score (journal reference value)

Simpler than FCS, but a rougher measure

Normalised Journal Citation Score (JSCm)C/FCSm (FCSm)production of colleagues of same academic seniority at dept-/institution,Production of same academic seniority within field, national level/internationalProduction of expert reference groupcitations to colleagues of same academic seniority at dept-/institution,Citations/ median citations to same academic seniority within field, national level/internationalCitations/median citations to expert reference groupH index at local/national/expert levelM quotient at local/national/expert level

(h)Index of Quality and Productivity estimated rate w.r.t. citation

count, productivity, academis age, field citation habits

Aggregate Immediacy Index (AII)

77

Page 78: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Part 2

Page 79: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Part 2. Data-collection

Work Package 5: New Bibliometric indicators August 6th, 2013 Project partners: Department of Information Studies, Royal School of Library and Information Science; Department of Library and Information Science, Humboldt University Berlin

Abstract

This report summarizes observations from the collection of publication data of the 793 scholars identified in WP5 sampling strategy dated 28th of June 2013: “Progress Report (draft to final report): Preparing for the analysis. Sampling strategy and methodological considerations in developing bibliometric indicators of the performance and impact of individuals for use in the ACUMEN

portfolio”. The scholars’ publication lists were collected. Individual scholar’s lists of publications were then sourced in Web of Science and Google Scholar, using Publish or Perish. The information on 750 scholars was successfully collected and an overview of this sample of scholars is presented in this report. This final WP5 sample is available for all consortium members to use and can be found in the ACUMEN dropbox. To evaluate bibliometrically the scholar’s performance in WOS, UT codes where collected and sent to CWTS where simple and sophisticated bibliometric indicators are currently being calculated, (a UT code is a unique article identifier used by Thomson Reuters that appears in databases in their Web of Knowledge service). The scholar’s performance in GS will be evaluated using Publish and Perish’s standard bibliometric indicators. Each scholar’s POP statistics were collected. Observations from the data collection that could have importance for the design ACUMEN portfolio are presented in this report.

Data-collection793 working links to online publication lists across 4 disciplines and 5 seniorities were identified in

Page 80: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

the sampling strategy4. The publication lists of these 793 scholars were collected from the scholar’s homepage and publication data was searched for in Web of Science and in Google Scholar, via Harzings Publish or Perish. Forty-three scholars were excluded due to: the scholar’s specialty falling outside the four disciplines investigated in preparation for the ACUMEN portfolio (15), no available publication list (13), deadlinks (12), duplicates (1), scholar impossible to identify (1) and the scholar’s academic seniority is not considered in our study (1). This resulted in a dataset of 750 scholars: 193 in Astronomy, 195 in Environmental Studies, 229 in Philosophy and 133 in Public Health, Fig, 1. Data collection commenced on the 13th of June 2013 and was completed by the 10th of July 2013.

4 WP5 (June 2013) Progress Report (draft to final report): Preparing for the analysis. Sampling strategy and methodological considerations in developing bibliometric indicators of the performance and impact of individuals for use in the ACUMEN portfolio.

Page 81: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Fig. 1. Flowchart of data-collection

Astronomy:

PhD n15Post Doc n48Assis Prof n26Assoc Prof n67Prof n37

Public Health n137:

PhD n9Post Doc n14Assis Prof n31Assoc Prof n53Prof n30

Philosophy n250:

PhD n9Post Doc n23Assis Prof n49Assoc Prof n82Prof n87

Environment n203:

PhD n3Post Doc n18Assis Prof n42Assoc Prof n85Prof n55

Astronomy n203:

PhD n15Post Doc n49Assis Prof n27Assoc Prof n72Prof n40

Data collection start date: 13th June 2013.

793 working links to online publication lists identified in sampling strategy across 4 disciplines and 5 seniorities

Publication lists and publication data of 793 scholars collected from Web of Science and Google Scholar, via Publish or Perish.

Publication data of 750 researchers retrieved.

Data collection completed: July 10th 2013

Public Health:

PhD n9Post Doc n14Assis Prof n30Assoc Prof n50Prof n29

Philosophy:

PhD n9Post Doc n22Assis Prof n45Assoc Prof n75Prof n78

Environment:

PhD n3Post Doc n17Assis Prof n39Assoc Prof n85Prof n51

Excluded 43:Deadlinks n12not discipline: n15Duplicates: 1Not publication list: n13Not seniority: 1Impossible to find in POP: 1

Page 82: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Gender distribution in the Sample

In the sample of 750 researchers 584 are men and 165 are women, Table 1. Women make up 22% of the overall sample, a reduction of 1% from the potential sample identified in the sampling strategy but still reflecting the European ratio of men to women in science, 3:15. Overall the data shows the trend that in the junior categories the ratio men to women is 2:1: phd students, post doc and assistant professor, while in the senior categories, associate professor and professor, the ratio is 4:1. This trend reflects the 2012 SHE figures of gender in research, confirming that our sample patterns the share of women employed in academia across Europe. Gender imbalance increases with age and women represent only 20% of Grade A academic staff, who are associate professors and professors6.

It is important to understand however if the exclusion of the 43 scholars has consequences for the ration men to women within disciplines and academic seniorities. The ratio men to women in the astronomy, environment and public health disciplines remain unchanged. The majority of the exclusions, 21/43, were in philosophy. This was partly due to a large amount of dead links and partly due to scholars identified as not belonging to the discipline. The title “Doctor of Philosophy” does not necessarily relate to a scholar working as a philosopher or being affiliated with the history of science. In the context of academic degrees, the term "philosophy" does not refer solely to the field of philosophy, but is used in a broader sense in accordance with its original Greek meaning (love of wisdom) and thus is awarded to scholars in other specialties. This first became clear during data collection as the publication lists and publishing patterns of the scholar did not correlate with the other scholars in this discipline. The inclusion of these “false-positive” scholars in the dataset is a result of the automatic data-harvesting by the software used by WP2 to collect the original shared dataset from Web of Science. Manual filtering, that is reading the CVs and publication lists and

5 Directorate-General for Research and Innovation, Unit B6 (2012) SHE Figures 2012: Gender in Research and Innovation. European Commision: Brussells. Retrieved from: http://ec.europa.eu/research/science-society/document_library/pdf_06/she-figures-2012_en.pdf6 SHE figures 2012.

Page 83: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

consulting institutional webpages, was the only way to decide if the scholar’s specialty belonged to Philosophy or the History & Philosophy of Science.

Table 1. Distribution of seniorities and gender across the disciplines in the sample

PhD Post Doc

Assis Prof Assoc Prof Prof Total

Astronomy 15 48 26 67 37 193Gender M/F 12:3 37:11 20:6 58:9 35:2 162:31Environment 3 17 39 85 51 195Gender M/F 3:0 11:6 30:9 72:13 44:7 160:35Philosophy 9 22 45 75 78 229Gender M/F 6:3 20:2 37:8 57:18 63:15 183:46Public Health 9 14 31 50 29 133Gender M/F 2:7 7:7 18:13 34:16 19:10 79:53Total 36 101 140 277 195 750Discipline M/F

23:13 75:26 105:36 221:56 161:34

585:165

The reduction has however had an overall positive effect on demographic of the philosophy category as the ratio men to women has decreased. By comparing the potential sample with the collected data, ratio men to women in the the phd category remains the same at 2:1, the post doc category has increased from 6:1 to 10:1, the assistant professor category has decreased from 5:1 to 4:1, the associate professor category from 4:1 to 3:1 and the professor category is also improved from 5:1 to 4:1.

Observations from the data-collectionForty-three scholars were excluded during the data-collection: 10 from astronomy, 8 from environment, 21 from philosophy and 5 from public health. In appendix 1 we illustrate, in tables, from which discipline and seniority these scholars have been excluded and what caused the exclusion.

Page 84: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Our disciplinary samples are different sizes which mean direct comparisons of the causes of exclusions are not possible. Percentages are then used in the following analysis to indicate trends in online behaviour that lead to the exclusion. The total number of excluded scholars and included scholars within each discipline were added together and used as the denominator in the percentage calculations in Table 2 and figures 2 & 3.

Table 2. Percentage exclusion per discipline

Noticeably the greatest reason for exclusion is that the scholar’s online presence does not include a publication list. Often scholars write about their specialty, projects, activities and achievements to promote interest in themselves and their field of study but omit the publication list. This appears to be more prominent in public health and environment where the norm seems to be to link to a repository like Pubmed, Inspire or ADS. In these cases the “publication list” is a link to an author search in the chosen repository. For example scholar number 523, who is a professor in public health, links directly to his publications in PubMed with the simple author search: Reis S[Author]. This

retrieves 523 references. These works are authored by Ries S, Reis SE, Ries SR, Ries Si etc. After exhaustive sorting we found that his real number of publications is only 62. We have interpreted this

% Dead links % Not discipline % Duplicates % No publication listAstronomy 14 0.3 0.3 16Environment 17 1 0 20Philosophy 13 6 3 15Public Health 17 2 0.4 25

Page 85: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

to mean that some scholars are either unaware of name ambiguity problems, of how databases “think” or are uncritical of numbers pulled from databases. This could really be a problem, even for simple indicators as we had expected the scholar at least would know their number of publications and would question such an inflated number. Perhaps the ACUMEN portfolio will have to encourage scholars to use Google Citation or a similar system, or stipulate scholars have an ORCID id to be a part of the portfolio so that they can claim all their real publications and calculate impact indicators more easily. The data indicates that in our sample the more senior the scholar is, the more likely the publication list was missing from their web profile, fig. 2.

Fig. 2 Percentage “no publication lists” to seniority within discipline

phd post doc assis assoc prof0

2

4

6

8

10

12

14

Percentage "no publication list": seniority within discipline

AstronomyEnvironmentPhilosophyPublic Health

Seniority

Perc

enta

ge

Dead links are the second major cause of exclusion and are fairly evenly distributed across the disciplines. The internet is a dynamic resource with information being added and removed

Page 86: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

constantly and the dead links, in our sample, do not appear to be more prominent in one discipline over another, which would indicate disciplinary issues with site maintenance. It is though worth stressing that the sample we present here is a snapshot of the internet and a different sample could be produced if the collection process was repeated at a later date.

Fig. 3 Percentage dead links to seniority within discipline

phd post doc assis assoc prof0123456789

Percentage dead links: seniority within discipline

AstronomyEnvironmentPhilosophyPublic Health

Seniority

Perc

enta

ge

Scholars appear to leave homepages or profiles incomplete when a new type of online profile tool becomes available or they move institutions. This has had a direct effect on our access to the publication lists of the scholars in our sample, especially senior scholars. In the short time since defining the sampling strategy and collecting the data links to publication lists have died, persons have moved institutes, been promoted and sites closed down or are under construction meaning that publication data could not collected and verified. This was especially noticeable in Public Health

Page 87: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

and Environment, whose scholars have a very active web presence often with 3 or more e-profiles available with varying degrees of currency on for example Linked In, blogs, Google Citations, Inspire, Scopus ID, PURE, CURIS, ORCID, Mendeley, Facebook, Microsoft Academic Search, Academia.eu, Impact Story, institutional homepages, project websites, etc., but this means that sites are neglected or expired when a new profile is created, and often under construction during our data collection window.

We observed that Astrophysicists enjoy using online dissemination tools the most and take enormous pride in personalizing homepages with all manner of interactive communication techniques, animations and outlinks to other interesting pages on the internet. This was however challenging in the data collection process as publication lists were “hidden” in solar systems or split up under different project pages or types of publication. The ACUMEN portfolio will have to encourage personalization to attract these scholars but also be simple enough so the information is easily findable by consumers of ACUMEN CVs. Further some astrophysicists, as well as environmental scientists and public health scholars, already include metrics on their CVs. The use varies from the very competent who contextualize the metrics in great detail to scholars who list the impact factor of the journals they publish in, please see the examples in appendix 3. In ADS7 ready-to-use metrics are available, as they are in the database Inspire8, with little or no guidance to responsible use and interpretation of these statistics. The metrics are presented as a list of numbers leaving the interpretation open for the consumer. The inclusion of metrics on CVs in our sample indicates that scholars in three of our disciplines are interested in bibliometrics enriching their publication lists but this interest is noticeably absent in the fourth discipline, Philosophy. This will be the strength of the ACUMEN portfolio and how it differs from other resources that solicit CVs using bibliometrics. ACUMEN presents the scholar with metrics that are not only beneficial to the hard sciences, but relevant to the individual scholar, their seniority and their specialty, and gives the scholar tools to contextualize the metrics and present them to the consumer in a narrative that explains what the numbers mean and how the resulting “impact” has been interpreted.

7 http://adsabs.harvard.edu/tools/metrics/8 http://inspirehep.net/author/G.Aad.1/

Page 88: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

The performance of WOS and Publish or Perish (POP) during data collectionThe students collecting the data were asked to keep a log book of their experiences searching WOS and POP. Two students did this and their log books can be found in appendix 2. The notes are written in a mixture of Danish and English, and are copy/pasted without grammatical correction from the students’ log books. The notes have however been anonymized and categorized into disciplines and seniorities. The main observations are reported in the next sections.

Publication listsPublication lists are rarely complete and more often than not out of date. In the data collection our method was to search from the date of the first reported publication on the list to 2013, regardless if the publication list did not report publications up to this year. Google Scholar includes publication types such as reports, comments and teaching materials that give a different publication/activity profile of the scholar than the profile in WOS which is limited to primarily to journal articles, reviews and conference papers. Scholars boost their publication lists or activity by including publications by colleagues in their project group while junior scholars’ link to lists by their department peers to increase their visibility and show their network. These publications were not included in our publication data.

Not all the publications found in WOS have a UT number, which means there will be a slight discrepancy between the descriptive statistics based on the actual number of publications found in WOS and the bibliometric results based on the WOS UT numbers, such as P, CPP.

Name ambiguityAs expected, finding a scholar with a common name such as “Fan” or “Li” and identifying their real publications was in some cases impossible in POP, for example:

Author name: ”ab logan” NOT ”ba logan” ”bb logan” ”bc logan” ”cb logan” ”db logan” ”bd logan” ”cb logan” ”bc logan” ”db logan” ”bd logan” ”eb logan” ”be logan” ”fb logan” ”bf logan” ”gb Logan” ”bg logan” ”hb logan” ”bh

Page 89: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

logan” ”ib logan” ”bi logan” ”jb logan” ”bj logan” ”kb Logan” ”bk logan” ”lb logan” ”bl logan” ”bm logan” ”mb logan” ”nb logan” ”bn logan” ”ob Logan” ”bo logan” ”pb logan” ”bp logan” ”qb logan” ”bq logan” ”rb logan” ”br logan” ”sb Logan” ”bs logan” ”tb logan” ”bt logan” ”ub logan” ”bu logan” ”vb logan” ”bv logan” ”bx Logan” ”xb logan” ”yb logan” ”by logan” ”zb logan” ”bz logan” "ahb logan" "elb logan" "lb Logan-fain"

It is not possible to limit to discipline and POP stops the search when the one thousand publications limit has been reached, eliminating what it considers to be less relevant publications than the ones returned. In terms of citations, these are usually articles with few (or no) citations. The omission may or may not be significant: most high-level citation metrics such as the h-Index and g-index are fairly robust and are unlikely to be affected. However, as we were looking for specific results, then these might be missing from the results list. It was not possible to search publications individually and group them to generate the bibliometric statistics. In these cases, POP ready to use bibliometrics are not useful as they do not reflect the true publication profile of the author and will give invalid information.

Homonyms are also a problem in POP, the students found that it was not uncommon for two or more authors to share the same surname and initial and be active within the same discipline. It was difficult to attribute the correct publications to the author. In POP tenacity and creativity is required to identify the scholar eg. the scholar Dvorak spells his name differently when publishing in English than when publishing in Hungarian was searched in POP:

"peter dvorak" or "petr dvorak" or "p dvorak" or "petra dvořáka" or " p Dvořáka" Eksklude: "pa dvorak" "pj dvorak" "pf dvorak" "lp dvorak"Likewise, scholars use a formal name for scientific articles and books “Samuel Clark” and an informal name on popular science documents, blogs, reviews, newspaper articles, etc, “Sam Clark”. This is an important distinction to be aware of when searching for publications on the internet.

Page 90: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

In self-evaluation name ambiguity should not be a problem as scholars will know the alternative names they used on their publications however this must not be assumed as we have already reported in this paper scholars’ unquestioning acceptance of search results.

National language challengesResearchers publish in their national languages which made it challenging to correctly couple the author to publications, especially in POP. In these cases the method was to firstly find the publications in WOS, as here the English language publications are prominently indexed, and use the abstrcts and indexing terms to understand the subject area. Using the researcher’s publication list as a master, the publications in GS were compared to the publication list, WOS list and key title words translated using google translate. In this way works with the same author name and not on the publication list, were identified and foreign language publications attributed correctly. This was a painstakingly slow process, but by doing so, non-english language publications were systematically collected and hence well-represented in the sample. We thus ensured that national language publications were not excluded due to our lack of knowledge of foreign languages.

DisciplinesMany scholars in the sample work with multi-disciplinary specialties and publish in a wide range of different formats and academic journals. Designing useful benchmarks for the scholars to contextualize their performance to will be challenging. For example statisiticians in Public Health publish in the traditions of the medical specialty they are working with, and surgeons publish, for example, at a very higher rate than practitioners of emergency medicine. The same trait is apparent in Philosophy, where cosmic-philosophers publishing styles mimic Astrophysicists with a high amount of multi-author publications whereas philosophers of economics appear to single author papers and publish more books than their cosmic-philosopher fellows.

Recommendations Emphasize the importance of storing the online CV, publication list and online profile in one

place and keeping it up-to-date. As a consumer it is difficult to gather a complete picture of

Page 91: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

the scholar when information is separated into personal homepages, institute homepages, pdfs and various profile tools.

We cannot expect the researcher to sort through two or more citation indices and remove duplicate citations to get a complete citation record. We do however encourage the researcher to explore different indices to understand their coverage in them and be critical of what the ready to use metrics reported in these sources represent. The optimum would be if the scholar presented indicators on their ACUMEN CV, such as amount of citations per paper, h index, extracted from more than one database and present the range.

Describe name ambiguity problems and how these affect the usefulness of citation indices and ready-to-use metrics. Ensure the scholar has room to write all the names he or she publishes or has published under. Name forms will make it easier for the consumer of the CVs to track activities and validate information. Research funders, research organisations, publishers, integrators etc. will find this useful.

Require the scholar to have an ORCID id or Google Citation profile to ensure the scholar can easily claim his publications.

Ensure easy import of publications into the portfolio. It will take effort to start an ACUMEN CV. The portfolio must support import of exisiting publicationslists in RIS, Bibtex, refman format, scopus ID, Google Mycitations, WOS, Mendeley and excel etc. Possible support in a “search and link” wizard? Search and link metadata on books, manuscript submissons, patents etc.

Enable the researcher to set up an alert/search profile that can pull publications into the CV after the researcher has accepted the publication as theirs and not a duplicate.

Develop guides to calculation and interpretation of metrics, both for the scholar AND the consumer.

The portfolio must include a description of the problems with the representability of reference standards at the individual and specialty level. We must provide guidelines to how the scholar can establish local standards that reflect their specialty as a field, acknowledging their multi-disciplinary character.

Personalisation of the ACUMEN CV will encourage use. Ensure that scholars can link to their peers ACUMEN CVs, like Linked In.

Page 92: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

A guide to how to present indices on the CV.

Next stepsData-analysis will continue with a description and trend analysis of the simple statistics from POP and later a correlation analysis of the simple and sophisticated indicators from CWTS, based on the WOS data. These analyses will enable us to decide if the indicators we recommend for the ACUMEN portfolio are a strong model of the disciplines and help us to identify which indicators are missing. Reference standards will be investigated as we are already aware of the difficulty the scholar will have in calculating useful peer comparisons. We will exemplify using performance standards supplied by CWTS that are based on a large level of aggregation and compare them with pseudo-h indices of the scholar’s peers and percentile citations at the article level. Are these simple indices a useful predictor of impact within a community?

We will be looking at indicators and gender, academic posts and disciplinary representation. Perhaps the indicators and data we have identified are data-driven and not researcher-driven. What consequences will this have for the usefulness of the metrics in the portfolio?

Page 93: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Appendices

1. Composition of disciplinary sample before and after data-collection………………………………………11

2. Log book from data-collection…………………………………………………………………………………………………13

3. Excerpts of a CVs using bibliometrics……………………………………………………………………………………….30

Page 94: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Appendix 1: Composition of disciplines before and after data-collection

Astronomy

Composition of discipline identified in sampling strategy

Astronomy Phd Post Doc Assis. Prof Assoc. Prof Prof TotalDead link 2 16 1 6 13 38Not Discipline 1 1Duplicate 1 1Not publication list 1 6 8 15 10 40Not correct seniority

Composition of discipline after data collection

Astronomy Phd Post Doc Assis. Prof Assoc. Prof Prof TotalDead link 2 17 1 7 14 41Not Discipline 1 1Duplicate 1 1Not publication list 1 6 9 18 12 46Not correct seniority

The set is reduced from 203 to 193 scholars, a reduction of 5%

Environment

Composition of discipline identified in sampling strategy

Environment Phd Post Doc

Assis. Prof Assoc. Prof Prof Total

Page 95: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Dead link 2 6 7 25 11 49Not Discipline 1 2 3DuplicateNot publication list 2 4 15 25 15 61Not correct seniority 1 1

Composition of discipline after data collection

Environment Phd Post Doc

Assis. Prof Assoc. Prof Prof Total

Dead link 2 6 9 25 12 54Not Discipline 1 3 4DuplicateNot publication list 2 5 16 26 16 65Not correct seniority 1 1Impossible to find in POP

1 1

The set is reduced from 203 to 195 scholars, a reduction of 4%Philosophy

Composition of discipline identified in sampling strategy

Philosophy Phd Post Doc

Assis. Prof Assoc. Prof Prof Total

Dead link 2 5 9 12 17 45Not Discipline 1 1 4 2 8Duplicate 1 1 5 3 10Not publication list 2 4 8 21 20 55Not correct seniority

Composition of discipline after data collection

Page 96: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Philosophy Phd Post Doc

Assis. Prof Assoc. Prof Prof Total

Dead link 2 6 10 12 18 49Not Discipline 1 3 10 8 22Duplicate 1 1 6 3 11Not publication list 2 4 8 21 20 55Not correct seniority

The set is reduced from 250 to 229 scholars, a reduction of 8%.

Public Health

Composition of discipline identified in sampling strategy

Public Health Phd Post Doc

Assis. Prof Assoc. Prof Prof Total

Dead link 3 2 8 17 10 40Not Discipline 2 1 1 2 6Duplicate 1 1Not publication list 3 4 9 26 16 58Not correct seniority

Composition of discipline after data collection

Public Health Phd Post Doc

Assis. Prof Assoc. Prof Prof Total

Dead link 3 2 8 17 11 41Not Discipline 2 1 1 2 6Duplicate 1 1Not publication list 3 4 9 29 16 61

Page 97: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Not correct seniority

The set is reduced from 137 to 132 plus one scholar moved from environment to public health (n133), a reduction of 3%.

Appendix 2: Log book from the data-collection.

These are observations by the students collecting the publication data in Web of Science and Google Scholar, via Publish or Perish. The students were asked to note any problems or challenges they had collecting data in these two indices. They were also encouraged to write down their thoughts about the performance or “usefulness” of WOS and POP in searching for a scholar’s publications. The notes are written in a mixture of Danish and English, and are copied without grammatical correction from the students’ log books. The notes have however been categorized into disciplines and seniorities.

Astronomy & Astrophysics

phd-studentsForfatteren akos kereszturi har udgivet artikler siden 1994, hvilket kunne indikere at han måske ikke er Post.doc. Kereszturi har 35 publikationer på sin publikationsliste, men noterer også en del Populærvidenskabelig formidling, der formodentlig vil dukke op i gs. Han går meget op i bred formidling af

Page 98: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Fysik, hvilket kan forklare det høje antal af publikationer i gs - måske er 294 dog lige lovligt højt. De er alle Inden for astrofysik og jeg åbnede de dokumenter, jeg var i tvivl om og de var af akos kerezturi.

Michael weidinger kan være et problem i gs, da der er en anden fysiker ved navn matthias weidinger, der Udgiver fra university of Wurzburg, der også udgiver inden for astronomi og astrofysik. Det bliver svært at Skelne de to fra hinanden i gs.

Gs: ”erik bartoš” ville udelukke ”me bartoš” men det viste sig at være gs, der havde taget hans titel med Som fornavn, altså var det ham.

Assistant professsorsMange dubletter i publish or perish

Daphne weihs, no. 77, er biomediciner og arbejder ikke med astronomi eller lignende.

Page 99: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Msg_max_results“Warning: results limit reached.The query returned <n> results, which is the maximum that google scholar Allows. This may affect the query coverage. Click help for more information.Indicates that your query returned the maximum number of results that google scholar Allows (1000; sometimes a few less). Your query may have more matches, but the remainder Are not available. As a result, some potential matches may be omitted from the list of results. Generally speaking, the missing results are deemed by google scholar to be less relevant than the ones that were returned. In terms of citations, these are usually articles with few (or No) citations.The omission may or may not be significant: most high-level citation metrics such as the h-Index and g-index are fairly robust and are unlikely to be affected. However, if you are looking for one or more specific results, then these might be missing from the results list.

Professors

Professor liGoogle scholar search: stopped after 1000 posts retreived, the search is not representative of his Work. Search query:"cheng li" from 2001 to 2013: allQuery date: 2013-06-27Papers: 47Citations: 3491Years: 13

Professor varga

Page 100: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Google scholar search: stopped after 1000 posts retreived, the search is not representative of his Work. Search query:"p varga" from 1966 to 2013: allQuery date: 2013-06-27Papers: 1000Citations: 16203Years: 48The search “peter varga” resulted in 34 posts, mostly hungarian, but they all belong to our professor. Hungarian posts verified by title opslag in google translate/and on his cv (which is out of date)

Page 101: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Environmental Science, studies & engineering

Assistant professors

255 Freni, gFik firdoblet sine publikaitoner i gs. Udover at det skyldes ikke-engelsk sproget litteratur var Der også en del praksis-orienteret materiale (rapporter osv.)

Associate professors

280 rajta iIkke inden for environmental, udgiver inden for fysik.

281 gendel yLinker til en anden persons cv. Hans cv er ikke til at finde på siden, men ved at google kommer det frem at Han er ph.d studerende og den persons cv, han linker til, er hans vejleder. Fik først sin ph.d i 2011 og er Derfor tvivlsomt assoc_prof. Http://www.google.dk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=10&ved=0cgmqfjaj&url=http%3a%2f%2fwww.neaman.org.il%2fneaman2011%2fuserdata%2fsendfile.asp%3fdbid%3d1%26lngid%3d2%26gid%3d2344&ei=9zjpudpmoczzsgakk4dicg&usg=afqjcngnfabqimgdoss6mrglx0rguvtjiq&bvm=bv.48572450,d.yms

Page 102: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Philosophy and the History & Philosophy of Science

Post doctoral students

26/6 gramelsberger, gabriele (417)Ingen navnesammenfald eller anden støj

26/6 lessmann, ortrud (418)Navnesammenfald

Lessmann, olivierUmiddelbart let at adskille, da deres fagområde var meget forskelligt

26/6 novotny, daniel d. (419)NavnesammenfaldNovotny, duanNovotny, davidSøgning på ”novotny dd” fik sorteret det meste af støjen fra.

26/6 dicken, paul (420)NavnesammenfaldDicken, peterForskelle i fagområde gjorde adskillese let

26/6 malmqvist, erik (421)NavnesammenfaldMalmqvist, ebba

Page 103: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Fagområderne var meget tæt på hinanden men hun var klart praktiker, hvor han er meget teoretisk Orienteret. Dette lettede sorteringen en del.

26/6 frega, roberto (422)NavnesammenfaldFrega, romeoFagområde var forskelligt, så det var let at sortere

26/6 marvan, tomas (423)Ingen navnesammenfald eller anden støj

26/6 eronen, markus (424)Op til flere navnesammenfald, men ved at søge på ”eronen mi” kom kun relevante dokumenter Frem, der kan godt være nogen der ikke er kommet med, men dem jeg fandt var højt relevante.

26/6 gerken, mikkel (425) (impossible)Mange navnesammenfald indenfor mange forskellige fagområder, oprydningsarbejdet især i gs Viste sig meget tidskrævende

26/6 herran, néstor (426)Enkelte navnesammefald, men da han har et meget snævert fokus for sit fagområde var det let at Sortere.

26/6 shultziner, doron (427)Ingen navnesammenfald eller anden støj.

Page 104: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

26/6 hennig, boris (428)Masser af navnesammenfald og enkelte fagområdesammenfald, især i gs vil det måske blive Nødvendigt at tjekke resultaterne efter, da nogle af dem jeg bedømte som relevante godt kan have Været af en navnefælle.

26/6 backman, jussi (429) (impossible)En meget høj grad af navnesammenfald også på eget universitet, fagområdesammenfald er ikke så Udtalt, men mængden af støj fra navnefæller gør det til et kæmpearbejde at sortere i det.

26/6 roinila, markku (430)Et navnesammenfald med en amerikansk forsker der skrev om det finsk-svenske Immigrationsmindretal i nordamerika. Let at skille fra hinanden.

26/6 milne, richard (431) (impossible)Høj frekvens af navnesammenfald også i beslægtede fagområder.

26/6 buczek, pawel (432)NavnesammenfaldBuczek, piotrFagområder er forskellige nok til at kunne sortere

26/6 vagelpohl, uwe (433)Umiddelbart ingen navnesammenfald eller anden støj

26/6 pieters, wolter (434)Et navnesammenfald indenfor nært beslægtet fagområde

Page 105: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

”Pieters, willem” Sortering lidt besværlig i gs da jeg ikke forstår hollandsk, men det gik forholdsvis smertefrit

26/6 lönnqvist, jan-erik (435)Navnesammenfald med en kemiker

26/6 stokes, patrick (436)Masser af navnesammenfald, ville måske være værd at gennemse igen

26/6 evers, daan (437)Meget høj frekvens af navnesammenfald, svært at indkredse i gs da han blev frasorteret i Forbindelse med at jeg prøvede at udelukke diverse ekstra initialer.Burde eventuelt gennemgås igen

26/6 sanchez leon, alberto (438)Få navnesammenfald, men dem der var lå også tæt på i fagområde, især i gs var det svært at Afkode hvilke dokumenter der hørte til.Burde eventuelt gennemgås igen

Assistant professors

Dvorak 454 (impossible)Gs:Forfatternavne: "peter dvorak" or "petr dvorak" or "p dvorak" or "petra dvořáka" or " p Dvořáka"Ekskludering: "pa dvorak" "pj dvorak" "pf dvorak" "lp dvorak"

Page 106: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Fandt petre dvoraka på forfatterens egen side hvor jeg gik et skridt tilbage fra den engelske Side. Der er 940 poster efter ovenstående søgning. Dvorak kan åbenbart staves på mange måder, og umiddelbart ud fra hvad jeg har kunne se Kan peter dvoraks navn også staves på flere måder, så hvorledes jeg ellers kunne ekskludereVed jeg ikke.

Roy 455 GsForfatternavne: "oliver roy" or "o roy"Ekskludering: "oc roy" "jo roy" "ofa roy" "mo roy" "op roy" "po roy"312 posterRangerede efter publication og gennemgik Dem som var relevante stod oftest sammen med andre relevante pga. Publikationen.

Page 107: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Ridge 458 (impossible)GsForfatternavne: "steve ridge" or "s ridge"Ekskludering: "sgm ridge" "sa ridge" "se ridge" "sgk ridge"Ingen relevante resultater – kan det passe?

Simon 461 (impossible)GsForfatternavne: "fabrizio simon" or "f simon"Ekskludering: "af simon" "fb simon" "fa simon" "fjg simón" "fx simon" "fg simon" "fr simon" "mf simon" "jf simon" "fjg simon" "lf simon" "df simon" "hf simon" "fp simon" "bf simon" "fm simon" "f simon-ritz" "f simon nieto" Tilsyneladende er der mange der hedder f simon, så der kom over 1000 poster selvom jeg Ekskluderede en del efternavne. Så den er impossible.

Wilkinson 464 (impossible)WosAuthor=(angus j wilkinson) or author=(wilkinson aj) or author=(wilkinson a) Refined by: authors=( wilkinson a or wilkinson aj ) and [excluding] Web of science categories=( biochemistry molecular biology or Health policy services or surgery or medicine research Experimental or transplantation or psychology or Infectious diseases or pathology or microbiology or Pediatrics or biochemical research methods or immunology Or cell biology or medicine general internal or genetics Heredity or hematology or physiology or zoology or Psychology multidisciplinary or nursing or behavioral Sciences or tropical medicine or psychology experimental Or psychiatry or psychology biological or clinical Neurology ) and research areas=( engineering or materials Science or physics or metallurgy metallurgical Engineering ) and authors=( wilkinson aj )

Page 108: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Timespan=1991-2013. Databases=sci-expanded, cpci-s.

Gs Author name: "angus j wilkinson" or "aj wilkinson"561 poster

Schäfer 471Wos Refinede med de universiteter han har arbejdet ved – gav 12 poster ud af de originale 151. Spørgsmålet er om der er noget materiale som ikke står registreret under universitetet som Er skrevet af schäfer

Author=(mike s schaefer) or author=(schaefer ms) or author=(schaefer m) Refined by: organizations-enhanced=( free university of berlin or University of hamburg ) Timespan=2002-2013. Databases=ssci, a&hci, cpci-ssh.

Clark 472Han hedder samuel clark men der står på hans egen side at han hedder sam clark. Fandt først hans Udgivelser efter kun at søge på samuel og ikke sam. Prøvede at søge på hans andre artikler på title i Wos, men fandt ingenting. Så han er kun katalogiseret som samuel, ihvertfald i wos. Gs"samuel clark" or "sam clark" "sj clark" "sl clark" "sr clark" "sa clark" "se clark" "st clark" "js clark"Fandt 4

Moreno munõz 476Wos

Page 109: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Ved brug af munoz i søgningen fandt jeg ingenting på hans navn. Ved brug af kun moreno Kom der over tusind poster, men ved at kigge i categories var der intet der havde med hans Område at gøre. Så jeg skrev 0 resultater

Associate professors

Chapman 489 (impossible)Gs"siobhan chapman" or "s chapman" NOT "sc chapman" "cs chapman" "rs chapman" "ds chapman" "sj chapman" "ms chapman" "ds Chapman" "ls chapman" "sw chapman" "fs chapman" "sk chapman" "sb chapman" "as Chapman" "ks chapman" "bs chapman" "st chapman" "st chapman" "ss chapman" "ls Chapman" "sr chapman" "sg chapman" "es chapman" "sp chapman" "js chapman" "ps Chapman" "ns chapman" "sd chapman" "sg chapman" "st chapman"1993-2013Over 1000 poster

Gonzales 498 (impossible)WosHun linker selv til en researcherid.com side, hvor hun har 91 udgivelser. Når jeg taster Hendes author id nummer ind i wos får jeg kun 10 poster. Ved søgning på hendes navn Dukker der langt flere frem, men ved afgrænsning i hvilken organisation det kommer fra (university of navarra) kommer der 11 frem. 2 af dem er nye hvor en af dem er en af Hendes. Hvor den sidste er henne er et godt spørgsmål. Men jeg får altså kun 11 resultater.

Page 110: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations
Page 111: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Gs"ana marta gonzalez" or "am gonzalez"NOT "am gonzales-angulo" "am gonzales-paramas" "am gonzales-vadillo" "am gonzalez-Rodriguez" "jm alvarez-suarez" "am gonzalez soca" "am gonzalez gonzalez" "jm alvarez-Suarez" "am gonzalez-angulo" "am gonzalez-cameno"Stadigvæk over 1000 poster.

Obrien 499 (impossible?)Hverken i gs eller wos fandt jeg nogle poster.

Christensen 505WosAuthor=(anne-marie soendergaard christensen) or author=(anne-marie sondergaard Christensen) or author=(anne marie soendergaard christensen) or author=(anne Marie sondergaard christensen) or author=(christensen ans) or Author=(christensen as) Timespan=2006-2013. Databases=sci-expanded, ssci, a&hci, cpci-s, Cpci-ssh. Ingen posterGsKun 6 poster

Kuna 529 (impossible)GsDer kom error 13 ved min søgning. De resultater der kom frem var ikke relevante.

Logan 535 (impossibleGs

Page 112: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Afgrænsning: ”ab logan” NOT ”ba logan” ”bb logan” ”bc logan” ”cb logan” ”db logan” ”bd logan” ”cb logan” ”bc logan” ”db logan” ”bd logan” ”eb logan” ”be logan” ”fb logan” ”bf logan” ”gb Logan” ”bg logan” ”hb logan” ”bh logan” ”ib logan” ”bi logan” ”jb logan” ”bj logan” ”kb Logan” ”bk logan” ”lb logan” ”bl logan” ”bm logan” ”mb logan” ”nb logan” ”bn logan” ”ob Logan” ”bo logan” ”pb logan” ”bp logan” ”qb logan” ”bq logan” ”rb logan” ”br logan” ”sb Logan” ”bs logan” ”tb logan” ”bt logan” ”ub logan” ”bu logan” ”vb logan” ”bv logan” ”bx Logan” ”xb logan” ”yb logan” ”by logan” ”zb logan” ”bz logan” "ahb logan" "elb logan" "lb Logan-fain"Der kommer stadig over 1000 poster. Når jeg afgrænser kommer de alligevel frem. Så jeg Kan ikke se hvad jeg kan gøre anderledes.

Page 113: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Professor

27/6 borgato, maria teresa (570)Ingen navnesammenfald eller anden støj

29/6 osborne, catherine (571) (impossible)Navnesammenfald indenfor samme fagområde, især et problem i gs, da jeg kom i tvivl om jeg Markerede den rigtge forfatter eller ej.

29/6 klein-braslavy, sara (572)Ingen navnesammenfald eller anden støj

29/6 lam, alice (573) (impossible)Navnesammenfald også indenfor beslægtede fagområder

29/6 lorch, marjorie perlman (574)Enkelte navnesammenfald, men adskillese af fagområder og hendes fokus på et meget snævert Emne gjorde det let at sortere.

29/6 galavotti, maria carla (575)Umiddelbart ingen navnesammenfald eller anden støj

29/6 enslin, penny (576)Meget få navnesammenfald hovedsageligt i gs, ingen umiddelbare fagområdeoverlap

29/6 unterhalter, elaine (577)Ingen navnesammenfald eller anden støj

Page 114: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

29/6 galeotti, anna elisabetta (578)Ingen umiddelbare navensammenfald eller anden støj

29/6 griffiths, morwenna (579) (impossible)Navnesammenfald indenfor nært beslægtede fagområder29/6 frewer, lynn j (580)Navnesammenfald udenfor fagområdeFrewer, lornaSkrev om fredsbevarende styrker og militær udstationeringUtroligt mange resultater i især gs, kan måske skyldes dubletter

29/6 chemla, karine (581)Ingen navnesammenfald eller anden støj

30/6 verbrugge, rineke (582)Enkelte navnesammenfald, forholdsvis let at sortere da der ikke var nært beslægtede fagområder

30/6 garcia-encinas, maria jose (583)Ingen navnesammenfald eller anden støj

30/6 campos boralevi, lea (584)Ingen navnesammenfald eller anden støj

30/6 fernandez, angel nepomuceno (585)Navnesammenfald med beslægtet fagområde, det var dog stadig muligt at sortere dem fra Hinanden.

Page 115: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

1/7 chaline, jean (586)Der var et meget stort antal af ekstra poster i gs, om det er dubletter eller fordi der er flere Indenfor samme felt er jeg ikke helt sikker på, men jeg inkluderede alle der holdt sig indenfor Emnet.

1/7 malo, antinio (587) (impossible)Der var umiddelbart for mange navnesammenfald til at kunne lave en meningsfyldt sortering uden At bruge mange timer på det.

1/7 d’agostino, marcello (588) (impossible)Mange navnesammenfald, men ikke i nært beslægtede fagområderGs var umiddelbart et utroligt stort sorteringsarbejde

1/7 buzzoni, marco (589) (impossible)Linket til hans egen litteraturliste var dødt

Page 116: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Public health and Public Health Policy

Assistant professors

17/6 bode, christina (703)Havde mange navnesammenfald i både wos og gs indenfor beslægtede felterBode, christophBode, caroleLøsning Wos: Søge på fuldt fornavn og se hvilke categories der var tilknyttet søgeresultatet, Derefter bruge dem Udelukke institutioner og universiteter som forskeren ikke er eller har været Tilknyttet (organizations, enhanced -> exclude i more options) Gennemgå titler for at se om de stemmer overens med forskningsspecialisering.GsSøge på fuldt eller delvis fornavn, Ekskludere initialer per vejledningFejlkilderHar måske ekskluderet dokumenter hvor hun står med kun første initial (bode, c)WosHar måske ekskluderet conference dokumenter ved at ekskludere bestemte Organisationer

18/6 booth, alison (704) (impossible)Hun har selv andet initial m. Fremgår ikke af hendes universitetshjemmesideBooth, am ifølge wos Mange navnesammenfald i både wos og gsBooth, andy m.Booth, alexanderBooth, alBooth, ao

Page 117: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Løsning Wos: Søg på fulde fornavn plus initial: booth alison mGsDet var umuligt umiddelbart at få et brugbart resultat.

18/6 williams, john r (705) (impossible)Der findes så mange john r. Williams at det var umuligt at lave en søgning der umiddelbart gav Gode resultater.

18/6 huhtala, heini (706)Ud fra stikprøver fandt jeg ingen navnesammenfald og stikprøver viste også samme lokalitet. Wos categories for datasættet ligger alle sammen indenfor medicinske eller beslægtede kategorierGs data var for omfangsrig til mere end en overfladisk gennemgang, det ser dog ud til ligesom i wos At falde indenfor det medicinske felt eller beslægtede felter.

18/6 gardner, benjamin (707) (impossible)Ved at begrænse på både organizations-expanded og wos categories kom jeg frem til de resultater Der er i regnearket i forhold til wos.I gs var det noget nær umuligt at begrænse søgningen således at man ramte den rigtige forfatter. Jeg har inkluderet de resultater jeg kom frem til men en større oprydning er nødvendig er min Bedømmelse.

18/6 spilková, jana (708)NavnesammenfaldSpilkova, jirina Ansat ved samme universitet og har udgivet i nogenlunde samme periode. Har i både gs og wos sorteret ud fra at de skrev om forskellige fagområder

Page 118: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

18/6 andreucetti, daniele (709)Ingen navnesammenfald eller andre problemer i hverken wos eller gs.I gs var der en del titler på italiensk, men ud fra hvad jeg kunne dechiffrere, så var de alle relevante.

18/6 van solinge, hanna (710)Ingen problemer med navnesammenfald eller lignendeI gs var der to artikler på spansk. Umiddelbart kunne jeg med mine spanskkundskaber ikke Bedømme deres relevans, men det virkede til at den ene ihvertfald havde noget med familier og Gamle at gøre, de er derfor ikke blevet udeladt fra datasættet.

Associate professors

19/6 hakkaart-van roijen, leona (711)Ingen problemer med afgrænsninger i hverken gs eller wos

19/6 baron-epel, orna (712)Ingen navnesammenfald eller anden åbenlys støj i hverken gs eller wos

19/6 johnsen, søren p. (713)Umiddelbart ingen navnesammenfald i wosGsDokumenterne virkede umiddelbart relevante pånær en enkelt post der var skrevet i Kyrillisk, jeg kunne ikke bedømme indholdet, men den er inkluderet i datasættet.

Page 119: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

19/6 reis, shmuel (714) (impossible)Ufatteligt mange navnesammenfald.Både reis, s****. Mange forskellige fornavne til afternavnet reis.Wos medtog også forfattere med sammensatte navne af typen reis-s****. F.eks. Reis-silva.

19/6 jensen, jesper ole (715) (impossible)Mange navnesammefald. Prøvede at afgrænse i wos med ”countries/territories” og valgte ”denmark”. Forsker på dtu med navnet jensen, jens oluf dominerede stadig listen. I gs er der alt for meget støj til at få et meningsfyldt resultat umiddelbart.

19/6 nielsen, claus vinther (716)Navnesammenfald med forskere indenfor andre felter. Andre forskere var indenfor videnskabelige felter der var markant anderledes

19/6 toft, gunnar (717)Ingen problemer med fremfinding, ingen navnesammenfald.

19/6 hesse, morten (718)Mange navnesammenfaldWos: En begrænsning til ”countries/territories” hvor jeg valgte ”denmark” gav kun Artikler af morten hesse så vidt som jeg kunne bedømmeGs: Blev nødt til at begrænse søgningen til ”hesse morten” da at medtage ”hesse m” Gav over 1000 hits.

20/6 ramlau-hansen, cecilia (719)

Page 120: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Ingen navnesammenfald eller andre problemer

20/6 støvring, henrik/stovring, henrik (720)Ingen navnesammenfald eller andet støj

20/6 muth, christiane (721) (impossible)Har ingen egentlig egen publikationsliste, det var nødvendigt at søge på hvor mange af hendes Instituts udgivelser hun var (med)forfatter på.Har i ”muth christiane_721_mabr.pdf” markeret navnet ”muth” da listen indbefatter 734 Hvoraf hun kun optræder på 53 af dem.I wos begrænsedes søgningen til kun at indbefatte det universitet hun er tilknyttet I gs var det umuligt at få et brugbart resultat da der var navnesammenfald indenfor både Ubeslægtede og beslægtede forskningområder.

20/6 hougaard, karen sørig (722)Ingen navnesammenfald eller andet støj20/6 vehtari, aki (723)Ingen navnesammenfald eller anden støj

20/6 kabai, péter (724)Ingen navnesammenfald eller anden støj

20/6 bødker, réne (725)Ingen navnesammenfald eller anden støj

20/6 ansel, pat (726) (impossible)Navnesammenfald og forskningsområdesammenfaldAnsell, peter

Page 121: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

20/6 chin a paw, mai (727)Ingen navnesammefald eller anden støj. Det var dog nødvendigt at søge på både ”chin a paw, m” og ”chinapaw, m” da hun optræder under Begge navne.

20/6 de bruyne, martine (728) (impossible)

Professors

U vogel 774 (impossible)Publikationsliste er samling af to forfatteres.Gs: Afgrænsning: "uf vogel" and "ub vogel" and "ur vogel" Gav 243 resultaterWos:Au=(vogel u*) and (sh=(physical sciences or life sciences biomedicine) or Wc=(multidisciplinary sciences)) Refined by: authors=( vogel u ) and organizations-enhanced=( university of Wurzburg or natl reference ctr meningococci or hannover medical School ) and [excluding] publication years=( 1989 or 1990 ) Timespan=all years. Databases=sci-expanded, a&hci, ssci, cpci-ssh, cpci-s.

Mj prince 775GsFrasortering ved at kigge dem alle sammen igennem. Alt det der har med disability og canada har jeg frasorteret

Page 122: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

A katalinic 776Wos:Au=("katalinic a") and (sh=(physical sciences or social sciences or life Sciences biomedicine) or wc=(social sciences, interdisciplinary or multidisciplinary Sciences)) Refined by: [excluding] web of science categories=( dentistry oral surgery Medicine or food science technology or computer science artificial Intelligence or telecommunications or computer science information Systems ) Timespan=all years. Databases=sci-expanded, a&hci, ssci, cpci-ssh, cpci-s.

Ad grant 777 (impossible)Wos:846 resultater før refining med organizations-enhanced=( london school of hygiene tropical medicine). Efter 126. Men om hun har arbejdet andre steder ved jeg ikke.Author=(grant ad) or author=(grant a) Refined by: [excluding] web of science categories=( physics particles fields or Computer science theory methods or environmental sciences or Engineering electrical electronic or astronomy astrophysics or foodScience technology or history or nuclear science technology or Instruments instrumentation or telecommunications or marine Freshwater biology or computer science information systems or Agriculture dairy animal science or economics or education Scientific disciplines or fisheries or engineering environmental or Oceanography or business or meteorology atmospheric sciences or Computer science interdisciplinary applications or veterinary Sciences or imaging science photographic technology or dentistry

Page 123: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Oral surgery medicine or political science or substance abuse or Zoology or engineering civil or literature british isles or sport Sciences or linguistics or chemistry applied or management or Language linguistics or materials science multidisciplinary ) and Authors=( grant a or grant ad ) and organizations-enhanced=( london school ofHygiene tropical medicine ) Timespan=1990-2013. Databases=sci-expanded, ssci, a&hci, cpci-s, cpci-ssh.

Page 124: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

H montgomery 784 (impossible)GsProblemer med eksludering af forkerte forfatternavne"he montgomery" "jh montgomery" "hl montgomery" "gh montgomery" "hdb montgomery" "he Montgomery-downs" "rh montgomery jr" "wh montgomery" "ah montgomer" "dh montgomery" "hj montgomery" "mh montgomer" "ah montgomery" "rh montgomery" "ch montgomery" "sh Montgomery" "mh montgomery" "h montgomery-massingberd" "jl montgomery"Nogle af disse endte op alligevel på listenFs violante 786Har kun artikler fra 2004 til 2008 i sin publikationsliste. Tog alle andre år med også, da det er usandsynligt atHan i løbet af de år er blevet professor. Derfor søgte jeg på alle år i wos og afgrænsede efter hvilke artikler Jeg fik fra i pop.

M martinez 788 (impossible)

Page 125: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Appendix 3: Excerpts of a CVs using bibliometrics

Excerpt 1: from Public Health

Page 126: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations

Excerpt 2: From Astrophysics

Page 127: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations
Page 128: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations
Page 129: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations
Page 130: static-curis.ku.dkstatic-curis.ku.dk/portal/files/124046467/2._Deliverable_5…  · Web viewPart 1. Preparing for the analysis. Sampling strategy and methodological considerations