ilya ponomarev 1, pawel sulima 1, jodi basner 1, unni jensen 1, joshua schnell 1, karen jo 2, and...
TRANSCRIPT
Ilya Ponomarev1, Pawel Sulima1, Jodi Basner1, Unni Jensen1, Joshua Schnell1, Karen Jo2, and Nicole Moore2
A New Approach for Automated Author Discipline Categorization and Evaluation of Cross-Disciplinary
Collaborations for Grant programs
1Custom Analytics, Rockville, MD2National Cancer Institute, Bethesda, MD
10/16/2013 5:30 PM
Why Cross-disciplinary Research?
2
“Interdisciplinary research can be one of the most productive and inspiring of human pursuits”
Facilitating Interdisciplinary ResearchNational Academy of Sciences, 2005
• Innovation increasingly occurs at the boundaries of disciplines• Complex “Puzzles” require diverse background• Data avalanche from multiple sources requires fusion of information• Convergent technologies require integration across disciplines
How to Measure Success of Cross-disciplinary Program?
THIS TALK:
1.In order to measure cross-disciplinarity define disciplines as accurate as possible
2.General approach of automatic assigning grant specific categories to papers and people
3.Application to NCI PS-OC grant program classification?
4
See also J. Basner, “Evaluating Collaboration and Outcomes of Health Research” Friday, 10/18/2013, 11:00am at Gunston East Rm
NCI Physical Sciences-Oncology Centers
5
12 centers, 250 Researchers
09/2009-Current
Institute Facilitate Generate
Evaluation: Birds View1. Use publications as a proxy of outcome
6
2006-2008:3,367 pubs
2009-2012:601 reported pubs
2. Compare baseline data set (2006-2008) with ongoing research data set (2009-2012) Web of Science+ Medline
166 active PS-OC investigators202,000 references4,199 journal titles
productivity impactcollaborationFields convergence
J. Basner, Friday, 10/18/2013
Evaluation: Birds ViewApproach:
7
PS-OC 2/3 broad categories
Onc
olog
y
Phy
sica
l Sci
ence
s
Life
Sci
ence
s
PS-OC 3 broad categories
Onc
olog
y
Phy
sica
l Sci
ence
s
Life
Sci
ence
s266 Web of Science Journal Subject Categories
8
Has Oncology SCMultiple SCs per journals (up to 7) Multidisciplinary (meaningless, but “Science”, “Nature”)Some SCs are already inter-disciplinaryLSs dominates after aggregation
22 ESI Subject Categories
9
One SC per journalDoes not have Oncology Multidisciplinary SC exists alsoClinical medicine?LSs dominates after aggregation
Mapping. Challenges
Approach:
1.Intermediate map on extended 6 Broad Categories
2.Paper level SC assignment based on references 10
PS-OC 3 broad categories
Onc
olog
y
Phy
sica
l Sci
ence
s
Life
Sci
ence
s
Web of Science 266 Journal SCs
Web of Science 22 Broad ESI categories
One SC per journalDoes not have Oncology Multidisciplinary SC exists alsoClinical medicine?LSs dominates after aggregation
Has Oncology SCMultiple SCs per journals MultidisciplinarySome SCs are inter-disciplinaryLSs dominates after aggregation
Step 1. Introduce 6 Intermediate PS-OC Categories for Better Selection:
11
PS – Physical Sciences
LS – Life Sciences
OC – Oncology
MED – Medicine
OTH – Others
MULT – Multidisciplinary
11
(very often MED journals are closer to ON than LS)
Will be dropped on final stage
Step 2. Map 265 WoS JSC to 6 PS-OC Categories:
12
Examples:
a) Obvious: Acoustics PS, Chemistry, Analytical PS
Oncology OC, Management OTH
b) Dominant: Biophysics PS
c) Dominant: Physics, Multidisciplinary PS
d) Meaningless: Multidisciplinary MULT(usually published in “Nature”, “Science” or “PNAS”)
Meaningless in terms of assignment PS-OC category: article published in MULT journal can be about PS, or about LS, or OC. Usually, it is not interdisciplinary article. Additional re-classification of article’s research field is needed based on references.
Step 3. Assign PS-OC Categories Weights to Each Journal
13
(Journals in WoS can have 1 or 2, or 3, … even 7 SCs)
Examples:
Journal “Radiation Research” – 3 SCs:
Biology LSBiophysics PSRadiology, NM PS
LSPS
Map Select distinctPS-OC categories
2
Count total (denominator) Weights)
LS=1/2PS=1/2OC=0MED =0MUL=0OTH=0
Each journal should be counted equally
Step 4. Calculate combined J-R weights for publications:
14
Example:
Coffey D., Getzenberg R. JAMA, 2006 1 journal cat (MED=1) 26 Refs:
14
Journal weights Aver. Refs Weights
LS=0PS=0MED=1OC=0MUL=0OTH=0
LS=0.23PS=0.04MED=0.17OC=0.36MUL=0.19OTH=0
½ (Journal + Refs)
LS=0.12PS=0.019MED=0.58OC=0.18MUL=0.1OTH=0
Better assignment of paper’s field based oninformation what paper cites
Step 5. Collect all publications for each investigator, calculate average weights, and rank PS-OC categories:
15
Example.
David A 8 pubs: Average JR weights
Averaged J-R Weights
LS =0.32PS =0.04MED=0.23OC =0.41OTH =0.01
Person Inter-disciplinarity
LS =2PS =4MED=3OC =1OTH =5
Ranks
3
Step 6. Redistribute MED and OTH weights between OC,LS, and PS
16
LS =0.32PS =0.04MED=0.23OC =0.41OTH =0.01
LS =0.4PS =0.05OC =0.55
Validation
17
At the beginning of the program: Investigators self-nominated themselves as oncologists or physicists
Future Development
19
19
Physical Scientist
Oncologist
Life Scientist
PS-OC Network Investigators Outside Network Co-authors
Conclusions
20
• Automated approach for decomposition of scientific publications into grant specific discipline categories
• Multi-step method with intermediate mapping• Weighted SC assignment based on article’s and its references’ SCs• Precision-recall validation based on investigators’ self-
categorizations• Oncologists within the NCI’s PS-OC program are publishing more
physical sciences research and physical scientists are publishing more oncology or life sciences research during years of program participation.