empirical realities of scientific misconduct in abstract publicly funded research · 2007-09-24 ·...

15
Empirical Realities of Scientific Misconduct in Publicly Funded Research What can we learn from the ORI’s investigations of U.S. cases in the biomedical and behavioral sciences? By Andrea Pozzi* & Paul A. David** *Stanford University : [email protected] **University of Oxford and Stanford University : [email protected] Presentation to the ESF-ORI First World Conference on Research Integrity: Fostering Responsible Research An EU Portuguese Presidency and European Commission Event to be held at The Calouste Gulbenkian Foundation, Lisbon, Portugal, 16-19th September 2007 This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. ABSTRACT This paper presents the results of an exploratory statistical study of time-series and cross-section patterns in enquiries and findings of investigations of cases involving plagiaries, falsifications, and fabrications by researchers in the biomedical and behavioral sciences. Our analysis is based upon datasets constructed from information provided by the published reports of the Office of Research Integrity (ORI) in the U.S. Public Health Service regarding its activities during 1994-2006. A descriptive approach is adopted, in view the paucity of prior systematic empirical work, and the limitations of the micro-level data available at this time. While no causal hypotheses are tested, some possible interpretations and implications of the findings point to the importance of more widespread, independent data collection and analysis as a foundation for public regulations and private initiatives that aim to address and control the phenomenon of scientific misconduct. Acknowledgements: The authors are grateful to the Science and Society Programme of the European Commission (DG-Research), and Portugal’s Ministry of Science, Technology and Higher Education for the support received for the research underlying this presentation and the writing of paper upon which it draws. The results and views expressed here are those of the authors and do not reflect official positions of either of those organizations or of the ORI. INTRODUCTION & ORGANIZATION Purpose and Motivation More comprehensive and systematic collection and analysis of quantitative information about the nature, incidence and circumstances of research misconduct is needed from independent social science researchers, for numerous reasons, including: the central role of reliable scientific knowledge in political, social, economic and cultural affairs of modern societies growing public awareness and concern regarding selectively reported instances of “scientific misconduct” regulatory and educational initiatives to address this “problem,” while acknowledging that its extent and causes in different contents are not well understood inherent difficulties conducting scientific investigations of behaviors that one is also seeking to regulate ORGANIZATION Part 1. Evolution of ORI Case Management, 1994-2006: Aggregate Trends and Case Disposition Rates Part 2. Dynamics of the Distribution of Investigations and ORI Findings of Misconduct, by Type Part 3. The Costs of Misconduct Inquiries– a Partial Assessment derived from ORI Published Data Part 4. Patterns in the Conditional Probabilities of Findings of Misconduct in Closed Investigations Part 5. Problems of Interpretation, Conjectures and Conclusions

Upload: others

Post on 03-Aug-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

Empirical Realities of Scientific Misconduct in Publicly Funded Research

What can we learn from the ORI’s investigations ofU.S. cases in the biomedical and behavioral sciences?

By

Andrea Pozzi* & Paul A. David***Stanford University : [email protected]

**University of Oxford and Stanford University : [email protected]

Presentation to the ESF-ORI First World Conference onResearch Integrity: Fostering Responsible Research

An EU Portuguese Presidency and European Commission Event to be held at The Calouste Gulbenkian Foundation, Lisbon, Portugal, 16-19th September 2007

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor,

San Francisco, California, 94105, USA.

ABSTRACT

This paper presents the results of an exploratory statistical study of time-series and cross-section patterns in enquiries and findings of investigations of cases involving plagiaries, falsifications, and fabrications by researchers in the biomedical and behavioral sciences.

Our analysis is based upon datasets constructed from information provided by the published reports of the Office of Research Integrity (ORI) in the U.S. Public Health Service regarding its activities during 1994-2006.

A descriptive approach is adopted, in view the paucity of prior systematic empirical work, and the limitations of the micro-level data available at this time. While no causal hypotheses are tested, some possible interpretations and implications of the findings point to the importance of more widespread, independent data collection and analysis as a foundation for public regulations and private initiatives that aim to address and control the phenomenon of scientific misconduct.

Acknowledgements: The authors are grateful to the Science and Society Programme of the European Commission (DG-Research), and Portugal’s Ministry of Science, Technology and Higher Education for the support received for the research underlying this presentation and the writing of paper upon which it draws. The results and views expressed here are those of the authors and do not reflect official positions of either of those organizations or of the ORI.

INTRODUCTION & ORGANIZATION

Purpose and MotivationMore comprehensive and systematic collection and analysis of quantitativeinformation about the nature, incidence and circumstances of research misconduct is needed from independent social science researchers, for numerous reasons, including:

• the central role of reliable scientific knowledge in political, social, economic and cultural affairs of modern societies• growing public awareness and concern regarding selectively reported instances of “scientific misconduct”• regulatory and educational initiatives to address this “problem,” while acknowledging that its extent and causes in different contents are not well understood• inherent difficulties conducting scientific investigations of behaviors that one is also seeking to regulate

ORGANIZATION

Part 1. Evolution of ORI Case Management, 1994-2006:Aggregate Trends and Case Disposition Rates

Part 2. Dynamics of the Distribution of Investigations and ORI Findings of Misconduct, by Type

Part 3. The Costs of Misconduct Inquiries– a Partial Assessment derived from ORI Published Data

Part 4. Patterns in the Conditional Probabilities of Findings of Misconduct in Closed Investigations

Part 5. Problems of Interpretation, Conjectures andConclusions

Page 2: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

Part 1. Analysis of Aggregate ORI Case Management Flows and Case Disposition Rates during 1994-2006

Aggregate Flows in Allegations and Inquiries

Average Rates of Case Disposition for Inquiries and Investigations

Average Frequencies of Findings of Misconduct in Closed Investigations

Distributions of time-lags in “Misconduct Correction”

Statistical data for the study by Pozzi and David (2007) were extracted from the published Annual Reports of the U.S. Office of ResearchIntegrity for the years 1994-2006

ORI Annual Reports, then (1994) …and now (2006) Total Flow of Allegations

ORI Institutions

ORIinquiries

Institutionalinquiries

Institutionalinvestigations

Findings of Misconduct

ORIinvestigations

Referred toother agencies

….

Dismissed

No findings of Misconduct

Dismissed

ORIoversight

Oversight on

inquiries

Oversight on investigations

Page 3: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

Total Flow of Allegations

ORI Institutions

ORIinquiries

Institutionalinquiries

Institutionalinvestigations

ORIinvestigations

Referred toother agencies

….

Dismissed

No findings of Misconduct

Dismissed

ORIoversight

Oversight on

inquiries

Oversight on investigations

Findings of Misconduct

Inquiries Not Pursued: 131

No Findings ofMisconduct: 102

Findings ofMisconduct: 165

Allegations: 3571

Inquiries: 398

Investigations: 267

ORI case processing: totals during 1994-2006

Source: Compiled from ORI Reports

Allegations Not Pursued: 3173

. TESTS OF THE STABILITY OF MEAN ORI CASE DISPOSITION RATIOS: PRE- vs. POST 1999

ORI case disposition ratio Mean and Standard Deviation P-value of Difference of Annual Ratios for Period: in Mean Ratio between 1994-1999 2000-2005 1994-1999 and 2000-2005 Ratios of the Numbers shown below to the Total Number of Allegations Received by All PHS-related Institutions Allegations reported to ORI 0.639 0.622 P > t = 0.7037 a (.070) (.083) All Cases opened by ORI 0.136 0.095 P > t = 0.0161 a (.025) (.024) All Investigations opened by ORI 0.080 0.062 P > t = 0.0501 a (.013) (.014) Investigations closed with ORI 0.054 0.032 P > t = 0.0098 a Findings of Misconduct (.013) (.011) Sequential Case Disposition Ratios:

Allegations reported /Total Allegations 0.639 0.622 P > t = 0.7037 a (.070) (.083) All cases opened / Allegations reported 0.213 0.154 P > t = 0.0322 a (.043) (.039) Investigations opened / All cases opened 0.596 0.664 P > t = 0.3342 a (.090) (.138) Findings of Misconduct /All Investigations 0.704 0.557 P > t = 0.3194 a (.202) (.274)

______________________________________________________________________________________________________ Notes and Sources: See sources of annual flows underlying Pozzi-David (2007): Chart 1.3. a P-value of a two tails t-test of diff based on pooled (unequal) sample variances. Data on the total number of allegations are drawn from the “Reports on potential misconduct” that are included in the Annual Reports with one year lag. Therefore, even though we rely on Annual Reports up to 2006, data on total allegations are available only until year 2005.

ORI became more selective (statistically) post-1999 in opening inquiries, but proportions for investigations opened following inquiries, and investigations finding misconduct were stable

Volumes of total allegations received annually by the ORI show variations around a weak upward trend, while those of “closed investigations” show slightly lagging variations around a weak

downward trend

Source: Elaboration of ORI reports, 1994-2006.Note: The series are plotted in logs to make them comparable.

Log plot of total number of allegations and cases investigated, 1994-2005

01234567

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005

Year

Total allegations Investigations closed

Page 4: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

ORI decisions to open an inquiries have declined as a proportion of the total number of allegations received, as the rate varies inversely with the volume of allegations; but there are substantial year-to-year

variations.

Source: Elaboration of ORI reports, 1994-2006.Note: The black line represents the fitted values from the OLS regression whose equation line is

% of inquiries= 22.04 - .0362*Total number of allegations, R2=.4601.The values for year 1995 actually resulted from averaging years 1994 and 1995 to smooth any start-up variations in the operations of the ORI.

ORI findings of misconduct (all types) as a proportion of allegations received also tend to vary inversely with the volume of allegations, but the relative year-to-year variations in the proportion are smaller

Source: Elaboration of ORI reports, 1994-2006.Note: The black line represents the fitted values from the OLS regression whose equation line is

% of inquiries= 22.04 - .0362*Total number of allegations, R2=.4601.The values for year 1995 actually resulted from averaging years 1994 and 1995 to smooth any initial conditions problem.

Closure rates for ORI all enquiries and investigations, and for investigations alone are approximately the same (both averaged over successive pairs of

years), with the relative volume of closed inquires declining somewhat

ORI case management rate, all charges combined

0

0.2

0.4

0.6

0.8

1

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005

Year

Smooth proportion of inquiries and investigation closed

Smooth proportion of investigation closed

Source: Elaboration of ORI reports, 1994-2006.Note: Smooth proportion of inquiries and investigations closed = total number of inquiries and investigations closed in a year over the sum of inquiries and investigations opened in the year and those carried to the next year.Smooth proportion of investigations closed = total number of investigations closed in a year over the sum of investigations opened in the year and those carried to the next year.Both series are smoothed by a two years moving average.

Source: Elaboration of ORI reports, 1994-2006.Note: Investigations closed with findings= Cumulative fraction of investigations closed with findings over the total of investigations opened and carried to the next year.Investigations opened= Cumulative fraction of new investigations opened over the total of investigations opened and carried to the next year.

ORI case management Evolution of investigation opening and closure

0

0.2

0.4

0.6

0.8

1

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Year

Investigations closed with findings Investigations opened

The cumulative ratios of newly opened investigations to total open investigations, and the similar measure for investigations closed with findings of misconduct exhibit great stability in ORI case management

Page 5: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

Part 2. Dynamics of the Distribution of ORIInvestigations and Findings of

Misconduct, by Type, 1994-2006

Total Flow of Allegations

ORI Institutions

ORIinquiries

Institutionalinquiries

Institutionalinvestigations

Investigations by Charge

Type

Referred toother agencies

….

Dismissed

No findings of Misconduct

Dismissed

ORIoversigh

tOversight on

inquiries

Oversight on investigations

Findings of Misconduct by Charge Types

Annual volumes of ORI closed investigations of falsifications and fabrications have declined since the 1990’s, while those involving

plagiarism remain few and show no trend

Source: Elaboration of ORI reports, 1994-2006.Note: Number of cases investigated= number of investigations closed during the year.

Number of cases investigated by type, 1994-2006

0

5

10

15

20

25

30

35

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005

Plagiarism Falsification Fabrication

The annual distributions of the three types of misconduct charges investigated by the ORI, and the corresponding

distributions of misconduct findings have remained quite stable and involved relatively few plagiarism cases

Source: Elaboration of ORI reports, 1994-2006.

Distribution of number of investigations and number of findings, by type of charge

Number of investigations

0%

20%

40%

60%

80%

100%

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Plagiarism Fabrication Falsification

Number of findings of misconduct

0%

20%

40%

60%

80%

100%

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Plagiarism Fabrication Falsification

Page 6: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

Full

professor Associate professor

Assistanprofesso

Post-doc

Grad student

Research assistant

Undergrad student

Research scientist Staff

Plagiarism 0.00 0.50 0.25 0.13 0.00 0.00 0.00 0.13 0.00

Falsification 0.08 0.08 0.10 0.19 0.19 0.00 0.00 0.15 0.21

Fabrication 0.07 0.00 0.00 0.07 0.33 0.11 0.04 0.15 0.22

Plagiarism

and Falsification

0.00 0.00 0.29 0.43 0.00 0.00 0.00 0.14 0.14

Falsification and

Fabrication 0.08 0.10 0.02 0.22 0.12 0.08 0.04 0.16 0.18

All three 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00

Source: Elaboration of ORI annual reports

Distribution of charges, by position 1994-2006 : Investigations closed with findings of misconduct (other than

plagiarism) are concentrated at lower ranks of the research hierarchy

. Source: Elaboration of ORI annual reports.

Full

professor Associate professor

Assistant professor

Post-doc

Grad student

Research assistant

Undergrad student

Research scientist Staff

Plagiarism 0.00 0.33 0.17 0.00 0.33 0.00 0.00 0.17 0.00

Falsification 0.21 0.18 0.12 0.07 0.08 0.02 0.00 0.15 0.17

Fabrication 0.06 0.06 0.06 0.12 0.06 0.12 0.00 0.24 0.29

Plagiarism

and Falsification

0.29 0.29 0.14 0.00 0.00 0.14 0.00 0.14 0.00

Falsification and

Fabrication 0.30 0.13 0.09 0.04 0.04 0.04 0.00 0.26 0.09

All three 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Distribution of charges, by position 1994-2006 : Investigations closed without findings of misconduct tend to be more concentrated in the upper ranks of the research hierarchy

Trends in the mean lag in “corrections” of the public record of research -- a partial view from ORI misconduct findings

Is the publicly supported research system being overwhelmed by instances of scientific misconduct that go undetected and uncorrected?

One symptom of such a condition would be the tendency for the distribution of the time-lags between acts of misconduct and their detection, and public correction to be shifting upwards, and becoming more skewed.

More systematic data about this should be collected, based on published notices and/or retractions by scientific research journals.

But an analysis of the evolution of the distribution of the “correction lags” -- measured from ORI cases where misconduct was found -- can provide an approximate indicator of recent trends in the typical lags for biomedical and behavioral research publications.

The distribution of measured “correction lags” in cases where ORI found misconduct (all types combined) is left-skewed with a mode

at 3 years, based pooled observations for 1994-2006

Misconduct correction lag (in years) Distribution for all charges combined, 1994-2006

0

0.2

0.4

0.6

0.8

1

<1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Years

Page 7: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

Distribution statistics for the measured “correction lags” (where ORI found misconduct) show the emergence of post-1990’s stability around a constant mode, lower mean and smaller rel-variance –

implying an approximate 3-year average “detection lag”

Source: Elaboration on ORI reports, 1994-2006.Note: The “corrections lag” is calculated only for those observations that involve publication of a paper related to the research under investigation. The date of publication of the paper is used as an estimate of the time at which misconduct took place (as the misconduct act had already been committed prior to journal submission and, a fortiori, prior to the publication). To obtain an approximation of the detection lag, we first find the mean correction lag by taking the “correction dates” as the minima of either the dates on which the investigation was closed (most cases are closed within the year they are opened), or of an antecedent voluntary retraction by the respondent; then subtracting the journal publication date gives the lag. Reducing the mean estimatesof the correction lags by the typical one-year investigation duration yields an “efficient under-estimator” of the mean detection lag.

In this sample, the correction date coincides with the end of the investigation in 40 cases out of 65, and we use the date of retraction statements in the other 25 cases.

1994 1994-1996 1994-1998 1994-2000 1994-2002 1994-2004 1994-2006

Mean 5.60 4.82 4.04 3.75 3.35 3.33 3.65Median 3.00 4.00 3.00 3.00 3.00 3.00 3.00

Std. Dev. 6.11 4.35 4.01 3.68 3.24 3.06 3.32Range 1--16 0--16 0--16 0--16 0--16 0--16 0--16

Std. Dev./Mean 1.12 0.90 0.99 0.98 0.97 0.92 0.92

Evolution of the average "Misconduct correction lag" (in years), for all charges combined

The distribution of measured “correction lags” in cases where ORI found falsification resembles that for all types of misconduct

pooled, over the period 1994-2006

Misconduct correction lag (in years)Distribution for all cases involving falsification, 1994-2006

0

0.2

0.4

0.6

0.8

1

<1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Years

The distribution statistics for the measured “correction lags” in cases where ORI found falsification broadly resemble those for all types of misconduct

pooled in successively longer sub-periods of 1994-2006

Source: Elaboration of data from ORI reports, 1994-2006. Note: The “corrections lag” is calculated only for those observations that involve publication of a paper related to the research under investigation. The date of publication of the paper is used as an estimate of the time at which misconduct took place (as the misconduct act had already been committed prior to journal submission and, a fortiori, prior to the publication). To obtain an approximation of the detection lag, we first find the mean correction lag by taking the “correction dates” as the minima of either the dates on which the investigation was closed (most cases are closed within the year they are opened), or of an antecedent voluntary retraction by the respondent; then subtracting the journal publication date gives the lag. Reducing the mean estimatesof the correction lags by the typical one-year investigation duration yields an “efficient under-estimator” of the mean detection lag.

In this sample, the correction date coincides with the end of the investigation in 40 cases out of 65, and we use the date of retraction statements in the other 25 cases.

1994 1994-1996 1994-1998 1994-2000 1994-2002 1994-2004 1994-2006

Mean 5.60 5.00 4.26 4.10 3.52 3.54 3.91Median 3.00 4.00 3.00 3.71 3.23 3.29 3.69

Std. Dev. 6.11 4.42 4.05 3.71 3.29 3.33 3.74Range 1--16 0--16 0--16 0--16 0--16 0--16 0--16

Std. Dev./Mean 1.09 0.88 0.95 0.90 0.93 0.94 0.96

Evolution of the average "Misconduct correction lag" (in years), for all cases involving falsification

The distribution of measured “correction lags” in cases where ORI found misconduct (all types combined) – after deletion of one case (with 16 year lag, involving falsification) reported in 1994, leaves the

mode at 3 years, but reduces the left-skew and the variance

Misconduct correction lag (in years) Trimmed distribution for all charges combined, 1994-2006

00.10.20.30.40.50.60.70.80.9

1

<1 1 2 3 4 5 6 7 8 9 10 11 12 13

Page 8: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

Eliminating the 1994 “outlier” observation (one 16-year lag) reveals the essential time-stationarity of the distribution of the “correction lag” for all types of misconduct cases over the period 1994-2006

Source: Elaboration on ORI reports, 1994-2006.Note: The “corrections lag” is calculated only for those observations that involve publication of a paper related to the research under investigation. The date of publication of the paper is used as an estimate of the time at which misconduct took place (as the misconduct act had already been committed prior to journal submission and, a fortiori, prior to the publication). To obtain an approximation of the detection lag, we first find the mean correction lag by taking the “correction dates” as the minima of either the dates on which the investigation was closed (most cases are closed within the year they are opened), or of an antecedent voluntary retraction by the respondent; then subtracting the journal publication date gives the lag. Reducing the mean estimatesof the correction lags by the typical one-year investigation duration yields an “efficient under-estimator” of the mean detection lag.

In this sample, the correction date coincides with the end of the investigation in 40 cases out of 65, and we use the date of retraction statements in the other 25 cases.

1994 1994-1996 1994-1998 1994-2000 1994-2002 1994-2004 1994-2006

Mean 3.00 4.13 3.56 3.40 3.14 3.12 3.46Median 2.50 3.50 3.00 3.00 3.00 3.00 3.00

Std. Dev. 2.16 3.36 3.25 3.07 2.74 2.60 2.96Range 1--6 0--12 0--12 0--12 0--12 0--12 0--13

Std. Dev./Mean 0.72 0.82 0.91 0.90 0.87 0.84 0.86

Evolution of the average "Trimmed misconduct correction lag" (in years), for all charges combined

Measuring the “Misconduct Correction Lags” from ORI CasesAveraging the gap between the date of a journal publication and the date of the misconduct finding (or a prior retraction) yields the mean “correction lag,” subtracting the mean duration of Inquiries& Investigations yields an efficient lower-bound estimate of the mean Detection Lag as being c. 3 years

Time

Time

Time

Time

ALLEGATION

MISCONDUCT

ALLEGATION

MISCONDUCT

FINDING FINDING

JOURNALPUBLICATION

VOLUNTARYRETRACTION

JOURNALPUBLICATION

O=================== O===================

Actual “Misconduct Detection Lag”

Actual “Misconduct Detection Lag”

Measured “Correction Lag”

Case 1: No retraction issued Case 2: Retraction issued

Note: The “detection lag” (between the act of misconduct and reported allegation) can be shorter or longer than the lag between journal publication and the opening of a inquiry (resulting in an investigation closed with misconduct findings). The measured “correction lag,” however, overstates the lag between a post-publication allegation and the opened inquiry, as it includes the period of the inquiry and investigation, which is put at roughly 0.65 years on average ( see Pozzi-David, 2007:Addendum on Inquiry Durations).

INQUIRY &INVESTIGATION

INQUIRY &INVESTIGATION

Part 3. The Costs of Misconduct Inquiries– a Partial Assessment derived from ORI Published Data

In view of the need to involve scientific personnel in these proceedings, what does the experience of the ORI-supervised process tell us about the costs of the investigation and adjudication of cases arising from allegations of scientificmisconduct, in toto and in terms of their outcomes?

Assumption 1: A representative panel member is convened, on average, 6 hours per month in every month elapsed between the opening of an inquiry that continues as anInvestigation and the investigation’s close, i.e., while the panel is dealing with the case.

Investigations closed with findings: evidence from Rhoades (2004)Number of investigations closed with findings in 1994-2003 = 165Average length of “inquiry + investigation” = 236 days = 7.9 monthsAverage size of the panel = 3.65 panelists

Hence:7.9 months x 6 hours/month x 3.65 panelist per case x 165 cases =

28,472 hours of panelists` time

Investigations closed without findings: evidence from Rhoades (2004)Number of investigations closed without findings in 1994-2003 = 102Average length of inquiry+investigation = 259 days = 8.6 monthsAverage size of the panel = 3.78 panelists

Hence:8.6 months x 6 hours/month x 3.78 panelist per case x 102 cases =

19,972 hours of panelists` time

Hours of panelists` time for 267 closed investigations, 1994-2003 = 28,472 + 19,972= 48,444

Reckoning the cost of scientific misconduct investigations –Step One: estimating the time spent by institutional panel members

Page 9: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

Assumption 2: An academic working year consists of 9 months, or 184 8-hours work days

Inquiries not leading to investigations: evidence from Rhoades (2004)Number of inquiries not leading to investigations in 1994-2003 = 70Average length of inquiry = 59.93 days = 2 monthsAverage size of the panel = 2.96 panelists

Hence,2 months x 6 hours/month x 2.96 panelist per case x 70 cases = 2,483 hours of panelists` time

Salaried academic work years used in inquiries not leading to investigations =2,483/1470 = 1.7 years

Salaried academic work years used in investigations closed with findings = 48,444/1470= 32.95 years

The estimate total number of salaried academic work years used in panel meetings for all 337 cases during 1994-2003 was

= 32.95 + 1.7 = 34.7 years.Question: Is that a big or a small cost?

Reckoning the cost of scientific misconduct investigations –Step One: estimating the time spent by institutional panel members, 2

Empirical findings from the published ORI data : (a) 337 cases during 1994-2006 were opened for ORI inquiries and terminated in 267 closed investigations.

(b) 34.7 “salaried academic work-years” of panelists were required to deal with the 337 inquiries. (c) Mean probability that the allegations in ORI-supervised investigations will be substantiated: 0.49. (d) There were 2.76 articles on average in cases were misconduct was found to have involved preceding journal publications, and such cases represented 19.1% of all cases. Illustrative supposition: The 9-month academic-year salary for senior scientists is (a modest) $120,000

Implications: (1) Approximately 1 academic year of institutional panelists’ time is required to handle 10 cases (2) Correcting the false research records in the 5 cases would therefore have an “opportunity cost” (of panelists’ time) that averaged $25,000 per case investigated.

A heuristic estimate of the cost of the scientific resourcesconsumed in scientific misconduct investigations --

Step Two: Elaborating the estimates of panelists’ time inputs

$25, 000 is a lower bound on the average per case time-cost of the investigative process: it counts only the panelists’ in panel meeting – not preparation time; nor the time of institutional administrators, the witnesses, and the respondents – for half of whom no allegations were sustained. Is $25,000 per a big number?: It approximates the modal incremental cost for a U.S. university to file a patent on a faculty invention arising from federally funded research – many of which do not yield any licensing revenues – so it is appreciable but doesn’t seem unaffordable. . Yet, for the cases where misconduct was found to affect prior publications and retraction notices were requested, the per case average is $104,000 -- or $65,750, if one counted the additional cases in which voluntary retractions were forthcoming during the investigation. That is works out to between $1,019 and $644 per article “flagged” by retraction or correction notices, if the journal acted on all such requests. But, such “flags” cannot be “purchased” one-at-a-time—the process is “lumpy” and imposes indivisible resource commitments : Setting up and maintaining a credible investigative and adjudication process (such as that supervised by the ORI, or the NSF Inspector General’s Office) would entail a big continuing commitment of available scientific research resources for all except the largest among the high-income countries.

A heuristic estimate of the cost of the scientific resourcesconsumed in scientific misconduct investigations --Step Three : Assessing the meaning of the results

1) Expanding formal investigations as a “deterrent” may disrupt research

2) Encouragement and protections for “whistle-blowing”may induce greater resource burdens investigating allegations that cannot be substantiated

3) Informal social control may be weakened by external formal procedures

…and then there may be additional costs of the unanticipated “perverse” system effects

Page 10: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

Part 3. Analysis of Effects on Conditional Probabilities of Findings of Misconductin Closed ORI Investigations, 1994-2006

Remark: The previously demonstrated stability of ORI case management processes over time, as well as that of the distribution of cases investigated, justifies pooling all the available data for the “closed investigations” from the entire period 1994-2006, in order to perform the statistical analysis reported in this Part with of the largest possible number of observations on individual cases and their characteristics.

Individual Allegation

ORI Institution

ORIinquiry

Institutionalinquiry

Institutionalinvestigation

Finding of Misconduct

ORIinvestigation

Referred toother agencies

….

Dismissed

No finding of Misconduct

Dismissed

ORIoversight

Oversight on inquiry

Oversight on investigation

Individual Allegation

ORI Institution

ORIinquiry

Institutionalinquiry

Institutionalinvestigation

Finding of Misconduct

ORIinvestigation

Referred toother agencies

….

Dismissed

No finding of Misconduct

Dismissed

ORIoversight

Oversight on inquiry

Oversight on investigation

data are observable for cases, by type of allegation, from ORI published reports

Frequencies and rank ordering of grantor institutes by frequency of closed investigations, shows no significant differences between

sub-periods of 1994-2006

Page 11: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

For purposes of this analysis the individual respondents in the ORI’s closed investigations were grouped into major categories defined

according to their academic and non-academic employment “positions”

Mean relative frequencies of misconduct findings in closed observations are indicated by the indicated slope of rays from the origin of the graph,

and reveal marked differences among two sets of “respondent-positions”

0.67

0.50

0.33

The pattern in the relative frequencies of misconduct findings among cases of falsification resembles that for all misconduct

cases combined

A still more striking separation between the mean relative frequencies of findings of misconduct appears among the different

respondent-position groups in cases involving plagiarism

Page 12: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

But the separation the mean frequencies of findings of misconduct is less pronounced among the various respondent-position groups

in cases involving plagiarism

Grouping all ORI cases investigations of misconduct by the identities of the grantor institutes reveals some differences in the relative frequency of

misconduct findings between two groups of National Institutes

< 0.5

Probit regression analysis of the marginal effects of respondent’s “position”reveals statistically higher probabilities of misconduct findings for post-docs,

grad students and staff, each compared with full professors in falsification cases, and for grad students vs full professors in fabrication cases

Levels of predicted conditional probabilities of misconduct findings, showing effects of respondent position and involvement of prior journal publications, for all closed ORI investigations, 1994-2006

Page 13: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

Estimates of the differentially lower probabilities of findings of falsification for full professors compared with post-docs, grad students and staff are

bigger and significant when controls are introduced for grantor institutes, multiple charges and involvement of prior journal publications

Levels of predicted conditional probabilities of a falsification finding, showing effects of respondent position involvement of prior journal

publications and selected grantor institutes, for closed ORI investigations of falsification cases during 1998-2006

Falsification Fabrication Plagiarism Falsification Fabrication Plagiarism

Professor (Full) Total 6 3 0 6 4 1(M,F) (6,0) (2,1) (0,0) (6,0) (4,0) (1,0)

Associate professor Total 4 3 3 5 2 0(M,F) (4,0) (3,0) (3,0) (5,0) (2,0) (0,0)

Assistant professor Total 2 1 1 3 1 3(M,F) (1,1) (1,0) (1,0) (3,0) (1,0) (3,0)

Post-Doc Total 5 5 1 15 10 2(M,F) (3,2) (4,1) (1,0) (10,5) (7,3) (2,0)

Graduate student Total 6 7 0 7 5 0(M,F) (4,2) (6,1) (0,0) (4,3) (2,3) (0,0)

Research assistant Total 2 2 0 5 5 0(M,F) (1,1) (1,1) (0,0) (4,1) (3,2) (0,0)

Undergraduate student Total 0 0 0 0 1 0(M,F) (0,0) (0,0) (0,0) (0,0) (1,0) (0,0)

Research scientist Total 17 11 2 5 6 2(M,F) (13,4) (5,6) (2,0) (1,4) (1,5) (2,0)

Staff Total 17 10 0 7 7 1

1994-1999 2000-2006

ORI investigations closed with findings of misconduct, for each type: by respondent’s position and gender, for sub-periods

1994-1999 and 2000-2006

Source: Tabulation of information extracted from summaries of in ORI Annual Reports.

Frequency and mean probability of ORI investigation’ substantiation of allegations of scientific misconduct during 1994-2003: conditional on the research position of the allegation’s source

(i.e.,the “whistleblower’s” position)

Institutional Position of the “Whistleblower” in the case that was investigated

Number of whistleblowers

whose allegations were

substantiated

Total number of

whistleblowers

Mean probability of substantiation

Standard dev. of the

probability of substantiation

All Cases (unconditional on position) 108 216 0.500 0.497 Faculty appointment: all 85 165 0.515 0.500

Tenure faculty ranks 73 137 0.533 0.499

Non faculty researcher: all 23 51 0.451 0.498

Research Assoc/Assist & Students 14 23 0.609 0.488

Post doctoral Fellows & Technicians 9 28 0.321 0.467

Source: Elaboration of Rhoades (2004), Table 35. Note: A two tails t-test of diff based on pooled (unequal) sample variance has been performed to test significance in the difference of the means between the group “Faculty appointment: all” and “Non faculty researchers: all”; the means are not statistically significant (P-

Tenured academics are almost 3 times as frequent as non-academics among all “whistle-blowers” but not significantly different from any of the other groups in the mean unconditional probability of having their investigated allegations substantiated

Page 14: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

Source: ORI Annual Report for 2006, p. 30.

Remark: More detailed studies and statistical analysis of circumstances in cases of research misconduct by postdocs and graduate student research assistants are needed. The forthcoming ORI staff report focused on this topic is a welcome initiative.

Part 4. Problems of Interpretation, Conjectures and Conclusions

Note: This Part will be elaborated in the conference presentation

A priori theorizing has obvious limits in interpreting the data where one only can observe detected cases of misconduct

Modern economists --starting with Gary Becker (1968) have developed formal “rationale actor” models of crime and punishment, in which the expected benefits of undetected deviant acts are weighed against the expected losses incurred when such behavior is detected.

Recently, the issue of scientific misconduct has been attracting similar and increasingly sophisticated game-theoretic analysis, e.g.:

But, as always, the predicted “equilibrium behavior” in these models depends strongly on what is assumed about the players’ knowledge and motives, including, the case of researchers, the level of effective effort they devote to examining each others’ working papers and publications, and mutual expectations about the likelihood of detection.

A priori theorizing has obvious limits in interpreting the data where one only can observe detected cases of misconduct - 2

As simple illustration of theorizing in rational actor models, suppose:• the career rewards of recognized priority in an radical research result are greater than

those for an incremental result; • a researcher with a strong reputation (a “star”) has more to lose than others, were

they to engage in acts of misconduct that were detected – this depends on assumptions about the individual’s expected remaining “career life” and time-discount rate;

• a “research “star” has less to lose than those with less peer-esteem if the current research project fails to obtain a radical result

Then one could argue that:• since “stars” would correctly expect that their claims of a radical result would be

accepted on trust, they would undertake riskier projects and the incidence of “star misconduct” would be comparatively under-stated by the statistics of detected cases;

OR ALTERNATIVELY • since “stars’” projects were expected to aim at radical results, they would correctly

expect to receive closer scrutiny and worry about the greater penalties of detection than of failure, so that the incidence of non-star misconduct would be comparatively understated.

Page 15: Empirical Realities of Scientific Misconduct in ABSTRACT Publicly Funded Research · 2007-09-24 · Empirical Realities of Scientific Misconduct in Publicly Funded Research What can

Reference for Pozzi-David(2007), presentation

http://siepr.stanford.edu/papers/pdf/Pozzi-David_Presentation.pdf

A preliminary draft of the background paper will be found in the chronological listof SIEPR/KNIIP 2006-07 Working Papers at: Andrea Pozzi and Paul A. David, “Empirical Realities of Scientific Misconduct in Publicly Funded Research: What can be learned from the data published by the

U.S. Public Health Service’s Office of Scientific Integrity?”

Discussion paper prepared for presentation at the ESF-ORI First World Conference on Scientific Integrity—Fostering Responsible Research, to be held at the Gulbenkian Foundation, Lisbon, Spain, on 16-19 September, 2007: Preliminary Text and presentation slides (not quotation or distribution without person of the authors) at: TEXT(click here): http://siepr.stanford.edu/papers/pdf/Pozzi-David_Text.pdf SLIDES(click here): http://siepr.stanford.edu/papers/pdf/Pozzi-David_Presentation.pdf