the distribution of references in scientific papers: an analysis of the imrad structure - issi-2013

Post on 07-May-2015

114 Views

Category:

Science

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The Distribution of Referencesin Scientific Papers:

an Analysis of the IMRaD Structure

ISSI 2013

Vienna, 16 July 2013

Marc Bertin, Iana Atanassova, Vincent Lariviere, Yves Gingras

Problem

Scientific papers usually follow a specific rhetorical structure: the IMRaD structure (Introduction, Method, Result and Discussion).

Questions:Questions:

� What relationships exist between cited references and the structure of the text?

� How does the IMRaD structure affect the distribution of references in scientific papers?

Method

� Corpus: 7 peer-reviewed academic journals:� PLoS series (ONE, Biology, Computational Biology,

Genetics, Medicine, Neglected Tropical Diseases, Pathogens)

XML using Journal Article Tag Suite (JATS)� XML using Journal Article Tag Suite (JATS)

� More than 47,000 scientific articles

� Identify the section structure of the articles

� Identify cited references in the text

� Study the distribution of references according to the text progression and structure.

Sections Identification

• Section titles can vary according to the article.

• e.g. "Method", "Methods", "Method and Model"Model"

• Section titles were analyzed in order to match each section with one of the section types in the IMRaD structure.

Sentence Level Processing

� We use sentences as basic units to model text progression

� Sentence segmentation allows us to work with text elements that are smaller than paragraphsparagraphs

� Analysis of the punctuation of the text following a set of typographic rules

� For each sentence, we count the number of references it contains and obtain their distribution along the text.

Corpus

Cited References

� Cited references are present as separate elements in the XML structure

� Special cases needing specific processing: reference ranges

ResultsResults

PLoS ONE &

PLoS Computational Biology

PloS Genetics, PLoS

Pathogens & PLoS Biology

PLoS Medicine & PLoS

Neglected Tropical Diseases

IMRaD Structure

Conclusion

� We have obtained the distribution of cited references in scientific papers.

� We have shown that this distribution seems quite stable and maybe even seems quite stable and maybe even invariant if we take into account the changes that occur in some journals in the positions of the different sections in the text of the articles.

Thank you!Thank you!

top related