why has bradford's law been an elusive phenomenon so far?

3
Letters to the Editor Sir: In Figure 1 of “Journal productivity distribution: Quantita- tive study of dynamic behavior” [JASIS 43(6):412-421, 19921, Vesna Oluic-Vukovic presents data which provide unarguable evidence that, if a bibliography is continuously added to, then the distribution-as revealed by a Bradford plot of successive cumulations over time-shows a gradual yet distinct change in shape. However, this phenomenon and, consequently, the importance of dynamic, i.e., time-dependent, models are not new, even in the field of informetrics. For instance, the University of Pittsburgh investigation of library book circulations showed a systematic change in the frequency-of-circulation distribution as the period of observation extended from one through to seven years [Fig. 9 of Kent et al. (1979, p. 41)]. More recently, Burrell (1991) used data on suc- cessive cumulations over a five-year period of a bibliography on “Regression” to reveal precisely the same sort of changing shape noted by Oluic-Vukovic. (Note that, in this work, Burrell advocates a simple normalization of the axes to give a so-called standardized Bradford plot, which gives an even more striking visual impression of the shape change.) Turning from the empirical evidence to time-dependent infor- metric models, OluiC-Vukovic suggests that the shape of the distribution curve varies over time as a consequence of the changes in the internal structure of data that are, most probably, influenced by the intrinsic probabilistic processes. This is exactly the view taken by several authors who have advocated a stochastic modeling approach. Many of the contexts studied by informetricians can be de- scribed as a population of “sources” producing “items” over time, where the production process incorporates randomness in two distinct ways: first, for an individual source, items are produced at random over time; second, there is variation in the average rates of production of different sources. One simple mathematical model which incorporates both of these types is called a mixture of stochastic point processes [there are, of course, other related models, see, e.g., Glanzel & Schubert (1991)]. For library circulation distributions, Burrell(l982) gives graph- ical illustrations of the consequences of simple forms of this model for data collected over extending periods of time, similar to the situation reported by Kent et al. (1979). See also Burrell (1980, 1988b) and Burrell and Cane (1982) for further description of the model construction and some simple possible applications, and Gelman and Sichel (1987) for a slight variant. In the context of reference scattering over journals in a subject bibliography originally considered by Bradford (1934) the same sort of model has been considered by Sichel (1985, 1992), Burrell (1988a), and Burrell and Fenton (1992). The connection between changes in distributional shape and concentration noted by OluiC-VukoviC have also been investigated using these models by Burrell (1985, 1992). 0 1993 John Wiley & Sons, Inc. All of the foregoing serves to contradict those works which assume or assert that (i) there is stability in informetric distribu- tions over extending periods of time, and/or (ii) deviations from the classical form of the Bradford plot (e.g., the Groos droop) are in some way due to a deficiency in the data (e.g., incompleteness of a bibliography). If OluiC-Vukovic’s data [as well as those of Burrell(1991)] explode these as myths and we have models which reveal them as such, why, then, do these myths persist? It seems to me that the error is either a belief in the universal truth of Bradford’s law, in which case (i) and (ii) are (almost) consequences, or a belief that they have been established as empirical facts, in which case they can be presumed. It must be stressed that in neither case can the fault be attributed to Bradford. When he wrote (Bradford, 1934) “ . . . to a considerable extent, the references are scattered throughout all periodicals with a frequency approximately related inversely to the scope . . . ” (italics added), it is clear that he was fully aware of the ap- proximate, and to some extent speculative, nature of his “law of scattering.” Others have been less skeptical! The scientific method demands that a hypothesis is tenable only so far as experimental evidence supports it and, when that support disappears, then the hypothesis itself must be amended- however uncomfortable or inconvenient that may be. Informetrics, as the mathematics of information science, must acknowledge this principle. Even if all it leads to is the abandonment of the above- mentioned myths, then OluiC-Vukovic’s article will have provided long-lasting benefits to informetrics. Quentin BurrelI Department of Mathematics University of Manchester Oxford Road Manchester MI3 9PL United Kingdom References Bradford, S.C. (1934). Sources of information on specific subjects. Engineering, 137, 85-86. Burrell, Q. L. (1980). A simple stochastic model for library loans. Journal of Documentation, 36, 115-132. Burrell, Q. L. (1982). Alternative models for library circulation data. Journal of Documentation, 38, 1- 13. Burrell, Q. L. (1985). The SO/20 rule: Library lore or statistical law? Journal of Documentation, 41, 24-39. Burrell, Q. L. (1988a). Modelling the Bradford phenomenon. Journal of Documentation, 44, l-18. Burrell, Q. L. (1988b). Predictive aspects of some bibliometric pro- cesses. In L. Egghe & R. Rousseau (Eds.), Informetrics 87/88: Select proceedings of the First International Conference on Bibliometrics and Theoretical Aspects of Information Retrieval (pp. 43-63). Amsterdam: Elsevier. Burrell, Q. L. (1991, August). The dynamic nature of bibliometric processes: A case study. Third International Conference on Infor- metrics, Bangalore: Indian Statistical Institute. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE. 44(3):181-183, 1993 CCC 0002-8231/93/030181-03

Upload: vesna-oluic-vukovic

Post on 06-Jun-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Letters to the Editor

Sir:

In Figure 1 of “Journal productivity distribution: Quantita- tive study of dynamic behavior” [JASIS 43(6):412-421, 19921, Vesna Oluic-Vukovic presents data which provide unarguable evidence that, if a bibliography is continuously added to, then the distribution-as revealed by a Bradford plot of successive cumulations over time-shows a gradual yet distinct change in shape. However, this phenomenon and, consequently, the importance of dynamic, i.e., time-dependent, models are not new, even in the field of informetrics.

For instance, the University of Pittsburgh investigation of library book circulations showed a systematic change in the frequency-of-circulation distribution as the period of observation extended from one through to seven years [Fig. 9 of Kent et al. (1979, p. 41)]. More recently, Burrell (1991) used data on suc- cessive cumulations over a five-year period of a bibliography on “Regression” to reveal precisely the same sort of changing shape noted by Oluic-Vukovic. (Note that, in this work, Burrell advocates a simple normalization of the axes to give a so-called standardized Bradford plot, which gives an even more striking visual impression of the shape change.)

Turning from the empirical evidence to time-dependent infor- metric models, OluiC-Vukovic suggests that

the shape of the distribution curve varies over time as a consequence of the changes in the internal structure of data that are, most probably, influenced by the intrinsic probabilistic processes.

This is exactly the view taken by several authors who have advocated a stochastic modeling approach.

Many of the contexts studied by informetricians can be de- scribed as a population of “sources” producing “items” over time, where the production process incorporates randomness in two distinct ways: first, for an individual source, items are produced at random over time; second, there is variation in the average rates of production of different sources. One simple mathematical model which incorporates both of these types is called a mixture of stochastic point processes [there are, of course, other related models, see, e.g., Glanzel & Schubert (1991)].

For library circulation distributions, Burrell(l982) gives graph- ical illustrations of the consequences of simple forms of this model for data collected over extending periods of time, similar to the situation reported by Kent et al. (1979). See also Burrell (1980, 1988b) and Burrell and Cane (1982) for further description of the model construction and some simple possible applications, and Gelman and Sichel (1987) for a slight variant. In the context of reference scattering over journals in a subject bibliography originally considered by Bradford (1934) the same sort of model has been considered by Sichel (1985, 1992), Burrell (1988a), and Burrell and Fenton (1992). The connection between changes in distributional shape and concentration noted by OluiC-VukoviC have also been investigated using these models by Burrell (1985, 1992).

0 1993 John Wiley & Sons, Inc.

All of the foregoing serves to contradict those works which assume or assert that (i) there is stability in informetric distribu- tions over extending periods of time, and/or (ii) deviations from the classical form of the Bradford plot (e.g., the Groos droop) are in some way due to a deficiency in the data (e.g., incompleteness of a bibliography). If OluiC-Vukovic’s data [as well as those of Burrell(1991)] explode these as myths and we have models which reveal them as such, why, then, do these myths persist?

It seems to me that the error is either a belief in the universal truth of Bradford’s law, in which case (i) and (ii) are (almost) consequences, or a belief that they have been established as empirical facts, in which case they can be presumed. It must be stressed that in neither case can the fault be attributed to Bradford. When he wrote (Bradford, 1934) “ . . . to a considerable extent, the references are scattered throughout all periodicals with a frequency approximately related inversely to the scope . . . ” (italics added), it is clear that he was fully aware of the ap- proximate, and to some extent speculative, nature of his “law of scattering.” Others have been less skeptical!

The scientific method demands that a hypothesis is tenable only so far as experimental evidence supports it and, when that support disappears, then the hypothesis itself must be amended- however uncomfortable or inconvenient that may be. Informetrics, as the mathematics of information science, must acknowledge this principle.

Even if all it leads to is the abandonment of the above- mentioned myths, then OluiC-Vukovic’s article will have provided long-lasting benefits to informetrics.

Quentin BurrelI Department of Mathematics University of Manchester Oxford Road Manchester MI3 9PL United Kingdom

References

Bradford, S.C. (1934). Sources of information on specific subjects. Engineering, 137, 85-86.

Burrell, Q. L. (1980). A simple stochastic model for library loans. Journal of Documentation, 36, 115-132.

Burrell, Q. L. (1982). Alternative models for library circulation data. Journal of Documentation, 38, 1- 13.

Burrell, Q. L. (1985). The SO/20 rule: Library lore or statistical law? Journal of Documentation, 41, 24-39.

Burrell, Q. L. (1988a). Modelling the Bradford phenomenon. Journal of Documentation, 44, l-18.

Burrell, Q. L. (1988b). Predictive aspects of some bibliometric pro- cesses. In L. Egghe & R. Rousseau (Eds.), Informetrics 87/88: Select proceedings of the First International Conference on Bibliometrics and Theoretical Aspects of Information Retrieval (pp. 43-63). Amsterdam: Elsevier.

Burrell, Q. L. (1991, August). The dynamic nature of bibliometric processes: A case study. Third International Conference on Infor- metrics, Bangalore: Indian Statistical Institute.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE. 44(3):181-183, 1993 CCC 0002-8231/93/030181-03

Burrell, Q.L. (1992). The Gini index and the Leimkuhler curve for bibliometric processes. Information Processing and Management, 28, 19-33.

Burrell, Q. L., & Cane, V. R. (1982). The analysis of library data. Journal of the Royal Statistical Society, Series A, 145, 439-471.

Burrell, Q.L. & Fenton, M.R. (in press). Yes, the GIGP really does work-and is workable!

Glanzel, W. & Schubert, A. (1991, August). Predictive aspects of a stochastic model for citation processes. Third International Confer- ence on Informetrics, Bangalore: Indian Statistical Institute.

Kent, A., et al. (1979). Use of library materials: The University of Pittsburgh study. New York: Marcel Dekker.

OluiC-Vukovifi, V. (1992). Journal productivity distribution: Quanti- tative study of dynamic behavior. Journal of the American Society for Information Science, 43, 412-421.

Sichel, H. S. (1985). A bibliometric distribution which really works. Journal of the American Society for information Science, 36, 314-321.

Sichel, HS. (1992). Anatomy of the generalized inverse Gauss- ian-Poisson distribution with special applications to bibliometric studies. Information Processing and Management, 28, 5- 17.

Why Has Bradford’s Law Been an Elusive Phenomenon So Far?

Sir:

The letter by Quentin L. Burrell, especially his statement about the persisting myths of the Bradford Law, have stimulated me to highlight some points which, in my opinion, may account for this. The space available here, however, does not allow me to discuss all aspects relevant to this topic, so I shall confine myself to a more general level.

For this purpose, I cite Brookes (1977), who, characterizing the conceptual framework within which the Bradford Law has been considered as “too static, too deterministic and too physical,” iden- tified simultaneously the main source of the existing anomalies. Namely, one may deduce easily from Brookes’ statement that the dynamic and probabilistic aspects as well as the role of time have been put aside in the consideration of Bradford’s Law. Instead of “treating the situation as a process,” as suggested by Fairthorne (1969), the modeling approaches relied on the “synoptic views of consolidated data,” bringing into question the validity of the conclusions derived on this basis as well as their correspondence with the empirical data.

Despite the fact that, in the meantime, especially during the past ten years, a significant qualitative shift in the concept formation in bibliometrics has been made (Bookstein, 1990, 1990; Burrell, 1980, 1982, 1988; Hubert, 1981; Schubert & Glanzel 1984; Sichel, 1985; Yablonsky, 1980) the success achieved so far in the explanation and theoretical foundation of Bradford and related bibliometric phenomena may be evaluated as rather modest.

Without going into lengthy speculation about possible causes which may account for this, two simple reasons which, in my opin- ion, have hindered qualitatively different approaches to Bradford’s Law, could be discerned.

First, some hypothetical statements related to Bradford’s Law have been accepted without ever being thoroughly investigated. One of these, which has so far escaped analysis and empirical verification, is the statement about remarkable invariant and long- term stability of the distribution, which excludes any possibility that the distribution pattern will change over time. This is the reason why the role of the time component, with the exception of a few authors (Burrell, 1982; Sichel, 1985; Vlachy, 1976), has been avoided in both theoretical considerations and empirical

examinations, or why so little knowledge is gathered about the main structural features of data such as inequality, concentra- tion/dispersal stratification, etc.

Another example, closely related to the first, is the statement about linearity of the distribution curve shape. In spite of the fact that there is neither strong empirical evidence, nor theoretical rationale to support it, this statement has been widely accepted, inducing numerous even highly controversial interpretations as to why empirical data so frequently differ from the expected linear form. This phenomenon intrigued me to study actively the sources of uncertainty and misinterpretations on the one side, and to find a solution which could be more generally accepted on the other (OluiC-Vukovid, 1991, 1992). There is no doubt that the existing controversies about distribution curve shape are induced by the above hypothetical propositions which have, in fact, obscured the main research problem. Namely, what matters is not whether empirical data conform or do not conform with the “exact” linear curve shape, but under what conditions the curve would take a particular form.

Second, Bradford’s Law has only rarely been considered out of the narrow context within which it was regularly noted, although it has been made evident almost 30 years ago (Fairthorne, 1969; Kendall, 1960) that its understanding and explanation require approaches which go beyond this narrow framework to a much broader class of probabilistic phenomena. The fact that a similar pattern of behavior occurs in a wide range of empirical data within and outside of bibliometrics, and that there is precise and comprehensive enough knowledge that may be helpful in the consideration of Bradford’s Law, have not had much effect until recently. This may explain in part why there are so few research works dealing with the apparent equivalence of Bradford’s and other related laws, or why the mechanism producing these laws has only recently been attracting more research interest.

The implications of all this mentioned above on the theoretical considerations of Bradford’s Law and practical consequence of it, deserve, however, an independent treatment rather than a few lines. It can only be stated, from the more general point of view, that existing problems range from the lack of a common conceptual basis and comprehensive theoretical framework to some methodological aspects and pragmatic data conceptualiza- tion. With respect to this, it seems that the current trend of investigation inaugurated by informetrics is more promising. The contributions of Bookstein, Burrell, and Sichel speak in favor of this.

Finally, I am very grateful to Dr. Burrell for opening such a valuable discussion on the topic which, in my opinion, is of great importance to the fields of bibliometrics and informetrics.

Vesna OluiC-VukoviC Institute for Information Sciences P.O. Box 327 41001 Zagreb Republic of Croatia

References

Bookstein, A. (1990). Informetric distributions, part I: Unified overview. Journal of the American Society for information Science, 41, 368-375.

Bookstein, A. (1990). Informetric distributions, part II: Resilience to ambiguity. Journal of the American Society for Information Science, 41, 376-386.

Brookes, B.C. (1977). Theory of the Bradford law. Journal of Documentation, 33, 108-209.

Burrell, Q. L. (1980). A simple stochastic model for library loans. Journal of Documentation, 36, 115- 132.

182 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-April 1993

Burrell, Q. L. (1982). Alternative models for library circulation data. Journal of Documentation, 38, 1-13.

Burrell, Q. L. (1985). The 80120 rule: Library lore or statistical law? Journal of Documentation, 41, 24-39.

Burrell, Q.L. (1988). Modelling the Bradford phenomenon. Journal of Documentation, 44, 1 - 18.

Fairthorne, R. A. (1969). Empirical hyperbolic distributions (Brad- ford-Zipf-Mandelbrot) for bibliometric description and prediction. Journul of Documentation, 25, 319-343.

Kendall, M. G. (1960). The bibliography of operational research. Operationul Research Quarterly, II, 3 1 - 36.

OluiC-Vukovic, V. (1991). The shape of the distribution curve: An indication of changes in the journal productivity distribution pattern. Journal of Information Science, I7, 281- 290.

Oluic-VukoviC, V. (1992). Journal productivity distribution: Quanti- tative study of dynamic behavior. Journal of the American Society for Information Science, 43, 412-421.

Schubert, A., & Glanzel, W. (1984). A dynamic look at a class of skew distributions. A model with scientometric application. Scientometrics, 6, 149- 167.

Sichell, H. S. (1985). A bibliometric distribution which really works. Journal of the American Society for Information Science, 36, 314-321.

Vlachy, J. (1976). Time factor in Lotka’s law. Probleme de Informare si Documentare, 10, 44-87.

Yablonsky, A. I. (1980). On fundamental regularities of the distribu- tion of scientific productivity. Scientometrics, 2, 3-34.

The Impact of Proceedings upon Authors and Journals

Sir:

Technical journals, technical authors, conference proceedings, and copyright laws are headed on a collision course. The outcome will have an impact upon the archiving and retrieval of informa- tion. It will also have an impact upon the professional stature of authors as they have to make choices of publication media.

Let us examine the situation. The copyright laws say that an author owns the copyright to his work from the time ink hits paper, or else that the author’s institution owns it if it is a “work made for hire.” The publisher to whom the work is submitted insists that all rights to the copyright except derivative patents be assigned to said publisher. Suppose the author signs. Then he/she cannot use the manuscript text or figures at a different publisher without getting permission from the first publisher. Permission is granted pro forma, so this is no problem. The impass arises at the second publisher. Almost invariably the second publisher is unwilling to reprint an earlier publication.

Who would want to publish an article which appeared pre- viously? Well, a scientist or engineer who gave a paper at a conference and subsequently wrote it up for the Proceedings of the conference might want to publish the paper in a refereed, archival, and abstracted journal. After all, the Proceedings is abstracted sparsely or not at all, and the information is lost to posterity, unless it is reprinted in an archival journal. Only then will it be included fully in the great abstracting services and databases. This exclusionary principle is based upon the concept “refereed,” which separates journals from all other publication media.

But, seeing that the copyright is owned by some other publish- ing house, that the text would have to be surrounded by quotes as if one were quoting Shakespeare, and that every figure caption would have to include a thank-you note, the second proposed publisher (the journal) will not be interested in undertaking the publication. The author is left in “outer darkness” until he/she has amassed enough new material to convince the journal editor that a “new” manuscript is in hand.

Of late, the number of conference Proceedings published under titles which do not include the word “proceedings” and issued by for-profit publishing houses has proliferated. The authors are highly motivated to appear in these Proceedings just as they were highly motivated to present orally (or by poster) at the specialized and prestigious symposia being reported. The authors should be wary, however. Proceedings receive no abstracting or else very little. This assertion can be substantiated by library research. Consider two Proceedings with the word “nondestructive” in their titles. The reference department at the library at Iowa State Uni- versity reports that the annual two-volume compendium, “Reviews of Progress in Quantitative Nondestructive Evaluation,” published by Plenum and now in its eleventh year, is abstracted only in Engineering Index which is issued by Engineering Information, Inc., a private not-for-profit firm which abstracts about 1000 annual Proceedings as well as abstracting refereed publications. The compilation, “Nondestructive Characterization of Materials,” which is now in its fourth biennial edition, also by Plenum, is not abstracted at all. Neither of the two publications listed above is refereed. Materials Evaluation, a refereed journal, is abstracted by 28 abstracting services, including Chemical Ab- stracts, Current Contents, Engineering Index, Science Abstracts (a conglomerate including Physics), Science Citation Index, and a host of specialized services from Art and Archaeology Technical Abstracts through Zincscan. Through these abstracting services and the databases which computerize access to them, the authors in Materials Evaluation and other refereed archival journals will find their works being referenced for years to come. The author who publishes first in a copyrighted Proceedings and then finds he/she cannot use the same paper in a journal may not find a full spectrum of abstracting coverage.

Thus, the Proceedings author should desire to become pub- lished in a journal to extend his reputation through time and space. Yet, if his project is finished (at which point the journal should want his manuscript but does not because nothing novel will be forthcoming), what is he/she to do? The paper really should go into the Proceedings, of course, for completeness of that compendium. But then, how is the author to maintain his/her options? Some may “sandbag” to have left-over “new” material, but that course of action is questionable. What then?

A suggestion to eliminate the impass is this: The publishing houses for Proceedings could contract for one-time rights to the manuscript. Then the second publication would be free of quotes and thank-you notes, the journals would be free and clear, and the authors would enhance their reputations.

Emmanuel P. Papadakis, Ph. D Associate Director, CNDE Iowa State University 1915 Scholl Road, Building 2, Ames, IA 50011

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-April 1993 183