putting big data in context

3
Journal of Public Administration Research and Theory 264 Putting Big Data in Context Mark H. Meier and Jennifer Imazeki. 2013. The Data Game: Controversies in Social Science Statistics, 4th ed. Armonk, NY: M.E. Sharpe. 309 pp. Teaching introductory statistics and data analysis in public and nonprofit adminis- tration programs has the propensity to stagnate, to become a rote exercise of mov- ing from descriptive to inferential, from simple to multiple. Though the approach to teaching such material has remained relatively consistent, data sources and availability have increased substantially. Namely, it has been posited that we are in an era of Big Data. This trend has been described as a philosophy du jour (Brooks 2013), the next frontier (Manyika et al. 2011), a revolution (Mayer-Schonberger and Cukier 2013), or a seismic shift (Gitelman and Jackson 2013). Characteristics of Big Data include unprecedented volumes and continuous generation of information, unwieldy com- plexity in processing, presence of both structured and unstructured forms, increasing difficulty in storage, and an age where every electronic movement you make has the potential to be counted and recorded. For many, the era of Big Data has resulted in a virtual treasure trove of afford- able and available data, the likes of which were previously cost prohibitive or dif- ficult to obtain. The availability of data and the ease with which technology allows us to “crunch the numbers” can set us free, illuminating errors in analysis such as the debunking by Thomas Herndon (Evergreen State College ’07) and his colleagues (Herndon, Ash, and Pollin 2013) of a key study supporting the European austerity movement. And although it would be wonderful (or unsettling) to have access to every datum collected by Google and Facebook, the increasing availability of large govern- ment data sets has become a hallmark of Big Data times. Indeed, public administrators in data-collecting agencies are working toward making these large data sets available to the public, which can both increase agency transparency and fulfill Freedom of Information Act requests in a more efficient and timely fashion. Private and nonprofit sector data collections should also be included in a con- versation about Big Data. Much of the large data collected by private firms remain proprietary and costly, and controversy arises when such data are shared (e.g., “NSA (National Security Agency) collecting phone records of millions of Verizon custom- ers daily”—Greenwald 2013, in The Guardian). The availability of information on the nonprofit sector (e.g., Urban Institute, Charity Navigator, and GuideStar), a some- what amorphous description of various organizations that fit within a legal definition that may or may not have federal tax exemption, also has increased over the past several years, allowing for more-comprehensive analyses of the sector. It is at this point that Maier and Imazeki’s book, The Data Game: Controversies in Social Science Statistics, 4th ed., comes into play as an effective bridge between (a) teach- ing the theory and mechanics of data analysis and (b) the applicability of these processes to a Big Data world. In this fourth edition, the authors recognize that the accessibility, prevalence, and use of large amounts of data have changed substantially since the previ- ous edition was published in 1999. Thus, Maier and Imazeki provide significant updated examples, citations, and resources, which are grounded in current controversies about at Dalhousie University on July 2, 2014 http://jpart.oxfordjournals.org/ Downloaded from

Upload: d

Post on 29-Jan-2017

214 views

Category:

Documents


2 download

TRANSCRIPT

Journal of Public Administration Research and Theory264

Putting Big Data in Context

Mark H. Meier and Jennifer Imazeki. 2013. The Data Game: Controversies in Social Science Statistics, 4th ed. Armonk, NY: M.E. Sharpe. 309 pp.

Teaching introductory statistics and data analysis in public and nonprofit adminis-tration programs has the propensity to stagnate, to become a rote exercise of mov-ing from descriptive to inferential, from simple to multiple. Though the approach to teaching such material has remained relatively consistent, data sources and availability have increased substantially. Namely, it has been posited that we are in an era of Big Data. This trend has been described as a philosophy du jour (Brooks 2013), the next frontier (Manyika et al. 2011), a revolution (Mayer-Schonberger and Cukier 2013), or a seismic shift (Gitelman and Jackson 2013). Characteristics of Big Data include unprecedented volumes and continuous generation of information, unwieldy com-plexity in processing, presence of both structured and unstructured forms, increasing difficulty in storage, and an age where every electronic movement you make has the potential to be counted and recorded.

For many, the era of Big Data has resulted in a virtual treasure trove of afford-able and available data, the likes of which were previously cost prohibitive or dif-ficult to obtain. The availability of data and the ease with which technology allows us to “crunch the numbers” can set us free, illuminating errors in analysis such as the debunking by Thomas Herndon (Evergreen State College ’07) and his colleagues (Herndon, Ash, and Pollin 2013) of a key study supporting the European austerity movement. And although it would be wonderful (or unsettling) to have access to every datum collected by Google and Facebook, the increasing availability of large govern-ment data sets has become a hallmark of Big Data times. Indeed, public administrators in data-collecting agencies are working toward making these large data sets available to the public, which can both increase agency transparency and fulfill Freedom of Information Act requests in a more efficient and timely fashion.

Private and nonprofit sector data collections should also be included in a con-versation about Big Data. Much of the large data collected by private firms remain proprietary and costly, and controversy arises when such data are shared (e.g., “NSA (National Security Agency) collecting phone records of millions of Verizon custom-ers daily”—Greenwald 2013, in The Guardian). The availability of information on the nonprofit sector (e.g., Urban Institute, Charity Navigator, and GuideStar), a some-what amorphous description of various organizations that fit within a legal definition that may or may not have federal tax exemption, also has increased over the past several years, allowing for more-comprehensive analyses of the sector.

It is at this point that Maier and Imazeki’s book, The Data Game: Controversies in Social Science Statistics, 4th ed., comes into play as an effective bridge between (a) teach-ing the theory and mechanics of data analysis and (b) the applicability of these processes to a Big Data world. In this fourth edition, the authors recognize that the accessibility, prevalence, and use of large amounts of data have changed substantially since the previ-ous edition was published in 1999. Thus, Maier and Imazeki provide significant updated examples, citations, and resources, which are grounded in current controversies about

at Dalhousie U

niversity on July 2, 2014http://jpart.oxfordjournals.org/

Dow

nloaded from

BOOK REVIEWS 265

data, for each subject area. For example, they detail the way crime data are collected and recorded for the various agencies in our complicated criminal justice system and how that has changed significantly during the past decade, with an increased emphasis on reporting “the numbers” in order to justify funding levels and to influence perceptions of the public in general and policy makers in particular. The authors begin the discus-sion of controversies surrounding the measurement of crime by highlighting that there is no one perfect data source, stipulating that researchers must “choose the data for which inaccuracies are least likely to affect the issue being studied” (p. 115).

The book chapters represent subject areas that are of interest to public and non-profit administrators. Each subject area chapter is divided into two parts: a discus-sion of a set of accessible data sources often generated and used by the public sector, followed by a discussion of the controversies surrounding a given subject area when “objective” data analysis is used to support what are, in essence, value-laden political positions on an issue. Following the introductory chapter is a chapter on demography, which lays the foundation for numerous public policy decisions and thus provides a useful launching point for students as they continue with the remaining topics: hous-ing, health, education, crime, the national economy, wealth, income, poverty, labor statistics, business statistics, government, and public opinion polling.

Maier and Imazeki take the time to briefly define key concepts applicable to the discussion at hand such as the use of cost benefit analysis when analyzing health data (p. 77), value-added measures in examining student test scores (p. 104), the ecological fallacy applied to perceptions of crime (p. 122), and relative versus absolute mobility when discussing wealth, income, and poverty (p. 176). Each chapter summary does a great job of drawing the reader back to the overall concerns a thoughtful researcher should have when working with data for each given subject area. The case study ques-tions at the end of each chapter require the student to respond reflectively about the chapter material, rather than relying on a “hunt-and-peck” search through the chap-ter for answers to the questions. Finally, the references at the end of each chapter are divided by chapter subheadings and page numbers, making it easy for the reader to navigate between the text and the corresponding citations.

Maier and Imazeki state that the book is intended to be read in its entirety, and there is much merit to this assertion. However, because this review is about the useful-ness of this text in the classroom, I would argue that each subject area chapter can stand on its own as a demonstration of controversies surrounding a particular area of study. For example, I teach introductory statistics in the evening, in the condensed summer session, to students entering the Master of Public Administration program, who attend class after working at their respective day jobs. Given these constraints, it is difficult to include anything beyond the primary “how to” statistics text. However, I believe that giving these budding public servants a broader context in which to see the course material in operation is possible without requiring a reading of the entire book by Maier and Imazeki. Instead, I  recommend students to read the first two chapters (Introduction and Demography), a chapter of their choice, the concluding chapter, and to answer select case study questions. As reflective of the interdisciplinary nature of public and nonprofit administration, my students often comment that they could not choose just one chapter because of their varied interests, and so end up reading more than what is required or take it up after the course is over.

at Dalhousie U

niversity on July 2, 2014http://jpart.oxfordjournals.org/

Dow

nloaded from

Journal of Public Administration Research and Theory266

The real strength of the book is the care that Maier and Imazeki give in their treatment of the controversies surrounding interpretations of data and the possibili-ties for why extremely different interpretations of the same set of data exist. To this end, the authors make explicit the role that assumptions play in creating conflicting interpretations of data. Indeed, the overarching theme of Maier and Imazeki’s book is that there is no single best approach to interpreting data, big or other-sized ones. The latest edition of The Data Game is an excellent supplement to statistics and data analysis texts, but it additionally stands on its own for inclusion in courses on research methods. The book works both for budding researchers and for practitioners whose interactions with data may be primarily as consumers of research.

doi:10.1093/jopart/mut046 Doreen SwetkisEvergreen State College

RefeRenCes

Brooks, David. 2013. The philosophy of data. The New York Times, February 4. www.nytimes.com (accessed May 17, 2013).

Gitelman, Lisa, and Virginia Jackson. 2013 Introduction. In Raw data is an oxymoron, ed. L. Gitelman, 1–14. Cambridge, MA: MIT Press.

Greenwald, Glenn. 2013. NSA collecting phone records of millions of Verizon customers daily. The Guardian. June 5. http://www.guardian.co.uk/world/2013/jun/06/nsa-phone-records-verizon-court-order (accessed June 25, 2013).

Herndon, Thomas, Michael Ash, and Robert Pollin. 2013. Does high public debt consistently stifle economic growth? A critique of Reinhart and Rogo. Amherst, MA: Political Economy Research Institute. http://www.peri.umass.edu/236/hash/31e2ff374b6377b2ddec04deaa6388b1/publica tion/566/ (accessed July 7, 2013).

Mayer-Schonberger, Viktor, and Kenneth Cukier. 2013. Big data: A revolution that will transform how we live, work, and think. New York, NY: Eamon Dolan/Houghton Mifflin Harcourt.

Manyika, James, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and Angela H. Byers. 2011. Big data: The next frontier for innovation, competition, and pro-ductivity. McKinsey Global Institute. http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation (accessed May 17, 2013).

at Dalhousie U

niversity on July 2, 2014http://jpart.oxfordjournals.org/

Dow

nloaded from