patterns of cbrn use by non-state...

ISA 2013

Patterns of CBRN Use by Non-State Actors: Analyzing the Evidence

Ronald Breiger1

Paul Murray1

Lauren Pinson2 1University of Arizona, Tucson, AZ 85718

2National Consortium for the Study of Terrorism and Responses to Terrorism (START), University of Maryland, College Park, MD 20740

Paper prepared for presentation at the Annual Convention of the International Studies Association, San Francisco, April 4, 2013,

Panel on New Data for the Scientific Study of Conflict

This work was supported by the Defense Threat Reduction Agency, Basic Research Award # HDTRA1-10-1-0017. We thank Gary Ackerman, Victor Asal, David Melamed, H. Brinton Milward, Karl Rethemeyer, and Eric Schoon for helpful discussions.

1

Patterns of CBRN Use by Non-State Actors: Analyzing the Evidence

In recent years, academics and policymakers have debated the growing potential for chemical, biological, radiological, and nuclear (CBRN) terrorist attacks. However, CBRN terrorism is often misrepresented by the media – due to limited first-hand information, conflicting reports, and varying sympathies – which creates unreliable events data.

This paper has two goals. The first is to introduce the reader to a new database for the scientific study of activities (plots, acquisitions, weaponization, attacks) of non-state actors seeking or using CBRN agents. Analysts have often attempted to mine CBRN databases and related datasets on terrorism by making use of standard regression models and their many generalizations such as logistic regression for binary outcomes (e.g., Asal and Rethemeyer 2009, on CBRN use or pursuit by Islamist organizations), Poison regression for counts (e.g., Asal & Rethemeyer 2008, modeling attack lethality), and many more. Our second goal therefore is to illustrate a set of newly developed techniques for, so to speak, turning these regression models “inside out” so that—instead of focusing on relations among variables—the analyst can use those relations to model a network of profile similarity among the cases (which for us are specific events). In this respect we join with a number of other papers presented in this panel that seek to extend the conventional quantitative modeling techniques in a variety of new directions. Philip Schrodt (2012, p. 556) has recently noted that event data is “perfectly suited for network analysis.” We intend to introduce an implementation of this idea that is quite non-standard, but one which we have found to be useful in working with databases in analyzing conflict situations. In illustrating the modeling of a new events databases, we build on our previous modeling work (in particular on Breiger et al. 2011, Breiger & Melamed 2013; Melamed et al. 2012, 2013).

The POICN Database

As Perliger and Pedahzur (2011) have noted, there has been “a striking increase in efforts and resources invested in data collection” on terrorist groups in recent years by academic and government agencies. Particularly notable in this respect have been the open-source, publicly available datasets maintained at the START Center at the University of Maryland, contributing to the present availability of “high-resolution” information (see also Hayden 2009).

In this paper we make use of a new relational database developed by the START Center, focusing on terrorist plots, acquisitions, and attacks relating to CBRN agents from 1990-2011. The Profiles of Incidents involving CBRN by Non-state actors (POICN) database distinguishes itself from other CBRN and terrorist attack databases by means of its transparent classification of source validity and inclusion of variables that rate the uncertainty that is sometimes present within and between sources. Explicitly incorporating and disclosing reliability and credibility levels allows for greater flexibility in tailoring the inclusion of cases for researchers’ specific

2

analytical requirements. In addition, the inclusion of such measures in this and similar datasets can facilitate more robust and defensible analyses and thus ultimately strengthen the role that social science can play in guiding and improving policy choices, especially in high-charged political and security contexts. When POICN is released publicly within the next several months, it will be the most comprehensive quantitative database on non-state actor CBRN events that have transpired within the past two decades.

Transparent coding: source competence and objectivity; event credibility. In developing the POICN database, START researchers began with 499 potential cases from 1990-2008 (a period subsequently extended to 2011), drawn from existing databases. The researchers rejected the widely applied but often-mistaken assumption that a case’s inclusion in a dataset automatically equates to full validation of that case. A Source Evaluation Schema was developed, consisting of a set of operationalized variables and coding instructions. (See Ackerman and Pinson 2011 and Sawyer and Ackerman 2012 for more details.) Multiple variables focus on capturing the intentional and accidental distortion of information regarding CBRN activities.

For each event, each source was coded for its competence. (For example, source competence was coded as “questionable” for those institutional publishers or authors generally known for high quality output, but where the particular source document describing an event exhibited prima facie indications of lower quality. “Full” competence was reserved for documents with no evident internal flaws produced by authors and institutions that have proven or researched competence in the geographical and substantive domain on which they are reporting, taking into account all that is known about their history and reputation as sources.) Likewise, each source was coded for its objectivity with respect to each event. (A newspaper that is generally measured in its approach to reporting but is known on occasion to take a very pro-Israeli or pro-Palestinian stance on the Israeli-Palestinian issue is coded as evidencing only “potential” objectivity).

The variable of credibility, coded once for each case (event) in the database, provides a measure of whether the event actually took place and whether that event really constituted a CBRN attack, based on corroboration between multiple independent sources. Sources coded as not competent or not objective were excluded from the credibility measure, which evaluates the number of remaining sources for each event with reference to their degree of mutual independence. A source is regarded as independent of another source if it does not share the same original authorship and does not rely on the same original source material. Source documents deriving from the same institution (such as the Associated Press) do not count as multiple sources. As indicated in Figure 1, the lowest level of credibility (35% of the events in the database) was assigned single-source events or events described by multiple non-independent sources. The intermediate level of credibility (14% of the events) was assigned to events reported by two independent sources not reflecting the same bias. The highest level of credibility (51% of

3

the events) was allocated to events for which there were three independent sources, or two independent sources with competing biases.

INSERT FIGURE 1 ABOUT HERE

Scope of the POICN database. We have highlighted the variables discussed above because they are directly relevant to the empirical analyses we report in this paper. Researchers will nonetheless want to know that the concern for validity and reliability of the events in the POICN database extends to many more design and coding features. For example, at least 54 of the key variables are coded for the presence / absence of both doubt and discrepancy, allowing researchers to assess in fine-grained ways effects of the level of validity of the data on the substantive conclusions drawn. The database contains records on 458 cases (events) spanning 1990-2011, and all of these 458 cases have been double-coded. POICN records 91 different groups and 75 lone actors as perpetrators in these events. Geospatial, longitudinal, technological, and organizational variables are coded and included.

Basic descriptive analytics of POICN 1990-2011. The POICN Dataset includes a spectrum of events ranging from protoplots (an event where a perpetrator is exploring CBRN pursuit but has not reached the level of plotting) to use of an agent (where a perpetrator employed or disseminated a CBRN substance in the commission of an attack ). Figure 2 shows the breakdown of these event types. Fifty-five percent of the 458 events are below the level of an actual attack.

Due to the inherent uncertainty in reporting of CBRN events, two unique variables are coded: Event Uncertainty and Attack Uncertainty. Event Uncertainty is relevant when sources do not confirm whether or not a CBRN event actually occurred. Attack Uncertainty is relevant when sources do not confirm whether an event was actually an attack or was an accident or natural event. The great majority of events have no event or attack uncertainty (Figure 3).

Events in POICN are coded for multiple agent types. Figure 4 shows that 78% of the events in POICN involve the pursuit or use of chemical weapons.

INSERT FIGURES 2, 3, AND 4 ABOUT HERE

Context and Research Questions

Motivation. Given a data matrix (cases by variables), regression analysis as well as many of its generalizations may be thought of as the study of relations among the variables. With its typical assumption that the “cases” are a random sample representative of a population of interest, regression analysis makes the cases invisible, as Michael Shalev (2007) and other analysts of comparative politics have argued in their critiques of regression approaches.

But often the cases are of interest, and the goal of the analysis should be to use the variables to let the cases be seen. Shalev (2007) discusses analyses where the cases are countries, and the

4

research agenda is comparative analysis of types of welfare states. In the example of the present paper, the cases are CBRN events, and our research agenda is comparative analysis of types of such events (discovering the types and how variables interact differently within each type). Moreover, in neither Shalev’s examples nor those of the present paper could the analyst claim that the cases are a random sample. The POICN database aims at collecting all known cases of CBRN events within its date range, and there are surely dependencies among the events along multiple dimensions. (For example, two attacks attempted by the same group in adjacent months are likely not “independent” of each other. In addition, attacks using the same toxic agent by different groups but within the same country might well lack independence from one another. It seems quite limiting indeed to assume independence among all the cases.) We propose instead to discover regions of dependence among the cases on the basis of their attributes. Along with Shalev (2007), Charles Ragin (2008), and other researchers, we are willing to pay the costs of giving up our claim to “significance testing” in order to gain insight by more richly exploring the structuring internal to our dataset. Moreover, we show that we can do all this by deepening the framework within which regression analysis is conventionally understood.

Outcome variable. We will assess the effects of several variables on the type of CBRN event. For this purpose we use only the 175 events taking place in the 1998-2011 period that are coded at the highest credibility level (Figure 1). Previous research (Ackerman and Pinson, 2011) has demonstrated that the significance of variables in predictions of event type are affected by credibility level, and in this analysis we seek to generalize from only the most credible events.

As mentioned above in discussion of Figure 2, the POICN database delineates eight different types of CBRN event, ranging from protoplots (e.g., knowledge that a terrorist group has hired a scientist with a CBRN specialty) all the way up to the use of an agent in an attack. We do not view these eight categories as forming an ordinal scale of intensity for CBRN activities. Rather, we separate the eight categories into two broad types:

Type A = seeking a CBRN weapon = protoplots, plots, attempted acquisition, and possession of a non-weaponized agent.

Type B = possessing a CBRN weapon = possession of a weapon, threat with possession, attempted use of weapon, and use of a CBRN substance in the commission of an attack.

Table 1 provides more detail on each of the eight categories of our two broad types. The basic distinction is between seeking a CBRN weapon (Type A) and possessing one (Type B).

TABLE 1 ABOUT HERE

We will briefly elaborate our warning against any assumption that events of Type A in Table 1 (such as plots and acquisition attempts) are less serious than those of Type B (e.g., possession of a weaponized agent or its use in an actual attack). The weapon possessed in a Type

5

B event might be relatively crude (such as a small amount of radioactive material intended to be left in a building), and the attack might have caused no harm even though harm was intended. On the other hand, an event of Type A might be consequential if, for example, a Bulgarian businessman is approached by a contact with ties to al Qaeda and asked about the possibility of acquiring spent nuclear fuel rods (Event ID 23). To pursue this point: one of the variables in the POICN database is “heightened interest,” which is manifested by an event if any of three criteria are met (at least five casualties; involvement of a CBRN agent that is classified as a warfare agent; use of, or a plot to create, weaponization of the agent in at least a moderately sophisticated manner). When heightened interest is cross-classified with event type (for the 175 high-credibility events we use in our analysis, of which 6 events had missing data on heightened interest), we have:

Event Type

Heighted Interest Type A Type B No 39 40 79

Yes 36 54 90

75 94 169

Thus, whether an event involves possession or use of a weaponized agent (Type B) versus some form of plot, attempted acquisition, or possession of a non-weaponized agent (Type A) is only moderately, and non-significantly, related to heightened interest in the event (Yule’s Q = .19, log(odds ratio) = .38; chi-square = 1.5 on 1 df). Both types of event (A and B) are potentially of great interest. As we seek to understand the qualities of these two broad event types (which partition the 175 events into two classes), we employ logistic regression.

Predictor variables. We examine the effect on event type of nine predictors, all of which are (like the outcome variable) binary. Three of these pertain to world region: Russia and the NIS countries; the Middle East and North Africa; and South Asia. (The seven other regional categories used in the database serve as the omitted category.1) Two pertain to weapon type: biological and chemical (with the omitted categories of radiological and nuclear providing a baseline). Four predictors pertain to the type of perpetrator: lone actors, religious extremists, cults, and ethnonationalist groups (with the three other classes of perpetrator providing the omitted-category baseline.2)

Descriptives are provided in Table 2. As seen there, 55% of the events pertain to the possession of a CBRN weapon (Type B events), while the remainder pertain to seeking such a weapon (Type A). Zero-order relationships and their calculation are illustrated at the bottom of

1 For 126 of the 175 events, a single country was listed. The remaining events were associated with multiple

countries. For these, only the region of the first-listed country was coded. 2 Only 10 of the 175 events were coded as having multiple perpetrators, and only four events were coded as

having multiple types of perpetrator. In these four cases only the first type was coded.

6

Table 2. Concerning biological weapons, for example, the odds are 12 / 25 = 0.48 that a biological weapon was possessed (Type B) versus sought (Type A). However, for non-biological weapons the odds are 84 / 54 = 1.56, implying that the odds on a Type B (rather than Type A) event are lower for biological weapons than for those of other types. Indeed, the odds ratio is (12 / 25) / (84 / 54) = .3086, and the log of the odds ratio is a negative –1.176 (see Table 2). Bivariate relations of each predictor with the outcome are given similarly in Table 2.

TABLE 2 ABOUT HERE

Research questions. In a conventional multivariate study we would pose our central research question as follows:

RQ1: What are the effects of the predictor variables on the outcome of possession and use of CBRN weapons (Type B events)?

Indeed we are interested in that question. In addition, however, we are also interested in how we can use the variables to learn about the cases (the CBRN events). We thus also formulate the following non-traditional questions:

RQ2: How can we compute the logistic regression coefficients as sums across the CBRN events? Therefore, how may we compute these same logistic regression coefficients as sums across clusters of events?

RQ3: How can we use the clustering of cases (events) to discover interactions among the variables? And how can we use a single variable to induce clustering among the cases?

RQ4: How can we define a network among the cases (CBRN events) such that that network yields the same predicted logits (log (�̂� / (1 – 𝑝)) as are produced by the standard logistic regression model? And how can this network among the cases provide us with a deeper understanding of the field of variables than is provided by the logistic regression equation?

Logistic Regression Results

The logistic regression model is reported in Table 3, and coefficients are given in both metric and standardized form. Events involving chemical agents, and event locations in Rusia – NIS states, the Middle East and North Africa, and (especially) South Asia are all positively and significantly associated with events pertaining to CBRN weapon possession (Type B events). Religious extremist and ethnonationalist groups (the latter in a switch from the zero-order relationship reported in the previous table) are significantly and negatively associated with Type B events or, to say the same thing, significantly and positively associated with Type A events: the seeking of CBRN weapons. Events involving biological agents, and lone perpetrators and cult groups, are also negatively associated with Type B (positively associated with Type A), but not significantly so.

7

TABLE 3 ABOUT HERE

Turning Regression Modeling “Inside Out”

We will illustrate with the example of logistic regression, focusing on new insights that can be applied to the model application just reviewed. The logistic regression model implies

log (�̂� (1 − �̂�⁄ )) = X �̂�

where �̂� is the modeled estimate of the probability that each event (in turn) is an event of Type B (possessing CBRN weapons), X is an events-by-variables design matrix for the predictor variables, and �̂� is a vector of estimated logistic regression coefficients.

We will compute the singular value decomposition (SVD) of matrix X:

X = U S VT

thus expressing the given events-by-predictor variables dataset (X) in terms of a set of orthogonal dimensions for the events (U) and a dual set orthogonal dimensions for the predictor variables (V), with S a diagonal matrix of weights (singular values) indicating the relative importance of each dimension.3 (The superscript T signifies matrix transposition.) We will also define a diagonal matrix pertaining to the outcome variable,

Y* = diag(log (�̂� (1 − �̂�⁄ )))

Then, the usual formula for computing logistic regression coefficients is identical to

�̂� = V 𝑺−1 𝑼𝑇 𝒀∗ 1

where 1 is a vector of 1’s. The matrix product V 𝑺−1 𝑼𝑇 𝒀∗ is of dimension (predictor) variables by events, and the sum of its rows yields the identical logistic regression coefficients that are produced by standard packages. The novel element, however, is that the equation above indicates that the same logistic regression coefficients produced by the standard packages can be alternatively defined as sums across the cases, which in our example are the 175 events that we have been working with but that have remained invisible in Tables 2 and 3.

In our previous work (e.g., Breiger et al. 2011, Melamed et al. 2013), we have examined regression coefficients as sums across clusters of the cases, where we have used inductive procedures (such as the k-means clustering algorithm) to identify the clusters. In this paper we take a different tack. We recognize that a binary variable may be simultaneously viewed as a clustering of cases into two clusters. Specifically, we distinguish less sophisticated events (such as a plot or an attack that involves leaving a radiological agent at a location, or contaminating

3 For convenience, we work not with X, the matrix of predictor variables, but with Z, in which each variable has been transformed to standard form (by subtracting its mean and dividing by its standard deviation).

8

drinking water with raw sewage) from highly sophisticated events (such as planning or constructing a containment system to protect perpetrators from the effects of the CBRN agent).4 We partition our 175 CBRN events into two clusters: those that are “low” in sophistication, and those that are “high.” Using the equation above, we can write the same regression coefficients that we have previously computed (see the “standardized” logistic regression coefficients in the right-most column of Table 3) as sums across the 175 events partitioned into the “low-sophistication” and the “high-sophistication” clusters of events, and we have done precisely this in Table 4.5

TABLE 4 AND FIGURE 5 ABOUT HERE

Where do the numbers in Table 4 “come from”? This is illustrated in Figure 5 for the case of biological weapons. The logistic regression coefficient for bioweapons (in standardized form) is identical to the linear regression coefficient of the logit of 𝑝 � regressed on the residuals from bioweapons after regressing it on all the other predictor variables (see Figure 5). Moreover, the two clusters of cases have values on these variables (logit(�̂�) and residuals of the bioweapons variable) that sum to the logistic multiple regression coefficient, –.0893 (given in standardized form).

The astute reader might wonder whether the decomposition of the logistic regression coefficient (–.0893) in Table 4, into a number pertaining to the cluster of “low”-sophistication cases (–.1345) plus a number pertaining to the cluster of “high”-sophistication cases (.0451) implies that the latter numbers are the logistic regression coefficients estimated separately for the cases in each cluster. They are not, although the analogy is suggestive (as illustrated in Figure 5). We think of the latter two numbers instead as the “versions” of local logistic coefficients that are assumed by the logistic regression model that is applied across all the cases.

Discovering statistical interactions. Notice the arrows at the right margin of Table 4, pointing to the variables “chemical weapons” and “lone perpetrators.” We observe that these variables move in opposite directions across the clusters. The chemical weapons variable decreases from .5656 (in the “low” sophistication cluster) to .2470 (in the “high” sophistication cluster). The lone perpetrator variable moves in the opposite direction, increasing from –.2579 (in the “low” sophistication cluster) to .0576 (in the “high” sophistication cluster). We have shown that this opposite trending across clusters of cases can imply a statistical interaction among the implicated variables (Melamed et al. 2012, 2013).6 Indeed, when we add an

4 To be precise, using the POICN variable for sophistication, we combine the categories of “low” and “medium”

sophistication and contrast them with all other events. 5 There are two cases of rounding error in the fourth decimal place, but the two sets of regression coefficients

(those in the right-most columns of Table 3 and Table 4) are otherwise identical. 6 The condition is that the cluster variable be significant when added to the (logistic) regression equation. In this

case, the “sophistication” variable provides a statistically significant addition to the model of Table 3.

9

interaction term (chemical weapons × lone perpetrator) to the logistic regression model of Table 3, we find a statistically significant improvement (Table 5).

TABLE 5 ABOUT HERE

The interpretation of the discovered statistical interaction is as follows: a lone perpetrator involved with chemicals is significantly more likely in an event that entails seeking weapons (Type A) rather than possessing or using them (Type B). The example provided here illustrates how clusters of cases (CBRN events) can be used to discover relations (statistical interactions) among variables.

A network among events. Standard data-analysis packages produce estimates of log (�̂� (1 − �̂�⁄ )) ; see the first equation on page 7 above. The identical logits may be computed in an alternative way by making use of the singular value decomposition given in the second equation on page 7. Making use of it, we may write, employing matrix multiplication:

(U UT) z* = log (�̂� (1 − �̂�⁄ ))

where U is from the above-mentioned singular value decomposition, and, with y containing the 0’s and 1’s of the observed dependent variable,

z* = X�̂� + (y - �̂�)

The z* are analogous to the pseudo-values of logistic regression (compare Breiger et al. 2011, p. 29).

The first equation in this section is important because it employs a network among the cases – UUT – to transform observed to fitted values of the outcome variable.7 In order to produce the fitted logits, logistic regression (in effect) operates on a network of profile similarity among the cases. This network deserves serious study.

The network we propose to study is cos(U), defined to have elements Pij / sqrt(Pii × Pjj), where P = UUT. Thus, cos(U) is a network of cosines among all the rows of matrix U; it gives us cosines among all pairs of the 175 events.

In order to gain visual clarity, we will report only the “strong” relations among the 175 events, defined as those with relatively high cosines above an arbitrarily chosen cutoff of +.60. Each of Figures 6, 7, and 8 shows (the identical) “outcome” network in its top panel, distinguishing between events of seeking to acquire CBRN materials (Type A) and possessing them (Type B events). Please recall that, by construction of the singular value decomposition (second equation on p. 7), no information on the outcome variable was taken into account in the formulation of network UUT. It is therefore remarkable that the (identical) top panel of each of 7 For this reason, this network appears in textbooks of mathematical statistics under the name, Projection Matrix.

Data analysts concerned with regression diagnostics will know it as the Hat Matrix.

10

these figures distinguishes very sharply a substantial number of clusters that are fairly uniformly of one event type or the other.

FIGURES 6, 7, 8 ABOUT HERE

We have numbered the clusters in Figures 6 through 8 in the same order, and here we provide some overall comments about the organization of the network among events that (in a very direct sense) yields the estimated logits produced by the logistic regression model. Recall that the outcome variable, event type, distinguishes between seeking CBRN weapons (Type A) and possessing or using them (Type B). “Other region” refers to those outside of Russia and the NIS countries, South Asia, and the Middle East / North Africa. As the same event might entail multiple types of C, B, R, and/or N materials, we coded each event into a single type by emphasizing the less frequent category.8 Inside this network, we may observe the following (compare Figs. 6-8):

Cluster Number in Figs. 6-8

Characterization

1

Type A, 1998-2001, Russia, chemical

2

Type B, 1998-2006, other region, radiological & nuclear

3

Type A, 2002-11 [post-9/11], Middle East, chemical

5

other region, chemical

6

other region, biological

9

Type A, 2007-11, South Asia, chemical

10

Mixed type, 2002-11, other region, radiological

11

Type A, other region, chemical

12

Type B, 1998-2006, other region, biological

As summarized just above, there is a great deal of patterning evident in Figs. 6-8; it is a patterning that organizes the production of estimated logits by the standard logistic regression model; however, it is a patterning that is invisible to conventional analyses using that model, because we are observing a network that exhibits no “average” relation between the variables.

Cluster 10 is of particular interest because it provides a network “bridge” between two clusters of events in which chemicals are the weapon of choice, whereas Cluster 10 itself consists strongly of events organized around radiological weapons (Figure 8-b). The events of Cluster 10 exist in “other” regions (beyond those emphasized in the regression model of this paper; Fig. 7-b) and have largely taken place in the years 2002-11 (Fig. 6-b). 8 Thus we coded the event as biological if it was also chemical; as radiological if it was also biological or chemical; and as nuclear no matter what other materials might be involved.

11

Events located in Russia and the NIS countries (Cluster 1) are strongly associated with the earliest period covered in this database (1998-2001), are strongly characterized by seeking rather than possessing weapons, and are centered around chemical weapons.

Identification of Cluster 2 seems a worthwhile result of our network procedures. Events in Cluster 2 are uniquely centered on radiological and nuclear materials, focused on possession (Type B) rather than seeking, and concentrated in the period 1998-2006. These events are strongly located in regions other than the three emphasized in our logistic regression model.

Events centered in the Middle East and North Africa (Cluster 3) are strongly associated with the entire post-9/11 period, are focused on seeking (Type A events), and center on chemical weapons.

Events in South Asia (Cluster 9) are highly interrelated (“dense”), concentrated in the most recent period (post-2006), focused on seeking and on chemical weapons.

Clusters 6 and 12 are strikingly strong in focusing on biological events. Both clusters concentrate in regions other than those emphasized in the regression modeling, but Cluster 6 was of mixed event type whereas Cluster 12 events tell a story of possession and/or use (Type B).

Discussion

This paper has pursued at length a line of criticism of logistic regression modeling that is also taken up (albeit in a much different though fascinating way) in the paper of Bear Braumoeller written for the same panel. Specifically, a purely additive specification in a logit (or related) regression model will not capture the intuition that a causal effect will vary depending on the values of other variables (Braumoeller 2013, p. 5). Moreover, as Braumoeller also points out, interaction terms in logistic regression models are problematic (in part) because the number of interactions required typically renders interpretation difficult or (often) impossible (Braumoeller 2013, p. 5). And, in a very particular and partial way, this paper attempts to address this problem by having the courage to move quantitative approaches more toward description of what Philip Schrodt, another panel speaker, refers to as the new events datasets that seem to require a more qualitative orientation emphasizing “thick description” (Schrodt 2012, p. 551) as well as more attention from the point of view of network analysis (Schrodt 2012, p. 556). To cite one example: a large number of (potential) statistical interactions are examined simultaneously by the decomposition of regression coefficients in Table 4 of this paper.

The basic premise of the modeling effort in this paper is that although regression modeling begins with a data matrix of dimension cases × variables, all standard modeling proceeds quickly to ignore the cases in order to focus on the variables. Moreover, there is a useful “dual” to the standard regression approach, one in which the usual regression coefficients are computed as sums across cases (and, hence, clusters of cases), an alternative in which new binary variables can be analyzed directly as the clusters that they are and used to discover

12

statistical interactions among the variables, and an approach in which a network of profile similarity among the cases is seen to be implicated every time an analyst hits the “compute regression” button on any favorite statistical package.

In pursing these modeling ideas, we had the superbly good fortune to be working with the most serious open-source database on CBRN events that has ever been constructed. Among the newly-opened possibilities, we want to reconsider the results reported here—confined to the cases of highest credibility—with respect to their robustness across a larger number of at least moderately credible events. In ways that this paper has just begun to tap, the POICN database will have a huge impact on future scientific analysis of CBRN as an arena for conflict.

References

Ackerman, Gary A., and Lauren Pinson. 2011. “Speaking Truth to Sources: Introducing a Method for the Quantitative Evaluation of Open-Sources in Event Data.” College Park, MD: National Consortium for the Study of Terrorism and Responses to Terrorism (START Center): working paper.

Asal, V.H., and R.K. Rethemeyer. 2009. "Islamist use and Pursuit of CBRN Terrorism." Pp. 335-358 in Jihadists and Weapons of Mass Destruction, edited by G. Ackerman, and J. Tamsett. Boca Raton, FL and London: CRC Press.

Asal, Victor, and R. Karl Rethemeyer. 2008. "The Nature of the Beast: Organizational Structures and the Lethality of Terrorist Attacks." Journal of Politics 70 (2):437-449.

Braumoeller, Bear F. 2013. “The Anna Karenina Principle in International Relations.” Paper prepared for presentation at the Annual Meeting of the International Studies Association, San Francisco (April).

Breiger, R.L., G.A. Ackerman, V. Asal, D. Melamed, H.B. Milward, R.K. Rethemeyer, and E. Schoon. 2011. "Application of a Profile Similarity Methodology for Identifying Terrorist Groups that use Or Pursue CBRN Weapons." Pp. 26-33 in Social Computing, Behavioral-Cultural Modeling and Prediction, edited by J. Salerno, S.J. Yang, D. Nau, and S. Chai. Berlin;Heidelberg: Springer.

Breiger, R.L., and David Melamed. 2013. “The Duality of Organizations and their Attributes: Turning Regression Modeling ‘Inside Out.’” Research in the Sociology of Organizations: in press.

Hayden, N.K. 2009. "Terrifying Landscapes: Understanding Motivations of Non-State Actors to Acquire and/or use Weapons of Mass Destruction." Pp. 163-194 in Unconventional

13

Weapons and International Terrorism: Challenges and New Approaches, edited by M. Ranstorp, and M. Normark. London and New York: Routledge.

Melamed, David, Ronald L. Breiger, and Eric Schoon. 2013. "The Duality of Clusters and Statistical Interactions." Sociological Methods & Research 42 (1):41-59.

Melamed, D., E. Schoon, R. Breiger, V. Asal, and R.K. Rethemeyer. 2012. "Using Organizational Similarity to Identify Statistical Interactions for Improving Situational Awareness of CBRN Activities." Pp. 61-68 in Social Computing, Behavioral-Cultural Modeling, and Prediction (Lecture Notes in Computer Science 7227), edited by S.J. Yang, A.M. Greenberg, and M. Endsley. Berlin; Heidelberg: Springer-Verlag.

Perliger, Arie, and Ami Pedahzur. 2011. "Social Network Analysis in the Study of Terrorism and Political Violence." PS: Political Science & Politics 44 (01):45-50.

Ragin, Charles C. 2008. Redesigning Social Inquiry: Fuzzy Sets and Beyond . Chicago and London: University of Chicago Press.

Sawyer, John, and Gary Ackerman. 2012. “Promethean Journeys: Examining the Mechanisms by Which Terrorists Acquire New Technologies of Lethality.” Paper presented at the Annual Meeting of the International Studies Association, San Diego.

Schrodt, Philip A. 2012. "Precedents, Progress, and Prospects in Political Event Data." International Interactions 38 (4):546-569.

Shalev, M. 2007. "Limits and Alternatives to Multiple Regression in Comparative Research." Pp. 261-308 in Comparative Social Research (Symposium on Methodology in Comparative Research), edited by L. Mjøset, and T.H. Clausen. Elsevier.

14

Figure 1. Levels of Event Credibility

Level 1: single source or multiple non-independent sources

Level 2: two independent sources

Level 3: 3+ independent sources, or 2 with competing bias

15

Figure 2. CBRN Terrorist Events by Event Type, 1990 - 2011

45%

3% 6%

11%

15%

8%

9%

3%

CBRN Terrorist Events by Event Type, 1990-2011

Use of Agent

Attempted Use

Threat with Possession Acquisition of a Weapon Acquisition of an Agent Attempted Acquisition Plot

Protoplot

16

Figure 3. Event Uncertainty (a) and Attack Uncertainty (b)

78%

16%

6% a) Event Uncertainty

No Uncertainty

Some Uncertainty

High Uncertainty

85%

11% 4% b) Attack Uncertainty

17

Figure 4. CBRN Events by Agent Type, 1990-2011

78.0%

16.3% 10.8% 3.7% 0

100

200

300

400

Chemical Biological Radiological Nuclear

CBRN Terrorist Events by Agent Type, 1990-2011

18

Figure 5. The X-axis is residuals of “biological weapon” variable regressed linearly against all other predictor variables. The Y-axis is 𝐥𝐨𝐠 (𝒑� (𝟏 − 𝒑�⁄ )). The linear regression line shown in the Figure has slope –.0893, identical to the (standardized) logistic regression coefficient for bioweapons in Table 3. The 175 events are depicted as open red squares (“low sophistication”) or as closed blue circles (“high sophistication”). We may compute

∑ 𝑿𝒔𝒀𝒔𝒔

∑ 𝑿𝒕𝟐𝒕

where subscript t indexes all 175 cases, and subscript s indexes cases in a subset. When s indexes the low-sophistication cases, the above expression equals –.1345. When s indexes the high-sophistication cases, the expression equals .0451. When s indexes all 175 cases (when s = t), the expression equals –.0893. These are the three numbers given for bioweapons in Table 4.

19

a) cos(U) > .60; events of Type A (red circles) and Type B (blue squares)

b) cos(U) > .60, events by Year: 1998-2001 (yellow), 2002-06 (light green), 2007-11 (dark green); circles for Type A, squares for Type B

Figure 6. cos(U) > .60, showing Event Type (a) and Year of occurrence (b)

1

1

3

3

9

9

11

11

12 2

2

20


b) cos(U) > .60; events in Russia (blue), South Asia (green), Middle East (red)

Figure 7. cos(U) > .60, showing Event Type (a) and Region of occurrence (b)

1

1

3

3

2

2

5

5

9

9

11

11

6

12

12

6

21


b) cos(U) > .60; CBRN type Chem (tan), Bio (green), Rad (blue), Nuc (red)

Figure 8. cos(U) > .60, showing Event Type (a) and CBRN material (b)

1

1

3

3

5

5

6

6

9 9

9

11

11

10

10

12

12 2

2

22

Table 1. The Two Broad Types (Seeking and Possessing) of CBRN Event*

Event Type A: Seeking a CBRN Weapon

Protoplot

The sources do not present evidence of an actual plot, bur rather mention events that may lay the groundwork for an actual plot. Example: Knowledge of a terrorist group hiring a scientist with a CBRN weapons specialty.

Plot

The perpetrator(s) seriously considered acquiring and using CBRN materials as a weapon, but did not make an attempt to acquire the agent.

Attempted acquisition

There is evidence to suggest that the perpetrator(s) attempted to acquire a CBRN substance for use as a weapon, but no evidence of success. Includes the attempted (but unsuccessful or abandoned) acquisition of raw materials or an intact CBRN weapon. If the sole terrorist organization involved in an event is the intended recipient of an agent /weapon that was intercepted en route, the event may be coded as an attempted acquisition with the terrorist organization as the perpetrator.

Possession of a non-weaponized agent

The perpetrator(s) suceed in possessing a CBRN agent but this agent does not constitute a weapon (e.g., it is not in a deliverable form, lacking an effective delivery mechanism for the intended attack).

Event Type B: Possession or Use of a Weaponized Agent

Possession of a weapon

The perpetrator(s) posessed both the agent and delivery mechanism in a form that constitutes a viable weapon or can easily be assembled into such a weapon at the time of reporting. The completed weapon may be crude (e.g., radioactive material the perpetrator plans to leave in a building) if there was evidence that the perpetrator intended to use the weapon in this crude form.

Threat with possession

The perpetrator(s) both threatened to use a CBRN substance and actually had the weapon in their possession at the time fo the threat.

Attempted use of agent

The perpetrator(s) attempted to employ or disseminate a CBRN substance but no agent was actually released.

Use of agent

The perpetrator(s) employed or disseminated a CBRN substance in the commission of an attack. If a small amount of agent was used, even if no harm was caused, it is coded as "use of agent" unless there is proof the event was not meant to cause harm.

*Note: Type A comprises 45.1%, and Type B 54.9%, of the 175 events we analyze.

23

Table 2. One-variable marginals, 2-variable relations with the Outcome variable

Possession of a CBRN weapon (Type B)

Fraction of events

Log odds

Log odds ratio

Overall

96 / 175 = 55% -0.600

Independent variables

Russia, NIS

12/15 = 80% 1.386 3.619

Middle East

16/27 = 59% 0.375 0.212

South Asia

16/18 = 89% 2.079 2.041

biological weapon 12/37 = 32% -0.734 -1.176

chemical weapon 83/123 = 67% 0.730 1.829

lone actor

19/38 = 50% 0.000 -0.249

religious extremists 33/72 = 46% -0.167 -0.621

cults

1/4 = 25% -1.099 -1.322

ethnonationalist 16/25 = 64% 0.575 0.442

Example of calculations: For "biological weapon," we have:

Bio-

wea

pon?

Possession?

No Yes

No 54 84 138 Yes 25 12 37

79 96 175

12 / (25 + 12) = 32%

log(12/25) = –.734

log((12/25) / (84/54)) = –1.176

24

Table 3. The Logistic Regression Model (Outcome is whether each event is of Type B) Both metric coefficients and standardized coefficients are reported.

Unstandardized Coefficients

Standardized Coefs.

Estimate Std. Error t value Pr(>|t|)

Estimate

(Intercept) -0.2524 0.5523 -0.457 0.647653

0.4607 Russia, NIS 2.2042 0.951 2.318 0.020455 *

0.6568 * Middle East 1.3508 0.6039 2.237 0.025291 *

0.4430 *

South Asia 3.0657 0.8907 3.442 0.000578 ***

0.8364 ***

biological weapon -0.2425 0.5105 -0.475 0.634852

-0.0893

chemical weapon 1.9579 0.4989 3.924 0.00009

***

0.8126 ***

lone actor -0.5628 0.5797 -0.971 0.331579

-0.2003

religious extremists -2.5801 0.6379 -4.044 0.00005

***

-1.1068 ***

cults

-2.4908 1.2786 -1.948 0.051409 .

-0.7273 . ethnonationalist -2.0678 0.828 -2.497 0.012509 * -0.8613 *

Assessment of model fit:

value df

Null deviance: 240.95 174

Residual deviance: 175.63 165

Reduction: 65.32 9

25

Table 4. The usual regression coefficients (see the right-most column in Table 3) are sums across clusters of events!

Sophistication:

Variable: Low High

Sum (Intercept) 0.5987 -0.138 0.4607 Russia, NIS 0.5071 0.1498 0.6569 Middle East 0.2339 0.2091 0.4430 South Asia 0.6401 0.1962 0.8363 biological weapon -0.1345 0.0451 -0.0894 chemical weapon 0.5656 0.2470 0.8126

lone actor -0.2579 0.0576 -0.2003 religious extremists -0.8061 -0.3007 -1.1068

cults -0.5475 -0.1798 -0.7273 ethnonationalist -0.6984 -0.1629 -0.8613

26

Table 5. The model of Table 3 with the addition of a (chemical weapons * lone actor) interaction

Unstandardized Coefficients

Standardized Coefs.

Estimate Std. Error t value Pr(>|t|)

Estimate

(Intercept) -0.9648 0.6927 -1.393 0.16367

0.7887

Russia, NIS 2.3841 1.0672 2.234 0.02548 *

0.7104 * Middle East 1.2187 0.6449 1.890 0.05880 .

0.3997 .

South Asia 3.1592 1.0025 3.151 0.00163 **

0.8619 **

biological weapon -0.6994 0.5615 -1.246 0.21289

-0.2577

chemical weapon 3.4065 0.7909 4.307 0.00002 ***

1.4139 *** lone actor 1.2867 0.8520 1.510 0.13097

0.4580

religious extremists -3.086 0.7870 -3.921 0.00009 ***

-1.3238 ***

cults

-3.1589 1.3744 -2.298 0.02154 *

-0.9224 *

ethnonationalist -2.5767 0.9622 -2.678 0.00741 **

-1.0733 **

chem * lone actor -3.4664 1.0624 -3.263 0.00110 ** -

3.4664 **

Assessment of model fit:

value df

Null deviance: 240.95 174

Residual deviance: 163.54 164

Reduction: 77.41 10

patterns of cbrn use by non-state...

Documents