single world intervention graphs...

82
Single World Intervention Graphs (SWIGs): Unifying the Counterfactual and Graphical Approaches to Causality Thomas Richardson Department of Statistics University of Washington Joint work with James Robins (Harvard School of Public Health) Therme Vals Causal Workshop 5 Aug 2013

Upload: others

Post on 19-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Single World Intervention Graphs (SWIGs):Unifying the Counterfactual and Graphical Approaches to Causality

Thomas Richardson

Department of Statistics

University of Washington

Joint work with James Robins (Harvard School of Public Health)

Therme Vals Causal Workshop

5 Aug 2013

Page 2: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Outline

Brief review of counterfactualsA new unification of graphs and counterfactuals vianode-splitting

I Factorization and ModularityI Contrast with Twin Network approachI Some Examples and ExtensionsI Sequentially Randomized Experiments / Time Dependent

ConfoundingI Dynamic Regimes

Experimental Testability and Independence of Errors inNPSEMs

Thomas Richardson Therme Vals Workshop Slide 1

Page 3: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Counterfactualsaka Potential Outcomes

Thomas Richardson Therme Vals Workshop Slide 2

Page 4: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

The potential outcomes framework: philosophy

Hume (1748) An Enquiry Concerning Human Understanding:

We may define a cause to be an object followed by another, andwhere all the objects, similar to the first, are followed by objectssimilar to the second, . . .

. . . where, if the first object had not been the second never hadexisted.

Thomas Richardson Therme Vals Workshop Slide 3

Page 5: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

The potential outcomes framework: crop trials

Jerzy Neyman (1923):To compare v varieties [on m plots] we will consider numbers:

U11 . . . U1m...

...Uv1 . . . Uvm

Here Uij is the crop yield that would be observed if variety i wereplanted in plot j.

Physical constraints only allow one variety to be planted in a givenplot in any given growning season.

Popularized by Rubin (1974); sometimes called the ‘Rubin causal model’.

Thomas Richardson Therme Vals Workshop Slide 4

Page 6: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Potential outcomes with binary treatment

For binary treatment X and response Y, we define two potentialoutcome variables:

Y(x = 0): the value of Y that would be observed for a givenunit if assigned X = 0;

Y(x = 1): the value of Y that would be observed for a givenunit if assigned X = 1;

WIll also write these as Y(x0) and Y(x1).Implicit here is the assumption that these outcomes arewell-defined. Specifically:

I Only one version of treatment X = xI No interference between units (SUTVA).

Thomas Richardson Therme Vals Workshop Slide 5

Page 7: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Potential Outcomes

Unit Potential Outcomes ObservedY(x = 0) Y(x = 1) X Y

1 0 12 0 13 0 04 1 15 1 0

Thomas Richardson Therme Vals Workshop Slide 6

Page 8: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Drug Response ‘Types’:

In the simplest case where Y is a binary outcome we have thefollowing 4 types:

Y(x0) Y(x1) Name0 0 Never Recover0 1 Helped1 0 Hurt1 1 Always Recover

Thomas Richardson Therme Vals Workshop Slide 7

Page 9: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Assignment to Treatments

Unit Potential Outcomes ObservedY(x = 0) Y(x = 1) X Y

1 0 1 12 0 1 03 0 0 14 1 1 15 1 0 0

Thomas Richardson Therme Vals Workshop Slide 8

Page 10: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Observed Outcomes from Potential Outcomes

Unit Potential Outcomes ObservedY(x = 0) Y(x = 1) X Y

1 0 1 1 12 0 1 0 03 0 0 1 04 1 1 1 15 1 0 0 1

Thomas Richardson Therme Vals Workshop Slide 9

Page 11: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Potential Outcomes and Missing Data

Unit Potential Outcomes ObservedY(x = 0) Y(x = 1) X Y

1 ? 1 1 12 0 ? 0 03 ? 0 1 04 ? 1 1 15 1 ? 0 1

Thomas Richardson Therme Vals Workshop Slide 10

Page 12: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Average Causal Effect (ACE) of X on Y

ACE(X→ Y) ≡ E[Y(x1) − Y(x0)]

= p(Helped) − p(Hurt) ∈ [−1, 1]

Thus ACE(X→ Y) is the difference in % recovery ifeveryone treated (X = 1) vs. if noone treated (X = 0).

Thomas Richardson Therme Vals Workshop Slide 11

Page 13: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Identification of the ACE under randomization

If X is assigned randomly then

X ⊥⊥ Y(x0) and X ⊥⊥ Y(x1) (1)

hence

E[Y(x1) − Y(x0)] = E[Y(x1)] − E[Y(x0)]

= E[Y(x1) | X = 1] − E[Y(x0) | X = 0]

= E[Y | X = 1] − E[Y | X = 0].

Thus if (1) holds then ACE(X→ Y) is identified from P(X, Y).

Thomas Richardson Therme Vals Workshop Slide 12

Page 14: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Inference for the ACE without randomizationSuppose that we do not know that X ⊥⊥ Y(x0) and X ⊥⊥ Y(x1).What can be inferred?

X = 0 X = 1Placebo Drug

Y = 0 200 600Y = 1 800 400

What is:

The largest number of people who could be Helped?400 + 200

The smallest number of people who could be Hurt? 0

⇒ Max value of ACE: (200 + 400)/2000 − 0 = 0.3

Similar logic:

⇒ Min value of ACE: 0 − (600 + 800)/2000 = −0.7

Thomas Richardson Therme Vals Workshop Slide 13

Page 15: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Inference for the ACE without randomization

Suppose that we do not know that X ⊥⊥ Y(x0) and X ⊥⊥ Y(x1).

General case:

−(P(x=0,y=1) + P(x=1,y=0)) 6 ACE(X→ Y)

ACE(X→ Y) 6 P(x=0,y=0) + P(x=1,y=1)

⇒ Bounds will always cross zero.

⇒ X ⊥⊥ Y(x0) and X ⊥⊥ Y(x1) essential for drawing non-trivialcausal inferences.

Thomas Richardson Therme Vals Workshop Slide 14

Page 16: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Summary of Counterfactual Approach

In our observed data, for each unit one outcome will be‘actual’; the others will be ‘counterfactual’.

The potential outcome framework allowsCausation to be ‘reduced’ to Missing Data⇒ Conceptual progress!

The ACE is identified if X ⊥⊥ Y(xi) for all values xi.

Randomization of treatment assignment implies X ⊥⊥ Y(xi).

Ideas are central to Fisher’s Exact Test; also many parts ofexperimental design.

The framework is the basis of many practical causal dataanalyses published in Biostatistics, Econometrics andEpidemiology.

Thomas Richardson Therme Vals Workshop Slide 15

Page 17: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Relating Counterfactuals and Structural Equations

Potential outcomes can be seen as a different notation forNon-Parametric Structural Equation Models (NPSEMs):

Example: X→ Y.

NPSEM formulation: Y = f(X, εY)

Potential outcome formulation: Y(x) = f(x, εY)

Two important caveats:

NPSEMs typically assume all variables are seen as beingsubject to well-defined interventions (not so with potentialoutcomes)

Pearl associates NPSEMs with Independent Errors(NPSEM-IEs) with DAGs (more on this later).

Thomas Richardson Therme Vals Workshop Slide 16

Page 18: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Relating Counterfactuals and ‘do’ notation

Expressions in terms of ‘do’ can be expressed in terms ofcounterfactuals:

P(Y(x) = y) ≡ P(Y = y | do(X = x))

but counterfactual notation is more general. Ex. Distribution ofoutcomes that would arise among those who took treatment(X = 1) had counter-to-fact they not received treatment:

P(Y(x = 0) = y | X = 1)

If treatment is randomized, so X ⊥⊥ Y(x = 0) then this equalsP(Y(x = 0) = y), but in an observational study these may bedifferent.

Thomas Richardson Therme Vals Workshop Slide 17

Page 19: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Graphs

Thomas Richardson Therme Vals Workshop Slide 18

Page 20: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Recap: Graphical Approach to Causality

X Y

No Confounding

X

H

Y

Confounding

Unobserved

Graph intended to represent direct causal relations.

Convention that confounding variables (e.g. H) are always includedon the graph.

Approach originates in the path diagrams introduced by SewallWright in the 1920s.

If X→ Y then X is said to be a parent of Y; Y is child of X.

Thomas Richardson Therme Vals Workshop Slide 19

Page 21: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Edges are directed, but are they causal?

X Y

P(X, Y) = P(X)P(Y | X)

No Confounding

X Y

P(X, Y) = P(Y)P(X | Y)

No Confounding

Neither factorization places any restriction on P(X, Y).

Thomas Richardson Therme Vals Workshop Slide 20

Page 22: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Linking the two approaches

X Y

X ⊥⊥ Y(x0) & X ⊥⊥ Y(x1)

X

H

Y

X 6⊥⊥ Y(x0) & X 6⊥⊥ Y(x1)

Unobserved

Elephant in the room:The variables Y(x0) and Y(x1) do not appear on thesegraphs!!

Thomas Richardson Therme Vals Workshop Slide 21

Page 23: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Node splitting: Setting X to 0

X Y

P(X= x, Y= y) = P(X= x)P(Y= y | X= x)

⇒ X x = 0 Y(x = 0)

Can now ‘read’ the independence: X ⊥⊥ Y(x=0).Also associate a new factorization:

P (X= x, Y(x=0)= y) = P(X= x)P (Y(x=0)= y)

where:P (Y(x=0)= y) = P(Y= y |X=0).

This last equation links a term in the original factorization to thenew factorization. We term this the ‘modularity assumption’.

Thomas Richardson Therme Vals Workshop Slide 22

Page 24: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Node splitting: Setting X to 1

X Y

P(X= x, Y= y) = P(X= x)P(Y= y | X= x)

⇒ X x = 1 Y(x = 1)

Can now ‘read’ the independence: X ⊥⊥ Y(x=1).Also associate a new factorization:

P (X= x, Y(x=1)= y) = P(X= x)P (Y(x=1)= y)

where:P (Y(x=1)=y) = P(Y=y |X=1).

Thomas Richardson Therme Vals Workshop Slide 23

Page 25: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Crucial point: Y(x=0) and Y(x=1) are never on the same graph.Although we have:

X ⊥⊥ Y(x=0) and X ⊥⊥ Y(x=1)

we do not haveX ⊥⊥ Y(x=0), Y(x=1)

Had we tried to construct a single graph containing both Y(x=0)and Y(x=1) this would have been impossible. (Why?)

⇒ Single-World Intervention Graphs (SWIGs).

Thomas Richardson Therme Vals Workshop Slide 24

Page 26: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Representing both graphs via a ‘template’

X Y

G

⇒ X x Y(x)

G(x)

Represent both graphs via a template:

Formally this is a ‘graph valued function’:

Takes as input a specific value x∗

Returns as output a SWIG G(x∗).

Each instantiation of the template is a SWIG G(x∗) that representsa different margin: P(X, Y(x∗)) with red nodes x∗ becomingconstants.

Thomas Richardson Therme Vals Workshop Slide 25

Page 27: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Intuition behind node splitting:(Robins, VanderWeele, Richardson 2007)

Q: How could we identify whether someone would choose to taketreatment, i.e. have X = 1, and at the same time find out whathappens to such a person if they don’t take treatment Y(x = 0)?

A: Consider an experiment in which, whenever a patient isobserved to swallow the drug have X = 1, we instantly interveneby administering a safe ‘emetic’ that causes the pill to beregurgitated before any drug can enter the bloodstream.Since we assume the emetic has no side effects, the patient’srecorded outcome is then Y(x = 0).

Thomas Richardson Therme Vals Workshop Slide 26

Page 28: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Harder Inferential problem

X0 Z

H X1

Y

Query: does this causal graph imply?

Y(x0, x1) ⊥⊥ X1(x0) | Z(x0),X0,

Thomas Richardson Therme Vals Workshop Slide 27

Page 29: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Simple solution

X0 Z

H X1

Y

X0

x0

Z(x0)

H

X1(x0)

x1

Y(x0, x1)

Query does this graph imply:

Y(x0, x1) ⊥⊥ X1(x0) | Z(x0),X0 ?

Answer: Yes – applying d-separation to the SWIG on the right wesee that there is no d-connecting path from Y(x0, x1) given Z(x0).More on this shortly...

Thomas Richardson Therme Vals Workshop Slide 28

Page 30: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Single World Intervention Template Construction (1)Given a graph G, a subset of vertices A = {A1, . . . ,Ak} to be intervenedon, we form G(a) in two steps:

(1) (Node splitting): For every A ∈ A split the node into a randomnode A and a fixed node a:

A

· · ·

· · ·

⇒ A

a

Splitting: Schematic Illustrating the Splitting of Node A

The random half inherits all edges directed into A in G;

The fixed half inherits all edges directed out of A in G.

Thomas Richardson Therme Vals Workshop Slide 29

Page 31: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Single World Intervention Template Construction (2)

(2) Relabel descendants of fixed nodes:

a ⇒A

B C

D

FE

X

T

Y

Z

· · ·

· · ·

· · ·

· · ·

a

A(. . .)

B(a, . . .) C(a, . . .)

D(a, . . .)

F(a, . . .)E(a, . . .)

X(. . .)

T(. . .)

Y(. . .)

Z(. . .)

· · ·

· · ·

· · ·

· · ·

Thomas Richardson Therme Vals Workshop Slide 30

Page 32: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Single World Intervention Graph

A Single World Intervention Graph (SWIG) G(a∗) is obtained fromthe Template G(a) by simply substituting specific values a∗ for thevariables a in G(a);

For example, we replace G(x) with G(x=0).

Changing the value of a fixed variable corresponds toconstructing a new graph and considering a differentpopulation, e.g. P(X, Y(x=0)) vs. P(X, Y(x=1))

It is only the instantiated graph G(x) that represents P(V(x)),not the template G(x).

Thomas Richardson Therme Vals Workshop Slide 31

Page 33: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Factorization and Modularity

Factorization: P(V(a)) over the counterfactual variables in G(a)factorizes with respect to G(a) (ignoring fixed nodes):

P (V(a)) =∏

Y(a)∈V(a)P(Y(a)

∣∣∣ paG(a)(Y(a)) \ a)

.

Modularity: P(V(a)) and P(V) are linked as follows:

P(Y(a)=y

∣∣∣ (paG(a)(Y(a)) \ a)= q

)= P

(Y=y

∣∣∣ (paG(Y) \A)= q,

(paG(Y) ∩A

)= apaG(Y)∩A

),

So the conditional density associated with Y(aY) in G(a) is just theconditional density associated with Y in G after substituting ai for anyAi ∈ A that is a parent of Y.

Thomas Richardson Therme Vals Workshop Slide 32

Page 34: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Applying d-separation to the graph G(a)

Counterfactual conditional independence relations may beobtained from the transformed graph by applying d-separation afteradding fixed nodes to the conditioning set:

Given disjoint subsets B(a), C(a) and D(a) of random vertices(where D(a) may be empty),

if B(a) is d-separated from C(a) given D(a) ∪ a in G(a) (2)

then B(a) ⊥⊥ C(a) | D(a) [P(V(a))].

In words, if in G(a) two subsets B(a) and C(a) of random nodes ared-separated by D(a) in conjunction with the fixed nodes a, then B(a) andC(a) are conditionally independent given D(a) in the associateddistribution P(V(a)).

Thomas Richardson Therme Vals Workshop Slide 33

Page 35: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Conditioning on fixed variables a

intuitive since these are fixed constants in the SWIG

Since vertices in a have no parents, no new paths d-connectdue to also conditioning on a.

⇒ If a d-separation holds in G(a) without conditioning on thefixed nodes, then it will continue to hold if we also condition onfixed nodes.

An alternative is simply to restrict attention to paths that donot contain fixed vertices,

e.g. remove fixed nodes from the graph before checkingd-separation.

Thomas Richardson Therme Vals Workshop Slide 34

Page 36: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Mediation graph (I)Intervention on Z alone.

X YZ X(z)Z z Y(z)⇒

factorization:

P(Z,X(z), Y(z)) = P(Z)P(X(z))P(Y(z) | X(z))

modularity:

P(X(z)=x) = P(X=x | Z= z),

P(Y(z)=y | X(z)=x) = P(Y=y | X=x,Z= z).

d-separation gives:Z ⊥⊥ X(z), Y(z).

Thomas Richardson Therme Vals Workshop Slide 35

Page 37: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Mediation graph (II)Intervention on Z and X:

X YZ X(z) xZ z Y(x, z)⇒

factorization:

P(Z,X(z), Y(x, z)) = P(Z)P(X(z))P(Y(x, z))

modularity:

P(X(z)=x) = P(X=x | Z= z),

P(Y(x, z)=y) = P(Y=y | X= x,Z= z).

d-separation gives:

Z ⊥⊥ X(z) ⊥⊥ Y(x, z)

Thomas Richardson Therme Vals Workshop Slide 36

Page 38: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

No direct effect graph

X YZ X(z) xZ z Y(x)⇒

factorization:

P(Z,X(z), Y(x)) = P(Z)P(X(z))P(Y(x))

modularity:

P(X(z)=x) = P(X=x | Z= z),

P(Y(x)=y) = P(Y=y | X= x).

d-separation gives:

Z ⊥⊥ X(z) ⊥⊥ Y(x)

Thomas Richardson Therme Vals Workshop Slide 37

Page 39: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Inferential Problem (II)

X0 Z

H X1

Y

X0

x0

Z(x0)

H

X1(x0)

x1

Y(x0, x1)

Pearl (2009), Ex. 11.3.3, claims the causal DAG above does not imply:

Y(x0, x1) ⊥⊥ X1 | Z,X0 = x0. (3)

The SWIG shows that (3) does hold; Pearl is incorrect.Specifically, we see from the SWIG:

Y(x0, x1) ⊥⊥ X1(x0) | Z(x0),X0, (4)

⇒ Y(x0, x1) ⊥⊥ X1(x0) | Z(x0),X0 = x0. (5)

This last condition is then equivalent to (3) via consistency.(Pearl infers a claim of Robins is false since if true then (3) would hold).

Thomas Richardson Therme Vals Workshop Slide 38

Page 40: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Pearl’s twin network for the same problem

X0

Z

H

X1

Y

X0

Z

H

X1

Y

x0

Z(x0, x1)

H(x0, x1)

x1

Y(x0, x1)

UZ

UH

UY

The twin network fails to reveal that Y(x0, x1) ⊥⊥ X1 | Z,X0 = x0.This ‘extra’ independence holds in spite of d-connection because (byconsistency) when X0 = x0, then Z = Z(x0) = Z(x0, x1).Note that Y(x0, x1) 6⊥⊥ X1 | Z,X0 6= x0.

Shpitser & Pearl (2008) introduce a pre-processing step to address this.

Thomas Richardson Therme Vals Workshop Slide 39

Page 41: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Confounding Revisited

X x Y(x)

H

Here we can read directly from the template that X 6⊥⊥ Y(x) sincethere is a path:

X← H→ Y(x).

Thomas Richardson Therme Vals Workshop Slide 40

Page 42: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Adjusting for confounding

X Y

L

X x Y(x)

L

Here we can read directly from the template that

X ⊥⊥ Y(x) | L.

It follows that:

P(Y(x)=y) =∑l

P(Y=y | L= l,X= x)P(L= l). (6)

Thomas Richardson Therme Vals Workshop Slide 41

Page 43: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Contrast with approach via removing edges

X Y

L

X x Y(x)

L

X Y

L

This ‘explains’ why L is sufficient to control confounding under thenull (where X has no effect on Y) but not under the alternative.

Thomas Richardson Therme Vals Workshop Slide 42

Page 44: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Adjusting for confounding

X Y

L

X x Y(x)

L

X ⊥⊥ Y(x) | L.

Proof of identification:

P[Y(x) = y] =∑l

P[Y(x) = y | L = l]P(L = l)

=∑l

P[Y(x) = y | L = l,X = x]P(L = l) indep

=∑l

P[Y = y | L = l,X = x]P(L = l) modularity

Thomas Richardson Therme Vals Workshop Slide 43

Page 45: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

More Examples (I)

X Y

L

H

(a-i)

X x Y(x)

L

H

(a-ii)

Here we can read directly from the template that

X ⊥⊥ Y(x) | L.

Thomas Richardson Therme Vals Workshop Slide 44

Page 46: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

More Examples (II)

X Y

L

H

(b-i)

X x Y(x)

L

H

(b-ii)

Here we can read directly from the template that

X ⊥⊥ Y(x) | L.

Thomas Richardson Therme Vals Workshop Slide 45

Page 47: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Sequentially randomized experiment (I)

A B C D

H

A and C are treatments;

H is unobserved;

B is a time varying confounder;

D is the final response;

Treatment C is assigned randomly conditional on the observedhistory, A and B;

Want to know P(D(a, c)).

Thomas Richardson Therme Vals Workshop Slide 46

Page 48: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Sequentially randomized experiment (I)

A B C D

H

If the following holds:

A ⊥⊥ D(a, c)

C(a) ⊥⊥ D(a, c) | B(a),A

General result of Robins (1986) then implies:

P(D(a, c)=d) =∑b

P(B=b | A= a)P(D=d | A= a,B=b,C= c).

Does it??

Thomas Richardson Therme Vals Workshop Slide 47

Page 49: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Sequentially randomized experiment (II)

A a B(a) C(a) c D(a, c)

H

d-separation:

A ⊥⊥ D(a, c)

C(a) ⊥⊥ D(a, c) | B(a),A

General result of Robins (1986) then implies:

P(D(a, c)=d) =∑b

P(B=b | A= a)P(D=d | A= a,B=b,C= c).

Thomas Richardson Therme Vals Workshop Slide 48

Page 50: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Multi-network approach

A B C D

H

UHUB

UC

UD

a B(a) C(a) D(a)

H(a)

a B(a, c) c D(a, c)

H(a, c)

Thomas Richardson Therme Vals Workshop Slide 49

Page 51: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Another example

A B C D

H2H1

A ⊥⊥ D(a, c)

C(a) ⊥⊥ D(a, c) | B(a),A

General result of Robins (1986) then implies:

P(D(a, c)=d) =∑b

P(B=b | A= a)P(D=d | A= a,B=b,C= c).

Does it??

Thomas Richardson Therme Vals Workshop Slide 50

Page 52: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Another example

A B C D

H2H1

A

aB(a)

C(a)

cD(a, c)

H1 H2

A ⊥⊥ D(a, c)

C(a) ⊥⊥ D(a, c) | B(a),A

General result of Robins (1986) then implies:

P(D(a, c)=d) =∑b

P(B=b | A= a)P(D=d | A= a,B=b,C= c).

Thomas Richardson Therme Vals Workshop Slide 51

Page 53: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

General result (Robins, 1986)Observed data:

O ≡ 〈L1,A1, . . . ,LK,AK, Y〉.

If the following holds for k = 1, . . . ,K

Y(a†) ⊥⊥ Ak(a†) | Lk(a

†),Ak−1(a†); (7)

then (under positivity):

P(Y(a†)=y |Lj(a†) = lj,Aj−1(a

†) = a†j−1)

=∑

lm+1,...,lK

p(y|lK, a†K)K∏

j=m+1

p(lj|lj−1, a†j−1). (8)

Here Aj−1(a†) ≡ 〈A1, . . . ,Aj−1(a

†j−2)〉, similarly for Lj−1(a

†).The RHS of (8) is referred to as the ‘g-formula’.

Thomas Richardson Therme Vals Workshop Slide 52

Page 54: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Dynamic regimes

A dynamic regime g is a policy that assigns treatment (usuallyat multiple time points) on the basis of past history;

Including conditional on the ‘natural’ value of treatment in theabsence of an intervention;

Exercise for as long as you would have done withoutintervention or twenty minutes, whichever is more.

See Young et al. (2012) for additional analysis.

Thomas Richardson Therme Vals Workshop Slide 53

Page 55: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Dynamic regimes

A1

a1

L(a1) A2(a1)

a2

Y(a1,a2)

H2H1

A1

A+1 (g)

L(g) A2(g)

A+2 (g)

Y(g)

H2H1

P(Y(g)) is identified.Thomas Richardson Therme Vals Workshop Slide 54

Page 56: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Dynamic regimes

A1

a1

L(a1) A2(a1)

a2

Y(a1,a2)

H2H1

A1

A+1 (g)

L(g) A2(g)

A+2 (g)

Y(g)

H2H1

P(Y(g)) is not identified.Thomas Richardson Therme Vals Workshop Slide 55

Page 57: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Joint Independence

We saw earlier that the causal DAG X→ Y implied:

X ⊥⊥ Y(x0) and X ⊥⊥ Y(x1)

However, joint independence relations such as:

X ⊥⊥ Y(x0), Y(x1)

never follow from our SWIG transformation:There is no way via node-splitting to construct a graph with bothY(x0), and Y(x1).This has important consequences for the identification of directeffects.

Thomas Richardson Therme Vals Workshop Slide 56

Page 58: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Assuming Independent Errors and

Cross-World Independence

Thomas Richardson Therme Vals Workshop Slide 57

Page 59: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Mediation graphIntervention on X and M:

M YX M(x) mX x Y(x, m)⇒

d-separation gives:

X ⊥⊥ M(x) ⊥⊥ Y(x, m)

Pearl associates additional independence relations with this graph

Y(x1,m) ⊥⊥ M(x0),X

Y(x0,m) ⊥⊥ M(x1),X

equivalent to assuming independent errors, εX ⊥⊥ εM ⊥⊥ εY .Thomas Richardson Therme Vals Workshop Slide 58

Page 60: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Pure Direct Effect

Pure (aka Natural) Direct Effect (PDE): Change in Y had X beendifferent, but M fixed at the value it would have taken had X notbeen changed:

PDE ≡ Y(x1,M(x0)) − Y(x0,M(x0)).

Legal motivation [from Pearl (2000)]:

“The central question in any employment-discrimination case is whetherthe employer would have taken the same action had the employee beenof a different race (age, sex, religion, national origin etc.) and everythingelse had been the same.” (Carson versus Bethlehem Steel Corp., 70FEP Cases 921, 7th Cir. (1996)).

Thomas Richardson Therme Vals Workshop Slide 59

Page 61: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Decomposition

PDE also allows non-parametric decomposition of Total Effect(ACE) into direct (PDE) and indirect (TIE) pieces.

PDE ≡ E [Y(1,M(0))] − E [Y(0)]

TIE ≡ E [Y (1,M(1)) − Y (1,M(0))]

TIE+ PDE ≡ E [Y(1)] − E [Y(0)] ≡ ACE(X→ Y)

Thomas Richardson Therme Vals Workshop Slide 60

Page 62: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Pearl’s identification claim

Pearl and others claim that under “no confounding” the PDE isidentified by the following mediation formula:

PDEmed(m) =∑m

[E[Y|x1,m] − E[Y|x0,m]]P(m|x0)

Thomas Richardson Therme Vals Workshop Slide 61

Page 63: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Critique of PDE: Hypothetical Case Study

Observational data on three variables:

X- treatment: cigarette cessation

M intermediate: blood pressure at 1 year, high or low

Y outcome: say CHD by 2 years

Observed data (X,M, Y) on each of n subjects.

All binary

X randomly assigned

Thomas Richardson Therme Vals Workshop Slide 62

Page 64: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Hypothetical Study (I): X randomized

Y = 0 Y = 1 Total P(Y=1 |m, x)

M = 0 1500 500 2000 0.25X = 0

M = 1 1200 800 2000 0.40

M = 0 948 252 1200 0.21X = 1

M = 1 1568 1232 2800 0.44

A researcher, Prof H wishes to apply the mediation formula toestimate the PDE.Prof H believes that there is no confounding, so that Pearl’sNPSEM-IE holds, but his post-doc, Dr L is skeptical.

Thomas Richardson Therme Vals Workshop Slide 63

Page 65: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Hypothetical Study (II): X and M Randomized

To try to address Dr L’s concerns, Prof H carries out animalintervention studies.

Y = 0 Y = 1 Total P(Y(m, x)=1)M = 0 750 250 1000 0.25

X = 0M = 1 600 400 1000 0.40

M = 0 790 210 1000 0.21X = 1

M = 1 560 440 1000 0.44

As we see: P(Y(m, x)=1) = P(Y=1 |m, x);Prof H is now convinced: ‘What other experiment could I do ?’

He applies the mediation formula, yielding PDEmed

= 0.Conclusion: No direct effect of X on Y.

Thomas Richardson Therme Vals Workshop Slide 64

Page 66: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Failure of the mediation formula

Under the true generating process, the true value of the PDE is:

PDE = 0.153 6= PDEmed

= 0

Prof H’s conclusion was completely wrong!

Thomas Richardson Therme Vals Workshop Slide 65

Page 67: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Why did the mediation formula go wrong?

Dr L was right – there was a confounder:

M YX

H

but. . . it had a special structure so that:

Y ⊥⊥ H |M,X = 0 and M ⊥⊥ H | X = 1

Thomas Richardson Therme Vals Workshop Slide 66

Page 68: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Why did the mediation formula go wrong?

Dr L was right – there was a confounder:

M YX

H

but. . . it had a special structure so that:

Y ⊥⊥ H |M,X = 0 and M ⊥⊥ H | X = 1

M Y

HX = 0

M Y

HX = 1

The confounding undetectable by any intervention on X and/or M.

Pearl: Onus is on the researcher to be sure there is no confounding.Causation should precede intervention.

Thomas Richardson Therme Vals Workshop Slide 67

Page 69: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

PDE identification cannot be checked via experimentIf our only interventions are on the variables X and M then wecannot do an experiment to learn the PDE.

We could learn E [Y{x = 1,M(x = 0)}] by intervention if we couldI intervene and set X to 0 and observe M(0),I then return each subject to their pre-intervention state,I finally intervene to set X to 1 and M to M(0) and observeY(1,M(0)).

Such an intervention strategy will usually not exist because notpossible in a real-world intervention (e.g., suppose the outcome Ywere death).

Because we cannot observe the same subject under both X = 1and X = 0 (i.e. ”across worlds”,) no intervention will allow us tolearn the distribution of mixed counterfactuals such asY{x = 1,M(x = 0)} :

(In the story Dr L had to introduce a new node on the graph in order tocheck the value of the PDE via an experiment.)

Thomas Richardson Therme Vals Workshop Slide 68

Page 70: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Summary of critique of Independent Error AssumptionThe independent error assumption cannot be checked by anyrandomized experiment on the variables in the graph.

⇒ Connection between experimental interventions and potentialoutcomes, established by Neyman has been severed;

⇒ Theories in Social and Medical sciences are not detailed enough tosupport the independent error assumption.

What about faithfulness and causal discovery procedures?

Such inferences are explicit that they rely on faithfulness, and aredesigned to guide hypothesis formation;

I Contrast: In Pearl’s NPSEM-IE approach the simple act ofusing a DAG is viewed as automatically committing you tomaking this untestable hypothesis.

Predictions (possibly derived assuming faithfulness) regardingintervention distributions P(Y(x)) = P(Y | do(x)) can be tested byrandomized experiments.

Thomas Richardson Therme Vals Workshop Slide 69

Page 71: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

How many experimentally untestable assumptions?Assumption of independent errors implies super-exponentiallymany ‘cross-world’ counterfactual independence assumptions:

No. Actual Vars. 2 3 4 K

Dim. P(V) 3 7 15 2K − 1No. Cnterfactual Vars. 3 7 15 2K − 1

Dim. Cnterfactual Dist. 7 127 32767 2(2K−1) − 1

Dim. SWIG 5 113 32697 (2(2K−1) − 1) −∑K−1

j=1 (4j − 2j)

Dim. NPSEM-IE 4 19 274∑K−1

j=0 (22j

− 1)

No. untestable indep. 1 94 32423 O(22K−2)constrnts in NPSEM-IE

Table: Dimensions of counterfactual models associated with completegraphs with binary variables.

Thomas Richardson Therme Vals Workshop Slide 70

Page 72: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

SWIG Completeness Conjecture

In an NPSEM we define a counterfactual independence to belogical if it holds regardless of the distribution overcounterfactuals (equivalently error terms) e.g. for binary X

Y(x0) ⊥⊥ Y(x1) | X, Y

Completeness Conjecture There exists a distribution overcounterfactuals that is experimentally indistinguishable fromthe NPSEM that assumes independent errors but in which theonly non-logical independencies are those that may bederived from the SWIG.

Thomas Richardson Therme Vals Workshop Slide 71

Page 73: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Summary

A simple approach to unifying graphs and counterfactuals vianode-splitting

The approach works via linking the factorizations associatedwith the two graphs

The approach provides a language that allows counterfactualand graphical people to communicate

The approach leads to many fewer untestable independenceassumptions than in the NPSEM-IE approach of Pearl.

The approach also provides a way to combine information onthe absence of individual and population level direct effects.

Thomas Richardson Therme Vals Workshop Slide 72

Page 74: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Thank You!

Thomas Richardson Therme Vals Workshop Slide 73

Page 75: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

ReferencesPearl, J. Causality (Second ed.). Cambridge, UK: CambridgeUniversity Press, 2009.

Richardson, TS, Robins, JM. Single World Intervention Graphs.CSSS Technical Report No. 128http://www.csss.washington.edu/Papers/wp128.pdf, 2013.

Robins, JM A new approach to causal inference in mortality studieswith sustained exposure periods applications to control of thehealthy worker survivor effect. Mathematical Modeling 7,1393–1512, 1986.

Robins, JM, VanderWeele, TJ, Richardson TS. Discussion of“Causal effects in the presence of non compliance a latent variableinterpretation by Forcina, A. Metron LXIV (3), 288–298, 2007.

Shpitser, I, Pearl, J. What counterfactuals can be tested. Journal ofMachine Learning Research 9, 1941–1979, 2008.

Spirtes, P, Glymour, C, Scheines R. Causation, Prediction andSearch. Lecture Notes in Statistics 81, Springer-Verlag.

Thomas Richardson Therme Vals Workshop Slide 74

Page 76: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Details on Pearl’s ErrorPearl correctly states that using his Twin Network method (next slide) itmay be shown that

Y(x0, x1) is not independent of X1, given Z and X0.

However, he then goes on to say (incorrectly):

In the twin network model there is a d-connected path fromX1 to Y(x0, x1). . . Therefore, [(3)] is not satisfied for Y(x0, x1)and X1.

[Ex. 11.3.3, p.353]This is actually incorrect in two ways:

Y(x0, x1) 6⊥⊥ X1 | Z,X0 does not imply Y(x0, x1) 6⊥⊥ X1 | Z,X0=x0

d-separation is not complete for Twin Networks so the presence of ad-connected path does not imply that an independence is notimplied.

Thomas Richardson Therme Vals Workshop Slide 75

Page 77: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

T

X

Z

Y

T

X

Z

Y

UT

UX

UZ

UY

T(z)

X(z)

z

Y(z)

T

X

Z

z

Y(z)

T and Y(z) are d-connected given X in the twin-network, but in spite ofthis T ⊥⊥ Y(z) | X under the associated NPSEM-IE because X(z) = X,and T and Y(z) are d-separated given X in the twin-network.

Thomas Richardson Therme Vals Workshop Slide 76

Page 78: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Mediation graph (I)Intervention on Z alone.

X YZ X(z)Z z Y(z)⇒

factorization:

P(Z,X(z), Y(z)) = P(Z)P(X(z))P(Y(z) | X(z))

modularity:

P(X(z)=x) = P(X=x | Z= z),

P(Y(z)=y | X(z)=x) = P(Y=y | X=x,Z= z).

d-separation gives:Z ⊥⊥ X(z), Y(z).

Thomas Richardson Therme Vals Workshop Slide 77

Page 79: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Mediation graph (II)Intervention on Z and X:

X YZ X(z) xZ z Y(x, z)⇒

factorization:

P(Z,X(z), Y(x, z)) = P(Z)P(X(z))P(Y(x, z))

modularity:

P(X(z)=x) = P(X=x | Z= z),

P(Y(x, z)=y) = P(Y=y | X= x,Z= z).

d-separation gives:

Z ⊥⊥ X(z) ⊥⊥ Y(x, z)

Thomas Richardson Therme Vals Workshop Slide 78

Page 80: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Importance of fixed nodesCompare:

X YZ X(z)Z z Y(z)⇒

P(Z,X(z), Y(z)) = P(Z)P(X(z))P(Y(z) | X(z))

P(Y(z)=y | X(z) = x) = P(Y=y | X=x,Z= z).

versus

X YZ X(z)Z z Y(z)⇒

P(Z,X(z), Y(z)) = P(Z)P(X(z))P(Y(z) | X(z)),

P(Y(z)=y | X(z) = x) = P(Y=y | X=x)

Thomas Richardson Therme Vals Workshop Slide 79

Page 81: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

Importance of fixed nodes: leaving them out causesproblems!

X YZ X(z)Z Y(z)⇒

P(Z,X(z), Y(z)) = P(Z)P(X(z))P(Y(z) | X(z))

P(Y(z)=y | X(z) = x) = P(Y=y | X=x,Z= z).

versus

X YZ X(z)Z Y(z)⇒

P(Z,X(z), Y(z)) = P(Z)P(X(z))P(Y(z) | X(z)),

P(Y(z)=y | X(z) = x) = P(Y=y | X=x)

Red nodes are needed in order to read off modularity property from G(a).

Thomas Richardson Therme Vals Workshop Slide 80

Page 82: Single World Intervention Graphs (SWIGs)people.tuebingen.mpg.de/p/causality-perspect/slides/Thomas_swig.pdf · Single World Intervention Graphs (SWIGs): Unifying the Counterfactual

No direct effect graph (I)

X YZ X(z)Z z Y(z)⇒

factorization:

P(Z,X(z), Y(z)) = P(Z)P(X(z))P(Y(z) | X(z))

modularity:

P(X(z)=x) = P(X=x | Z= z),

P(Y(z)=y | X(z) = x) = P(Y=y | X=x).

d-separation gives:Z ⊥⊥ X(z), Y(z)

Thomas Richardson Therme Vals Workshop Slide 81