BINF702 SPRING 2014- Chapter 3 Probability
BINF702 – SPRING 2014
Chapter 3 - Probability
BINF702 SPRING 2014- Chapter 3 Probability
3.1 Introduction - An Example to Hang Our Hat On
Example 3.1
Cancer One theory concerning the etiology of breast cancer states that women in a given age group who give birth to their first child relatively late in life (after 30) are at greater risk for eventually developing breast cancer over some time period t than are women who give birth to their first child early in life (before 20). Because women in the upper social classes tend to have children later, this theory has been used to explain why these women have a higher risk of developing breast cancer than women in the lower social classes. To test this hypothesis, we might identify 2000 women from a particular census tract who are currently ages 45-54 and have never had breast cancer, of whom 1000 had their first child before the age of 20 (call this group A) and 1000 after the age of 30 (group B). These 2000 women might be followed for 5 years to assess if they developed breast cancer during this period. Suppose there are 4 new cases of breast cancer in group A and 5 new cases in group B.
BINF702 SPRING 2014- Chapter 3 Probability
3.2 Definition of Probability
Def. 3.1 - The sample space is the set of all possible outcomes. In referring to probabilities of events, and event is any set of outcomes of interest. The probability of an event is the relative frequency of this set of outcomes over a an indefinitely large (or infinite) number of trials.
Ex - Toss a fair coin and observe the uppermost side. Since we expect that heads is as likely to come up as tails, we conclude that the empirical probability distribution is
P(H) = 1/2, P(T) = 1/2.
Can you provide another example?
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.2 -Definition of Probability
Eq. 3.1
1. The probability of an event E, denoted by Pr(E), always satisfies
2. If outcomes A and B are two events that cannot both happen at the same time, then Pr(A or B occurs) = Pr(A) + Pr(B)
0 Pr( ) 1E
Can you provide an example of two events that can’t happen at the same time?
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.2 -Definition of Probability
Def. 3.2 - Two events A and B are mutually exclusive if they cannot both happen at the same time.
A B mutually exclusive A B not mutually exclusive
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.3 - Some Useful Probabilistic Notation
Def. 3.3 - The symbol {} is used as shorthand for the event.
Def. 3.4 - A U B is the event that either A or B occurs, or they both occur.
Def. 3.5 - A B is the event that both A and B occurs simultaneously.
A B
Can you provide an example of two events and their intersection?
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.3 - Some Useful Probabilistic Notation
Def. 3.6 - The complement of A is denoted
We note
A
Pr 1 PrA A
A
Can you provide an example of an event and its complement?
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.4 - The Multiplicative Law of Probability Hypertension, Genetics Suppose we are conducting a hypertension-screening program
in the home. Consider all possible pairs of DBF measurements of the mother and father
within a given family, assuming that the mother and father are not genetically related.
This sample space consists of all pairs of numbers of the form (X, Y) where X > 0, Y >
0. Certain specific events might be of interest in this context. In particular, we might be
interested in whether the mother or father is hypertensive, which is described,
respectively, by events A = {mother's DBF > 95), B = {father's DBF > 95). These events
are diagrammed in Figure 3.4.Suppose we know that Pr(A) - .1, Pr(B) = .2. What can
we say about Pr(A n B) = Pr(mother's DBF > 95 and father's DBF > 95) = Pr(both
mother and father are hypertensive)? We can say nothing unless we are willing to make
certain assumptions.
BINF702 SPRING 2014- Chapter 3 Probability
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.4 - The Multiplicative Law of Probability
Def. 3.7 - Two events A and B are called independent events if
Def. 3.8 - Two events A, B are dependent if
Eq. 3.2 Multiplication Law of Probability
If A1, …, Ak are mutually exclusive events, then
Pr Pr PrA B A B
Pr Pr PrA B A B
1 2 1 2Pr Pr Pr Prk kA A A A A A
Can you provide an example of two independent events?
Can you provide an example of two dependent events?
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.5 - The Addition Law of Probability
Eq. 3.3 - Addition Law of Probability
If A and B are any events, then
Pr Pr Pr PrA B A B A B
What happens if A and B are mutually exclusive?
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.5 - The Addition Law of Probability
Eq. 3.4 - Addition Law of Probability for Independent Events
If two events A and B are independent, then
Pr Pr Pr( ) 1 PrA B A B A
How does this come about?
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.6 - Conditional Probability
Def. 3.9 - The conditional probability of B given A written Pr(B|A) is defined as
Eq. 3.5
1. If A and B are independent events, then
2. If two events A, B are dependent, then
PrPr |
Pr
A BB A
A
Pr | Pr Pr |B A B B A
Pr | Pr Pr |
Pr Pr Pr
B A B B A and
A B A B
?=A B B A
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.6 - Conditional Probability
Def. 3.10 - The relative risk (RR) of B given A is given by
N. B. - A and B independent implies RR is 1. The larger the dependence of the two events the further the relative risk is different from 1.
Pr |
|Pr |
B ARR B A
B A
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.6 - Conditional Probability
Eq. 3.6 - Total-Probability rule
For any events A and B.
N. B. - This is a very useful rule.
Def. 3.11 - A set of events A1, …, Ak is exhaustive if at least one of the events must occur.
Eq. 3.7 - Total-Probability Rule
Let A1, …, Ak be mutually exclusive and exhaustive events. The unconditional probability of B Pr(B) can be written as a weighted average of the conditional probabilities of B given Ai (Pr(B|Ai)) as follows
Pr Pr | Pr Pr | PrB B A A B A A
1
Pr Pr | Prk
i i
i
B B A A
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.6 - Conditional Probability
Eq. 3.8 - Generalized Multiplicative Law of Probability
If A1, …,Ak are an arbitrary set of events then
1 2 1 2 1 3 2 1 1 2 1Pr Pr Pr | Pr | Prk k kA A A A A A A A A A A A A
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.7 - Bayes’ Rule and Screening Tests
Def. 3.12 - The predictive value positive (PV+) of a screening test is the probability that a person has a disease given that the test is positive.
Pr(disease|test+)
The predictive value negative (PV-) of a screening test is the probability that a person does not have a disease given that the test is negative.
Pr(no disease|test-)
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.7 - Bayes Rule and Screening Tests
Def. 3.13 - The sensitivity of a symptom (or set of symptoms or screening test) is the probability that the symptom is present given that the person has a disease.
Def. 3.14 - The specificity of a symptom (or set of symptoms or screening test) is the probability that the symptom is not present given that the person does not have a disease.
Def. 3.15 - A false negative is defined as a person who tests out as negative but who is actually positive. A false positive is defined as a person who tests out as positive but who is actually negative.
Section 3.7 - Bayes Rule and Screening Tests
Eq. 3.9 Bayes’ Rule
Let A = symptom and B = disease,
Pr | Pr
Pr |Pr | Pr Pr | Pr
A B BPV B A
A B B A B B
In words, we can write this as
sensitivity *
sensitivity * 1 specificity * 1-x
xPV
x
where x = Pr(B) = prevalence of disease in the reference
population. Similarly,
specificity * 1
specificity * 1- 1-sensitivity *
xPV
x x
BINF702 SPRING 2014- Chapter 3 Probability
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.7 - Bayes’ Rule and Screening Tests
Eq. 3.10 - Generalized Bayes’ Rule
Let B1, B2, …, Bk be a set of mutually exclusive and exhaustive
disease states; that is, at lease one disease state must occur and
no two disease states can occur at the same time. Let A represent
the presence of a symptom or set of symptoms. Then
1
Pr | PrPr |
Pr | Pr
i i
i k
j j
j
A B BB A
A B B
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.8 - Bayesian Inference
Def. 3.16 - The prior probability of an event is the best guess by the observer of an event’s probability in the absence of data. This prior probability may be a single number, or it may be a range of likely values for the probability, perhaps with weights attached to each possible value.
Def. 3.17 - The posterior probability of an event is the probability of an event after collecting some empirical data. It is obtained by integrating information from the prior probability with additional data related to the event in question.
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.9 - ROC Curves
Def. - A receiver operating characteristic (ROC) curve is a plot of the sensitivity versus (1 - specificity) of a screening test, where the different points on the curve correspond to different cutoff points used to designate test positive.
BINF702 SPRING 2014- Chapter 3 Probability
Section 3.10 - Prevalence and Incidence
De. 3.19 - The prevalence of a disease is the probability of currently having the disease regardless of the duration of time one has had the disease.It is obtained by dividing the number of people who currently have the disease by the number of people in the study population.
Def. 3.20 - The cumulative incidence of a disease is the probability that a person with no prior disease will develop a new case of the disease over some specified period.
Occupational Health
Ex. 3.29 pg. 69
BINF702 SPRING 2014- Chapter 3 Probability
BINF702 SPRING 2014- Chapter 3 Probability
Occupational Health
Ex. 3.30
BINF702 SPRING 2014- Chapter 3 Probability
Genetics
Ex 3.31
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
Ex. 3.52
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
3.53
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
Ex. 3.54
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
Ex. 3.55
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
Ex. 3.57
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
Ex. 3.58
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
Ex. 3.59
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
3.60
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
Ex. 3.61
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
Ex. 3.62
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
Ex. 3.70
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
Ex. 3.71
BINF702 SPRING 2014- Chapter 3 Probability
Pulmonary Disease
Ex. 3.73
Pulmonary Disease
Ex. 3.74
BINF702 SPRING 2014- Chapter 3 Probability
BINF702 SPRING 2014- Chapter 3 Probability
Hypertension
Ex. 3.120
BINF702 SPRING 2014- Chapter 3 Probability
Hypertension
3.121
BINF702 SPRING 2014- Chapter 3 Probability
Hypertension
Ex. 3.122
BINF702 SPRING 2014- Chapter 3 Probability
Hypertension
Ex. 3.122 Ex. 3.122
BINF702 SPRING 2014- Chapter 3 Probability
Hypertension
Ex. 3.123
BINF702 SPRING 2014- Chapter 3 Probability
Orthopedics
Ex. 3.134
BINF702 SPRING 2014- Chapter 3 Probability
Orthopedics
Ex. 3.135
BINF702 SPRING 2014- Chapter 3 Probability
Orthopedics
Ex. 3.136
BINF702 SPRING 2014- Chapter 3 Probability
Orthopedics
Ex. 3.136
Can you produce R code to produce a ROC curve from this data?
BINF702 SPRING 2014- Chapter 3 Probability
Orthopedics
Ex. 3.137
BINF702 SPRING 2014- Chapter 3 Probability
Homework Problems Chapter 3
3.13, 3.16, 3.20, 3.62, 3.63, 3.79, 3.80, 3.81, 3.82, 3.100, 3.101, 3.102, 3.103, 3.104, 3.105, 3.106