6.2 probability theory longin jan latecki temple university
Post on 19-Jan-2016
33 Views
Preview:
DESCRIPTION
TRANSCRIPT
Module #19 – Probability
6.2 Probability Theory
Longin Jan LateckiTemple University
Slides for a Course Based on the TextSlides for a Course Based on the TextDiscrete Mathematics & Its ApplicationsDiscrete Mathematics & Its Applications (6 (6thth
Edition) Kenneth H. Rosen Edition) Kenneth H. Rosen based on slides by based on slides by
Michael P. Frank and Andrew W. MooreMichael P. Frank and Andrew W. Moore
Module #19 – Probability
Terminology• A (stochastic) A (stochastic) experimentexperiment is a procedure that is a procedure that
yields one of a given set of possible outcomesyields one of a given set of possible outcomes• The The sample spacesample space SS of the experiment is the of the experiment is the
set of possible outcomes.set of possible outcomes.• An event is a An event is a subsetsubset of sample space. of sample space.• A random variable is a function that assigns a A random variable is a function that assigns a
real value to each outcome of an experimentreal value to each outcome of an experimentNormally, a probability is related to an experiment or a trial.
Let’s take flipping a coin for example, what are the possible outcomes?
Heads or tails (front or back side) of the coin will be shown upwards.After a sufficient number of tossing, we can “statistically” concludethat the probability of head is 0.5.In rolling a dice, there are 6 outcomes. Suppose we want to calculate the prob. of the event of odd numbers of a dice. What is that probability?
Module #19 – Probability
Random Variables
• A A “random variable”“random variable” VV is any variable whose is any variable whose value is unknown, or whose value depends on value is unknown, or whose value depends on the precise situation.the precise situation.– E.g.E.g., the number of students in class today, the number of students in class today– Whether it will rain tonight (Boolean variable)Whether it will rain tonight (Boolean variable)
• The proposition The proposition VV==vvii may have an uncertain may have an uncertain
truth value, and may be assigned a truth value, and may be assigned a probability.probability.
Module #19 – ProbabilityExample 10
• A fair coin is flipped 3 times. Let A fair coin is flipped 3 times. Let SS be the sample space of 8 be the sample space of 8 possible outcomes, and let possible outcomes, and let XX be a random variable that assignees be a random variable that assignees to an outcome the number of heads in this outcome. to an outcome the number of heads in this outcome.
• Random variable Random variable X X is a functionis a function XX::SS → → XX((SS), ), where where XX((SS)={0, 1, 2, 3} is the range of )={0, 1, 2, 3} is the range of X, X, which is the number the number of heads, andof heads, andSS={ ={ (TTT), (TTH), (THH), (HTT), (HHT), (HHH), (THT), (HTH)(TTT), (TTH), (THH), (HTT), (HHT), (HHH), (THT), (HTH) } }
• X(TTT) = 0 X(TTT) = 0 X(TTH) = X(HTT) = X(THT) = 1X(TTH) = X(HTT) = X(THT) = 1X(HHT) = X(THH) = X(HTH) = 2X(HHT) = X(THH) = X(HTH) = 2X(HHH) = 3X(HHH) = 3
• The The probability distribution (pdf) of random variableprobability distribution (pdf) of random variable XX is given by is given by P(X=3) = 1/8, P(X=2) = 3/8, P(X=1) = 3/8, P(X=0) = 1/8. P(X=3) = 1/8, P(X=2) = 3/8, P(X=1) = 3/8, P(X=0) = 1/8.
Module #19 – Probability
Experiments & Sample Spaces
• A (stochastic) A (stochastic) experimentexperiment is any process by which is any process by which a given random variable a given random variable VV gets assigned some gets assigned some particularparticular value, and where this value is not value, and where this value is not necessarily known in advance.necessarily known in advance.– We call it the “actual” value of the variable, as We call it the “actual” value of the variable, as
determined by that particular experiment.determined by that particular experiment.• The The sample spacesample space SS of the experiment is just of the experiment is just
the domain of the random variable, the domain of the random variable, SS = dom[ = dom[VV]]..• The The outcomeoutcome of the experiment is the specific of the experiment is the specific
value value vvii of the random variable that is selected. of the random variable that is selected.
Module #19 – Probability
Events
• An An eventevent EE is any set of possible outcomes in is any set of possible outcomes in SS……– That is, That is, EE SS = = domdom[[VV]]..
• E.g., the event that “less than 50 people show up for our next E.g., the event that “less than 50 people show up for our next class” is represented as the set class” is represented as the set {1, 2, …, 49}{1, 2, …, 49} of values of the of values of the variable variable VV = (# of people here next class) = (# of people here next class)..
• We say that event We say that event EE occursoccurs when the actual value when the actual value of of VV is in is in EE, which may be written , which may be written VVEE..– Note that Note that VVEE denotes the proposition (of uncertain denotes the proposition (of uncertain
truth) asserting that the actual outcome (value of truth) asserting that the actual outcome (value of VV) will ) will be one of the outcomes in the set be one of the outcomes in the set EE..
Module #19 – Probability
Probabilities
• We write P(A) as “the fraction of possible We write P(A) as “the fraction of possible worlds in which A is true”worlds in which A is true”
• We could at this point spend 2 hours on the We could at this point spend 2 hours on the philosophy of this.philosophy of this.
• But we won’t.But we won’t.
Module #19 – Probability
Visualizing A
Event space of all possible worlds
Its area is 1Worlds in which A is False
Worlds in which A is true
P(A) = Area ofreddish oval
Module #19 – Probability
Probability
• The The probabilityprobability p p = Pr[= Pr[EE] ] [0,1] [0,1] of an event of an event EE is is a real number representing our degree of certainty a real number representing our degree of certainty that that EE will occur. will occur.– If If Pr[Pr[EE] = 1] = 1, then , then EE is absolutely certain to occur, is absolutely certain to occur,
• thus thus VVEE has the truth value has the truth value TrueTrue..– If If Pr[Pr[EE] = 0] = 0, then , then EE is absolutely certain is absolutely certain not not to occur,to occur,
• thus thus VVEE has the truth value has the truth value FalseFalse..– If If Pr[Pr[EE] = ] = ½½, then we are , then we are maximally uncertainmaximally uncertain about about
whether whether EE will occur; that is, will occur; that is, • VVEE and and VVEE are considered are considered equally likelyequally likely..
– How do we interpret other values of How do we interpret other values of pp??Note: We could also define probabilities for more general propositions, as well as events.
Module #19 – Probability
Four Definitions of Probability
• Several alternative definitions of probability Several alternative definitions of probability are commonly encountered:are commonly encountered:– Frequentist, Bayesian, Laplacian, AxiomaticFrequentist, Bayesian, Laplacian, Axiomatic
• They have different strengths & They have different strengths & weaknesses, philosophically speaking.weaknesses, philosophically speaking.– But fortunately, they coincide with each other But fortunately, they coincide with each other
and work well together, in the majority of cases and work well together, in the majority of cases that are typically encountered.that are typically encountered.
Module #19 – Probability
Probability: Frequentist Definition
• The probability of an event The probability of an event EE is the limit, as is the limit, as nn→∞→∞,, of of the fraction of times that we find the fraction of times that we find VVEE over the course over the course of of nn independent repetitions of (different instances of) independent repetitions of (different instances of) the same experiment.the same experiment.
• Some problems with this definition: Some problems with this definition: – It is only well-defined for experiments that can be It is only well-defined for experiments that can be
independently repeated, infinitely many times! independently repeated, infinitely many times! • or at least, if the experiment can be repeated in principle, or at least, if the experiment can be repeated in principle, e.g.e.g., over , over
some hypothetical ensemble of (say) alternate universes.some hypothetical ensemble of (say) alternate universes.
– It can never be measured exactly in finite time!It can never be measured exactly in finite time!
• Advantage:Advantage: It’s an objective, mathematical definition. It’s an objective, mathematical definition.
n
nE EV
n
lim:]Pr[
Module #19 – Probability
Probability: Bayesian Definition• Suppose a rational, profit-maximizing entity Suppose a rational, profit-maximizing entity RR is is
offered a choice between two rewards:offered a choice between two rewards:– Winning Winning $1$1 if and only if the event if and only if the event EE actually occurs. actually occurs.– Receiving Receiving pp dollars (where dollars (where pp[0,1][0,1]) unconditionally.) unconditionally.
• If If RR can honestly state that he is completely can honestly state that he is completely indifferent between these two rewards, then we say indifferent between these two rewards, then we say that that RR’s probability for ’s probability for EE is is pp, that is, , that is, PrPrRR[[EE] :] :≡≡ pp..
• Problem:Problem: It’s a subjective definition; depends on the It’s a subjective definition; depends on the reasoner reasoner RR, and his knowledge, beliefs, & rationality., and his knowledge, beliefs, & rationality.– The version above additionally assumes that the utility of The version above additionally assumes that the utility of
money is linear.money is linear.• This assumption can be avoided by using “utils” (utility units) This assumption can be avoided by using “utils” (utility units)
instead of dollars.instead of dollars.
Module #19 – Probability
Probability: Laplacian Definition
• First, assume that all individual outcomes in the First, assume that all individual outcomes in the sample space are sample space are equally likelyequally likely to each other to each other……– Note that this term still needs an operational definition!Note that this term still needs an operational definition!
• Then, the probability of any event Then, the probability of any event EE is given by, is given by, Pr[Pr[EE] = |] = |EE|/||/|SS||. . Very simple!Very simple!
• Problems:Problems: Still needs a definition for Still needs a definition for equally likelyequally likely, , and depends on the existence of and depends on the existence of somesome finite sample finite sample space space SS in which all outcomes in in which all outcomes in SS are, in fact, are, in fact, equally likely.equally likely.
Module #19 – Probability
Probability: Axiomatic Definition• Let Let pp be any total function be any total function pp::SS→[0,1]→[0,1] such that such that
∑∑ss pp((ss) = 1) = 1..
• Such a Such a pp is called a is called a probability distributionprobability distribution..• Then, theThen, the probability under p probability under p
of any event of any event EESS is just: is just:• Advantage: Advantage: Totally mathematically well-defined!Totally mathematically well-defined!
– This definition can even be extended to apply to infinite This definition can even be extended to apply to infinite sample spaces, by changing sample spaces, by changing ∑∑→→∫∫, and calling , and calling pp a a probability density functionprobability density function or a probability or a probability measuremeasure..
• Problem: Problem: Leaves operational meaning unspecified.Leaves operational meaning unspecified.
Es
spEp )(:][
Module #19 – Probability
The Axioms of Probability
• 0 <= P(A) <= 10 <= P(A) <= 1
• P(True) = 1P(True) = 1
• P(False) = 0P(False) = 0
• P(A or B) = P(A) + P(B) - P(A and B)P(A or B) = P(A) + P(B) - P(A and B)
Module #19 – Probability
Interpreting the axioms• 0 <= P(A) 0 <= P(A) <= 1<= 1• P(True) = 1P(True) = 1• P(False) = 0P(False) = 0• P(A or B) = P(A) + P(B) - P(A and B)P(A or B) = P(A) + P(B) - P(A and B)
The area of A can’t get any smaller than 0
And a zero area would mean no world could ever have A true
Module #19 – Probability
Interpreting the axioms• 0 <= 0 <= P(A) <= 1P(A) <= 1• P(True) = 1P(True) = 1• P(False) = 0P(False) = 0• P(A or B) = P(A) + P(B) - P(A and B)P(A or B) = P(A) + P(B) - P(A and B)
The area of A can’t get any bigger than 1
And an area of 1 would mean all worlds will have A true
Module #19 – Probability
Interpreting the axioms
• 0 <= P(A) <= 10 <= P(A) <= 1• P(True) = 1P(True) = 1• P(False) = 0P(False) = 0• P(P(AA or or BB) = P() = P(AA) + P() + P(BB) - P() - P(AA and and BB))
A
B
Module #19 – Probability
These Axioms are Not to be Trifled With
• There have been attempts to do different There have been attempts to do different methodologies for uncertaintymethodologies for uncertainty
• Fuzzy LogicFuzzy Logic
• Three-valued logicThree-valued logic
• Dempster-ShaferDempster-Shafer
• Non-monotonic reasoningNon-monotonic reasoning
• But the axioms of probability are the only system But the axioms of probability are the only system with this property: with this property:
If you gamble using them you can’t be unfairly exploited by an If you gamble using them you can’t be unfairly exploited by an opponent using some other system [di Finetti 1931]opponent using some other system [di Finetti 1931]
Module #19 – Probability
Theorems from the Axioms
• 0 <= P(A) <= 1, P(True) = 1, P(False) = 00 <= P(A) <= 1, P(True) = 1, P(False) = 0• P(P(AA or or BB) = P() = P(AA) + P() + P(BB) - P() - P(AA and and BB))
From these we can prove:From these we can prove:
P(not A) = P(~A) = 1-P(A)P(not A) = P(~A) = 1-P(A)
• How?How?
Module #19 – Probability
Another important theorem
• 0 <= P(A) <= 1, P(True) = 1, P(False) = 00 <= P(A) <= 1, P(True) = 1, P(False) = 0• P(P(AA or or BB) = P() = P(AA) + P() + P(BB) - P() - P(AA and and BB))
From these we can prove:From these we can prove:
P(A) = P(A ^ B) + P(A ^ ~B)P(A) = P(A ^ B) + P(A ^ ~B)
• How?How?
Module #19 – Probability
Probability of an event EThe probability of an event E is the sum of the The probability of an event E is the sum of the
probabilities of the outcomes in E. That isprobabilities of the outcomes in E. That is
Note that, if there are n outcomes in the event E, Note that, if there are n outcomes in the event E, that is, if E = {athat is, if E = {a11,a,a22,…,a,…,ann} then} then
p(E) p(s)
sE
p(E) p(
i1
n
ai)
Module #19 – Probability
Example
• What is the probability that, if we flip a coin What is the probability that, if we flip a coin three times, that we get an odd number of three times, that we get an odd number of tails?tails?
((TTTTTT), (TTH), (), (TTH), (TTHH), (HTT), (HHHH), (HTT), (HHTT), (HHH), ), (HHH), (THT), (H(THT), (HTTH)H)
Each outcome has probability 1/8,Each outcome has probability 1/8,
p(odd number of tails) = 1/8+1/8+1/8+1/8 = ½ p(odd number of tails) = 1/8+1/8+1/8+1/8 = ½
Module #19 – Probability
Visualizing Sample Space
• 1.1. ListingListing– S = {Head, Tail}S = {Head, Tail}
• 2.2. Venn Diagram Venn Diagram
• 3.3. Contingency TableContingency Table
• 4.4. Decision Tree DiagramDecision Tree Diagram
Module #19 – Probability
SS
TailTail
HHHH
TTTT
THTHHTHT
Sample SpaceSample SpaceS = {HH, HT, TH, TT}S = {HH, HT, TH, TT}
Venn Diagram
OutcomeOutcome
Experiment: Toss 2 Coins. Note Faces.Experiment: Toss 2 Coins. Note Faces.
Event Event
Module #19 – Probability
22ndnd CoinCoin11stst CoinCoin HeadHead TailTail TotalTotal
HeadHead HHHH HTHT HH, HTHH, HT
TailTail THTH TTTT TH, TTTH, TT
TotalTotal HH,HH, THTH HT,HT, TTTT SS
Contingency Table
Experiment: Toss 2 Coins. Note Faces.Experiment: Toss 2 Coins. Note Faces.
S = {HH, HT, TH, TT}S = {HH, HT, TH, TT} Sample SpaceSample Space
OutcomeOutcomeSimpleSimpleEvent Event (Head on(Head on1st Coin)1st Coin)
Module #19 – Probability
Tree Diagram
Outcome Outcome
S = {HH, HT, TH, TT}S = {HH, HT, TH, TT} Sample SpaceSample Space
Experiment: Toss 2 Coins. Note Faces.Experiment: Toss 2 Coins. Note Faces.
TT
HH
TT
HH
TT
HHHH
HTHT
THTH
TTTT
HH
Module #19 – Probability
Discrete Random Variable
– Possible values (outcomes) are discretePossible values (outcomes) are discrete• E.g., natural number (0, 1, 2, 3 etc.)E.g., natural number (0, 1, 2, 3 etc.)
– Obtained by CountingObtained by Counting– Usually Finite Number of ValuesUsually Finite Number of Values
• But could be infinite (must be “countable”)But could be infinite (must be “countable”)
Module #19 – Probability
Discrete Probability Distribution ( also called probability mass function (pmf) )
1.1.List of All possible [List of All possible [xx, , pp((xx)] pairs)] pairs– xx = Value of Random Variable (Outcome) = Value of Random Variable (Outcome)– pp((xx) = Probability Associated with Value) = Probability Associated with Value
2.2.Mutually Exclusive (No Overlap)Mutually Exclusive (No Overlap)
3.3.Collectively Exhaustive (Nothing Left Out)Collectively Exhaustive (Nothing Left Out)
4. 0 4. 0 pp((xx) ) 1 1
5. 5. pp((xx) = 1) = 1
Module #19 – Probability
Visualizing Discrete Probability Distributions
{ (0, .25), (1, .50), (2, .25) }{ (0, .25), (1, .50), (2, .25) }{ (0, .25), (1, .50), (2, .25) }{ (0, .25), (1, .50), (2, .25) }
ListingListingTableTable
GraphGraph EquationEquation
# # TailsTails f(xf(x))CountCount
p(xp(x))
00 11 .25.2511 22 .50.5022 11 .25.25
pp xxnn
xx nn xxpp ppxx nn xx(( ))
!!
!! (( )) !!(( ))
11
.00.00
.25.25
.50.50
00 11 22xx
p(x)p(x)
Module #19 – Probability
Arity of Random Variables
• Suppose A can take on more than 2 valuesSuppose A can take on more than 2 values• A is a A is a random variable with arity krandom variable with arity k if it can take on exactly one value if it can take on exactly one value
out of out of {v{v11,v,v22, .. v, .. vkk}}
• Thus…Thus…
jivAvAP ji if 0)(
1)( 21 kvAvAvAP
Module #19 – Probability
Mutually Exclusive Events
• Two events Two events EE11, , EE22 are called are called mutually mutually
exclusiveexclusive if they are disjoint: if they are disjoint: EE11EE22 = = – Note that two mutually exclusive events Note that two mutually exclusive events cannot cannot
both occurboth occur in the same instance of a given in the same instance of a given experiment.experiment.
• For mutually exclusive events,For mutually exclusive events,Pr[Pr[EE11 EE22] = Pr[] = Pr[EE11] + Pr[] + Pr[EE22]]..
Module #19 – Probability
Exhaustive Sets of Events
• A set A set EE = { = {EE11, , EE22, …}, …} of events in the sample of events in the sample
space space SS is called is called exhaustiveexhaustive iff . iff .• An exhaustive set An exhaustive set EE of events that are all mutually of events that are all mutually
exclusive with each other has the property thatexclusive with each other has the property that
SEi
.1]Pr[ iE
Module #19 – Probability
An easy fact about Multivalued Random Variables:
• Using the axioms of probability…Using the axioms of probability…
0 <= P(A) <= 1, P(True) = 1, P(False) = 00 <= P(A) <= 1, P(True) = 1, P(False) = 0
P(A or B) = P(A) + P(B) - P(A and B)P(A or B) = P(A) + P(B) - P(A and B)• And assuming that A obeys…And assuming that A obeys…
• It’s easy to prove thatjivAvAP ji if 0)(
1)( 21 kvAvAvAP
)()(1
21
i
jji vAPvAvAvAP
• And thus we can prove
1)(1
k
jjvAP
Module #19 – Probability
Another fact about Multivalued Random Variables:
• Using the axioms of probability…Using the axioms of probability…
0 <= P(A) <= 1, P(True) = 1, P(False) = 00 <= P(A) <= 1, P(True) = 1, P(False) = 0
P(A or B) = P(A) + P(B) - P(A and B)P(A or B) = P(A) + P(B) - P(A and B)• And assuming that A obeys…And assuming that A obeys…
• It’s easy to prove that
jivAvAP ji if 0)(
1)( 21 kvAvAvAP
)(])[(1
21
i
jji vABPvAvAvABP
Module #19 – Probability
Elementary Probability Rules
• P(~A) + P(A) = 1P(~A) + P(A) = 1• P(B) = P(B ^ A) + P(B ^ ~A)P(B) = P(B ^ A) + P(B ^ ~A)
1)(1
k
jjvAP
)()(1
k
jjvABPBP
Module #19 – Probability
Bernoulli Trials
• Each performance of an experiment with Each performance of an experiment with only two possible outcomes is called a only two possible outcomes is called a Bernoulli trialBernoulli trial..
• In general, a possible outcome of a In general, a possible outcome of a Bernoulli trial is called a Bernoulli trial is called a successsuccess or a or a failurefailure..
• If p is the probability of a success and q is If p is the probability of a success and q is the probability of a failure, then p+q=1.the probability of a failure, then p+q=1.
Module #19 – Probability
ExampleA coin is biased so that the probability of heads is 2/3. A coin is biased so that the probability of heads is 2/3.
What is the probability that exactly four heads come up What is the probability that exactly four heads come up when the coin is flipped seven times, assuming that the when the coin is flipped seven times, assuming that the flips are independent?flips are independent?
The number of ways that we can get four heads is: C(7,4) The number of ways that we can get four heads is: C(7,4) = 7!/4!3!= 7*5 = 35= 7!/4!3!= 7*5 = 35
The probability of getting four heads and three tails is The probability of getting four heads and three tails is (2/3)(2/3)44(1/3)(1/3)3= 3= 16/316/377
p(4 heads and 3 tails) is C(7,4) p(4 heads and 3 tails) is C(7,4) (2/3)(2/3)44(1/3)(1/3)33 = 35*16/3 = 35*16/377 = = 560/2187560/2187
Module #19 – Probability
Probability of k successes in n independent Bernoulli trials.
The probability of k successes in n The probability of k successes in n independent Bernoulli trials, with independent Bernoulli trials, with probability of success p and probability of probability of success p and probability of failure q = 1-p is C(n,k)pfailure q = 1-p is C(n,k)pkkqqn-kn-k
Module #19 – Probability
Find each of the following probabilities when n independent Bernoulli trials are carried out
with probability of success, p.
• Probability of no successes.Probability of no successes.
C(n,0)pC(n,0)p00qqn-k n-k = 1(p= 1(p00)(1-p))(1-p)n n = (1-p)= (1-p)n n
• Probability of at least one success.Probability of at least one success.
1 - (1-p)1 - (1-p)n n (why?)(why?)
Module #19 – Probability
Find each of the following probabilities when n independent Bernoulli trials are carried out
with probability of success, p.
• Probability of at most one success.Probability of at most one success.
Means there can be no successes or one Means there can be no successes or one successsuccess
C(n,0)pC(n,0)p00qqn-0 n-0 +C(n,1)p+C(n,1)p11qqn-1 n-1
(1-p)(1-p)nn + np(1-p) + np(1-p)n-1n-1
• Probability of at least two successes.Probability of at least two successes.
1 -1 - (1-p)(1-p)nn - np(1-p) - np(1-p)n-1n-1
Module #19 – Probability
A coin is flipped until it comes ups tails. The probability the coin comes up tails is p.
• What is the probability that the experiment What is the probability that the experiment ends after n flips, that is, the outcome ends after n flips, that is, the outcome consists of n-1 heads and a tail?consists of n-1 heads and a tail?
(1-p)(1-p)n-1n-1pp
Module #19 – Probability
Probability vs. Odds• You may have heard the term “odds.”You may have heard the term “odds.”
– It is widely used in the gambling community.It is widely used in the gambling community.
• This is not the same thing as probability!This is not the same thing as probability!– But, it is very closely related.But, it is very closely related.
• The The odds in favorodds in favor of an event of an event EE means the means the relative relative probability of probability of EE compared with its complement compared with its complement EE. .
OO((EE) :) :≡ Pr(≡ Pr(EE)/Pr()/Pr(EE))..– E.g.E.g., if , if pp((EE) = 0.6) = 0.6 then then pp((EE) = 0.4) = 0.4 and and OO((EE) = 0.6/0.4 = 1.5) = 0.6/0.4 = 1.5..
• Odds are conventionally written as a ratio of integers.Odds are conventionally written as a ratio of integers.– E.g.E.g., , 3/23/2 or or 3:23:2 in above example. “Three to two in favor.” in above example. “Three to two in favor.”
• The The odds againstodds against EE just means just means 1/1/OO((EE)). . “2 to 3 against”“2 to 3 against”
Exercise:Express theprobabilityp as a functionof the odds in favor O.
Module #19 – Probability
Example 1: Balls-and-Urn• Suppose an urn contains 4 blue balls and 5 red balls.Suppose an urn contains 4 blue balls and 5 red balls.• An example An example experiment: experiment: Shake up the urn, reach in Shake up the urn, reach in
(without looking) and pull out a ball.(without looking) and pull out a ball.• A A random variablerandom variable VV: Identity of the chosen ball.: Identity of the chosen ball.• The The sample space sample space SS: The set of: The set of
all possible values of all possible values of VV::– In this case, In this case, SS = { = {bb11,…,,…,bb99}}
• An An eventevent EE: “The ball chosen is: “The ball chosen isblue”: blue”: EE = { ______________ } = { ______________ }
• What are the odds in favor of What are the odds in favor of EE??• What is the probability of What is the probability of EE? ?
b1 b2
b3
b4
b5
b6
b7
b8
b9
Module #19 – Probability
Independent Events
• Two events E,F are called independent if Pr[EF] = Pr[E]·Pr[F].
• Relates to the product rule for the number of ways of doing two independent tasks.
• Example: Flip a coin, and roll a die.Pr[(coin shows heads) (die shows 1)] =
Pr[coin is heads] × Pr[die is 1] = ½×1/6 =1/12.
Module #19 – Probability
Example
Suppose a red die and a blue die are rolled. The sample space:
1 2 3 4 5 6 1 x x x x x x 2 x x x x x x 3 x x x x x x 4 x x x x x x 5 x x x x x x 6 x x x x x x
Are the events sum is 7 and the blue die is 3 independent?
Module #19 – Probability
|S| = 36 1 2 3 4 5 6 1 x x x x x x 2 x x x x x x 3 x x x x x x 4 x x x x x x 5 x x x x x x 6 x x x x x x
|sum is 7| = 6
|blue die is 3| = 6
| in intersection | = 1
p(sum is 7 and blue die is 3) =1/36
p(sum is 7) p(blue die is 3) =6/36*6/36=1/36
Thus, p((sum is 7) and (blue die is 3)) = p(sum is 7) p(blue die is 3)
The events sum is 7 and the blue die is 3 are independent:
Module #19 – Probability
Conditional Probability
• Let Let EE,,FF be any events such that be any events such that Pr[Pr[FF]>0]>0..• Then, the Then, the conditional probabilityconditional probability of of EE given given FF, ,
written written Pr[Pr[EE||FF]], is defined as , is defined as Pr[Pr[EE||FF] :] :≡ ≡ Pr[Pr[EEFF]/Pr[]/Pr[FF]]..
• This is what our probability that This is what our probability that EE would turn out would turn out to occur should be, if we are given to occur should be, if we are given only only the the information that information that FF occurs. occurs.
• If If EE and and FF are independent then are independent then Pr[Pr[EE||FF] = Pr[] = Pr[EE]].. Pr[Pr[EE||FF] = Pr[] = Pr[EEFF]/Pr[]/Pr[FF] = Pr[] = Pr[EE]]×Pr[×Pr[FF]/Pr[]/Pr[FF] = Pr[] = Pr[EE]]
Module #19 – Probability
Visualizing Conditional Probability
• If we are given that event If we are given that event FF occurs, then occurs, then– Our attention gets restricted to the subspace Our attention gets restricted to the subspace FF..
• Our Our posteriorposterior probability for probability for EE (after seeing (after seeing FF)) correspondscorrespondsto the to the fractionfraction of of FF where where EEoccurs also.occurs also.
• Thus, Thus, pp′(′(EE)=)=pp((EE∩∩FF)/)/pp((FF).).
Entire sample space S
Event FEvent EEventE∩F
Module #19 – Probability
• Suppose I choose a single letter out of the 26-letter English alphabet, totally at random.– Use the Laplacian assumption on the sample space {a,b,..,z}.– What is the (prior) probability
that the letter is a vowel?• Pr[Vowel] = __ / __ .
• Now, suppose I tell you that the letter chosen happened to be in the first 9 letters of the alphabet.– Now, what is the conditional (or
posterior) probability that the letteris a vowel, given this information?
• Pr[Vowel | First9] = ___ / ___ .
Conditional Probability Example
a b c
de fghi
j
k
l
mn
o
p q
r
s
tu
v
w
xy
z
1st 9lettersvowels
Sample Space S
Module #19 – Probability
Example
• What is the probability that, if we flip a coin three times, that we get an odd number of tails (=event E), if we know that the event F, the first flip comes up tails occurs?
(TTT), (TTH), (THH), (HTT), (HHT), (HHH), (THT), (HTH)
Each outcome has probability 1/4,
p(E |F) = 1/4+1/4 = ½, where E=odd number of tailsor p(E|F) = p(EF)/p(F) = 2/4 = ½For comparison p(E) = 4/8 = ½ E and F are independent, since p(E |F) = Pr(E).
Module #19 – Probability
Prior and Posterior Probability• Suppose that, before you are given any information about the
outcome of an experiment, your personal probability for an event E to occur is p(E) = Pr[E].– The probability of E in your original probability distribution p is called
the prior probability of E.• This is its probability prior to obtaining any information about the outcome.
• Now, suppose someone tells you that some event F (which may overlap with E) actually occurred in the experiment.– Then, you should update your personal probability for event E to occur,
to become p′(E) = Pr[E|F] = p(E∩F)/p(F).• The conditional probability of E, given F.
– The probability of E in your new probability distribution p′ is called the posterior probability of E.
• This is its probability after learning that event F occurred.
• After seeing F, the posterior distribution p′ is defined by letting
p′(v) = p({v}∩F)/p(F) for each individual outcome vS.
Module #19 – Probability
6.3 Bayes’ Theorem
Longin Jan LateckiTemple University
Slides for a Course Based on the TextSlides for a Course Based on the TextDiscrete Mathematics & Its ApplicationsDiscrete Mathematics & Its Applications (6 (6thth
Edition) Kenneth H. Rosen Edition) Kenneth H. Rosen based on slides by based on slides by
Michael P. Frank and Wolfram BurgardMichael P. Frank and Wolfram Burgard
Module #19 – Probability
Bayes’ Rule• One way to compute the probability that a
hypothesis H is correct, given some data D:
• This follows directly from the definition of conditional probability! (Exercise: Prove it.)
• This rule is the foundation of Bayesian methods for probabilistic reasoning, which are very powerful, and widely used in artificial intelligence applications:– For data mining, automated diagnosis, pattern recognition,
statistical modeling, even evaluating scientific hypotheses!
]Pr[
]Pr[]|Pr[]|Pr[
D
HHDDH
Rev. Thomas Bayes1702-1761
Module #19 – Probability
Bayes’ Theorem
• Allows one to compute the probability that a Allows one to compute the probability that a hypothesis hypothesis HH is correct, given data is correct, given data DD::
]Pr[
]Pr[]|Pr[]|Pr[
D
HHDDH
jjj
iii HHD
HHDDH
]Pr[]|Pr[
]Pr[]|Pr[]|Pr[
Set of Hj is exhaustive
Module #19 – Probability
Example 1: Two boxes with balls• Two boxes: first: 2 blue and 7 red balls; second: 4 blue and 3 red ballsTwo boxes: first: 2 blue and 7 red balls; second: 4 blue and 3 red balls• Bob selects a ball by first choosing one of the two boxes, and then one ball Bob selects a ball by first choosing one of the two boxes, and then one ball
from this box.from this box.• If Bob has selected a red ball, what is the probability that he selected a ball If Bob has selected a red ball, what is the probability that he selected a ball
from the first box.from the first box.• An An eventevent EE: Bob has chosen a red ball.: Bob has chosen a red ball.• An An eventevent FF: Bob has chosen a ball from the first box.: Bob has chosen a ball from the first box.• We want to find p(We want to find p(FF | | EE))
Module #19 – Probability
Example 2:• Suppose 1% of population has AIDS• Prob. that the positive result is right: 95%• Prob. that the negative result is right: 90%• What is the probability that someone who has the
positive result is actually an AIDS patient?
• H: event that a person has AIDS• D: event of positive result• P[D|H] = 0.95 = 0.95 P[D| H] = 1- 0.9
• P[D] = P[D|H]P[H]+P[D|H]P[H ] = 0.95*0.01+0.1*0.99=0.1085
• P[H|D] = 0.95*0.01/0.1085=0.0876
Module #19 – Probability
What’s behind door number three?• The Monty Hall problem paradoxThe Monty Hall problem paradox
– Consider a game show where a prize (a car) is behind one of Consider a game show where a prize (a car) is behind one of three doorsthree doors
– The other two doors do not have prizes (goats instead)The other two doors do not have prizes (goats instead)– After picking one of the doors, the host (Monty Hall) opens a After picking one of the doors, the host (Monty Hall) opens a
different door to show you that the door he opened is not the different door to show you that the door he opened is not the prizeprize
– Do you change your decision?Do you change your decision?• Your initial probability to win (i.e. pick the right door) is 1/3Your initial probability to win (i.e. pick the right door) is 1/3• What is your chance of winning if you change your choice What is your chance of winning if you change your choice
after Monty opens a wrong door?after Monty opens a wrong door?• After Monty opens a wrong door, if you change your choice, After Monty opens a wrong door, if you change your choice,
your chance of your chance of winningwinning is 2/3 is 2/3– Thus, your chance of winning Thus, your chance of winning doublesdoubles if you change if you change– Huh?Huh?
Module #19 – Probability
Module #19 – Probability
Monty Hall Problem
Ci - The car is behind Door i, for i equal to 1, 2 or 3.
Hij - The host opens Door j after the player has picked Door i, for i and j equal to 1, 2 or 3. Without loss of generality, assume, by re-numbering the doors if necessary, that the player picks Door 1, and that the host then opens Door 3, revealing a goat. In other words, the host makes proposition H13 true.
Then the posterior probability of winning by not switching doorsis P(C1|H13).
3
1)( iCP
Module #19 – Probability
3
113
1113
13
1113131
)()|(
)()|(
)(
)()|()|(
iii CPCHP
CPCHP
HP
CPCHPHCP
P(H13 | C1 )= 0.5, since the host will always open a door that has no car behind it, chosen from among the two not picked by the player (which are 2 and 3 here)
)()|()()|()()|(
)()|(
331322131113
1113
CPCHPCPCHPCPCHP
CPCHP
3
1
2161
31
031
131
21
31
21
Module #19 – ProbabilityThe probability of winning by switching is P(C2|H13), since under our assumption switching means switching the selection to Door 2, since P(C3|H13) = 0 (the host will never open the door with the car)
3
113
2213
13
2213132
)()|(
)()|(
)(
)()|()|(
iii CPCHP
CPCHP
HP
CPCHPHCP
3
2
2131
31
031
131
21
31
1
The posterior probability of winning by not switching doorsis P(C1|H13) = 1/3.
Module #19 – Probability
Exercises: 6, p. 424, and 16, p. 425
Module #19 – Probability
Module #19 – Probability
Module #19 – Probability
Module #19 – Probability
Continuous random variable
Module #19 – Probability
Continuous Prob. Density Function
1.1. Mathematical FormulaMathematical Formula
2.2. Shows All Values, Shows All Values, xx, and , and Frequencies, f(Frequencies, f(xx))– f(f(xx) Is ) Is NotNot Probability Probability
3.3. PropertiesProperties
((Area Under Curve)Area Under Curve)ValueValue
((Value, Frequency)Value, Frequency)
f(x)f(x)
aa bbxxff xx dxdx
ff xx
(( ))
(( ))
All All xx
aa x x bb
11
0,0,
Module #19 – Probability
Continuous Random Variable Probability
Probability Is Area Probability Is Area Under Curve!Under Curve!
PP cc xx dd ff xx dxdxcc
dd(( )) (( ))
f(x)f(x)
Xc d
Module #19 – Probability
Probability mass function
In probability theory, a probability mass function (pmf) is a function that gives the probability that a discrete random variable is exactly equal to some value. A pmf differs from a probability density function (pdf) in that the values of a pdf, defined only for continuous random variables, are not probabilities as such. Instead, the integral of a pdf over a range of possible values (a, b] gives the probability of the random variable falling within that range.
Example graphs of a pmfs. All the values of a pmf must be non-negative and sum up to 1. (right) The pmf of a fair die. (All the numbers on the die have an equal chance of appearing on top when the die is rolled.)
Module #19 – ProbabilitySuppose that X is a discrete random variable, taking values on some countable sample space S ⊆ R. Then the probability mass function fX(x) for X is given by
thus
Note that this explicitly defines fX(x) for all real numbers,
including all values in R that X could never take; indeed, it assigns such values a probability of zero.
Example. Suppose that X is the outcome of a single coin toss, assigning 0 to tails and 1 to heads. The probability that X = x is 0.5 on the state space {0, 1} (this is a Bernoulli random variable), and hence the probability mass function is
)()( xfxXP X
Module #19 – Probability
Uniform Distribution
1. 1. Equally Likely OutcomesEqually Likely Outcomes
2. Probability Density2. Probability Density
3. Mean & Standard Deviation 3. Mean & Standard Deviation Mean Mean
MedianMedian
f xd c
( ) 1
f xd c
( ) 1
c d d c
2 12
c d d c2 12
1d c
1d c
x
f(x)
dc
Module #19 – Probability
Uniform Distribution Example
• You’re production manager of a soft drink You’re production manager of a soft drink bottling company. You believe that when a bottling company. You believe that when a machine is set to dispense 12 oz., it really machine is set to dispense 12 oz., it really dispenses dispenses 11.511.5 to to 12.512.5 oz. inclusive. oz. inclusive.
• Suppose the amount dispensed has a uniform Suppose the amount dispensed has a uniform distribution. distribution.
• What is the probability that What is the probability that lessless thanthan 11.811.8 oz. oz. is dispensed?is dispensed?
Module #19 – Probability
Uniform Distribution Solution
PP(11.5 (11.5 xx 11.8) 11.8) = (Base)(Height) = (Base)(Height)
= (11.8 - 11.5)(1) = = (11.8 - 11.5)(1) = 0.30 0.30
11.511.5 12.512.5
ff((xx))
xx11.811.8
1 112 5 11511
10
d c
. .
.
1 112 5 11511
10
d c
. .
.
1.01.0
Module #19 – Probability
Normal Distribution
1.1. Describes Many Random Processes Describes Many Random Processes or Continuous Phenomenaor Continuous Phenomena
2.2. Can Be Used to Approximate Discrete Can Be Used to Approximate Discrete Probability DistributionsProbability Distributions
– Example: BinomialExample: Binomial
3.3. Basis for Classical Statistical InferenceBasis for Classical Statistical Inference
4.4. A.k.a. Gaussian distributionA.k.a. Gaussian distribution
Module #19 – Probability
Normal Distribution
1.1. ‘‘Bell-Shaped’ & Bell-Shaped’ & SymmetricalSymmetrical
2.2. Mean, Median, Mode Mean, Median, Mode Are EqualAre Equal
4. Random Variable Has 4. Random Variable Has Infinite RangeInfinite Range MeanMean
X
f(X)
X
f(X)
* light-tailed distribution
Module #19 – Probability
Probability Density Function
2
2
1
e2
1)(
x
xf
2
2
1
e2
1)(
x
xf
ff((xx)) == Frequency of Random Variable Frequency of Random Variable xx == Population Standard DeviationPopulation Standard Deviation == 3.14159; e = 2.718283.14159; e = 2.71828xx == Value of Random Variable (-Value of Random Variable (-< < xx < < )) == Population MeanPopulation Mean
Module #19 – Probability
Effect of Varying Parameters ( & )
X
f(X)
CA
B
Module #19 – Probability
Normal Distribution Probability
?)()( dxxfdxcPd
c
c dx
f(x)
c dx
f(x)
Probability is Probability is area under area under curve!curve!
Module #19 – Probability
X
f(X)
X
f(X)
Infinite Number of Tables
Normal distributions differ by Normal distributions differ by mean & standard deviation.mean & standard deviation.
Each distribution would Each distribution would require its own table.require its own table.
That’s an That’s an infinite infinite number!number!
Module #19 – Probability
Standardize theNormal Distribution
X
X
One table! One table!
Normal DistributionNormal Distribution
= 0
= 1
Z = 0
= 1
Z
ZX
ZX
Standardized
Normal DistributionStandardized
Normal Distribution
Module #19 – Probability
Intuitions on Standardizing
• Subtracting Subtracting from each value X just from each value X just moves the curve around, so values are moves the curve around, so values are centered on 0 instead of on centered on 0 instead of on
• Once the curve is centered, dividing Once the curve is centered, dividing each value by each value by >1 moves all values >1 moves all values toward 0, pressing the curvetoward 0, pressing the curve
Module #19 – Probability
Standardizing Example
X= 5
= 10
6.2 X= 5
= 10
6.2
Normal DistributionNormal Distribution
ZX
6 2 5
1012
..Z
X
6 2 510
12.
.
Module #19 – Probability
Standardizing Example
X= 5
= 10
6.2 X= 5
= 10
6.2
Normal DistributionNormal Distribution
ZX
6 2 5
1012
..Z
X
6 2 510
12.
.
Z= 0
= 1
.12 Z= 0
= 1
.12
Standardized Normal Distribution
Standardized Normal Distribution
Module #19 – Probability
Module #19 – Probability
6.4 Expected Value and Variance
Longin Jan LateckiTemple University
Slides for a Course Based on the TextSlides for a Course Based on the TextDiscrete Mathematics & Its ApplicationsDiscrete Mathematics & Its Applications
(6(6thth Edition) Kenneth H. Rosen Edition) Kenneth H. Rosen based on slides by Michael P. Frankbased on slides by Michael P. Frank
Module #19 – Probability
Expected Values• For any random variable For any random variable VV having a numeric domain, its having a numeric domain, its
expectation valueexpectation value or or expected valueexpected value or or weighted average weighted average valuevalue or or (arithmetic) mean value(arithmetic) mean value ExEx[[VV]], under the , under the probability distribution probability distribution Pr[Pr[vv] = ] = pp((vv)), is defined as , is defined as
• The term “expected value” is very widely used for this.The term “expected value” is very widely used for this.– But this term is somewhat misleading, since the “expected” But this term is somewhat misleading, since the “expected”
value might itself be totally unexpected, or even impossible!value might itself be totally unexpected, or even impossible!• E.g.E.g., if , if pp(0)=0.5(0)=0.5 & & pp(2)=0.5(2)=0.5, then , then Ex[Ex[VV]=1]=1, even though , even though pp(1)=0(1)=0 and so and so
we know that we know that VV≠1≠1!!• Or, if Or, if pp(0)=0.5(0)=0.5 & & pp(1)=0.5(1)=0.5, then , then Ex[Ex[VV]=0.5]=0.5 even if even if VV is an integer is an integer
variable!variable!
.)(:][:][:ˆ][
Vv
p vpvVVVdom
ExEx
Module #19 – Probability
Derived Random Variables
• Let Let SS be a sample space over values of a random be a sample space over values of a random variable variable VV (representing possible outcomes). (representing possible outcomes).
• Then, any function Then, any function ff over over SS can also be considered can also be considered to be a random variable (whose actual value to be a random variable (whose actual value ff((VV)) is is derived from the actual value of derived from the actual value of VV).).
• If the range If the range RR = = rangerange[[ff]] of of ff is numeric, then the is numeric, then the mean value mean value ExEx[[ff]] of of ff can still be defined, as can still be defined, as
Ss
sfspff )()(][ˆ Ex
Module #19 – Probability
Recall that a random variable X is actually a functionf: S → X(S),
where S is the sample space and X(S) is the range of X.This fact implies that the expected value of X is
Ss SXr
rrXpsXspXE)(
)()()()(
Example 1. Expected Value of a Die.Let X be the number that comes up when a die is rolled.
}6,...,1{}6,...,1{ 2
7
6
1)()(
rr
rrrXpXE
Module #19 – ProbabilityExample 2
• A fair coin is flipped 3 times. Let A fair coin is flipped 3 times. Let SS be the be the sample space of 8 possible outcomes, and let sample space of 8 possible outcomes, and let XX be a random variable that assignees to an be a random variable that assignees to an outcome the number of heads in this outcome. outcome the number of heads in this outcome.
E(X) = 1/8[X(TTT) + X(TTH) + X(THH) + E(X) = 1/8[X(TTT) + X(TTH) + X(THH) + X(HTT) + X(HHT) +X(HHH) + X(THT) + X(HTT) + X(HHT) +X(HHH) + X(THT) + X(HTH)] X(HTH)] = 1/8[0 + 1 + 2 + 1 + 2 + 3 + 1 + 2] = 12/8 =3/2= 1/8[0 + 1 + 2 + 1 + 2 + 3 + 1 + 2] = 12/8 =3/2
Module #19 – Probability
Linearity of Expectation Values
• Let Let XX11, , XX22 be any two random variables be any two random variables derived from the derived from the samesame sample space sample space SS, and , and subject to the same underlying distribution. subject to the same underlying distribution.
• Then we have the following theorems:Then we have the following theorems:ExEx[[XX11++XX22] = ] = ExEx[[XX11] + ] + ExEx[[XX22]]
ExEx[[aXaX11 + + bb] = ] = aaExEx[[XX11] + ] + bb
• You should be able to easily prove these for You should be able to easily prove these for yourself at home.yourself at home.
Module #19 – Probability
Variance & Standard Deviation
• The The variancevariance VarVar[[XX] = ] = σσ22((XX)) of a random variable of a random variable XX is the expected value of the is the expected value of the squaresquare of the of the difference between the value of difference between the value of XX and its and its expectation value expectation value ExEx[[XX]]::
• The The standard deviationstandard deviation or or root-mean-squareroot-mean-square (RMS) (RMS) differencedifference of of XX is is σσ((XX) :≡ ) :≡ VarVar[[XX]]1/21/2..
Ss
p spXsXX )(][)(:][ 2ExVar
Module #19 – ProbabilityExample 15
• What is the variance of the random variable What is the variance of the random variable XX, , where where XX is the number that comes up when a die is the number that comes up when a die is rolled?is rolled?
• V(X) = E(XV(X) = E(X22) – E(X)) – E(X)22
E(XE(X22) = 1/6[1) = 1/6[122 + 2 + 222 + 3 + 322 + 4 + 422 + 5 + 522 + 6 + 622] = 91/6] = 91/6
V(X) = 91/6 – (7/2) V(X) = 91/6 – (7/2) 2 2 = 35/12 = 35/12 ≈≈ 2.92 2.92
top related