notes3n

12. Examples: On the night of March 1, 1986, in Lorain, Ohio, John Doe was struck by

a speeding taxi as he crossed the street. The taxi was driving the wrong way down

a one-way street and did not stop. An eyewitness thought that the taxi was blue.

Lorrain has only two taxi companies, Blue Cab and Green Cab. 85% of the taxis are

Green Cab. Tests have shown that the eyewitness is perfect in distinguishing cabs

from private automobiles, but under conditions approximating those of the night of

the accident he was able to identify the correct color of a taxi 80% of the time. Suppose

you are the judge and suppose that “preponderance of the evidence” is interpreted to

mean a probability of 50 percent, do you think it is more likely than not that the taxi

was blue rather than green? [answer:

P (B|Eye witness says it is blue)=

P (eye witness says it is blue|B)P (B)P (eye witness says it is blue|B)P (B) + P (eye witness says it is blue|G)P (G)

=0.8× 0.15

0.8× 0.15 + 0.2× 0.85 = 41.38%.

2.2 Random Variable

1. A random variable (r.v.) X is a function from a sample space S into the real numbers.

The sample space of the r.v., X , is the range of the r.v.

• Why do we want to introduce random variables? The reason is that in many

experiments it is easier to deal with a summary variable than with the original

probability structure. For example, in an opinion poll, we may ask 50 people

whether they favors Gore or Bush. If we record a ‘1’ for Gore and ‘0’ for Bush,

the sample space for this experiment will have 250 elements (too big!!). It may

be that the only quantity of interest is the number of people who favors Gore,

then we can define a random variable X = the number of 1s recorded out of 50.

The sample space for X is the set of integers {0, 1, ..., 50} .• Example: “toss two coins”. Recall the original sample space {HH,HT, TH,TT} .We can define a r.v. as the number of heads in the two tosses, which will corre-

spond to the following map:

X (HH) = 2

X (HT ) = X (TH) = 1

X (TT ) = 0.

10

The number of heads in the two tosses is random (or stochastic) because it

depends on the outcome of the random experiment.

2. Notational convention: r.v. will always be denoted by upper case letters, and the

realized values of the r.v will be denoted by the corresponding lowercase letter. E.G.

r.v. X can take the value x.

3. Suppose that we have a random experiment with sample space S = {s1, ..., sn}and probability measure P. Now suppose that we define a r.v. X with range X =

{x1, ..., xm} . How can define a probability function PX on X ?We will observe X = xi

iff the outcome of the experiment is an sj such that X (sj) = xi. Hence

PX (X = xi) = P (sj ∈ S : X (sj) = xi) .

• Example: Consider an experiment of tossing a fair coin three times. Define ther.v. to be the number of heads obtained in the three tosses. What is S? What

is X and X? What is PX (1)?

4. If the range of X is finite or countably infinite, we call the r.v. a discrete r.v.; A

continuous r.v. is an r.v. that can take on any value in some interval; a r.v. is called

mixed if restricted to some range it is discrete, and restricted to another range it is

continuous.

2.3 Cumulative Distribution Function (cdf)

1. The c.d.f. of a random variable X, denoted by FX (x) is defined by

FX (x) = PX (X ≤ x) for all x.

• Example: Consider the experiment of tossing three fair coins, and let X =

number of heads observed. The c.d.f. of X is

FX (x) =

0 if x ∈ (−∞, 0)1/8 if x ∈ [0, 1)1/2 if x ∈ [1, 2)7/8 if x ∈ [2, 3)1 if x ∈ [3,∞)

.

[step function graph representation] Note that FX satisfies: [these are general

properties of c.d.f.: any function FX that satisfies the following three properties

are c.d.f of some random variable]

11

— limx→−∞ FX (x) = 0, limx→∞ FX (x) = 1;

— FX (x) is nondecreasing in x;

— FX (x) is right continuous.

• Example: Suppose we do an experiment that consists of tossing a coin until ahead appears. Let p = the probability of a head on any given toss. Define a r.v

X = number of tosses required to get a head. Then for any x = 1, 2, ...,

P (X = x) = (1− p)x p

since we must get x − 1 tails followed by a head for the event to occur and alltrials are independent. Hence for any positive integer x,

P (X ≤ x) =xXi=1

P (X = i) =xXi=1

(1− p)i−1 p

=1− (1− p)x1− (1− p) p = 1− (1− p)

x , x = 1, 2, ...

Hence FX (x) = 1− (1− p)x for x = 1, 2, ... [write out the complete FX ] This iscalled geometric distribution.

• Example: consider a continuous c.d.f [logistic distribution]

FX (x) =1

1 + e−x.

why is this a c.d.f? [verify the three defining properties of c.d.f. function].

2.4 Probability Density Function [pdf] and Probability Mass Function[pmf]

1. Associated with a r.v. X and its cdf FX is another function, called probability mass function

(pmf) for discrete r.v. and probability density function for continuous r.v.. Both pdf

and pmf are concerned with “point probabilities” of r.v.s. Notational convention: we

use an uppercase letter for the cdf and the corresponding lowercase letter for the pmf

or pdf.

2. The pmf, fX (x) ,of a discrete r.v. X is given by

fX (x) = P (X = x) for all x.

• The pmf for the geometric distribution is

fX (x) = P (X = x) =

½(1− p)x−1 p for x = 1, 2, ...

0 otherwise

12

• For a discrete r.v., to get the cdf from the pmf, we do the following:

FX (x) = P (X ≤ x) =xX

z=−∞fX (z) .

3. The pdf, fX (x) , or a continuous r.v. X is the function that satisfies

FX (x) =

Z x

−∞fX (t)dt for all x.

When fX is a continuous, then by the Fundamental Theorem of Calculus, we have

fX (x) =dFX (x)

dx.

• Example: For the logistic distribution, we have its cdf

FX (x) =1

1 + e−x

hence its pdf is

fX (x) =dFX (x)

dx=

e−x

(1 + e−x)2

The area under the curve fX (x) gives us the interval probabilities [graph repre-

sentation]

P (a ≤ X ≤ b) = FX (b)− FX (a) =Z b

afX (x)dx.

4. The support of a r.v. X is defined as:

Supp [X] = {x : fX (x) > 0.}

That is the support of a r.v. is the values that can arise with positive density.

5. How to check whether a function fX is a pdf (or pmf) or some r.v.? It is a pdf (or

pmf) if and only if

• fX (x) ≥ 0 for all x;• Px fX (x) = 1 (for pmf) or

R∞−∞ fX (x) dx = 1 (pdf).

6. Notation: The expression “X has a distribution given by FX (x) ” is abbreviated

symbolically by “X ∼ FX (x) ” where we read the symbol “∼ ” as “is distributed as”.We can similarly write X ∼ fX (x) .

13

notes3n

Documents