binomial normal distribution

Upload: gopal-gupta

Post on 14-Apr-2018

237 views

Category:

Documents


1 download

TRANSCRIPT

  • 7/30/2019 Binomial Normal Distribution

    1/47

    Normal Distribution

  • 7/30/2019 Binomial Normal Distribution

    2/47

    Normal Distribution

    Why are normal distributions so important?

    Many dependent variables are commonly assumed

    to be normally distributed in the population

    If a variable is approximately normally distributed wecan make inferences about values of that variable

    Example: Sampling distribution of the mean

  • 7/30/2019 Binomial Normal Distribution

    3/47

    TYPES OF HISTOGRAM

  • 7/30/2019 Binomial Normal Distribution

    4/47

    Normal Distribution

    Symmetrical, bell-shaped curve

    Also known as Gaussian distribution

    Point of inflection = 1 standard deviationfrom mean

    Mathematical formula

    f(X) 1

    2

    (e)

    (X)2

    22

  • 7/30/2019 Binomial Normal Distribution

    5/47

    Since we know the shape of the curve, we can

    calculate the area under the curve

    The percentage of that area can be used to

    determine the probability that a given value could bepulled from a given distribution

    The area under the curve tells us about the probability- in

    other words we can obtain a p-value for our result (data)

    by treating it as a normally distributed data set.

  • 7/30/2019 Binomial Normal Distribution

    6/47

    Problem:

    Each normal distribution with its own values of

    and would need its own calculation of the area

    under various points on the curve

  • 7/30/2019 Binomial Normal Distribution

    7/47

    The normalBell shaped curve:

    =100,2=10

    90 95 100 105 110

    x

    0.

    00

    0.

    02

    0.0

    4

    0.

    06

    0.

    08

    0.

    10

    0.

    12

    fx

  • 7/30/2019 Binomial Normal Distribution

    8/47

    Normal curves:

    (=0,2=1) and (=5,2=1)

    -2 0 2 4 6 8

    x

    0.

    0

    0.

    1

    0.

    2

    0.

    3

    0.

    4

    fx1

  • 7/30/2019 Binomial Normal Distribution

    9/47

    Normal curves:(=0,2=1) and (=0,2=2)

    -3 -2 -1 0 1 2 3

    x

    0.

    0

    0.

    1

    0.

    2

    0.

    3

    0.

    4

    y

  • 7/30/2019 Binomial Normal Distribution

    10/47

    Normal curves:(=0,2=1) and (=2,2=0.25)

    -2 0 2 4 6 8

    0.

    0

    0.

    2

    0.

    4

    0.

    6

    0.

    8

    1.

    0

    fx1

  • 7/30/2019 Binomial Normal Distribution

    11/47

    The standard normal curve:

    =0, and2=1

    -3 -2 -1 0 1 2 3

    x

    0.

    0

    0.

    1

    0.

    2

    0.

    3

    0.

    4

    y

  • 7/30/2019 Binomial Normal Distribution

    12/47

    Normal Probability DistributionsStandard Normal Distribution N(0,1)

    We agree to use thestandard normaldistribution

    Bell shaped

    =0

    =1

    Note: not all bellshaped distributionsare normaldistributions

  • 7/30/2019 Binomial Normal Distribution

    13/47

    Normal Distribution

    The standard normal distribution will allow us

    to make claims about the probabilities of

    values related to our own data

    How do we apply the standard normal

    distribution to our data?

  • 7/30/2019 Binomial Normal Distribution

    14/47

    Z-score

    If we know the population mean and

    population standard deviation, for any value

    of X we can compute a z-score by subtracting

    the population mean and dividing the resultby the population standard deviation

    zX

  • 7/30/2019 Binomial Normal Distribution

    15/47

    Probabilities are depicted by areas under the curve

    Total area under the curve is1

    The area in red is equal top(z > 1)

    The area in blue is equal top(-1< z

  • 7/30/2019 Binomial Normal Distribution

    16/47

  • 7/30/2019 Binomial Normal Distribution

    17/47

    Binomial Probability Distribution

    A fixed number of observations (trials), n

    e.g., 15 tosses of a coin; 20 patients; 1000 people surveyed

    A binary random variable

    e.g., head or tail in each toss of a coin; defective or notdefective light bulb

    Generally called success and failure

    Probability of success is p, probability of failure is 1 p

    Constant probability for each observation

    e.g., Probability of getting a tail is the same each time we

    toss the coin

  • 7/30/2019 Binomial Normal Distribution

    18/47

    Binomial example

    Take the example of 5 coin tosses. Whats the

    probability that you flip exactly 3 heads in 5

    coin tosses?

  • 7/30/2019 Binomial Normal Distribution

    19/47

    Binomial distribution

    Solution:

    One way to get exactly 3 heads: HHHTT

    Whats the probability of this exact arrangement?P(heads)xP(heads) xP(heads)xP(tails)xP(tails) =(1/2)3x

    (1/2)2

    Another way to get exactly 3 heads: THHHT

    Probability of this exact outcome = (1/2)1x (1/2)3x(1/2)1 = (1/2)3x(1/2)2

  • 7/30/2019 Binomial Normal Distribution

    20/47

    Binomial distribution

    In fact, (1/2)3x(1/2)2 is the probability of each

    unique outcome that has exactly 3 heads and 2

    tails.

    So, the overall probability of 3 heads and 2 tails is:

    (1/2)3x(1/2)2 + (1/2)3x(1/2)2+ (1/2)3x(1/2)2 +

    .. for as many unique arrangements as there arebut how many are there??

  • 7/30/2019 Binomial Normal Distribution

    21/47

    Outcome Probability

    THHHT (1/2)3x(1/2)2

    HHHTT (1/2)3x(1/2)2

    TTHHH (1/2)3x(1/2)2

    HTTHH (1/2)3x(1/2)2

    HHTTH (1/2)3x(1/2)2

    THTHH (1/2)3x(1/2)2

    HTHTH (1/2)3x(1/2)2

    HHTHT (1/2)3x(1/2)2

    THHTH (1/2)3x(1/2)2

    HTHHT (1/2)3x(1/2)210 arrangementsx (1/2)3x(1/2)2

    The probabilityof each unique

    outcome (note:

    they are all

    equal)

    ways toarrange 3

    heads in

    5 trials

    5

    3

    5C3 = 5!/3!2! = 10

  • 7/30/2019 Binomial Normal Distribution

    22/47

    P(3 heads and 2 tails) = x P(heads)3 x P(tails)2 =

    10 x ()5=

    31.25%

    5

    3

  • 7/30/2019 Binomial Normal Distribution

    23/47

    x

    p(x)

    0 3 4 51 2

    Binomial distribution function:X= the number of heads tossed in 5 coin tosses

    number of heads

    p(x)

    number of heads

  • 7/30/2019 Binomial Normal Distribution

    24/47

    Example 2

    As voters exit the polls, you ask a representative

    random sample of 6 voters if they voted for

    proposition 100. If the true percentage of voters who

    vote for the proposition is 55.1%, what is theprobability that, in your sample, exactly 2 voted for

    the proposition and 4 did not?

  • 7/30/2019 Binomial Normal Distribution

    25/47

    Solution:

    Outcome Probability

    YYNNNN = (.551)2x(.449)4

    NYYNNN(.449)1x (.551)2x(.449)3 = (.551)2x(.449)4

    NNYYNN(.449)2x (.551)2x(.449)2 = (.551)2x(.449)4

    NNNYYN(.449)3x (.551)2x(.449)1 = (.551)2x(.449)4

    NNNNYY(.449)4x (.551)2 = (.551)2x(.449)4

    .

    .

    ways to

    arrange 2

    Obama votes

    among 6

    voters

    6

    2

    15 arrangementsx(.551)2x(.449)4

    6

    2

    P(2 yes votes exactly) = x (.551)2x(.449)4 = 18.5%

  • 7/30/2019 Binomial Normal Distribution

    26/47

    Binomial distribution, generally

    XnXn

    X

    pp

    )1(

    1-p =probabilityof failure

    p =

    probability of

    success

    X = #

    successes

    out ofntrials

    n = number of trials

    Note the general pattern emerging if you have only two possibleoutcomes (call them 1/0 or yes/no or success/failure) in n independent

    trials, then the probability of exactlyXsuccesses=

  • 7/30/2019 Binomial Normal Distribution

    27/47

    Definitions: Binomial

    Binomial: Suppose that n independent experiments, or trials,

    are performed, where n is a fixed number, and that each

    experiment results in a success with probabilityp and a

    failure with probability 1-p. The total number of successes,

    X, is a binomial random variable with parameters n andp.

    We write:X ~ Bin (n, p) {reads: X is distributed binomially

    with parameters n and p}

    And the probability thatX=r(i.e., that there are exactly r

    successes) is:

    rnrn

    r

    pprXP

    )1()(

  • 7/30/2019 Binomial Normal Distribution

    28/47

    Definitions: Bernouilli

    Bernouilli trial: If there is only 1 trial with

    probability of successp and probability of

    failure 1-p, this is called a Bernouilli

    distribution. (special case of the binomial withn=1)

    Probability of success:

    Probability of failure:pppXP

    111

    1

    1

    )1()1(

    pppXP

    1)1()0( 010

    1

    0

  • 7/30/2019 Binomial Normal Distribution

    29/47

    Binomial distribution: example

    If I toss a coin 20 times, whats the

    probability of getting exactly 10 heads?

    176.)5(.)5(. 101020

    10

  • 7/30/2019 Binomial Normal Distribution

    30/47

    Binomial distribution: example

    If I toss a coin 20 times, whats the

    probability of getting of getting 2 or fewer

    heads?

    4

    472018220

    2

    572019120

    1

    72020020

    0

    108.1

    108.1105.9190)5(.!2!18

    !20)5(.)5(.

    109.1105.920)5(.!1!19

    !20)5(.)5(.

    105.9)5(.!0!20

    !20)5(.)5(.

    x

    xxx

    xxx

    x

  • 7/30/2019 Binomial Normal Distribution

    31/47

    Things that follow a binomial

    distribution

    Cohort study (or cross-sectional):

    The number of exposed individuals in your sample that

    develop the disease

    The number of unexposed individuals in your sample that

    develop the disease

    Case-control study:

    The number of cases that have had the exposure The number of controls that have had the exposure

  • 7/30/2019 Binomial Normal Distribution

    32/47

    Answer

    1. You are performing a cohort study. If the probability of developing disease in the

    exposed group is .05 for the study duration, then if you sample (randomly) 500

    exposed people, how many do you expect to develop the disease? Give a margin

    of error (+/- 1 standard deviation) for your estimate.

    X ~ binomial (500, .05)E(X) = 500 (.05) = 25

    Var(X) = 500 (.05) (.95) = 23.75

    StdDev(X) = square root (23.75) = 4.8725 4.87

  • 7/30/2019 Binomial Normal Distribution

    33/47

    Answer

    2. Whats the probability that at most 10 exposed subjects

    develop the disease?

    01.)95(.)05(....)95(.)05(.)95(.)05(.)95(.)05(.49010

    500

    10

    4982500

    2

    4991500

    1

    5000500

    0

    This is asking for a CUMULATIVE PROBABILITY: the probability of 0 getting the

    disease or 1 or 2 or 3 or 4 or up to 10.

    P(X10) = P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4)+.+ P(X=10)=

    (well learn how to approximate this long sum next week)

  • 7/30/2019 Binomial Normal Distribution

    34/47

    Youll rarely calculate the binomial by hand. However, it is good to know how to

    Pascals Triangle Trick for calculating binomial coefficients

    Recall from math in your past that Pascals Triangle is used to get thecoefficients for binomial expansion

    For example, to expand: (p + q)5

    The powers follow a set pattern: p5 + p4q1 + p3q2 + p2q3+ p1q4+ q5

    But what are the coefficients?

    Use Pascals Magic Triangle

    A brief distraction: Pascals Triangle

    Trick

  • 7/30/2019 Binomial Normal Distribution

    35/47

    Pascals Triangle

    1

    1 11 2 1

    1 3 3 1

    1 4 6 4 1

    1 5 10 10 5 1

    1 6 15 20 15 6 1

    1 7 21 35 35 21 7 1

    Edges are all 1s

    Add the twonumbers in the row

    above to get the

    number below, e.g.:

    3+1=4; 5+10=15

    To get the

    coefficient forexpanding to the

    5th power, use the

    row that starts

    with 5.

    (p + q)5 = 1p5 + 5p4q1 + 10p3q2 + 10p2q3+ 5p1q4+ 1q5

  • 7/30/2019 Binomial Normal Distribution

    36/47

    50

    5

    0

    )5(.)5(.

    415

    1

    )5(.)5(.

    325

    2

    )5(.)5(.

    235

    3

    )5(.)5(.

    145

    4

    )5(.)5(.

    055

    5

    )5(.)5(.

    X P(X)

    0

    1

    2

    3

    4

    5

    Same coefficients for X~Bin(5,p)

    X P(X)

    0 5)5(.1

    1 5)5(.5

    2 5)5(.10 3 5)5(.10

    4 5)5(.5

    5 5)5(.1

    32(.5)5=

    1.0

    5

    0

    =5!/0!5!=1

    5

    1

    =5!/1!4! = 5

    5

    2

    = 5!/2!3!=5x4/2=10

    5

    3

    =5!/3!2!=10

    5

    4

    =5!/4!1!= 5

    5

    5

    =5! /5!1!=1 (Note the symmetry!)

    For example, X=# heads in 5 coin tosses:

    From line 5

    of Pascalstriangle!

  • 7/30/2019 Binomial Normal Distribution

    37/47

    Relationship between binomial probability

    distribution and binomial expansion

    If p + q = 1 (which is the case if they are binomial probabilities)

    then: (p + q)5= (1)

    5= 1 or, equivalently:

    1p5 + 5p4q1 + 10p3q2 + 10p2q3+ 5p1q4+ 1q5 = 1

    (the probabilities sum to 1, making it a

    probability distribution!)

    P(X=0) P(X=1) P(X=2) P(X=3) P(X=4) P(X=5)

  • 7/30/2019 Binomial Normal Distribution

    38/47

    Introduction to the Poisson Distribution

    Poisson distribution is for countsif events happen ata constant rate over time, the Poisson distributiongives the probability of X number of events occurring in

    time T.

  • 7/30/2019 Binomial Normal Distribution

    39/47

    Poisson Mean and Variance

    Mean

    Variance and Standard Deviation

    2

    where = expected number of hits in a given

    time period

    For a Poisson

    random variable, the

    variance and mean

    are the same!

  • 7/30/2019 Binomial Normal Distribution

    40/47

    Poisson Distribution, example

    The Poisson distribution models counts, such as the number of new cases

    of SARS that occur in women in New England next month.

    The distribution tells you the probability of all possible numbers of new

    cases, from 0 to infinity.

    If X= # of new cases next month andX~ Poisson (), then the probabilitythatX=k(a particular count) is:

    !

    )(

    k

    ekXp

    k

  • 7/30/2019 Binomial Normal Distribution

    41/47

    Example

    For example, if new cases of West Nile Virus in

    New England are occurring at a rate of about 2

    per month, then these are the probabilities

    that: 0,1, 2, 3, 4, 5, 6, to 1000 to 1 million tocases will occur in New England in the next

    month:

  • 7/30/2019 Binomial Normal Distribution

    42/47

    Poisson Probability table

    !0

    220

    e

    !1

    2 21

    e

    !2

    222

    e

    !3

    223

    e

    X P(X)

    0 =.135

    1 =.27

    2 =.27

    3 =.18

    4 =.09

    5

  • 7/30/2019 Binomial Normal Distribution

    43/47

    Example: Poisson distribution

    Suppose that a rare disease has an incidence of 1 in 1000 person-

    years. Assuming that members of the population are affected

    independently, find the probability of k cases in a population of

    10,000 (followed over 1 year) for k=0,1,2.

    The expected value (mean) = = .001*10,000 = 10

    10 new cases expected in this population per year

    00227.!2

    )10()2(

    000454.!1

    )10()1(

    0000454.!0

    )10()0(

    )10(2

    )10(1

    )10(0

    eXP

    eXP

    eXP

  • 7/30/2019 Binomial Normal Distribution

    44/47

    more on Poisson

    Poisson Process (rates)

    Note that the Poisson parameter can be given as themean number of events that occur in a defined time

    period OR, equivalently, can be given as a rate, such as=2/month (2 events per 1 month) that must bemultiplied by t=time (called a Poisson Process)

    X ~ Poisson ()

    !

    )()(

    k

    etkXP

    tk

    E(X) = t

    Var(X) = t

  • 7/30/2019 Binomial Normal Distribution

    45/47

    Example

    For example, if new cases of West Nile in NewEngland are occurring at a rate of about 2 permonth, then whats the probability that exactly 4

    cases will occur in the next 3 months?

    X ~ Poisson (=2/month)

    %4.13!4

    6

    !4

    )3*2(months)3in4P(X

    )6(4)3*2(4

    ee

    Exactly 6 cases?

    %16!6

    6

    !6

    )3*2(months)3in6P(X

    )6(6)3*2(6

    ee

  • 7/30/2019 Binomial Normal Distribution

    46/47

    Practice problems

    1a. If calls to your cell phone are a Poisson process

    with a constant rate =2 calls per hour, whats the

    probability that, if you forget to turn your phone off

    in a 1.5 hour movie, your phone rings during thattime?

    1b. How many phone calls do you expect to get

    during the movie?

  • 7/30/2019 Binomial Normal Distribution

    47/47

    Answer

    1a. If calls to your cell phone are a Poisson process with a constantrate =2 calls per hour, whats the probability that, if you forget toturn your phone off in a 1.5 hour movie, your phone rings during thattime?

    X ~ Poisson (=2 calls/hour)P(X1)=1 P(X=0)

    05.!0

    )3(

    !0

    )5.1*2()0( 3

    30)5.1(20

    eee

    XP

    P(X1)=1 .05 = 95% chance

    1b. How many phone calls do you expect to get during the movie?

    E(X) = t = 2(1.5) = 3