cheatsheet final

4

Click here to load reader

Upload: prasadbsnv

Post on 02-Jun-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CheatSheet Final

8/10/2019 CheatSheet Final

http://slidepdf.com/reader/full/cheatsheet-final 1/4

Cheat Sheet

•   Some Theorems Involving Sets:

(1) If   A  ⊂  B   and B  ⊂  C , then  A  ⊂ C    (8)   A − B =  A ∩ B

(2)   A ∪ B =  B  ∪ A   (9) If   A  ⊂ B , then  A ⊃  B  or  B  ⊂  A

(3)   A ∪ (B ∪ C ) = (A ∪ B) ∪ C  = A ∪ B ∪ C    (10)   A ∪ ∅ =  A, A ∩ ∅ =  ∅(4)   A ∩ B =  B  ∩ A   (11)   A ∪ U  = U , A ∩ U  = A(5)   A ∩ (B ∩ C ) = (A ∩ B) ∩ C  = A ∩ B ∩ C    (12a)   (A ∪ B)

 =  A

 ∩ B

(6)   A ∩ (B ∪ C ) = (A ∩ B) ∪ (A ∩ C )   (12b)   (A ∩ B) =  A ∪ B

(7)   A ∪ (B ∩ C ) = (A ∪ B) ∩ (A ∪ C )   (13)   A = (A ∩ B) ∪ (A ∩ B)

•   Fundamental principle of counting: If an operation consists of  k  steps, of which the first can be donein  n1  ways, for each of these the second step can be done in  n2  ways, for each of the first two the thirdstep can be done in  n3  ways, an so forth, then the whole operation can be done in  n1 · n2 · · · nk  ways.

•   Permutations and Combinations:

The number of permutation of  n distinct objects taken r at a time is  nP r  =   n!(n−r)!

  for r  = 0, 1, 2, · · ·  , n.

The number of permutations of  n  objects of which n1  are of one kind, n2  are of a second kind,  · · · , nk  areof a  k th type, and  n  =  n1 + n2 + · · · + nk   is   nP n1,n2,··· ,nk  =   n!

n1!·n2!···nk! .

The number of combinations of   n  distinct objects taken   r   at a time isnr

  =   n!

r!(n−r)!

  for all   r   =

0, 1, 2, · · ·  , n.

•   Binomial coefficient:  (x + y)n = n

r=0

nr

xn−ryr.

•  Some Important Theorems on Probability:  Let Ai   (i = 1, 2, 3, · · · n)  be events of the sample spaceS , then

(1) If  A1 ⊂  A2  then  P (A1) ≤  P (A2)   (2) For every event A,  0  ≤  P (A) ≤  1

(3) If  A  is the complement of  A, then  P (A) = 1 − P (A)   (4) P (∅) = 0

(5) If  A  and  B  are any two events, then  P (A ∪ B) = P (A) + P (B) − P (A ∩ B). More generally, if  A1, A2,A3  are any three events, then  P (A1 ∪ A2 ∪ A3) = P (A1) + P (A2) + P (A3) − P (A1 ∩ A2) − P (A2 ∩ A3) −P (A3 ∩ A1) + P (A1 ∩ A2 ∩ A3).

•   Conditional probability of  B   given  A: P (B|A) =  P (A

∩B)

P (A)   or P (A ∩ B) = P (A)P (B|A). For any threeevents A1,  A2,  A3, we have  P (A1 ∩ A2 ∩ A3) = P (A1)P (A2|A1)P (A3|A1 ∩ A2).

•   Independent Events:  Events   A   and   B   are independent if and only if   P (A ∩  B) =   P (A)P (B), or inother words  P (B|A) = P (B). If  A  and  B  are independent, then  A  and  B  are also independent.

•   Bayes’ Theorem:   If  B1,  B2, · · ·  , Bk  constitute a partition of a sample space  S   and  P (Bi) = 0   for  i  =

1, 2, · · ·  , k, then for any event  A in  S  such that P (A) = 0 P (Br|A) =   P (Br)·P (A|Br)k

i=1 P (Bi)·P (A|Bi)

  for r  = 1, 2, · · ·  , k.

•   Probability distributions:

I. Discrete case.

Discrete probability distributions (or probability functions):  If   X   is a discrete randomvariable, the function given by  f (x) = P (X  = x) for each x  within the range of  X  is called the probabilitydistribution of  X .  f (x) is a probability distribution if and only if (1)  f (x) ≥  0, and (2)

 x f (x) = 1.

Distribution functions for discrete random variables:   If   X   is a discrete random variable,the function given by   F (x) =  P (X   ≤  x) =

 t≤x f (t)   for  −∞  < x <  ∞  where   f (t)   is the value of the

probability distribution of  X  at  t, is called the distribution function of  X . If  F (x) is a distribution functionthen (1)  F (−∞) = 0, F (∞) = 1, and (2) if  a  ≤  b, then  F (a) ≤  F (b) for any real numbers  a  and  b.

If the range of a random variable  X  consists of the values  x1   < x2   < x3  <  · · ·  < xn, then  f (x1) =F (x1) and  f (xi) = F (xi) − F (xi−1) for  i  = 2, 3, · · ·  , n.

II. Continuous case.

Continuous probability density functions:  A function with values  f (x), defined over the setof all real numbers, is called a probability density function of the continuous random variable  X   if and

only if  P (a < X < b) =  ba

 f (x)dx. Where f (x) has the properties (1)  f (x) ≥  0, and (2) ∞

−∞ f (x)dx = 1.

Distribution functions for continuous random variables:  If  X  is a continuous random va-riable and the value of its probability density at   t   is   f (t), then the function given by   F (x) =   P (X   ≤x) =

  x−∞ f (t)dt   is called the distribution function of   X . If   F (x)   is a distribution function then (1)

F (−∞) = 0, F (∞) = 1, and (2) if  a  ≤  b, then  F (a) ≤ F (b)   for any real numbers  a  and  b.

Page 2: CheatSheet Final

8/10/2019 CheatSheet Final

http://slidepdf.com/reader/full/cheatsheet-final 2/4

If  f (x)  and  F (x)  are the values of the probability density and the distribution of the continuousrandom variable   X   at   x, then   P (a < X < b) =   F (b) −  F (a)   for any real   a   and   b   with   a   ≤   b, andf (x) = dF (x)/dx  where the derivative exists.

•   Joint distributions:

Joint probability distributions:  If  X   and Y   are discrete random variables, the function given byf (x, y) = P (X   = x, Y   = y )  for each pair of values  (x, y)  within the range of  X   and  Y   is called the jointprobability distribution of  X  and Y  . f (x, y) is a joint probability distributions if and only if (1)  f (x, y) ≥  0and (2)

 x

y f (x, y) = 1.

Joint distribution functions:   If   X   and   Y   are discrete random variables, the function given byF (x, y) = P (X  ≤  x, Y   ≤ y) =

 s≤x

t≤y f (s, t) for  −∞  < x < ∞  and  −∞  < y < ∞  where f (s, t) is the

value of the joint probability distribution of  X   and  Y   at (s, t), is called the joint distribution function of X  and Y  . If  F (x, y) is the value of the joint distribution function of two discrete random variables  X  andY   at (x, y), then (1) F (−∞, −∞) = 0, (2)  F (∞, ∞) = 1, (3) if  a < b and  c < d, then  F (a, c) ≤  F (b, d).

Marginal probability functions:   If   X   and   Y   are discrete random variables and   f (x, y)   is thevalue of their joint probability distribution at  (x, y), the function given by  g(x) =

 y f (x, y)  for each  x

within the range of   X   is called the marginal distribution of   X . Correspondingly, the function given byh(y) =

 x f (x, y)  for each  y  within the range of  Y  is called the marginal distribution of  Y  .

Conditional distribution: If  f (x, y) is the value of the joint probability distribution of the discrete

random variables   X   and   Y   at   (x, y), and   h(y)   is the value of the marginal distribution of   Y   at   y, thefunction given by  f (x|y) =   f (x,y)

h(y)   (h(y)  = 0) for each  x  within the range of  X , is called the conditional

distribution of   X   given   Y   =   y. Correspondingly, if   g(x)   is the value of the marginal distribution of   X 

at  x, the function  w(y|x) =   f (x,y)g(x)

  (g(x) = 0) for each  y  within the range of  Y  , is called the conditional

distribution of  Y   given  X  = x.

Independent random variables:   Suppose that   X   and   Y   are discrete random variables. If theevents   X   =   x   and   Y    =   y   are independent events for all   x   and   y, then we say that   X   and   Y   areindependent random variables. In such case,  P (X   =  x, Y   =  y) =  P (X   =  x) · P (Y   =  y)  or equivalentlyf (x, y) =  f (x|y) · h(y) =  g(x) · h(y).   Conversely, if for all  x  and  y   the joint probability function  f (x, y)can be expressed as the product of a function of  x   alone and a function of  y  alone (which are then themarginal probability functions of  X   and Y  ),  X   and  Y   are independent. If, however,  f (x, y) cannot be soexpressed, then  X   and  Y   are dependent.

•   Mathematical expectation:

I. Discrete case.   Let   f (x)   be the probability distribution of   X   then the expected value of   X   isE (X ) =

 x xf (x). Let X  be a discrete random variable with probability distribution  f (x), the expected

value of  g (X ) is given by  E [g(X )] = 

x g(x)f (x).

II. Continuous case.  Let  f (x) be the probability density function of  X  then the expected value of X   is   E (X ) =

 ∞−∞ xf (x)dx. Let  X  be a continuous random variable with probability density  f (x), the

expected value of  g(X )  is given by  E [g(X )] = ∞−∞ g(x)f (x)dx.

•   Mathematical expectation:

I. Discrete case. If  X  is a discrete random variable and f (x) is the value of its probability distribution

at  x, the expected value of  X   is  E (X ) = 

x xf (x).  The expected value of  g(X )   is given by  E [g(X )] =

x g(x)f (x).

II. Continuous case.  If  X   is a continuous random variable and  f (x) is the value of its probabilitydensity at  x, the expected value of  X   is  E (X ) =

 ∞−∞ xf (x)dx.  The expected value of  g(X )   is given by

E [g(X )] = ∞−∞ g(x)f (x)dx.

III. Joint case.   If   X   and   Y   are discrete random variables and   f (x, y)   is the value of their jointprobability distribution at  (x, y), the expected value of  g (X, Y  )  is  E [g(X, Y  )] =

 x

y g(x, y) · f (x, y).

Some theorems on expectation.

(1) If  a  and  b are any constant, then  E (aX  +  b) = aE (X ) + b.

(2) If  X   and  Y   are any random variables, then  E (X  + Y  ) = E (X ) + E (Y  ).

(3) If  X   and  Y   are independent random variables, then  E (XY  ) = E (X )E (Y  ).

•   Moments:

rth moment of   X   about the origin.  The  rth moment about the origin of a random variable  X ,denoted by   µr, is the expected value of  X r; symbolically,   µr   =  E (X r) =

 x xrf (x)   for  r   = 0, 1, 2, · · ·

Page 3: CheatSheet Final

8/10/2019 CheatSheet Final

http://slidepdf.com/reader/full/cheatsheet-final 3/4

when X   is discrete, and  µr  = E (X r) =

 ∞−∞ xrf (x)dx when X  is continuous. µ

1  is called the mean of thedistribution function of  X , and it is denoted by  µ.

rth moment of   X   about the mean.  The  rth moment about the mean of a random variable  X ,denoted by  µr, is the expected value of  (X  − µ)r; symbolically,  µr  = E [(X  − µ)r] =

 x(x − µ)rf (x)  for

r = 0, 1, 2, · · ·   when  X  is discrete, and µr  =  E [(X − µ)r] = ∞−∞(x − µ)rf (x)dx  when  X  is continuous.  µ2

is called the variance of the distribution of  X , and it is denoted by  σ2,  var(X ), or  V  (X );  σ , the positivesquare root of the variance, is called the standard deviation.

Some theorems on variance.

(1) σ2 = µ2 − µ2 = E (X 2) − [E (X )]2.

(2) If  a  and  b are any constants then  var(aX  + b) = a2var(X ).

(3) If  X   and  Y   are independent random variables,  v ar(X  ± Y  ) = var(X ) + var(Y  ).

•  Chebyshev’s inequality.  Suppose that  X   is a random variable (discrete or continuous) having mean  µ

and variance  σ2, which are finite. Then if  ε   is any positive number  P (|X  − µ| ≥ ε) ≤   σ2

ε2   or, with  ε =  kσP (|X  − µ| ≥  kσ) ≤  1/k2.

•  Moment generating function. The moment-generating function of a random variable X , where it exists,is given by   M X(t) =   E [etX ] =

 x etxf (x)   when   X   is discrete and   M X(t) =   E [etX ] =

 ∞−∞ etxf (x)dx

when X  is continuous.  drM X(t)

dtrt=0 = µr.

Some theorems on moment generating function.  If  a  and  b  are constant, then

(1) M X+a(t) = E [e(X+a)t] = eatM X(t);

(2) M bX(t) = E (ebXt) = M X(bt);

(3) M X+ab

(t) = E 

e(X+ab   )t =  ea

btM X

tb

.

•   Product Moments.

Product moment about the origin.  The   rth and   sth product moment about the origin of therandom variables X  and  Y  , denoted by µr,s, is the expected value of  X rY  s ; symbolically µr,s =  E (X rY  s) =

x

y xrysf (x, y)   for  r   = 0, 1, 2, · · ·   and  s  = 0, 1, 2, · · ·   when   X   and  Y   are discrete. Note that  µ1,0   =

E (X ), which we denote here  µX , and that  µ0,1 =  E (Y  ), which we denote here by  µY  .

Product moment about the mean.   The   rth and  sth product moment about the means of therandom variables  X   and  Y  , denoted by  µr,s, is the expected value of  (X  − µX)r(Y   − µY  )

s; symbolicallyµr,s =  E [(X  − µX)r(Y   − µY  )

s] = 

x

y(x − µX)r(y − µY  )

sf (x, y)  for  r  = 0, 1, 2, · · ·   and  s  = 0, 1, 2, · · ·when X  and Y  are discrete. µ1,1 is called the covariance of  X  and Y  , and it is denoted by  σXY  , cov(X, Y  ),or  C (X, Y  ).

Some theorems on covariance.

(1) σXY   = C ov(X, Y  ) = E (XY  ) − E (X )E (Y  ) = µ1,1 − µXµY  .

(2) If  X   and  Y   are independent random variables, then  σXY   = C ov(X, Y  ) = 0.

(3) V ar(X  ± Y  ) = V ar(X ) + V ar(Y  ) ± 2Cov(X, Y  ).

•   Conditional expectation.  If  X   is a discrete random variable and  f (x|y) is the value of the conditional

probability distribution of  X   given  Y   = y  at  x, then

Conditional expectation.   the conditional expectation of   u(X )   given   Y    =   y   is   E [u(X )|y] =x u(x)f (x|y).

Conditional mean. the conditional mean of the random variable X  given Y   = y, is µX|y  = E (X |y) =x xf (x|y).

Conditional variance.  the conditional variance of the random variable  X   given  Y   =  y   is  σ2X|y   =

E [(X  − µX|y)2|y] = E (X 2|y) − µ2X|y .

•   Special Probability Distributions.

The Discrete Uniform Distribution.  A random variable X  has a discrete uniform distribution if and only if its probability distribution is given by  f (x) =   1

k   for  x  =  x1, x2, · · ·  , xk   where  xi   =  xj   when

i = j .The Bernoulli Distribution.   A random variable  X  has a Bernoulli distribution if and only if its

probability distribution is given by  f (x; θ) =  θx(1 − θ)1−x for  x = 0, 1.  µ =  θ ,  σ2 = θ(1 − θ),  µr   = θ   forr = 1, 2, 3, · · · .

Page 4: CheatSheet Final

8/10/2019 CheatSheet Final

http://slidepdf.com/reader/full/cheatsheet-final 4/4

The Binomial Distribution.   A random variable  X   has a binomial distribution if and only if itsprobability distribution is given by  b(x; n, θ) =

nx

θx(1 − θ)n−x for x  = 0, 1, 2 · · ·   ,n. M X(t) = [1 + θ(et −

1)]n,  µ =  nθ  and σ2 = nθ(1 − θ).

The Negative Binomial and Geometric Distribution.   A random variable  X  has negative bi-nomial distribution if and only if  b∗(x; k, θ) =

x−1k−1

θk(1 − θ)x−k for x =  k, k + 1, k + 2, · · ·  . b∗(x; k, θ) =

kx  · b(k; x, θ),  µ  =   k

θ ,  σ2 =   kθ

1θ  − 1

. For  k  = 1  we obtain the geometric distribution denoted by  g(x; θ).

The Hypergeometric Distribution.  A random variable   X  has a hypergeometric distribution if and only if its probability distribution is given by  h(x; n,N,M ) =

M x

N −M n−x

/N n

  for  x = 0, 1, 2, · · ·  , n,

x ≤  M   and n − x ≤  N  − M .  µ  =   nM N   ,  σ2 = nM (N  − M )(N  − n)/{N 2(N  − 1)}.

The Binomal Approximation to the Hypergeometric Distribution.  As a rule of thumb,if   n   is not exceed 5 percent of   N , then we may use binomial probabilities in place of hypergeometricprobabilities.

The Poisson Distribution.   A random variable   X   has a Poisson distribution if and only if its

probability distribution is given by  p(x; λ) =   λxe−λ

x!  for  x  = 0, 1, 2 · · ·   where  λ   is the mean of successes.

µ =  σ2 = λ,  M X(t) = eλ(et−1).

The Poisson Approximation to the Binomial Distribution. Poisson distribution will providea good approximation to binomial probabilities when  n ≥  20  and  θ  ≤  0.05. When  n  ≥  100  and  nθ < 10,the approximation will generally be excellent.

•   Measures of Central Tendency

Mean. The mean is equal to the sum of all the values in the data set divided by the number of valuesin the data set.

Median.   The median is the middle score for a set of data that has been arranged in order of magnitude.

Mode.  The mode is the most frequently occurring value in a set of values.

•   Sampling Distributions.

Sample Mean and Sample Variance  If  X 1, X 2, · · ·  , X n  are constitute a random sample, then the

sample mean is given by  X  =

n

i=1 Xi

n   and the sample variance is given by S 2 =

n

i=1(Xi−X)2

n−1   .

Some theorems on sampling distributions of means.

(1) If  X 1, X 2, · · ·  , X n  are constitute a random sample from an infinite population with mean  µ  and

the variance  σ2, then  E (X ) = µ  and  var(X ) =   σ2

n .

(2) For any positive constant  c, the probability that X  will take on a value between  µ − c and  µ + c

is at least  1 −   σ2

nc2 .  When  n  → ∞, this probability approaches 1.

(3) If   X 1, X 2, · · ·   , X n   are constitute a random sample from an infinite population with mean   µ,

the variance σ2, and the moment-generating function  M X(t), then the limiting distribution of  Z  =   X−µσ/√ n

as  n  → ∞  is the standard normal distribution.

(4) If  X  is the mean of a random sample of size  n  from a normal population with mean  µ  and thevariance  σ2, its sampling distribution is a normal distribution with mean  µ  and the variance  σ2/n.

(5) If  X  is the mean of a random sample of size  n taken without replacement from a finite populationof size  N   with mean  µ  and the variance σ2, the E (X ) = µ  and  v ar(X ) =   σ2

nN −nN −1

.

•   The Estimation of Means.   If   θ̂1   and   θ̂2   are values of the random variables  Θ̂1   and  Θ̂2   such thatP (Θ̂1  < θ <  Θ̂2) = 1 − α  for some specified probability  1 − α, we refer to the interval  θ̂1  < θ <  θ̂2   as a(1 − α)100%   confidence interval for  θ. The probability  1 − α   is called the degree of confidence, and theendpoints of the interval are called the lower and upper confidence limits.

Some theorems on estimation of means.

(1) If  X , the mean of a random sample of size n  from a normal population with the known varianceσ2, is to be used as an estimator of the mean of the population, the probability is  1 − α  that the errorwill be less than  zα/2 ·   σ√ 

n.

(2) If  x   is the value of the mean of a random sample of size  n  from a normal population with theknown variance  σ2, then  x − zα/2 ·   σ√ 

n < µ < x + zα/2 ·   σ√ 

n  is a  (1 − α)100%  confidence interval for the

mean of the population.