chapter 1: statistics series in two variables

19
Chapter 1: Statistics series in two variables 1- RELATION BETWEEN TWO VARIABLES OF A POPULATION We assume that, following a study, we're interested in two quantitative characters (two numerical variables) on a given population of n individuals. The modalities (values) of the variable x are denoted by : x1 , x2 , x3 , … , xn. In general, they are denoted by xi where 1 ≤ i n. The modalities (values) of the variable y are denoted by : y1 , y2, y3 , … , yn. In general, they are denoted by yi, where 1 ≤ i n. These couples (xi ; yi) form a statistical series of two variables. The results can be summarized in a table in the form: The main question is : is there a link or a relation between the variables x and y? and how strong this relation is? 2- SCATTER PLOT POINTS MEAN POINT Scatter plot points: In a well-chosen orthogonal system of reference, the set of points Ai of coordinates (xi ; yi) with 1 ≤ i n is called the scatter plot points associated to the statistical series in two variables. The point G of coordinates ; x y is called mean point or center of gravity of the scatter plot points associated to the statistical series in two variables. 3- COVARIANCE a) The covariance of a statistical series in two variables x and y is the real number 1 1 cov ; n xy i i i x y x x y y n or 1 1 cov ; n xy i i i x y xy xy n . b) Recall The variance of the variable x is given by V(x) = 2 1 1 n i i x x n = 2 2 1 1 n i i x x n . The standard deviation of the variable x is given by: x V x . The variance of the variable y is given by V(y) = 2 1 1 n i i y y n = 2 2 1 1 n i i y y n . The standard deviation of the variable y is given by: y V y . Value xi x1 x 2 ... x n Value yi y 1 y 2 ... y n Mean point : If x is the mean (average) of the values xi and y is the mean (average) of the values yi , we have 1 2 1 ... 1 n n i i x x x x x n n and 1 2 1 ... 1 n n i i y y y y y n n .

Upload: others

Post on 27-Dec-2021

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 1: Statistics series in two variables

Chapter 1: Statistics series in two variables

1- RELATION BETWEEN TWO VARIABLES OF A POPULATION

We assume that, following a study, we're interested in two quantitative characters (two numerical

variables) on a given population of n individuals.

The modalities (values) of the variable x are denoted by : x1 , x2 , x3 , … , xn. In general, they are

denoted by xi where 1 ≤ i ≤ n.

The modalities (values) of the variable y are denoted by : y1 , y2, y3 , … , yn. In general, they are

denoted by yi, where 1 ≤ i ≤ n.

These couples (xi ; yi) form a statistical series of two variables.

The results can be summarized in a table in the form:

The main question is : is there a link or a relation between the variables x and y? and how strong

this relation is?

2- SCATTER PLOT POINTS – MEAN POINT

Scatter plot points: In a well-chosen orthogonal system of reference, the set of points Ai of

coordinates (xi ; yi) with 1 ≤ i ≤ n is called the scatter plot points associated to the statistical series

in two variables.

The point G of coordinates ; x y is called mean point or center of gravity of the scatter plot

points associated to the statistical series in two variables.

3- COVARIANCE

a) The covariance of a statistical series in two variables x and y is the real number

1

1cov ;

n

xy i i

i

x y x x y yn

or 1

1cov ;

n

xy i i

i

x y x y x yn

.

b) Recall

The variance of the variable x is given by V(x) = 2

1

1

n

i

i

x xn

=2

2

1

1

n

i

i

x xn

.

The standard deviation of the variable x is given by: x V x .

The variance of the variable y is given by V(y) = 2

1

1

n

i

i

y yn

=2

2

1

1

n

i

i

y yn

.

The standard deviation of the variable y is given by: y V y .

Value xi x1 x2 ... xn

Value yi y1 y2 ... yn

Mean point : If x is the mean (average) of the values xi and y is the mean (average) of the values

yi , we have 1 2

1

...1

nn

i

i

x x xx x

n n and 1 2

1

...1

nn

i

i

y y yy y

n n.

Page 2: Chapter 1: Statistics series in two variables

4- LINEAR ADJUSTMENT USING THE METHOD OF LEAST SQUARES

a) Consider a statistical series in two variables x and y such that the scatter plot has a "certain" alignment.

It is proposed to adjust this scatter plot by a straight line;

But a method, the least squares method, is mainly used, because it has many theoretical interests and

the calculations that it involves without too much difficulty.

The idea is to determine the coefficients a and b of a

straight line (D) of equation y = ax + so that it passes as

close as possible to the points of the scatter plot.

For each abscissa xi, one calculates the distance MiPi

between the point of the scatter plot and the point of the

line, that is to say: i i i i iM P y ax b .

In the least squares method, a and b are sought for

which the sum of the squares of these distances 2 2 2

1 1 2 2 ... n nS M P M P M P is minimal. It happens

that there exists only one straight line corresponding to

this condition.

b) Regression line of y at x:

We admit the following theorem:

The regression line /y xD of y at x, associated to the scatter points Ai (xi ; yi) , with 1 ≤ i ≤ n;

- Passes through the mean point G ; x y of the scatter plot;

- Has an equation y = ax + b where a =

cov ; )x y

V x and b = y ax .

c) Regression line of x at y: We admit the following theorem:

The regression line /x yD of x at y, associated to the scatter points Ai (xi ; yi) , with 1 ≤ i ≤ n ,

- Passes through the mean point G ; x y of the scatter plot;

- Has an equation x = a′y + b′ where a′ =

cov ; )x y

V y and b′ = 'x a y .

d) Remark:

The regression line /y xD helps to predict or estimate a certain value of the variable y

corresponding to a value of the variable x.

The error of prediction in percentage is calculated by : real value estimated value

100real value

.

Page 3: Chapter 1: Statistics series in two variables

5- COEFFICIENT OF LINEAR CORRELATION

Consider a statistical series in two variables x and y. The coefficient of linear correlation between x and y is the real number:

r =

1

2 2

1 1

cov ;

n

i ixy i

n nx y x y

i i

i i

x x y yx y

x x y y

.

a) We admit that 1 1 r . b) If r = 1, the correlation between x and y is perfect positive. In this case he points of the scatter plot are

collinear.

c) If r = − 1, the correlation between x and y is perfect negative. In this case he points of the scatter plot

are collinear.

d) If 3

12

r , the correlation between x and y is negative strong.

e) If 3

12

r , the correlation between x and y is positive strong.

f) If r = 0, there is no linear independence between x and y.

Page 4: Chapter 1: Statistics series in two variables

Chapter 2 : Numerical sequences

1) Definition

A numerical sequence is an application f from a subset I of into .

f : I

n f(n).

We call I the set of indexes of the sequence. The image of n by f , which is f(n), is denoted by un and

it’s called general term of the sequence.

A sequence of general term un is noted by (un).

The real numbers u0 , u1 , u2 , ... , are the successive terms of the sequence.

In some cases the sequence is defined starting a certain rank n0 , then the corresponding term of

the sequence , 0nu , is its first term.

2) Determination

Determine a sequence, is to give a method to calculate its terms.

A numerical sequence can be determined by:

a) Explicit sequence: in this case the sequence is defined by an explicit formula, un = f(n), where f is

a numerical function, that means the general term of the sequence is given.

b) Recurrent sequence: In this case the sequence is defined by the given of its first term and a

relation in the form un + 1 = g(un), where g is a numerical function. The sequence in this case is

called a recurrent sequence or a sequence defined by recurrence.

3) Sense of variation

Consider a sequence (un) defined over a subset I of .

(un) is an increasing sequence (a strictly increasing sequence) if, for every natural number n in I,

we have un + 1 ≥ un (un + 1 > un).

(un) is a decreasing sequence (a strictly decreasing sequence) if, for every natural number n in I,

we have un + 1 ≤ un (un + 1 > un).

An increasing sequence (or decreasing sequence) is called monotonic sequence.

Remarks ••••••••••••••••••••••••••••••••••••••••••••••••••••••• 1- Practical methods to determine the sense of variation of a sequence:

a) We determine the sign of the difference un + 1 – un.

If un + 1 – un ≥ 0 , then (un) is increasing.

If un + 1 – un ≤ 0, then (un) is decreasing.

b) In the case where the terms of the sequence (un) are strictly positives, we compare n 1

n

u

u

to 1.

If n 1

n

u

u

≥ 1 , then (un) is increasing.

If n 1

n

u

u

≤ 1, then (un) is decreasing.

c) If the general term of the sequence (un) is given by un = f(n), where f is a numerical

function, the sense of variation of (un) is the same of the function f. Then it depends on the

sign of f ' the derivative of f.

Page 5: Chapter 1: Statistics series in two variables

4) ARITHMETIC SEQUENCE – GEOMETRIC SEQUENCE

Arithmetic sequence Geometric sequence

Recurrent form 1st term is given and Un + 1 = Un + r

where r is the common difference

1st term is given and Un + 1 = qUn

where q is the common ratio

To prove that a

sequence is: :

arithmetic , we prove that

Un + 1 – Un = constant r = common

difference

Geometric, we prove that

n 1

n

U

U

= constant q = common

ratio

Explicit

form

If the first

term is U0

Un = U0 + nr Un = U0 qn

If the first

term is U1

Un = U1 + (n – 1)r Un = U1 qn – 1

If the first

term is Up

Un = Up + (n – p)r Un = Up qn – p

Sum of terms

0 1 n - 1

n termes

U + U +... + U = 0 n 1

n(U U )

2

1 2 n

n termes

U + U +... + U = n

2( U1 + Un )

0 1 n

n+1 termes

U + U +... + U = n 1

2

( U0 + Un)

0 1 n - 1

n termes

U + U +... + U = U0

n1 q

1 q

1 2 n

n termes

U + U +... + U = U1

n1 q

1 q

0 1 n

n+1 termes

U + U +... + U = U0

n 11 q

1 q

5) LIMIT OF ARITHMETIC AND GEOMETRIC SEQUENCES

o Find the limit of the sequence is to study its behavior as n tends to + ∞.

o An arithmetic sequence (un) of ratio r and first term u0 have an infinite limit.

In fact : un = u0 + nr and nnlim u

= + ∞ or – ∞ according that r > 0 or r < 0.

o If (un) is a geometric sequence of common ratio q and first term u0 ; then we have : un = u0 . qn.

If q > 1, then n

nlim q

= + ∞ then nnlim u

= + ∞ or – ∞ according to the sign of u0.

If q < – 1, then the limit of the sequence (un) does not exist.

If |q| < 1 that is – 1 < q < 1, then n

nlim q

= 0 then nnlim u

= 0.

In this case if Sn = n

0

1 qu

1 q

is the sum of the n first term of a geometric sequence (un), and if the

sequence (un) has infinite number of terms, then 0n

n

ulim S S

1 q

.

Page 6: Chapter 1: Statistics series in two variables

Chapter 3: Interest

I - Simple interest

The interest is said to be a simple interest when it is directly proportional to the time of deposit. This is

the case when the interests are given to the client each year, or when the time of investment is less than

a year.

The simple interest is computed by the formula : I C r n , where

C: the initial capital

r: the annual interest rate

n: the time in years

► Remark 1:

The time period t of a loan or an investment may be given in:

a) Years: In this case n = t and the formula of the simple interest is t

I C r12

.

b) Months: In this case n = t

12 and the formula of the simple interest is

tI C r

12 .

c) Days during an ordinary year : In this case n = t

365 and the formula of the simple interest is

tI C r

365 .

d) Days during a leap year: In this case n = t

366 and the formula of the simple interest is

tI C r

366

.

e) Days during a commercial year: In this case n = t

360 and the formula of the simple interest is

tI C r

360 .

► Remark 2:

Time between two dates:

When the period of investment is given in days, we have to find the exact number of days between two

calendar dates.

When we count the number of days we can include either the first day or the last day, but we cannot include

the both days in our account.

The number of days in each month are given by the following table:

► Acquired value (future value):

The acquired (future) value after t years is: Cn = F = C + I = C(1 + rt).

► Actual value (present value):

The actual (present) value of an investment or a loan is: C = nC

1 rt.

II- Compound interest:

The interest is said to be a compound interest when it is added to the capital every period before we

add the interest for the next period. The interests are said to be capitalized: that is the interests have

been added to constitute a new capital.

Jan Feb March April May June July Aug Sep Oct Nov Dec

31 28 or 29 31 30 31 30 31 31 30 31 30 31

Page 7: Chapter 1: Statistics series in two variables

► Future value (Acquired value):

The acquired (future) value after n periods is : n

nC C 1 i , where:

C: the initial capital

r: the annual interest rate

t: the time in years

i: the interest rate per period

n: the number of periods during t years where r

ik

and n = kt.

► Remark:

The interest may be compounded:

Annually, then k = 1

Semiannually, then k = 2

Quarterly, then k = 4

Monthly, then k = 12

Weekly, then k = 52

Daily, then k = 365 (or 366 for a leap year, or 360 for a commercial year)

Hourly, then k = 8760 (for an ordinary year)

Every minute, then k = 525600 (for an ordinary year).

► Calculation of compound interests:

The reported compound interest during n periods are: n n

n nI C C C 1 i C C 1 i 1

.

► Present value (Actual value):

The actual (present) value is:

nn

nn

CC C 1 i

1 i

.

III- Annuities

An annuity is successively equal payments that is paid at equal periods to constitute a capital or to pay

back a debt. It can be monthly payments, semiannually payments, quarterly payments, etc…

The interests used for the calculation of the annuity are compound interests. The annuity in our case is

paid at the end of each period.

R: the periodic payment

n: the number of annuities

i: the interest rate per period.

► Future value (Acquired value) :

n

n

1 i 1V R

i

.

► The gained (earned) interest : nI V R n

► Present value (Actual value):

n1 1 i

V Ri

► The paid interest : I R n V .

► Remark:

The future value of an annuity is equal to the sum of all the successive payments and all the compound

interest earned at the end of the term of an annuity

The present value of an annuity is equal to the sum of the present values of all the successive

payments

Page 8: Chapter 1: Statistics series in two variables

Chapter 4: Logarithmic function

1

1

1

2

1

0

0

0

00

00

0

11

x 0

ln'(x) +

ln x

Page 9: Chapter 1: Statistics series in two variables

Chapter 5: Exponential function

0 0

0

11

x

'xe +

xe

0

Page 10: Chapter 1: Statistics series in two variables

Chapter 6: Economical functions

I- Introduction Let x be the number of units of a product that a manufacturing company produces during a certain time

period, x is usually called the production level for that product. The three basic functions of this

production level are the cost function C, the revenue function R, and the profit function P. The study of

the rate at which one of these economic functions changes relative to x is called marginal analysis.

Hence, the word « marginal » means « derivative of » or equivalently « the rate of change of ».

II- Cost functions

When an enterprise produces goods or services during a certain time period., and when a factory

produces items, they spend an amount of money necessary to obtain a production volume. These

expenses are called costs.

1) Total cost function

In general, the total cost C(x) of producing x units of a product may be expressed as a sum of the

fixed costs and the variable costs:

- The fixed costs are the costs that must be paid even if nothing is produced. These include, for

instance, rent, electricity, insurance, etc. The fixed costs are independent from the level of production

x. In practical manner the fixed costs are calculated when the level of production x is equal to 0.

- The variable costs are the costs of labor and material. These costs depend on the numbers of units

produced x.

- The total cost function is the sum of the fixed costs and the variable costs. We write:

T F V

C x C C x , where 0F T

C C and V

C is a function of x.

►Economical interpretation

If C(x0) = k, the total cost of production of x0 units is equal to k (monetary units).

2) Average cost function or unitary cost function If C(x) is the total cost of producing x units of a product during a certain time period. Hence, the

average cost function C per unit is defined by

C x

C xx

. It is the production cost of one item of

the product.

►Economical interpretation:

If 0C x k , then the common cost of one unit between the x0 produced units is equal to k

(monetary units).

3) Marginal cost function:

The marginal cost function MC is the derivative of the cost function; that is, C

M x C' x .

►Economical interpretation

The marginal cost gives the cost of producing one supplementary unit.

If MC(x0) = k, then the cost of production of the th

0x 1 unit is equal to k (monetary units).

Page 11: Chapter 1: Statistics series in two variables

► Remark: minimizing the average cost:

The enterprise has an interest in minimizing the average cost. In general C x is minimum if

C' x 0 , but C x

C xx

then

2

C' x .x C xC' x

x

.

C' x 0 for C' x .x C x 0 , which gives C x

C' xx

and that is C

M x C x .

Therefore, the average cost is minimum when it is equal to the marginal cost.

Attention: It is mandatory in some cases to study the variations of the average function C to

determine its extremum in a precise manner.

III- Revenue functions

1) Total revenue function

If all the produced quantity x by a factory is sold for a certain unit price p during a given time

period, we define the total revenue function by : R x unit price × produced quantity = p x .

In general, the unit price p is constant, and usually given in the problem.

The total revenue function is a linear function of the level of production x.

► Remark: In some cases, the produced quantity is not sold in total, but the sold quantity represents

a ratio r % from the whole production, in this case we use the formula

p x unit of x rR x

unit of C x 100 .

►Economical interpretation

If R(x0) = k, the total revenue of selling x0 units is equal to k (monetary units).

2) Average revenue function or unitary revenue function If R(x) is the total revenue of producing x units of a product during a certain time period. Hence, the

average revenue function R per unit is defined by

R x

R xx

. It is the production revenue of

one item of the product.

►Economical interpretation

If 0R x k , then the common revenue of selling one unit between the x0 produced units is

equal to k (monetary units).

3) Marginal revenue function:

The marginal revenue function MR is the derivative of the revenue function; that is,

R

M x R' x .

►Economical interpretation

The marginal revenue gives the revenue of selling one supplementary unit.

If R(x0) = k, then the revenue of selling the th

0x 1 unit is equal to k (monetary units).

IV- The profit functions

1) Total profit function

Page 12: Chapter 1: Statistics series in two variables

The total profit function P(x) is the difference between the total revenue and the total cost. Hence

P(x) = R(x) – C(x).

►Economical interpretation

If P(x0) = k, the total profit of selling x0 units is equal to k (monetary units).

► Remark:

If P(x) > 0, then the enterprise realizes at a profit. In this case R(x) > C(x). Graphically, this

case corresponds to the values of the production level x so that the representative curve of the

revenue R is above that of the cost C.

If P(x) < 0, then the enterprise realizes a loss. In this case R(x) < C(x). Graphically, this case

corresponds to the values of the production level x so that the representative curve of the

revenue R is below that of the cost C.

If P(x) = 0, then the enterprise is breaking-even. That is, the enterprise is neither making

money nor losing. In this case R(x) = C(x). Graphically, this case corresponds to the values

of the production level x so that the representative curves of the revenue R and that of the

cost C intersect.

2) Average profit function or unitary revenue function If P(x) is the total profit of producing x units of a product during a certain time period. Hence, the

average profit function P per unit is defined by

P x

P xx

. It is the production profit of one

item of the product.

►Economical interpretation

If 0P x k , then the common profit of selling one unit between the x0 produced units is equal

to k (monetary units).

3) Marginal profit function:

The marginal profit function MP is the derivative of the profit function; that is, P

M x P' x .

►Economical interpretation

The marginal profit gives the profit of selling one supplementary unit.

If R(x0) = k, then the profit of selling the th

0x 1 unit is equal to k (monetary units).

► Remark: Maximizing the profit: The main objective of the enterprise is to maximize its profit,

and that is to determine the level of production x0 that allows the maximization to happen. In general,

the profit P(x) is maximal if P′(x) = 0 and that is R′(x) − C′(x) = 0 so R′(x) = C′(x). So the level of

production x0 that maximizes the profit correspond to the value of x where the marginal revenue is

equal to the marginal cost. The maximum amount of profit is P(x0).

Attention: It is mandatory in some cases to study the variations of the profit function P to determine

its maximum in a precise manner.

V- DEMAND, SUPPLY, ELASTICITY OF DEMAND

It is well known that if the price of an item rises, its demand will fall, and vice-versa. And, if p defines

the price at which consumers will purchase a product in a given unit of time, then:

1) The demand function D: gives the quantity D(p) of a product that will be sold by an enterprise to

the consumers at price p during the unit of time.

Page 13: Chapter 1: Statistics series in two variables

2) The supply function S: gives the quantity S(p) of a product that suppliers (enterprises) are willing

to produce or supply at price p during the unit of time.

3) Market Equilibrium: The point where the

representative curves of the demand and supply

functions of a given item intersect is called the

equilibrium point, as shown in the adjacent

figure:

Hence, the price pe and the quantity eD p at

which the curves of the supply and demand

functions intersect are called the

equilibrium price, and the equilibrium

quantity, respectively.

Hence, the point of intersection e ep ; D p economically means that at this point we have market

equilibrium.

4) Elasticity of the demand:

For the demand function D(p), the elasticity of demand E(p) at

price p is defined by

D' pE p p

D p .

The elasticity of demand can be used by economists to compare the relative rate of change in the

demanded quantity to the rate of change in price. Thus, elasticity of demand is used by economists as

a measure of how responsive demand is to price change, and hence for a particular price p0, we say

that:

a. The demand is elastic at p0 if E(p0) > 1. It means that consumers will purchase relatively more or

less in response to price changes.

b. The demand is inelastic at p0 if E(p0) < 1. It means that consumers will purchase only slightly

more or less in response to price changes

c. The demand has unit elasticity at p0 if E(p0) = 1.

d. The demand is perfectly inelastic if E(p0) = 0.

►Economical interpretation of the elasticity of the demand at a certain price p0

If E(p0) = k, then if the price increases by 1% from unit price p0, the demand decreases by k %.

5) Revenue and elasticity of the demand

Recall that:

Revenue = (Price per Unit) (Quantity Demanded);

That is, R(p) = p D(p).

Differentiating the above equation with respect to p, we

obtain R ' p D p 1 E p .

Hence, we have the following criteria for determining

how revenue behaves relative to changes in price.

a) If the demand is elastic at p, then the revenue R(p)

decreases as the price p increases.

b) If the demand is inelastic at p, then the revenue R(p)

increases as the price p increases.

c) If demand has a unit elasticity, then the revenue

R(p) is maximum at p, as shown in the Figure

below.

Page 14: Chapter 1: Statistics series in two variables

Chapter 7: Counting

I- n factorial: let n IN such that n 2 , n factorial n! is defined by n! n (n 1) (n 2) .... 3 2 1

By convention: 1! 1 0! 1

II- Arrangement with repetition or P-list:

Let E be a finite set containing n elements. Let p . An arrangement with repetition of p elements (p by

p) out of a set E is an ordered choice (sequence) of p elements of E distinct or not.

III- Arrangement without repetition:

Let E be a finite set containing n elements. Let p n . An arrangement p to p without repetition out of the set

E is an ordered choice of p distinct elements of E.

n n −1 n − 2 n −3 ..............................................

IV- Permutation:

A permutation is an arrangement without repetition taken n by n out of a set E having n elements

Let p = n, Pn = n

n

n!A n!

0! is the total number of the permutations

V- Combination

I- Subset of p elements of a finite set E - COMBINATIONS

1) Definition

Let E be a set of n elements and p a natural number such that p ≤ n. We call the combination of p

elements of E any subset of E formed of p elements.

Such a combination is therefore an (unordered) choice of p B elements of E.

Two combinations are different when they differ from at least one element.

The number of combinations of p elements of E will be denoted by or.

Examples:

0

nC = 1 because E has only one subset formed of zero element (the empty set).

1

nC = n because E has n subsets formed of a single element.

n

nC = 1 because E has only one part of n elements, which is E itself.

Let E = {1; 2; 3}. The combinations of two elements of E are the parts {1; 2}, {1; 3} and {2; 3} from

E.

Notation: We denote byp

nA the total number of arrangements taken p by p without repetition out of E

where p

n

n!p n A

(n p)!

P boxes

General Case: An arrangement with repetition of p elements (p by p) out of a set E containing n

elements or the p list of E is pn .

Page 15: Chapter 1: Statistics series in two variables

Let E = {a; b; c; d} a set of 4 elements. We know that this set has 24 arrangements of three elements: 3

4A = 24. Thus (a; b; c), (a; c; b), (b; a; c), (b; c; a), (c; a ; b) and (c; b; a) are distinct arrangements of

three elements of E. These 6 arrangements define the same combination {a; b; c} because, for

combinations, the writing order does not count.

2) Number of combinations

Let E be a set of n elements and p a natural integer such that p ≤ n. We know that a combination determine

p! arrangements without repetition, the number of combinations of p elements of E, will therefore be p

p nn

A n!C

p! p!(n p)!

Examples:

The number of ways to choose 3 persons from 5 is the number of combinations of 3 persons

chosen from 5 ; this number is then 3

5

5! 5 4 3!C 10

3!2! 3!2 1

ways.

To participate the lotto we choose a subset of 6 numbers from the set {1 ; 2 ; 3; … ; 41 ; 42}, then

we have 6

42

42!C

6!36! = 5245786 choices.

3) Properties

n and p are two natural numbers such that 0 ≤ p ≤ n.

a) p n p

n nC C .

b) Si p ≥ 1, on a: p p p 1

n n 1 n 1C C C

(relation of Pascal).

4) Pascal’s Triangle

The Pascal’s triangle helps to calculate all the p

nC .

In the first column and in the diagonal on place 1

car 0

nC = n

nC = 1.

We add the cells p p 1

n 1 n 1C and C

to obtain p

nC using

the relation of Pascal

5) Binomial formula of Newton

let a and b be two real numbers and n is a natural number.

(a + b)n = 0

nC an + 1

nC an – 1b + 2

nC an – 2b2 + … + p

nC an – pbp + … + n

nC bn = n

p n p p

n

p 0

C a b

.

Example: (x + 1)5 = x5 + 5x4 + 10x3 + 10x2 + 5x + 1

p

n

0 1 2 3 4 5 …

0 1

1 1 1

2 1 2 1

3 1 3 3 1

4 1 4 6 4 1

5 1 5 10 10 5 1

+

Page 16: Chapter 1: Statistics series in two variables

II- THE DIFFERENT TYPES OF PRINTS

We have a set E of n elements and we "draw" p elements of E.

The following table summarizes the different types of drawings

Type of

drawings

Order Repetition of elements Counting

Successive with

replacement

the order is taken in

consideration

A element can be drawn

several times

np (arrangements with

repetition)

Successive

without

replacement

the order is taken in

consideration

An element is drawn only one

time

p

nA (arrangements

without repetition)

Simultaneous the order is not taken in

consideration

An element is drawn only one

time

p

nC (combinations)

Page 17: Chapter 1: Statistics series in two variables

I- Conditional probability

► Definition 1

Let A and B be two events of a universe Ω such that P (B) ≠ 0.

The probability that event A is realized, knowing that event B is realized, is called conditional

probability, and is denoted by P (A / B) or PB(A).

► Property 1

Let A and B be two events of a universe Ω such that P(B) ≠ 0.

We have : P(A / B) = P(A B)

P(B)

.

► Attention: The formula for P (A / B) is often used to determine P (AB). In the

most of the exercises P(A / B) is given or is directly calculated, P(AB) is determined by

the formula P(A B) = P(A / B) × P(B) = P(B / A) × P(A).

II- Formula of total probabilities

► Application: Study of a situation: representation by a weighted tree

In high school, 54% of students are girls (G). From the girls, 72% are external (E) while among the

boys (B), the percentage of external is 76%.

A student is chosen at random. The objective is to calculate the probability of the evet E: The chosen

student is external

♥ We are interested in the two criteria: sex (G or B) and quality (E or E ).

The universe of this random experiment is the set of students of the high school, and each student has

the same probability of being questioned.

The situation can be represented by the chart below called weighted tree.

♥ Branches drawing

Initially, there are two possibilities, G or B, represented by two branches.

At knot F, there are two cases, E or E , depending on whether the girl is external or not. Similarly, from

knot G.

Page 18: Chapter 1: Statistics series in two variables

♥ Probability entry

On the branch G, we enter P(G) = 0.54.

On the branch E, the probability for a girl is external is written P(E / G) = 0.72.

On the E branch, the probability for a girl is not to be external: P( E / G) = 0.28.

From the girls, 72% are external and 28% are not, so P(E / G) + P( E / G) = 1.

The tree is also completed from G.

♥ Event represented by a path

The path represents the event "The student is a girl and she is external". This event is

GE. Let us calculate its probability.

P (GE) is the product of the probabilities inscribed on the path , then

P (GE) = P(E / G) × P(G) = 0.54 × 0.72 = 0.3888

The path represents the event "The student is a boy and he is external". This event is B

E. Let us calculate its probability.

P (BE) is the product of the probabilities inscribed on the path , then

P (BE) = P(E / B) × P(B) = 0.46 × 0.76 = 0.3496 .

♥ Formula of total probability

The event E is present at two branches of the tree; we say that E is the union of the two

incompatible events GE and BE, then :

P(E) = P (GE) + P(BE) = 0.3888 + 0.3496 = 0.7384.

► Definition 3

E is a set and B1 , B2 , …, Bn are n subsets of E.

We say that B1 , B2 , …, Bn form a partition of E if and only if, we have :

1) Bi ≠ Ø for every i = 1, 2 , … , n ;

2) Bi Bj = Ø ;

3) B1 B2 … Bn = E.

► Attention: If E is an event of the sample space Ω, then E et E form a partition of Ω.

► Property 2

If B1 , B2 , …, Bn are n events forming a partition of the sample space Ω relative to a random

experience and E is an event of Ω (E Ω), then we have:

P(E) = P(E B1) + P(E B2) + … + P(E Bn) therefore

P(E) = P(E / B1) × P(B1) + P(E / B2) × P(B2) + … + P(E / Bn) × P(Bn).

III- Independant events

► Definition 2

If A and B are two events of the sample space Ω such that P(A) ≠ 0 and P(B) ≠ 0.

A and B are said to be independent if the probability of one of them does not change, if the other occurs

or not, that is P(A / B) = P(A) or P(B / A) = P(B).

► Property 2

Two events A and B such that P(A) ≠ 0 and P(B) ≠ 0 are independent if P(A B) = P(A) × P(B).

► Attention: To prove that two events A and B are independent we use one of the following methods:

We prove that P(A / B) = P(A) or P(B / A) = P(B).

We prove that P(A B) = P(A) × P(B) by direct calculation of P(A B) and de P(A) × P(B).

Page 19: Chapter 1: Statistics series in two variables

IV- Random variable

Definition

Ω is a sample space of a random experiment. Defining a random variable X over Ω, is to associate to each

issue of Ω a real number xi.

If x1, x2, …., xn are the values of X, we define the probability distribution of X by calculating the probabilities

p1, p2, …, pn for each values xi taken by X. We write P(X = xi) = pi.

Usually the probability distribution of a random variable of X is represented by a table in the form :

X = xi x1 x2 ……. xn

Pi = P(X = xi) P1 P2 Pn

Note: n

i 1 2 n

i 1

p p p .... p 1

.

The mathematical expectation of X is the mean of xi noted by:

E(X) = n

i i 1 1 2 2 n n

i 1

p x p x p x .... p x

.

The variance of X is given by : V(X) = n

2 22 2 2 2

i i 1 1 2 2 n n

i 1

p x E(X) p x p x .... p x E(X)

.

The standard deviation of X denoted by σ(X) = V(x) .

Prepared by H. Ahmad