introduction to probability theory and statistics
DESCRIPTION
Introduction to probability theory and statistics. HY 335 Presented by: George Fortetsanakis. Roadmap. Elements of probability theory Probability distributions Statistical estimation Reading plots. Terminology. Event: every possible outcome of an experiment. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/1.jpg)
Introduction to probability theory and statistics
HY 335Presented by: George Fortetsanakis
![Page 2: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/2.jpg)
Roadmap• Elements of probability theory• Probability distributions• Statistical estimation• Reading plots
![Page 3: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/3.jpg)
Terminology• Event: every possible outcome of an experiment.• Sample space: The set of all possible outcomes of an experiment
Example: Roll a dice• Every possible outcome is an event• Sample space: {1, 2, 3, 4, 5, 6}
Example: Toss a coin• Every possible outcome is an event• Sample space: {head, tails}
![Page 4: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/4.jpg)
Probability axioms of Kolmogorov
Probability of an event A is a number assigned to this event such that:1. : all probabilities lie between 0 and 12. P(Ø) = 0 : the probability to occur nothing is zero3. : The probability of the sample space is 14.
1)(0 AP
1)( SP
)()()()( BAPBPAPBAP
A BA∩B
![Page 5: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/5.jpg)
Theorems from the axioms
Proof:
)(1)( APAP
![Page 6: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/6.jpg)
Theorems from the axioms
Proof:
)(1)( APAP
)(1)(
1)()(
1)()()(
1)(1)(
APAP
APAP
AAPAPAP
AAPSP
(From axiom 3)
(From axiom 4)
(From axiom 2)
![Page 7: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/7.jpg)
Independence• A and B are independent if• Outcome A has no effect on outcome B and vice versa.• Example: Probability of tossing heads and then tails is 1/2*1/2 = 1/4
)(*)()( BPAPBAP
![Page 8: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/8.jpg)
Conditional probabilities
Definition of conditional probability:
The chain rule:
If A, B are independent:
)(
)()|(
BP
BAPBAP
)|()()( BAPBPBAP
)()|(
)|()()()()(
APBAP
BAPBPAPBPBAP
![Page 9: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/9.jpg)
Total probability theorem
Assume that B1, B2, …, Bn form a partition of the sampling space:
Then:
},...,2,1{, 0
and ...21
njiBjBi
SBBB n
)()|(
)()(
1
1
i
n
ii
n
ii
BPBAP
BAPAP
B1 B2 B3
B4B5
A
A∩B1
A∩B2
A∩B3
![Page 10: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/10.jpg)
Roadmap• Elements of probability theory• Probability distributions• Statistical estimation• Reading plots
![Page 11: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/11.jpg)
Probability mass function• Defined for discrete random variable X with domain D.• Maps each element x∈D to a positive number P(x) such that
• P(x) is the probability that x will occur.
, 1)(0 xP
Dx
xP 1)(
1 2 3 4 5 6 x
P(x)
Probability mass function of fair dice
![Page 12: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/12.jpg)
Geometric distribution
Distribution of number of trials until desired event occurs.• Desired event: d∈D• Probability of desired event: p
ppkP k 1)1()(
Number of trials Probability of k-1 failures Probability of success
)()|(:property Memoryless kKPnKnkKP
![Page 13: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/13.jpg)
Continuous Distributions• Continuous random variable X takes values in subset of real numbers D⊆R• X corresponds to measurement of some property, e.g., length, weight
• Not possible to talk about the probability of X taking a specific value
• Instead talk about probability of X lying in a given interval
0)( xXP
]) ,[()( 2121 xxXPxXxP
]) ,[()( xXPxXP
![Page 14: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/14.jpg)
Probability density function (pdf)• Continuous function p(x) defined for each x∈D• Probability of X lying in interval I⊆D computed by integral:
• Examples:
• Important property:
lxdxxplXP )()(
1)()( DxdxxpDXP
2
1
)(]) x,[()( 2121
x
x
dxxpxXPxXxP
x
dxxpXPxXP )(]) x,[()(
![Page 15: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/15.jpg)
Cumulative distribution function (cdf)
• For each x∈D defines the probability
Important properties:• • •
Complementary cumulative distribution function (ccdf)
)( xXP
x
dxxpXPxXPxF )(]) x,[()()(
0)( F
1)( F
)()()( 1221 xFxFxXxP
)(1)(1)()( xFxXPxXPxG
![Page 16: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/16.jpg)
Expectations• Mean value
• Variance: indicates depression of samples around the mean value
• Standard deviation
dxxxp )(][
dxxpx )()(])[( 222
2
![Page 17: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/17.jpg)
Gaussian distribution
Probability density function
Parameters• mean: μ• standard deviation: σ
![Page 18: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/18.jpg)
Exponential distribution
Probability density function
Cumulative distribution function
Memoryless property:
)()|( tTPTtTP
![Page 19: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/19.jpg)
Poisson process
Random process that describes the timestamps of various events• Telephone call arrivals • Packet arrivals on a router
Time between two consecutive arrivals follows exponential distribution
Time intervals t1, t2, t3, … are drawn from exponential distribution
t1 t2 t3 t4 t5 t6
…
Arrival 1 Arrival 2 Arrival 3 Arrival 4 Arrival 5 Arrival 6 Arrival 7
![Page 20: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/20.jpg)
Problem definition
Dataset D={x1, x2, …, xk} collected by network administrator • Arrivals of users in the network• Arrivals of packets on a wireless router
How can we compute the parameter λ of the exponential distribution that better fits the data?
![Page 21: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/21.jpg)
Maximum likelihood estimation
Maximize likelihood of obtaining the data with respect to parameter λ
k
ii
i
k
i
k
xp
xp
xxxp
Dp
1
1
21
*
)|(lnmaxarg
)|(maxarg
)|,...,,(maxarg
)|(maxarg
Likelihood function
Due to independence of samples
![Page 22: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/22.jpg)
Estimate parameter λ• Probability density function
• Define the log-likelihood function
• Set derivative equal to 0 to find maximum
k
ii
k
ii
k
i
k
i
x xkxel i
1111
)ln()ln()ln()(
k
ii
k
ii
x
kx
k
d
dl
1
*
1
00)(
![Page 23: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/23.jpg)
Roadmap• Elements of probability theory• Probability distributions• Statistical estimation• Reading plots
![Page 24: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/24.jpg)
Basic statistics
Suppose a set of measurements x = [x1 x2 … xn]
• Estimation of mean value: (matlab m=mean(x);)
• Estimation of standard deviation: (matlab s=std(x);)
n
xn
ii
1^
11
2^
^
n
xn
ii
![Page 25: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/25.jpg)
Estimate pdf
• Suppose dataset x = [x1 x2 … xn] • Can we estimate the pdf that values in x follow?
![Page 26: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/26.jpg)
Estimate pdf
• Suppose dataset x = [x1 x2 … xn] • Can we estimate the pdf that values in x follow?
Produce histogram
![Page 27: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/27.jpg)
Step 1• Divide sampling space into a number of bins
• Measure the number of samples in each bin
-4 -2 0 2 4
-4 -2 0 2 4
3 samples 6 samples5 samples 2 samples
![Page 28: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/28.jpg)
Step 2
• E = total area under histogram plot = 2*3 + 2*5 + 2*6 +2*2 = 32• Normalize y axis by dividing by E
-4 -2 0 2 4
3
65
2
x
Freq
uenc
y
-4 -2 0 2 4
3/32
6/325/32
2/32
x
P(x)
![Page 29: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/29.jpg)
Matlab codefunction produce_histogram(x, bins)% input parameters% X =[x1; x2; … xn]: a column vector containing the data x1, x2, …, xn.
% bins = [b1; b2; …bk]: A vector that Divides the sampling space in bins
% centered around the points b1, b2, …, bk.
figure; % Create a new figure[f y] = hist(x, bins); % Assign your data points to the corresponding binsbar(y, f/trapz(y,f), 1); % Plot the histogramxlabel('x'); % Name axis xylabel('p(x)'); % Name axis y
end
![Page 30: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/30.jpg)
Histogram examples10
00 s
ampl
es10
000
sam
ples
Bin spacing 0.1 Bin spacing 0.05
![Page 31: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/31.jpg)
Empirical cdf
How can we estimate the cdf that values in x follow?
Use matlab function ecdf(x)
Empirical cdf estimated with 300 samples from normal distribution
![Page 32: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/32.jpg)
Percentiles• Values of variable below which a certain percentage of observations fall• 80th percentile is the value, below which 80 % of observations fall.
80th percentile
![Page 33: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/33.jpg)
Estimate percentiles
Percentiles in matlab: p = prctile(x, y);• y takes values in interval [0 100]• 80th percentile: p = prctile(x, 80);
Median: the 50th percentile• med = prctile(x, 50); or• med = median(x);
Why is median different than the mean?• Suppose dataset x = [1 100 100]: mean = 201/3=67, median = 100
![Page 34: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/34.jpg)
Roadmap• Elements of probability theory• Probability distributions• Statistical estimation• Reading plots
![Page 35: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/35.jpg)
Reading empirical cdfs
Short tails Long tails
![Page 36: Introduction to probability theory and statistics](https://reader035.vdocuments.site/reader035/viewer/2022081508/56812ac2550346895d8e8dff/html5/thumbnails/36.jpg)
Reading time series