basic statistical concepts

28
BASIC STATISTICAL CONCEPTS Statistical Moments & Probability Density Functions Ocean is not “stationary” “Stationary” - statistical properties remain constant in t Data collected have signal and noise Both signal and noise are assumed to have random behavi Population Sample

Upload: haile

Post on 07-Jan-2016

41 views

Category:

Documents


0 download

DESCRIPTION

BASIC STATISTICAL CONCEPTS. Ocean is not “stationary”. “Stationary” - statistical properties remain constant in time. Data collected have signal and noise. Both signal and noise are assumed to have random behavior. Most basic descriptive parameter :. Sample Mean. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: BASIC STATISTICAL CONCEPTS

BASIC STATISTICAL CONCEPTSStatistical Moments & Probability Density Functions

Ocean is not “stationary”

“Stationary” - statistical properties remain constant in time

Data collected have signal and noise

Both signal and noise are assumed to have random behavior

Population Sample

Page 2: BASIC STATISTICAL CONCEPTS

Most basic descriptive parameter for any set of measurements:

N

iix

Nx

1

1

Sample Mean

over the duration of a time series – “time average”

or over an ensemble of measurements – “ensemble mean”

Sample mean is an unbiased estimate of the population mean ‘’

The population mean, μ, can be regarded as the expected outcome E(y) of an event y.

If the measurement is executed many times, μ would be the most common outcome, i.e., it’d be E(y) (e.g. the weight printed on a bag of chips)

Page 3: BASIC STATISTICAL CONCEPTS

Sample Mean - locates center of mass of data distribution such that:

Weighted Sample Mean

N

iii xf

Nx

1

1

N

fi relative frequency of occurrence of i th value

N

ii xx

N 1

01

N

i

x1

'

Page 4: BASIC STATISTICAL CONCEPTS

Variance - describes spread about the mean or sample variability

N

ii xx

Ns

1

22 1'Sample variance

2'' ss Sample standard deviation typical difference from the mean

N

ii xx

N 1

22

1

1Population variance (unbiased)

N needs to be > 1 to define variance and std dev

Only for N < 30 s’ and are significantly different

N

i

N

iii x

Nx

N 1

2

1

22 1

1

1Computationally more efficient (only one pass through the data)

Page 5: BASIC STATISTICAL CONCEPTS

N

ii xx

N 1

22

1

1

Population variance

has one degree of freedom (dof) <

N

ii xx

Ns

1

22 1'

Sample variance

because we estimate population variance with sample variance(one less dependent measure)

d.o.f. : = # of independent pieces of data being used to make a calculation.

= measure of how certain we are that our sample is representative of the entire population

The larger the more certain we are that we have sampled the entire population

Example: we have 2 observations, when estimating the mean we have 2 independent observations: = 2

But when estimating the variance, we have one independent observation because the two observations are at the same distance from the mean: =1

Page 6: BASIC STATISTICAL CONCEPTS

Other values of Importance

range(1.27)

0.66

-0.61

Median – equal number of values above and below = -0.007

Mode – value occurring most often

N = 1601

Page 7: BASIC STATISTICAL CONCEPTS

Mode = -0.3

Two ModesBimodal

Page 8: BASIC STATISTICAL CONCEPTS

Probability

Provides procedures to infer population distribution from sample distribution

and to determine how good the inference is

The probability of a particular event to occur is the ratio of the number of occurrences of that event and the total number of occurrences for all possible events

P (a dice showing ‘6’) = 1/6

The probability of a continuous variable is defined by a PROBABILITY DENSITY FUNCTION -- PDF

0 P (x) 1

Page 9: BASIC STATISTICAL CONCEPTS

Probability is measured by the area underneath PDF

1

dxxf

Page 10: BASIC STATISTICAL CONCEPTS

1

dxxf

Page 11: BASIC STATISTICAL CONCEPTS

Probability Density FunctionGauss or Normal or Bell

123

2

22 2

xexf

erf(1/(2)½)

= 68.3%

erf(2/(2)½)

= 95.4%

erf(3/(2)½)

= 99.7%

Page 12: BASIC STATISTICAL CONCEPTS

123

68.3%

95.4%99.7%

2

22zezF

x

z

standardized normal variable

Probability Density FunctionGauss or Normal or Bell

Page 13: BASIC STATISTICAL CONCEPTS

Probability Density FunctionGamma

xexxf

1

= 1

= 1

= 2 = 3

= 4

0

1 dxex x

Page 14: BASIC STATISTICAL CONCEPTS

Probability Density FunctionGamma

xexxf

1

0

1 dxex x

= 2

= 1

= 2

= 3

= 4

Page 15: BASIC STATISTICAL CONCEPTS

Probability Density FunctionChi Square

xexxf

1

= /2

Special case for = 2

= 2

= 4 = 6

= 8

4 2

8 2

12

2

16

2

Page 16: BASIC STATISTICAL CONCEPTS

CONFIDENCE INTERVALS

1 - /2/2

Confidence Interval for with known

For N > 30 (large enough sample)

the 100 (1 - )% confidence interval is:

Nzx

Nzx

22

x

z

standardized normal variable

Page 17: BASIC STATISTICAL CONCEPTS

(1 - /2) = 0.975

http://statistics.laerd.com/statistical-guides/normal-distribution-calculations.php

z /2 = 1.96

Page 18: BASIC STATISTICAL CONCEPTS

100 (1 - )% C.I. is:N

zxN

zx

22

If = 0.05, z /2 = 1.96

Suppose we have a CT sensor at the outlet of a spring into the ocean. We obtain a burst sample of 50 measurements, once per second, with a sample mean of 26.5 ºC and a stdev of 1.2 ºC for the burst.

What is the range of possible values, at the 95% confidence, for the population mean?

50

2.196.12

Nz

55.0 05.2795.25

Page 19: BASIC STATISTICAL CONCEPTS

CONFIDENCE INTERVALS

1 - /2/2

Confidence Interval for with unknown

For N < 30 (small samples)

the 100 (1 - )% confidence interval is:

N

stx

N

stx ,2,2

Ns

xt

Student’s t-distribution with = (N-1) degrees of freedom

x

z

Page 20: BASIC STATISTICAL CONCEPTS

/2 = 0.025

d.o.f.= 19

Page 21: BASIC STATISTICAL CONCEPTS

1 - /2/2

100 (1 - )% C.I. is:N

stx

N

stx ,2,2

If = 0.05, t0.025,19 = 2.093

Suppose we do 20 CTD profiles at one station in St Augustine Inlet. We obtain a mean at the surface of 16.5 ºC and a stdev of 0.7 ºC .

20

7.0093.2,2

N

st 33.0 83.1617.16

What is the range of possible values, at the 95% confidence, for the population mean?

Page 22: BASIC STATISTICAL CONCEPTS

CONFIDENCE INTERVALS

1 - /2

/2

Confidence Interval for 2

To determine reliability of spectral peaks

Need to know C.I. for 2 on the basis of s2

2

,21

22

2,2

2 11

sNsN

2L

2U

= (N-1) degrees of freedom

Page 23: BASIC STATISTICAL CONCEPTS

1 - /2

/2

2L

2U

2

,21

22

2,2

2 11

sNsN

Suppose that we have = 10 spectral estimates of a tidal record.

100 (1 - )% C.I. is:

The background variance near a distinct spectral peak is 0.3 m2

95% C.I. for variance?

How large would the peak have to be to stand out, statistically, from background level?

/2 = 0.025; 1 - /2 = 0.975

Look at Chi square table:

Page 24: BASIC STATISTICAL CONCEPTS

148.2025.3 210P

25.3

3.010

48.20

3.01011 22

,21

22

2,2

2

sNsN

Chi Square Table

92.015.0 2

The background variance lies in this range

The spectral peak has to be greater than 0.92 m2 to distinguish it from background levels

Page 25: BASIC STATISTICAL CONCEPTS
Page 26: BASIC STATISTICAL CONCEPTS
Page 27: BASIC STATISTICAL CONCEPTS
Page 28: BASIC STATISTICAL CONCEPTS