sta 114, spring 2008 probability and statistics instructor: sayan mukherjee ta: quanlin li stat 113

21
Sta 114, spring 2008 Probability and statistics Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Upload: ethelbert-armstrong

Post on 13-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

Probability and statistics Probability and statistics

Instructor: Sayan Mukherjee

TA: Quanlin Li

STAT 113

Page 2: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

There are three kinds of lies: lies, damned lies, and statistics.

B. Disraeli

Perspectives on stats

Page 3: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

What is probability ?What is probability ?

Probability is a branch of mathematics that deals with calculating the likelihood of a given event's occurrence, which is expressed as a number between 1 and 0.

Page 4: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

What is statistics ?What is statistics ?

Statistics derives from: Latin -- statisticum collegium ("council of state")Italian -- statista ("statesman" or "politician").

Statistik: German first introduced by Gottfried Achenwall (1749), originally designated the analysis of data about the state, or the "science of state". Acquired the meaning of the collection and classification of data generally in the early 19th century.

Statistics as inverse probability -- estimating parameters from experimental data

Page 5: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

Well-posed problems

A problem is well-posed if its solution

• exists

• is unique

• is stable, eg depends continuously on the data

Inverse problems are typically ill-posed

Page 6: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

Class requirements and rulesClass requirements and rules

Course webpage

Page 7: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

First digitsFirst digits

List of world records

Count entries starting with: {1,2,3,4,5,6,7,8,9}

Count entries ending with: {1,2,3,4,5,6,7,8,9}

Accounting fraud

Page 8: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

What’s wrong with the heartland ?What’s wrong with the heartland ?

Page 9: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

It’s the emptinessIt’s the emptiness

Page 10: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

The geometry of randomness

Dido’s problem (Isoperimetry) : Among all closed level curves of fixed length, find the one that encloses the largest area.

A

A

Page 11: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

The geometry of Gaussian random variables

A Gaussian distribution:

Page 12: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

The geometry of Gaussian random variables

A draw of n Gaussian random variables is a point in an n-dimensional space. How far from the origin is this point ?

x x12 x2

2 ... xn2

For n large the answer is that with very high probability

1c

nx

n1

c

n

Page 13: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

Law of large numbers or central limit theorem

The previous observation is a special case of the following phenomena:

Given a smooth function of n variables

x (x1,...,xn ) the following is true

Pr f x x f x h C1 exp C2h2n .

A classic example : f (x) x1 x2 ... xnn

.

Page 14: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

Geometry of real dataGeometry of real data

Digits in spaceDigits in space

Mandarin tonesMandarin tones

Page 15: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

Regression -- pedestrian detection

Papageorgiou and Poggio, 1998

Page 16: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

Daimler ChryslerDaimler Chrysler

Page 17: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

Experimental MercedesExperimental Mercedes

A fast version, integrated with a real-time obstacle

detection system

MPEG

Constantine Papageorgiou

Page 18: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

Sta 114, spring 2008

People classification/detectionPeople classification/detection

Stuttgart

Page 19: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

STA 293 03, fall 2005

More regression: talking faces

Text-to-visual-speech (TTVS) systems:

Page 20: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

STA 293 03, fall 2005

More regression: talking faces

• Hunter

• Its automatic

• Today show

Page 21: Sta 114, spring 2008 Probability and statistics Instructor: Sayan Mukherjee TA: Quanlin Li STAT 113

STA 293 03, fall 2005

Conclusion

Statistics is about predictive modeling that quantifies uncertainty

There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know.

---- Donald Rumsfeld