exploring randomness: delusions and opportunities

72
Exploring Randomness: Delusions and Opportunities LS 812 Mathematics in Science and Civilization November 3, 2007

Upload: mina

Post on 10-Feb-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Exploring Randomness: Delusions and Opportunities . LS 812 Mathematics in Science and Civilization. November 3, 2007. Sources and Resources. Nassim, Nicholas Taleb Fooled by Randomness, Second Edition, Random House, New York. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Exploring Randomness: Delusions and Opportunities

Exploring Randomness: Delusions and Opportunities

LS 812 Mathematics in Science and

CivilizationNovember 3, 2007

Page 2: Exploring Randomness: Delusions and Opportunities

Sources and Resources

• Nassim, Nicholas Taleb Fooled by Randomness, Second Edition, Random House, New York.

• Weldon,K.L. Everyday Benefits of Understanding Variability. Presented at Applied Statistics Conference, Ribno, Slovenia. September, 2007.

• www.stat.sfu.ca/~weldon

Page 3: Exploring Randomness: Delusions and Opportunities

Introduction• Randomness is about Uncertainty - e.g. Coin

• Is Mathematics about Certainty? - P(H) = 1/2

• Mathematics can help to Describe “Unexplained Variability”

• Randomness concept is key for “Probability”

• Probability is key to exploring implications of “unexplained variability”

Page 4: Exploring Randomness: Delusions and Opportunities

Abstract Real World

Mathematics

Applications of Mathematics

Randomness Applied Statistics

Surprising FindingsUseful Principles

Ten Findings and Associated Principles

Page 5: Exploring Randomness: Delusions and Opportunities

Example 1 - When is Success just

Good Luck?

An example from the world of Professional Sport

Page 6: Exploring Randomness: Delusions and Opportunities
Page 7: Exploring Randomness: Delusions and Opportunities

Sports League - Football

Success = Quality or Luck?2007 AFL LADDER

TEAM Played WinDraw Loss Points FOR Points Against Ratio PointsGeelong 22 18 - 4 2542 1664 153 72Port Adelaide 22 15 - 7 2314 2038 114 60West Coast Eagles 22 15 - 7 2162 1935 112 60Kangaroos 22 14 - 8 2183 1998 109 56Hawthorn 22 13 - 9 2097 1855 113 52Collingwood 22 13 - 9 2011 1992 101 52Sydney Swans 22 12 1 9 2031 1698 120 50Adelaide 22 12 - 10 1881 1712 110 48St Kilda 22 11 1 10 1874 1941 97 46Brisbane Lions 22 9 2 11 1986 1885 105 40Fremantle 22 10 - 12 2254 2198 103 40Essendon 22 10 - 12 2184 2394 91 40Western Bulldogs 22 9 1 12 2111 2469 86 38Melbourne 22 5 - 17 1890 2418 78 20Carlton 22 4 - 18 2167 2911 74 16Richmond 22 3 1 18 1958 2537 77 14

Page 8: Exploring Randomness: Delusions and Opportunities
Page 9: Exploring Randomness: Delusions and Opportunities

Recent News Report

“A crowd of 97,302 has witnessed Geelong breakits 44-year premiership drought by crushing a hapless Port Adelaide by a record 119 points in Saturday's grand final at the MCG.” (2007 Season)

Page 10: Exploring Randomness: Delusions and Opportunities

Sports League - Football

Success = Quality or Luck?2007 AFL LADDER

TEAM Played WinDraw Loss Points FOR Points Against Ratio PointsGeelong 22 18 - 4 2542 1664 153 72Port Adelaide 22 15 - 7 2314 2038 114 60West Coast Eagles 22 15 - 7 2162 1935 112 60Kangaroos 22 14 - 8 2183 1998 109 56Hawthorn 22 13 - 9 2097 1855 113 52Collingwood 22 13 - 9 2011 1992 101 52Sydney Swans 22 12 1 9 2031 1698 120 50Adelaide 22 12 - 10 1881 1712 110 48St Kilda 22 11 1 10 1874 1941 97 46Brisbane Lions 22 9 2 11 1986 1885 105 40Fremantle 22 10 - 12 2254 2198 103 40Essendon 22 10 - 12 2184 2394 91 40Western Bulldogs 22 9 1 12 2111 2469 86 38Melbourne 22 5 - 17 1890 2418 78 20Carlton 22 4 - 18 2167 2911 74 16Richmond 22 3 1 18 1958 2537 77 14

Page 11: Exploring Randomness: Delusions and Opportunities

Are there better teams?

• How much variation in the total points table would you expect IFevery team had the same chance of winning every game? i.e. every game is 50-50.

• Try the experiment with 5 teams. H=Win T=Loss (ignore Ties for now)

Page 12: Exploring Randomness: Delusions and Opportunities

5 Team Coin Toss Experiment

My experiment …• T T H T T H H H H TTeam Point

s3 162 125 81 44 0

But all teams Equal Quality

(Equal Chance to win)

ExperimentResult----->

•Win=4, Tie=2, Loss=0 but we ignore ties. P(W)=1/2•5 teams (1,2,3,4,5) so 10 games as follows•1-2,1-3,1-4,1-5,2-3,2-4,2-5,3-4,3-5,4-5

Page 13: Exploring Randomness: Delusions and Opportunities

Implications?

•Points spread due to chance?

•Top team may be no better than the bottom team (in chance to win).

Page 14: Exploring Randomness: Delusions and Opportunities

Simulation: 16 teams, equal chance to win, 22 games

Page 15: Exploring Randomness: Delusions and Opportunities

Sports League - Football

Success = Quality or Luck?2007 AFL LADDER

TEAM Played WinDraw Loss Points FOR Points Against Ratio PointsGeelong 22 18 - 4 2542 1664 153 72Port Adelaide 22 15 - 7 2314 2038 114 60West Coast Eagles 22 15 - 7 2162 1935 112 60Kangaroos 22 14 - 8 2183 1998 109 56Hawthorn 22 13 - 9 2097 1855 113 52Collingwood 22 13 - 9 2011 1992 101 52Sydney Swans 22 12 1 9 2031 1698 120 50Adelaide 22 12 - 10 1881 1712 110 48St Kilda 22 11 1 10 1874 1941 97 46Brisbane Lions 22 9 2 11 1986 1885 105 40Fremantle 22 10 - 12 2254 2198 103 40Essendon 22 10 - 12 2184 2394 91 40Western Bulldogs 22 9 1 12 2111 2469 86 38Melbourne 22 5 - 17 1890 2418 78 20Carlton 22 4 - 18 2167 2911 74 16Richmond 22 3 1 18 1958 2537 77 14

Page 16: Exploring Randomness: Delusions and Opportunities

Does it Matter?

Avoiding foolish predictionsManaging competitors (of any kind)Understanding the business of sport

Appreciating the impact of uncontrolled variationin everyday life

Page 17: Exploring Randomness: Delusions and Opportunities

Point of this Example?

Need to discount “chance”In making inferences from everyday observations.

Page 18: Exploring Randomness: Delusions and Opportunities

Example 2 - Order from

Apparent Chaos

An example from some personal data collection

Page 19: Exploring Randomness: Delusions and Opportunities

Gasoline ConsumptionEach Fill - record kms and litres of fuel used

Smooth--->SeasonalPattern….Why?

Page 20: Exploring Randomness: Delusions and Opportunities

Pattern Explainable? Air temperature?

Rain on roads?

Seasonal Traffic Pattern?

Tire Pressure?

Info Extraction Useful for Exploration of Cause

Smoothing was key technology in info extraction

Page 21: Exploring Randomness: Delusions and Opportunities

Is Smoothing Objective?

1 2 3 4 5 4 3 2 1 2 3 4 5Data plotted ->>

Page 22: Exploring Randomness: Delusions and Opportunities

How much to smooth?

Page 23: Exploring Randomness: Delusions and Opportunities

Optimal Smoothing Parameter?

• Depends on Purpose of Display• Choice Ultimately Subjective• Subjectivity is a necessary part of good data analysis

Page 24: Exploring Randomness: Delusions and Opportunities

Summary of this Example

• Surprising? Order from Chaos …

• Principle - Smoothing and Averaging reveal patterns encouraging investigation of cause

Page 25: Exploring Randomness: Delusions and Opportunities

Example 3 - Utility of Averages

• Understanding them can contribute to your wealth!

-1 .5 0 3

AVG = .38

Page 26: Exploring Randomness: Delusions and Opportunities

Preliminary ProposalI offer you the following “investment opportunity”

You give me $100. At the end of one year, I will return an amount determined by tossing a fair coins twice, as follows:

$0 ………25% of time (TT)$50.……. 25% of the time (TH)$100.……25% of the time (HT)$400.……25% of the time. (HH)

Would you be interested?

Page 27: Exploring Randomness: Delusions and Opportunities

Stock Market Investment

• Risky Company - example in a known context

• Return in 1 year for 1 share costing $10.00 25% of the time0.50 25% of the time1.00 25% of the time4.00 25% of the timei.e. Lose Money 50% of the time

Only Profit 25% of the time “Risky” because high chance of loss

Page 28: Exploring Randomness: Delusions and Opportunities

Independent Outcomes• What if you have the chance to put $1 into each of 100 such companies, where the companies are all in very different markets?

• What sort of outcomes then? Use coin-tossing (by computer) to explore

Page 29: Exploring Randomness: Delusions and Opportunities

Diversification Unrelated Companies

• Choose 100 unrelated companies, each one risky like this. Outcome is still uncertain but look at typical outcomes ….

. . . . . . : . . : . . . : : . . . . . . - +- - -- - - - -- +- - -- - - - -- +- - -- - -- - - +-- - - - -- - - +-- - - - -- - - +-r et urn 105 120 135 150 165 180

One-Year Returns to a $100 investment

Page 30: Exploring Randomness: Delusions and Opportunities

Looking at Profit only

Avg Profit approx 38%

Page 31: Exploring Randomness: Delusions and Opportunities

Gamblers like Averages and Sums!

• The sum of 100 independent investments in risky companies is very predictable!

• Sums (and averages) are more stable than the things summed (or averaged).

• Square root law for variability of averages

VAR -----> VAR/n

Page 32: Exploring Randomness: Delusions and Opportunities

Example 4 - Industrial Quality

Control• Filling Cereal Boxes, Oil Containers, Jam Jars

• Labeled amount should be minimum• Save money if also maximum• variability reduction contributes to profit

• Method: Management by exception …>

Page 33: Exploring Randomness: Delusions and Opportunities

Management by exception

QC=

QualityControl

<-- Nominal Amount

Page 34: Exploring Randomness: Delusions and Opportunities

Japan a QC Innovator from 1950

• Consumer Reports– Best Maintenance HistoryAlmost all Japanese Makes

– Worst Maintenance HistoryAmerican and European Makes

Key Technology was Variability Reduction

Usually via Control Charts

Page 35: Exploring Randomness: Delusions and Opportunities

Summary Example 4

• Surprising that Simple Control Chart could have such influence

• Control Chart is just an implementation of the idea of Management by Exception

Page 36: Exploring Randomness: Delusions and Opportunities

Example 5 - A Simple Law of Life

• Sometimes we see the same pattern in data from many different sources.

• Recognition of patterns aids description, and also helps to identify anomalies

Page 37: Exploring Randomness: Delusions and Opportunities

Example: Zipf’s Law• An empirical finding

• Frequency * rank = constant

• Example - frequency (i.e. population) of citiesLargest city is rank 1Second largest city is rank 2 ….

Page 38: Exploring Randomness: Delusions and Opportunities

Canadian City Populations

Page 39: Exploring Randomness: Delusions and Opportunities

Population*Rank = Constant?(Frequency * rank = constant)

Page 40: Exploring Randomness: Delusions and Opportunities

Other Applications of Zipf

•Word Frequency in Natural or Programming Language•Volume of messages at Internet Sites•Number of Employees of Companies•Academic Publishing Productivity•Enrolment of Universities•……

•Google “Zipf’s Law” for more in-depth discussion

Page 41: Exploring Randomness: Delusions and Opportunities

Summary for Zipf’s Law• Surprising that processes involving many accidents of history and social chaos, should result in a predictable relationship

• Useful to describe an empirical relationship that has meaning in very different settings - a convenient descriptive tool.

Page 42: Exploring Randomness: Delusions and Opportunities

Example 6 - Obtaining Confidential Information• How can you ask an individual for

data on• Incomes• Illegal Drug use• Sex modes• …..Etc in a way that will get an honest response?

There is a need to protect confidentiality of answers.

Page 43: Exploring Randomness: Delusions and Opportunities

Example: Marijuana Usage

• Randomized Response Technique

Pose two Yes-No questions and have coin toss determine which is answered

Head 1. Do you use Marijuana regularly?Tail 2. Is your coin toss outcome a tail?

Page 44: Exploring Randomness: Delusions and Opportunities

Randomized Response Technique

• Suppose 60 of 100 answer Yes. Then about 50 are saying they have a tail. So 10 of the other 50 are users. 20%.

• It is a way of using randomization to protect Privacy. Public Data banks have used this.

Page 45: Exploring Randomness: Delusions and Opportunities

Summary of Example 6

• Surprising that people can be induced to provide sensitive information in public

• The key technique is to make use of the predictability of certain empirical probabilities.

Page 46: Exploring Randomness: Delusions and Opportunities

Example 7 - Survival Assessment

• Personal Data is always hard to get.

• Need to make careful use of minimal data

• Here is an example ….

Page 47: Exploring Randomness: Delusions and Opportunities

Traffic Accidents

• Accident-Free Survival Time- can you get it from ….

•Have you had an accident?How many months have you had your drivers license?

Page 48: Exploring Randomness: Delusions and Opportunities

Accident Free Survival Time

1009080706050403020100

1.00

0.75

0.50

0.25

0.00

Time Since License

Proability that Accident Already Occurredas a function of Time Since License Obtained

Page 49: Exploring Randomness: Delusions and Opportunities

Accident Next Month

Can show that, for my 2002 class of 100 students,chance of accident next month

was about 1%.

Page 50: Exploring Randomness: Delusions and Opportunities

Summary of Example 7

•Surprising that such minimal information is useful

•Again, key technique is to use empirical probabilities and smoothing

Page 51: Exploring Randomness: Delusions and Opportunities

Example 8 - Lotteries:Expectation and Hope

• Cash flow – Ticket proceeds in (100%)– Prize money out (50%)– Good causes (35%)– Administration and Sales (15%)

50 %

•$1.00 ticket worth 50 cents, on average•Typical lottery P(jackpot) = .0000007

Page 52: Exploring Randomness: Delusions and Opportunities

How small is .0000007?• Buy 10 tickets every week for 60 years

• Cost is $31,200.

• Chance of winning jackpot is = ….

1/5 of 1 percent!

Page 53: Exploring Randomness: Delusions and Opportunities

Summary

•Surprising that lottery tickets provide so little hope!

•Key technology is simple use of probabilities

Page 54: Exploring Randomness: Delusions and Opportunities

Example 9 - Peer Review: Is it fair?

• Average referees accept 20% of average quality papers

• Referees vary in accepting 10%-50% of average papers

• Two referees accepting a paper -> publish.

• Two referees disagreeing -> third ref

• Two referees rejecting -> do not publish

Analysis via simulation - assumptions are:

Page 55: Exploring Randomness: Delusions and Opportunities

6

13

6

Ultimately published:

6 + .20*13 (approx)

=9 papers out of 100

16 others just as good!

Page 56: Exploring Randomness: Delusions and Opportunities

Peer Review Fair?

• Does select good papers but• Many equally good papers are rejected

• Similar property of school admission systems, competition review boards, etc.

Page 57: Exploring Randomness: Delusions and Opportunities

Summary of Example 9

•Surprising that peer review is so dependent on chance

•Key procedure is to use simulationto explore effect of randomness

Page 58: Exploring Randomness: Delusions and Opportunities

Example 10 - Investment:Back-the-winner fallacy

• Mutual Funds - a way of diversifying a small investment

• Which mutual fund? • Look at past performance?• Experience from symmetric random walk …

Page 59: Exploring Randomness: Delusions and Opportunities

Trends that do not persist

Page 60: Exploring Randomness: Delusions and Opportunities

Implication from Random Walk …?

• Stock market trends may not persist • Past might not be a good guide to future

• Some fund managers better than others?

• A small difference can result in a big difference over a long time …

Page 61: Exploring Randomness: Delusions and Opportunities

A simulation experiment to

determine the value of past performance data• Simulate good and bad managers

• Pick the best ones based on 5 years data

• Simulate a future 5-yrs for these select managers

Page 62: Exploring Randomness: Delusions and Opportunities

How to describe good and bad fund managers?• Use TSX Index over past 50 years as a guide ---> annualized return is 10%

• Use a random walk with a slight upward trend to model each manager.

• Daily change positive with probability p

Good manager ROR = 13%pa

p=.56

Medium manager ROR = 10%pa

p=.55

Poor manager ROR = 8% pa

P=.54

Page 63: Exploring Randomness: Delusions and Opportunities
Page 64: Exploring Randomness: Delusions and Opportunities

Simulation to test “Back the Winner”

• 100 managers assigned various p parameters in .54 to .56 range

• Simulate for 5 years• Pick the top-performing mangers (top 15%)

• Use the same 100 p-parameters to simulate a new 5 year experience

• Compare new outcome for “top” and “bottom” managers

Page 65: Exploring Randomness: Delusions and Opportunities

START=100

Page 66: Exploring Randomness: Delusions and Opportunities

Mutual Fund Advice?

Don’t expect past relative performance to be a good indicator of future relative performance.

Again - need to give due allowance for randomness (i.e. LUCK)

Page 67: Exploring Randomness: Delusions and Opportunities

Summary of Example 10

• Surprising that Past Perfomance is such a poor indicator of Future Performance

• Simulation is the key to exploring this issue

Page 68: Exploring Randomness: Delusions and Opportunities

Ten Surprising Findings

1. Sports Leagues - Lack of Quality Differentials 2. Gasoline Mileage - Seasonal Patterns 3. Stock Market - Risky Stocks a Good Investment4. Industrial QC - Variability Reduction Pays5. Civilization - City Growth follows Zipf’s Law6. Marijuana - Show of Hands shows 20% are

regular users7. Traffic Accidents - Simple class survey predicts

1% chance of accident in next month8. Lotteries offer little hope9. Peer Review is often unfair in judging submissions10. Past Performance of Mutual Funds a poor indicator

of future performance.

Page 69: Exploring Randomness: Delusions and Opportunities

Ten Useful Concepts & Techniques? 1. Sports Leagues - Unexplained variation

can cause illusions - simulation can inform

2. Gasoline Mileage - Averaging (and smoothing) amplifies signals

3. Stock Market - Averaging tames unexplained variation - diversification a key to reduce risk

4. Industrial QC - Management by Exception, Continuous Incremental Improvement

5. Population of Cities - Order can emerge from chaos

Page 70: Exploring Randomness: Delusions and Opportunities

Ideas 6-106. Marijuana - Randomness can protect

privacy and preserve anonymity7. Traffic Accidents - simple survey data

can predict future risk, using probabilities

8. Lotteries - Not a reasonable “investment”9. Peer Review - Role of “luck”

underestimated10.Mutual Funds - Role of “luck”

underestimated!

Page 71: Exploring Randomness: Delusions and Opportunities

Role of Math?• Key background for

– Graphs– Probabilities– Simulation models– Smoothing Methods

• Important for constructing theory of inference

Page 72: Exploring Randomness: Delusions and Opportunities

The End

[email protected]

Questions, Comments, Criticisms…..