“known knowns, known unknowns and unknown unknowns.” · pdf file“known...

12
“Known knowns, known unknowns and unknown unknowns.” The Mathematics of Risk and Uncertainty

Upload: phungquynh

Post on 21-Mar-2018

233 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: “Known knowns, known unknowns and unknown unknowns.” · PDF file“Known knowns, known unknowns and unknown unknowns. ... he addresses the entire tribe to thank them for their

“Known knowns, known unknowns

and unknown unknowns.”

The Mathematics of Risk and Uncertainty

Page 2: “Known knowns, known unknowns and unknown unknowns.” · PDF file“Known knowns, known unknowns and unknown unknowns. ... he addresses the entire tribe to thank them for their

Puzzles and ProblemsHere are some of the problems we are going to look at today.

1. Two sons?:

Mr and Mrs Jones have two children, one of their children is a boy. What is the probability that they have two

sons? (Assume that all children are either boys or girls, independently, with probability 1

2M

2. The Blue eyed-island puzzle

There is an island upon which a tribe resides.The tribe consists of 1000 people, with various eye colours.Yet,

their religion forbids them to know their own eye color, or even to discuss the topic; thus, each resident can

(and does) see the eye colors of all other residents, but has no way of discovering his or her own (there are no

reflective surfaces).If a tribesperson does discover his or her own eye color, then their religion compels them

to commit ritual suicide at noon the following day in the village square for all to witness.All the tribespeople

are highly logical and devout, and they all know that each other is also highly logical and devout (and they all

know that they all know that each other is highly logical and devout, and so forth).

Of the 1000 islanders, it turns out that 100 of them have blue eyes and 900 of them have brown eyes, although

the islanders are not initially aware of these statistics (each of them can of course only see 999 of the 1000

tribespeople). One day, a blue-eyed foreigner visits to the island and wins the complete trust of the tribe. One

evening, he addresses the entire tribe to thank them for their hospitality.However, not knowing the customs,

the foreigner makes the mistake of mentioning eye color in his address, remarking “how unusual it is to see

another blue-eyed person like myself in this region of the world”.

What effect, if anything, does this faux pas have on the tribe?

3. Tossing coins:

I toss 20 fair coins. What is the probability of exactly 10 heads?

4. General Election

You visit a large country (population > 1 000 000) during its General Election. Before arriving you know

nothing about the country.

You find out that there are just two political parties, The Reds and The Blues. You chat to five local people,

picked at random, three tell you they support The Reds, two support The Blues.

What is the probability that the Reds will win the election?

5. A Marathon Race

You are in a city on a day of its Marathon. You are watching the race, and see a runner with vest number 1023.

How many runners are taking part in the race?

2 |

Page 3: “Known knowns, known unknowns and unknown unknowns.” · PDF file“Known knowns, known unknowns and unknown unknowns. ... he addresses the entire tribe to thank them for their

Conditional ProbabilityThe key result which we will use today is the formula for conditional probability; which tells us exactly what

we need to do to our assessment of probabilities as we discover more information. The formula is easy derive,

but some of the results are not necessarily intuitive, and many people find that conditional probability ques-

tions require a great deal of thought.

The probability that A occurs given that we know that B has occurred is given by the for-

mula:-

PHA BL =PHA Ý BL

PHBLThis result was first discovered by Clergyman Thomas Bayes and it is often known as Bayes’ Theorem.

The Medical Test

I think that the following example is perhaps the most important example of conditional probability to under-

stand, and illustrates why many people jump to wrong conclusions.

You live in a country where a rare and deadly disease affects one person in 10, 000.

A test for a disease is accurate 99% of the time. You take the test and it comes back positive, i.e. saying that

you have the disease. What is the probability that you do in fact have the disease?

Solution

Imagine a million people live in the country, on average 100 of them will have the disease, and 99 of these will

test positive. (On average there will be hundred ill people but one will test negative in error.)

The population will also consist of 999, 900 well people , of which 9, 999 (1% of the well people) will test

positive for the disease.

In total 10, 098 will test positive and the overwhelming majority (9,999) of them will be false positives. In fact

the probability is 9999

10 098, or approximately 99 % that you do not have the disease.

Most people in such a situation would be alarmed by a positive test, but their intuition is very misleading. The

accuracy of the test is less significant than the rarity of the disease.

The vast majority of miscarriages of justice are based in misapprehensions of this type.

| 3

Page 4: “Known knowns, known unknowns and unknown unknowns.” · PDF file“Known knowns, known unknowns and unknown unknowns. ... he addresses the entire tribe to thank them for their

The “Two sons” puzzle.

“Mr and Mrs Jones have two children. You know that one of their children is a boy. What is the proba-

bility that they have two sons?”

Here are two solutions: alarmingly each with different answers!!

Solution 1: We know that the couple have two children and there are four equally likely possibilities

BB, BG, GB, GG, for this. Of the three possibilities containing a boy BB, BG, GB, only one consists of two

boys. So the answer is 1

3.

Solution 2 If you consider the son you know about, then they have one sibling. This sibling is either brother or

a sister. So the probability is 1

2.

The correct answer is “It depends on how we know that one of their children is a boy!”

What do we know? How do we know it?

The infamous Monty Hall Problem is a famous counter-intuitive result, and shows that when we calculate

conditional probabilities we need to look very closely about the Information provided.

Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car;

behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors,

opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is

it to your advantage to switch your choice?

The answer is that you should switch, and in so doing you will have a 2

3 chance of winning the car. However,

many people still refuse to believe this!!

A very clear explanation is available here: http://www.youtube.com/watch?v=njqrSvGz8Ps

The Blue-Eyed Island Problem

4 |

Page 5: “Known knowns, known unknowns and unknown unknowns.” · PDF file“Known knowns, known unknowns and unknown unknowns. ... he addresses the entire tribe to thank them for their

The Blue-Eyed Island Problem

Most people when they first meet the blue-eyed island problem is that the traveler tells the tribe a fact that they

already know. However, if you follow the details through the tribe end up killing themselves on the 100th day.

A good explanation can be found here: http://terrytao.wordpress.com/2008/02/05/the-blue-eyed-islanders-

puzzle/

Conclusion

All of the problems we have looked at so far, the Medical Test, The Monty Hall Problem, The Blue-Eyed

Island Problem and The Two Sons Problem, show how our intuition can let us down. In particular it shows that

we need to look very carefully both at what we know, but also exactly how we found it out.

Pascal’s TriangleThe Mathematician Blaise Pascal found a neat way of finding probabilities associated with repeated random

events (for example tossing many coins, or rolling a large number of dice), and many of you will have met his

triangle which summed up the number of ways of obtaining different outcomes:

A whole talk could be filled with the extraordinary properties of Pascal’s Triangle.

We shall derive the formula (which you will meet again at A-Level) concerning the probability of the number

of successes from n-trials.

The probability of k successes from n trials each with probability of success p is given by:

PHk successesL = Ckn pkH1 - pLn-k

Ckn

=n!

k ! Hn - kL !

| 5

Page 6: “Known knowns, known unknowns and unknown unknowns.” · PDF file“Known knowns, known unknowns and unknown unknowns. ... he addresses the entire tribe to thank them for their

PHk successesL = Ckn pkH1 - pLn-k

Ckn

=n!

k ! Hn - kL !

This distribution is known as the Binomial Distribution.

Known Unknowns V Unknown Unknowns

We can now answer the coin tossing problem: I toss a fair coin 20 times, what is the probability I obtain 10

heads? We simply we need to use Pascal’s Triangle C10

20

220» 18 %

The following question seems similar, but is more interesting and more difficult. A basketball player has made

20 shots and scored 10 baskets. What is the probability that his next shot is successful? This problem is much

more like real life. In the first case the probability is assumed to be known (the coin is fair) and we find the

probability of obtaining certain data. However, except for special objects like coins and dice, normally we

know the data, and have to try and find the probability.

6 |

Page 7: “Known knowns, known unknowns and unknown unknowns.” · PDF file“Known knowns, known unknowns and unknown unknowns. ... he addresses the entire tribe to thank them for their

Bayes’ Billiard Ball ProblemDespite being largely ignored during his own lifetime, Thomas Bayes’ paper A Treatise on the Doctrine of

Chance is probably one of the most influential essays ever written, and underpins virtually all of contemporary

Probability Theory. In it he looked at the exceptionally important problem, of how to infer the probability of an

event from the results.

He imagined that a red ball is placed at random on a billiard table, and its position, p, from near cushion to far

cushion is measured on a scale from 0 to 1. If a second white ball is placed at random on the table then the

probability that it lies to the left of the original ball is p. An observer in the billiard room knows the value of p,

(simply by looking where the red ball is) and can simply use Pascal’s Binomial Distribution to work out all the

associated probabilities of future events.

However imagine a person in the neighbouring room who only knows the running score from lots of trials (i.e.

the number of times the white ball lies to the left or to the right). This observer needs to change their assess-

ment of p, based on what they have observed. Bayes gave a precise formula about how this should be done.

He derived the formula:

f Hp DataL µ PHData pL f HpLwhere f HpL represents our assessment probability density of p before we do the experiment, this is known as

the prior distribution, and f Hp DataL is our assessment of the probability after we have done the experiment.

This is known as the posterior distribution.

The Election Problem

If before the election we assume that the proportion of red voters is uniformly distributed between 0 and 1 (i.e.

all election results are in some sense equally likely). According to Bayes’ Result the probability that a person

votes red is given by the density function which is proportional to p3H1 - pL2 (three successes and two failures)

and to find the probability that more than half vote red we need to find the area under the graph.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

Probability person votes red

Pro

babil

ity

densit

y

So the probability that more than half the electorate vote red is the shaded region which is 65.63 %

| 7

Page 8: “Known knowns, known unknowns and unknown unknowns.” · PDF file“Known knowns, known unknowns and unknown unknowns. ... he addresses the entire tribe to thank them for their

The German Tank ProblemOne of the triumphs of Bayesian Statistics occurred during the Second World War when Allied Mathemati-

cians were able to get exceptionally accurate estimates for the size of the German army from serial numbers of

destroyed and captured vehicles.The records of the Speer Ministry, which was in charge of Germany’s war

production, were recovered after the war.  The table below gives the actual tank production for three different

months, the estimate by statisticians from serial number analysis, and the number obtained by traditional

American/British “intelligence” gathering.

Bayesian Analysis is central to modern anti-terrorist measures, and is one of the most active areas of research,

especially since the explosion in the amount of available data makes it harder to tell the signal from the noise.

The Marathon Problem

There is no single correct answer, but I would be interested to hear some of your suggestions!

8 |

Page 9: “Known knowns, known unknowns and unknown unknowns.” · PDF file“Known knowns, known unknowns and unknown unknowns. ... he addresses the entire tribe to thank them for their

Conditional Probability

1. I think of a number between one and ten. a) What is the probability that the number is prime? b) If

the number is prime, what is the probability the number is even?

2. A hand dealt from a standard deck of cards consisting of one ace and one 10, Jack, Queen or King is

known as “Blackjack”

a) If you pick two cards from a randomly shuffled pack, what is the probability of getting Blackjack.

b) Alice is dealt two cards from the pack, and a first card is an Ace. What is the probability that she gets

Blackjack (i.e. that her second card is a 10, Jack, Queen or King)?

c) Bob is also dealt two cards from the pack, and his first card is a 10. What is the probability that he gets

Blackjack (i.e. that her second card is a 10, Jack, Queen or King)?

d) Chris gets dealt Blackjack. What is the probability that his first card was an ace? Does this contradict parts

b)? and c)?

3. The following table shows the admissions data for a (fictional) university.

Science Humantities Overall

Applied Succesful Applied Successful Applied Successful

Male 1200 600 800 160 2000 760

Female 800 420 1200 300 2000 720

Total 2000 1020 2000 460 4000 1480

a) Which is greater: i) The probability that a male scientist will be accepted. ii) The probability that a female

scientist will be accepted?

b) Which is greater: i) The probability that a male humanities student will be accepted. ii) The probability that

a female humanities student will be accepted?

c) Which is greater: i) The probability that a male student will be accepted. ii) The probability that a female

student will be accepted?

d) What is surprising about these results? Is there any suggestion that the University is biased in favour of male

or of female candidates.

4. Albert is taller than Bob, what is the probability Chris is taller than Albert?

5. You are on a magic island where inhabitants always answer all questions (independently) truthfully

with probability 2

3 and lie with probability

1

3. You have the following conversation:

You: What is your name?

Islander: Bob.

You: Is that true?

Islander: Yes.

a) What is the probability the islander you are talking to is called Bob?

You talk to a different islander and have the following conversation:

You: What is your name?

Islander: Andrew.

You: What is your name?

Islander: Andrew.

b) What is the probability the islander is called Andrew? Is your answer the same as part a?

| 9

Page 10: “Known knowns, known unknowns and unknown unknowns.” · PDF file“Known knowns, known unknowns and unknown unknowns. ... he addresses the entire tribe to thank them for their

6. The probability of winning the lottery is one in fourteen million. The probability that a local

newspaper makes a typo in the lottery results is one in one hundred thousand. One day the local

paper publishes your results as the winning lottery numbers. What is the probability that you have in

fact won the lottery?

7. Joe has red hair. What is the probability that his mother has red-hair?

N.B. This is a very hard problem!! What factors are you going to take into account?

Counting Problems and Pascal’s Triangle

1. How many different ways are there of ordering the letters in the word: i) CAT. ii) BOAT. iii)

APPLE. iv) MATHEMATICS.

2. A chess competition consists of 20 players. Each player plays a single match between each of the

other 19 participants. Explain why the total number if matches is given by 20�19

2?

3. In the card game Brag players are dealt three cards from the pack. Why is the total number of hands

not 52�51�50? How many hands are?

4. Andrew, Barry, Chris, David and Edward have three tickets for a concert.

a) How many different ways are there of choosing which three people go to the gig? Repeat this question if

they have two tickets. Can you explain this result.

b) Andrew, Barry, Chris, David and Edward have three tickets for a concert. One is a VIP Ticket, one is a

Luxury Ticket, and one is a Standard Ticket. How many ways are there of allocating these tickets to three

different people? Why is the number bigger than in part a)?

5. Andrew invites 6 people to a party. How many ways are there in which:

a) Nobody turns up? b) One person turns up? c) Two people turn up? d) Three people turn up? e) Four people

turn up? f) Five people turn up? g) Everyone turns up?

h) How many possible parties are there in total? Can you explain a different way of getting this number?

i) Can you generalize this to a result about Pascal’s Triangle.

6. a) How many ways are there of traveling from the bottom left hand square of an 8�8 chessboard, to

the top right hand square if all moves are either up or right?

b) Can you relate this to your solution of question 1?

c) How does this question relate to Pascal’s Triangle?

7. In the National Lottery six lucky numbers are chosen from a bag containing 49 numbers. What is the

probability of having a ticket which contains the six lucky numbers?

8. An ice cream shop has three flavours of ice cream, Chocolate, Vanilla and Strawberry. It sells ice-

creams of three scoops, which do not necessarily need to be different flavours. (So you could order

chocolate-chocolate-chocolate if you wanted to). How many different ice-creams are possible? a) If

order matters. b) If order does not matter?

9. What is the sum of the first twelve triangle numbers? Where is this number on Pascal’s Triangle?

[N.B. This number is the same as the total number of gifts in the song The Twelve Days of

Christmas]

10. An artist paints cubes with three colours: red, green and blue. Two colourings are the same if they

can be rotated on to one another. How many different colourings of the cube are possible?

Pascal’s Triangle and Probability

10 |

Page 11: “Known knowns, known unknowns and unknown unknowns.” · PDF file“Known knowns, known unknowns and unknown unknowns. ... he addresses the entire tribe to thank them for their

Pascal’s Triangle and Probability

1. I toss 3 fair coins. By drawing a probability tree, find the probability of each different number of

heads?

2. Why could you not use this method to find the distribution from the number of heads from 100

throws?

3. The graph below shows the probability of different numbers of heads from 100 throws.

20 30 40 50 60 70 80

0.00

0.02

0.04

0.06

0.08

Number of Heads

Pro

babil

ity

a) How were these probabilities calculated?

b) Describe the shape of the distribution?

c) *Can you find a formula for the shape of a curve?

4. A bag contains three red balls and five green balls. I pick three balls at random. What is the

probability they are all red? a) If I replace the balls. b) If I do not replace the balls? c) Which of

these numbers is bigger and why?

Bayes Theorem

1. A bag contains ten coins, nine are fair and one has a heads on both sides. A single coin is picked

from the bag.

a) If the coin lands head, what is the probability that coin you picked was the biased coin?

b) If the coin is tossed twice and lands heads both times. What is the probability that the coin is biased?

c) If the coin is tossed one hundred and lands heads each time. What is the probability that the coin is biased?

d) Is it possible to ever know for sure form the results of the coin tosses whether the coin you have selected is

biased? When would you know that the coin is fair?

2. A rare disease affects 1 % of the population. A test for the disease is accurate 99 % of the time. You

take the test and it says that you have the disease. a) What is the probability that you do in fact have

the disease?

b) You take the test again, and it comes back positive for a second time. What is the probability that you have

the disease? What assumptions does your answer depend on?

| 11

Page 12: “Known knowns, known unknowns and unknown unknowns.” · PDF file“Known knowns, known unknowns and unknown unknowns. ... he addresses the entire tribe to thank them for their

3. There are ten people in a railway carriage. Andrew is the third tallest person. A new person enters

the carriage, show that the probability that they are taller than Andrew is 3

11? [Hint: If the people

are ranked in height order consider where the new person can slot in.]

4. A boxer has fought ten fights. She has won seven and lost two. What is the probability that she wins

her next fight? [Hint: how is this question like question 3?]

5. A coin is tossed ten times, and lands heads seven times and tails twice. What is the probability that

the next toss is a heads? [Is this question also like 3 and 4. If not, why not?]

6. In a race against three (randomly chosen) athletes, James is the second fastest runner. James then

competes against a three different (randomly chosen) athletes.

a) What is the probability he wins the second race?

b) What is the probability he comes second in the second race?

c) What is the probability that he comes third in the second race?

d) Can you generalize? What is the probability that a runner finishing in jth place in a race of m runners,

finishes in kth place in a race of n runners?

7. A coin is tossed 10 times and lands heads 10 times. What is the probability the next toss is a head?

(Consider your answer to question 1,3 and 5. N.B> There is no single correct answer here, but think

what you need to take into account.)

8. The Marathon Problem: You see a runner in a race with vest number 1023. How many people are

running in the race?

12 |