probability

68
PROBABILITY

Upload: shadi

Post on 23-Feb-2016

15 views

Category:

Documents


0 download

DESCRIPTION

probability. Probability. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: probability

PROBABILITY

Page 2: probability

Probability

Probability is a measure of how likely something is to happen. If you flip a coin, the probability of the coin landing heads is 50%, meaning that you expect it to land heads 50 times out of every 100 flips. If you roll a die, the probability of the die landing on 4 is 1/6 (because a die has 6 faces), meaning that on average, you would roll a 4 once every 6 rolls.

Page 3: probability

Conditional Probability

Sometimes the probability of an event is increased or decreased by other events. The probability that there is no final for this class is very low. But IF I die, the probability is much higher. And IF Lingnan closes, the probability is much much higher. We say that the probability of P IF Q is the probability of P conditional on Q.

Page 4: probability

Representing Probabilities

We can represent probabilities using the symbol P(-). [This is a little confusing, because before we were using P to represent sentences.] For example:

P(H) = 50%

This might mean “the probability of the coin landing heads is 50%.”

Page 5: probability

So for example, before we learned that the probability of A happening is always greater than or equal to the probability of A and B happening. We can represent this truth as follows:

P(A) ≥ P(A & B)

Page 6: probability

Conditional Probabilities

We also have a way of representing conditional probabilities: P(A/B) means “the probability of A conditional on B” or “the probability that A will happen IF B happens.”

P(~F/C) > P(~F)The probability that there will be no final conditional on Lingnan closing is greater than the probability that there will be no final.

Page 7: probability

Review: Which of the following two statements is true?

1. P(Fido is an animal/ Fido is a dog) = 100%.

2. P(Fido is a dog/ Fido is an animal) = 100%.

Page 8: probability

Experiments

Page 9: probability

Scientific Method

Science proceeds by the hypothetico-deductive method, which consists of four steps:1. Formulate a hypothesis2. Generate testable predictions3. Gather data4. Check predictions against observations

Page 10: probability

Experiments

Today we’re going to talk about experiments and good experimental design.

How do we design experiments that can test our hypotheses? Experiments that can generate data that are relevant to our predictions?

Page 11: probability

Causation

Much of science is concerned with discovering the causal structure of the world.

We want to understand what causes what so we can predict, explain, and control the events around us.

Page 12: probability

Prediction

For example, if we know that rain is caused by cool, dry air meeting warm, wet air then we can predict when and where it will rain, by tracking air currents, temperature, and moisture.

Page 13: probability

Prediction

This is important because rain affects our ability to engage in everyday activities, like traveling or exercising.

Knowledge of causation lets us make predictions, which helps us make plans

Page 14: probability

Explanation

One way to explain something is to determine what causes it.

For example, if you find out that a certain virus causes a disease among bears, then you have explained why the animals are getting sick.

Page 15: probability

Explanation

This is important because once you know an explanation for a disease (what causes it), you can begin treating it– for example, with antiviral drugs.

Page 16: probability

Control

Finally, if we know what causes some effect, then we can control nature to our advantage.

For example, if you don’t know what causes diamonds, you have to look through mines to find some.

Page 17: probability

Control

But when we know that diamonds are caused by carbon under high pressure, high temperature conditions, we can simply re-create those conditions to grow as many diamonds as we want.

Page 18: probability

CAUSATION VS. CORRELATION

Page 19: probability

Independence

In statistics, we say that two variables are independent when the value of one variable is completely unrelated to the other:

P(A/ B) = P(A) and P(B/ A) = P(B)

B happening does not make A any more likely to happen. (If that’s true, so is the reverse.)

Page 20: probability

Example

For example, recall one of our non-random sequences of coin flips:

XOXXOXOXOOXXOXOOXOXO

How did we know that this sequence was non-random? Because whether the coin lands X or O is not independent of the other tosses.

Page 21: probability

Example

For example, recall one of our non-random sequences of coin flips:

XOXXOXOXOOXXOXOOXOXO

P(X/ O) = 7/9, P(X) = 10/20P(O/ X) = 8/10, P(O) = 10/20

Page 22: probability

Correlation

Two variables A, B that are not independent are said to be correlated.

A and B are positively correlated when P(A/ B) > P(A). If B happens, A is more likely to happen.

A and B are negatively correlated when P(A/ B) < P(A). If B happens, A is less likely to happen.

Page 23: probability

Correlation

Other relationships between variables are often called correlation as well.

A and B are positively correlated when increases in A correspond to increases in B.

A and B are negatively correlated when increases in A correspond to decreases in B.

Page 24: probability

Positive Correlation Example

For example, demand and price are positively correlated.

If demand increases for a certain product, then the price of that product increases. If demand decreases, price decreases.

Page 25: probability

$250,000 for 1 Rhino Horn

A greatly increased demand for rhino horn in traditional Chinese medicine has led to a tremendous price increase for the horns.They are worth so much now that all 5 species of rhino are close to extinction.

Page 26: probability

Negative Correlation Example

On the other hand, supply and price are negatively correlated.

If supply increases for a certain product, then the price of that product decreases. If supply decreases, price increases.

Page 27: probability

Pork Prices Predicted to Soar

So recently, higher corn prices have made pig-farming less profitable, leading to a decreased supply of pigs.

Experts are predicting that there will be an increase in pork prices next year.

Page 28: probability

Causation and Correlation

One thing that can lead two variables A and B to be correlated is when A causes B.

For example, if having a cold causes a runny nose, then having a cold is correlated with having a runny nose:

P(cold/ runny nose) > P(cold)

Page 29: probability

Causation and Correlation

Similarly, the number of cars on the road is correlated with the number of accidents: if there is an increase in the number of people driving, there will be an increase in the number of car accidents.

This is because a larger number of cars causes a larger number of accidents.

Page 30: probability

Causation ≠ Correlation

But causation does not imply correlation. If A and B are correlated there are several possibilities:

• A causes B• B causes A• C causes A and C causes B• A and B are only accidentally correlated

Page 31: probability

B causes A

Whenever there are lots of police at a location, the chance that there is a criminal there goes up.

So do police cause crime? No, exactly the opposite: crime causes the police to show up!

Page 32: probability

B causes A

Here’s a somewhat more realistic example. It has been observed that democracies tend to get in fewer wars than non-democratic countries.

A plausible inference would be that the negative correlation between democracy and war is due to the fact that democracy causes peace.

Page 33: probability

B causes A

But there’s another explanation, and some studies have suggested that it’s the right one.

Frequent wars cause a country to not be democratic. Countries that get in a lot of wars don’t have the stability that’s necessary for democracy to flourish.

Page 34: probability

Common Cause

Sometimes A and B are correlated, not because A causes B and B causes A, but instead because a third variable C, the common cause, causes both A and B.

Page 35: probability

Porn and Rape

A study of U.S. prison inmates found that prisoners who had been exposed to pornography earlier in life were less likely to be in prison for rape, compared with those exposed to porn later in life.

Page 36: probability

Porn and Rape

Does this mean that exposure to porn early in life prevents men from becoming rapists? Should you give your children porn?

No. Inmates who had been exposed to porn later were more likely to have had a religious fundamentalist upbringing.

Page 37: probability

Porn and Rape

And a religious fundamentalist upbringing was correlated with higher rates of sexual deviancy (and rape).

Fundamentalist upbringing caused both late exposure to porn and higher chances of sexual crimes.

Page 38: probability

Coincidence

Page 39: probability

The “Texas Sharp Shooter”

Suppose I stand in front of a barn. I have a machine gun with me, and I am blindfolded. I shoot wildly at the barn for several minutes.

Afterward, I walk up to the barn. I find a spot where three bullets are very close together, and I paint a target around them. “Look!” I say, “at what an excellent marksman I am!”

Page 40: probability

Rare Things are Frequent

Rare coincidences are bound to happen sometimes. How likely is it that someone will both win the lottery and get struck by lightning?

Well, there is 1 lottery every week, 50 every year. In a span of 30 years, 1500 people will win the lottery.

Page 41: probability

Getting Struck by Lightning

There is a 1 in 1 million chance of getting struck by lightning in any given year. Let’s suppose each lottery winner on average lives 30 years after winning. That’s 30 distinct 1 in 1 million chances of getting struck, or a 30 in 1 million chance of getting struck in 30 years.

P(struck) = 1 – P(not struck) = 1 – .999999^30

Page 42: probability

Winners Getting Struck by Lightning

So what’s the probability that any of the 1,500 winners will get struck?

P(some winner is struck) = 1 – P(no winner is struck) = 1 – .99997^1500 = 1 – .955997 = .044 = 4.4%.

That’s higher than the probability that a coin will land heads 5 times in a row.

Page 43: probability

Lucia de Berk

In 2006, Lucia de Berk, a nurse at a hospital in the Netherlands was convicted of killing 7 children.

There was no evidence against her except for the fact that she was in the room during or before each of the deaths.

Page 44: probability

Correlation

Prosecutors reasoned that there was a correlation: Lucia de Berk in the room & death.

It couldn’t be that the deaths caused her to be in the room.It couldn’t be that some common cause C both caused her to be in the room and the deaths.So the only other option was that she caused the deaths.

Page 45: probability

Coincidence

But there was a third option: coincidence.

How many hospitals are there in all the world? How many nurses work at each of those hospitals? What are the chances that, just by accident, in one of those hospitals one of those nurses just happened to be present for 7 deaths?

Page 46: probability

Rare Things are Frequent

Richard Gill, Professor of Mathematical Statistics at the University of Leiden, worked hard to overturn the case. He estimated that the chance that this was an accident was 1 in 9.

This doesn’t prove that she’s innocent (or guilty). But things that have a 1 in 9 chance of happening happen all the time!

Page 47: probability

EXPERIMENTAL DESIGN

Page 48: probability

Types of Scientific Studies

There are two basic types of scientific studies (the stuff that gets published in scientific journals and reported in the “science” section of the newspaper):

• Observational studies• Controlled experiments

Page 49: probability

Observational Studies

An observational study looks at data in order to determine whether two variables are correlated.

Page 50: probability

Observational Studies

For example, an observational study might ask women to record how much wine they drink, and also to report if they develop breast cancer. After many years, a correlation may be found between wine consumption and cancer.

Page 51: probability

Importantly, observational studies can only show whether two variables A and B are correlated. They cannot show whether A causes B, or B causes A, or some third cause causes both, or if the correlation is accidental.

Page 52: probability

Controlled Experiments

The first recorded controlled experiment occurs in the Book of Daniel, part of the Jewish Torah and the Christian Bible.

Page 53: probability

Daniel’s Experiment

Daniel wanted to discover which of two diets was better: a diet of meat and wine, or vegetables.So he proposed that some servants eat one diet and the rest eat the other. Then at the end of 10 days, they’d see who looked healthier.

Page 54: probability

Controlled Experiments

In a controlled experiment there are two groups who get separate treatments.

One group, the “control group” gets the standard treatment. For example, all of the king’s servants ate meat and wine before Daniel suggested a different diet might be better.

Page 55: probability

Controlled Experiments

The other group, the “experimental group”, gets the treatment we plan to test.

If the test group has better results than the control group, we have good evidence that our new treatment should be adopted.

Page 56: probability

Why are They Better?

Observational studies only reveal correlations, they can’t reveal causation.

Controlled experiments are also only studies of correlation: correlation between the control group and outcomes, and correlation between the experimental group and outcomes.

Page 57: probability

Why are They Better?

But controlled experiments are better than observational studies. Why?

In observational studies, people are not randomly assigned to conditions. For example, an observational study might find a correlation between using a cane and dying within a year.

Page 58: probability

Canes

This is because old people are more likely to use a cane and more likely to die (than young people).If you randomly assigned young and old people to cane or no-cane conditions, the correlation would go away. Canes don’t cause death.

Page 59: probability

Confounding Variables

A confounding variable is a variable that affects the variables you want to study.

For instance, if you want to study whether canes cause death, age is a confounding variable, because age influences your chances of death.

Page 60: probability

Confounding Variables

A controlled experiment lets you “control for” confounding variables. You can make the control group and the experimental group have equal numbers of people from each age group.

Then you know that if more people in your experimental group die, it wasn’t due to their age (the other group had similar ages).

Page 61: probability

Controlling

In an observational study, there is no way to rule out a common cause for two correlated variables A and B.

In an experimental study, the common cause is ruled out, because the experimenter is the one who causes (“controls”) whether people have A or not.

Page 62: probability

Controlling

In an observational study, there is no way to rule out B causing A rather than A causing B. Does wine reduce the risk of cancer, or does a lowered risk of cancer increase wine consumption?

If experimenters control who gets wine, then we can rule out the hypothesis that in our study, lowered cancer risk causes wine drinking.

Page 63: probability

Next Time

We’ll talk more next time about other things that can bias an experiment and how to “control for” them.

Page 64: probability

What about Observational Studies?

Why do scientists still conduct observational studies, if controlled experiments are considered better evidence?

1. Moral reasons2. Practical reasons

Page 65: probability

Moral Reasons

Sometimes performing a controlled experiment would be unethical.

For example, suppose we want to know whether vaccines cause autism (NOTE: they do not).

We cannot simply stop vaccinating people.

Page 66: probability

Moral Reasons

If you stopped vaccinating children, you’d effectively be killing lots of children (and adults).

Vaccines prevent lots and lots of otherwise deadly infectious diseases.

Page 67: probability

Moral Reasons

Thus you must conduct an observational study. Find people who (for whatever reason) chose not to vaccinate their children, and compare their rates of autism to those of the vaccinated children.

When you do this you find that vaccines do not cause autism. (No correlation, hence no causation.)

Page 68: probability

Practical Reasons

Some controlled experiments are also simply impractical.

Does being smart make you rich? Well, we can’t make a random group of people smart. That’s impossible. Does being rich make you smart? Well, we can’t give a random bunch of people a lot of money– we’re just poor scientists!