statistics and probability
Post on 19-Jun-2015
632 Views
Preview:
DESCRIPTION
TRANSCRIPT
What is Data?
Data is a collection of facts, such as values or measurements.
It can be numbers, words, measurements, observations or even just
descriptions of things.
Qualitative vs Quantitative
Data can be qualitative or quantitative.
Qualitative data is descriptive information (it describes something)
Quantitative data, is numerical information (numbers).
And Quantitative data can also be Discrete or Continuous:
Discrete data can only take certain values (like whole numbers)
Continuous data can take any value (within a range)
Put simply: Discrete data is counted, Continuous data is measured
Example: What do we know about Arrow the Dog?
Qualitative:
He is brown and black
He has long hair
He has lots of energy
Quantitative:
Discrete:
He has 4 legs
He has 2 brothers
Continuous:
He weighs 25.5 kg
He is 565 mm tall
To help you remember think "Quantitative is about Quantity"
Collecting
Data can be collected in many ways. The simplest way is direct observation.
Example: you want to find how many cars pass by a certain point on a
road in a 10-minute interval.
So: stand at that point on the road, and count the cars that pass by in
that interval.
You collect data by doing a Survey.
Census or Sample
A Census is when you collect data for every member of the group (the
whole "population").
A Sample is when you collect data just for selected members of the group.
Example: there are 120 people in your local football club.
You can ask everyone (all 120) what their age is. That is a census.
Or you could just choose the people that are there this afternoon. That
is a sample.
A census is accurate, but hard to do. A sample is not as accurate, but may be
good enough, and is a lot easier.
Language
Data or Datum?
The singular form is "datum", so we would say "that datum is very high".
"Data" is the plural so we can say "the data are available", but it is also
a collectionof facts, so "the data is available" is fine too.
Discrete and Continuous Data
Data can be Descriptive (like "high" or "fast") or Numerical (numbers).
And Numerical Data can be Discrete or Continuous:
Discrete data is counted, Continuous data is measured
Discrete Data
Discrete Data can only take certain values.
Example: the number of students in a class (you can't have half a
student).
Continuous Data
Continuous Data can take any value (within a range)
Examples:
A person's height: could be any value (within the range of
human heights), not just certain fixed heights,
Time in a race: you could even measure it to fractions of
a second,
A dog's weight,
The length of a leaf,
Lots more!
Analog and Digital
Analog: something physical with continuous change.
Digital: made up of numbers.
Arrow Barks!
Let's record him barking:
Arrow's bark is analog. It is actual pressure waves in the air, so it is physical
with continuous change.
Continuous change: changes smoothly ... no sudden breaks.
And the microphone converts that pressure into an electrical signal. It is
stilll analog (the electricity is physical, and has continuous change).
But when it gets to your computer or
phone it gets
converted to digits!
Thousands of times a second the
analog signal is measured by
special electronics ... and is then
saved as numbers.
So the "sound" is now "12, 25, 39, 52, 68, 71, 78, 82, 82, 79, 70, 59, ..." (in
fact it would be inbinary, so would be something like
"000011000001100100100111...")
It is now digital!
Notice the digital data has sudden jumps up and down ... it
does not change continuously.
It is Discrete Data: that means it can only be certain values (such as 1, 2, 3,
etc).
Digital data is very easy for computers and phones to use. It can be saved,
shared electronically, sent all over the world quickly and more.
How can we hear Digits?
Easy! The numbers are used to control the size of an electrical signal, which
is analog.
The electricity can be
sent to a speaker ...
... to make sound waves
again!
It should sound very much like the original bark (but not perfectly so!)
Digital Pictures
A similar thing happens when you take a picture.
Light (which is analog) gets projected onto a grid of millions of little sensors
inside the camera:
The camera measures the light at each point and produces numbers.
The picture is now digital!
So the "picture" is now "A1DDF9, ADE3FF, B5E7FE, AFE4F8, ...", which
are hexadecimal color numbers, (that are used internally in binary, so would
be something like "101000011101110111111001...")
Look really closely at a digital picture ... it is made up of millions of little
squares called "pixels":
Each "pixel" is made using a hexadecimal color number.
Digital IS Numbers
So digital pictures, music, videos etc are actually stored on your computer or
phone as numbers.
Numbers rule!
Bar Graphs
A Bar Graph (also called Bar Chart) is a graphical display of data using bars
of different heights.
Imagine you just did a survey of your friends to find which kind of movie
they liked best:
Table: Favorite Type of Movie
Comedy Action Romance Drama SciFi
4 5 6 1 4
You could show that on a bar graph like this:
It is a really good way to show relative sizes: it is easy to see which types of
movie are most liked, and which are least liked, at a glance.
You can use bar graphs to show the relative sizes of many things, such as
what type of car people have, how many customers a shop has on different
days and so on.
Example: Most Popular Fruit
A survey of 145 people revealed their favorite fruit:
Fruit: Apple Orange Banana Kiwifruit Blueberry Grapes
People: 35 30 10 25 40 5
And here is the bar graph:
For that group of people Blueberries are most popular and Grapes are the
least popular.
Example: Student Grades
In a recent test, this many students got the following grades:
Grade: A B C D
Students: 4 12 10 2
And here is the bar graph:
You can create graphs like that using our Data Graphs (Bar, Line and
Pie) page.
Histograms vs Bar Graphs
Bar Graphs are good when
your data is
incategories (such as
"Comedy", "Drama", etc).
But when you
have continuous data (such
as a person's height) then
use a Histogram.
Pie Chart
Pie Chart - A special chart that uses "pie slices" to show relative
sizes of data.
Imagine you just did a survey of your friends to find which kind of movie
they liked best.
Here are the results:
Table: Favorite Type of Movie
Comedy Action Romance Drama SciFi
4 5 6 1 4
You could show that by this pie chart:
It is a really good way to show relative sizes: it is easy to see which movie
types are most liked, and which are least liked, at a glance.
You can create graphs like that using our Data Graphs (Bar, Line and
Pie) page.
Or you can make them yourself ...
How to Make Them Yourself
First, put your data into a table (like above), then add up all the values to
get a total:
Comedy Action Romance Drama SciFi TOTAL
4 5 6 1 4 20
Next, divide each value by the total and multiply by 100 to get a percent:
Comedy Action Romance Drama SciFi TOTAL
4 5 6 1 4 20
4/20
=20%
5/20
=25%
6/20
=30%
1/20
= 5%
4/20
=20% 100%
Now you need to figure out how many degrees for each "pie slice" (correctly
called a sector).
A Full Circle has 360 degrees, so we do this calculation:
Comedy Action Romance Drama SciFi TOTAL
4 5 6 1 4 20
4/20
=20%
5/20
=25%
6/20
=30% 1/20 = 5%
4/20
=20% 100%
4/20 ×
360°
= 72°
5/20 ×
360°
= 90°
6/20 ×
360°
= 108°
1/20 ×
360°
= 18°
4/20 ×
360°
= 72°
360°
Now you are ready to start drawing!
Draw a circle.
Then use your protractor to measure
the degrees of each sector.
Here I show the first sector ...
... you can do the rest!
More Examples
You can use pie charts to show the relative sizes of many things, such as:
what type of car people have,
how many customers a shop has on different days and so on.
how popular are different breeds of dogs
Example: Student Grades
Here is how many students got each grade in the recent test:
A B C D
4 12 10 2
And here is the pie chart:
Dot Plots
A Dot Plot is a graphical display of data using dots.
Imagine you just did a survey of your friends to find which kind of movie
they liked best:
Table: Favorite Type of Movie
Comedy Action Romance Drama SciFi
4 5 6 1 4
On a dot plot it looks like this:
It is a really good way to show relative sizes: it is easy to see which types of
movie are most liked, and which are least liked, at a glance. Very similar to
a bar graph.
Here is another example:
Example: Minutes To Eat Breakfast
A survey of "How long does it take you to eat breakfast?" had the following
results:
Minutes: 0 1 2 3 4 5 6 7 8 9 10 11 12
People: 6 2 3 5 2 5 0 0 2 3 7 4 1
Which means that 6 people take 0 minutes to eat breakfast (they probably
had no breakfast!), 2 people say they only spend 1 minute having breakfast,
etc.
And here is the dot plot:
Another version of the dot plot has just one dot for each data point like this:
Example: (continued)
This has the same data as above:
But notice that we need to have lines and numbers on the side so we can see
what the dots mean.
Line Graphs
Line Graph - A graph that shows information that is connected in
some way (such as change over time)
You are learning math facts, and each day you do a short test to see how
good you are. These are the results:
Table: Facts I got Correct
Day 1 Day 2 Day 3 Day 4
3 4 12 15
And here is the same data as a Line Graph:
You seem to be improving!
Scatter Plots
A graph of plotted points that show the
relationship between two sets of data.
In this example, each dot represents one person's
weight versus their height.
(The data is plotted on the graph as "Cartesian
(x,y) Coordinates")
Example:
The local ice cream shop keeps track of how much ice cream they sell versus
the temperature on that day. Here are their figures for the last 12 days:
Ice Cream Sales vs Temperature
Temperature °C Ice Cream Sales
14.2° $215
16.4° $325
11.9° $185
15.2° $332
18.5° $406
22.1° $522
19.4° $412
25.1° $614
23.4° $544
18.1° $421
22.6° $445
17.2° $408
And here is the same data as a Scatter Plot:
It is now easy to see that warmer weather leads to more sales, but the
relationship is not perfect.
Line of Best Fit
You can also draw a "Line of Best Fit" (also called a "Trend Line") on your
scatter plot:
Try to have the line as close as possible to all points, and as many points
above the line as below.
Example: Sea Level Rise
A Scatter Plot of Sea
Level Rise:
And here I have drawn
on a "Line of Best Fit".
Correlation
When the two sets of data are strongly linked together we say they have
a High Correlation.
The word Correlation is made of Co- (meaning "together"), and Relation
Correlation is Positive when the values increase together, and
Correlation is Negative when one value decreases as the other
increases
Like this:
(Learn More About Correlation)
Negative Correlation
Correlations can be negative, which means there is a correlation but one
value goes down as the other value increases.
Example : Birth Rate vs Income
The birth rate tends to be lower in richer
countries.
Below is a scatter plot for about 100 different
countries.
Country
Yearly
Production
per Person
Birth
Rate
Madagascar $800 5.70
India $3,100 2.85
Mexico $9,600 2.49
Taiwan $25,300 1.57
Norway $40,000 1.78
It has a negative correlation (the line slopes down)
Note: I tried to fit a straight line to the data, but maybe a curve would work
better, what do you think?
Pictographs
A Pictograph is a way of showing data using images.
Each image stands for a certain number of things.
Example: Apples Sold
Here is a pictograph of how many apples were sold at the local shop
over 4 months:
Note that each picture of an apple means 10 apples (and the half-
apple picture means 5 apples).
So the pictograph is showing:
In January 10 apples were sold
In February 40 apples were sold
In March 25 apples were sold
In April 20 apples were sold
It is a fun and interesting way to show data.
But it is not very accurate: in the example above we can't show just 1 apple
sold, or 2 apples sold etc.
Why don't you try to make your own pictographs? Here are a few ideas:
How much money you have (week by week)
How much exercise you get (each day)
How many hours you watch TV every week
How many sports stories are in each newspaper
Histograms
A Histogram is a graphical display of data using bars of different heights.
It is similar to a Bar Chart, but
a histogram groups numbers
into ranges
And you decide what ranges to
use!
Example: Dress Shop Survey
You asked customers who bought one of the "Aurora" range of skirts
how old they were.
The ages were from 5 to 25 years old.
You decide to put the results into groups of 5:
The 1 to 5 years old range,
The 6 to 10 years old range,
etc...
So when someone says "I am 17"
you add 1 to the "16-20" range.
And here is the result:
You can see (for example) that there were 30customers
between 6 and 10 years old
Histograms are a great way to show results of continuous data, such as:
weight
height
how much time
etc.
But if your data is
in categories (such as
Country or Favorite Movie),
then you should use a Bar
Chart.
Frequency Histogram
A Frequency Histogram is a special histogram that uses vertical columns to
show frequencies (how many times each score occurs):
Here I have added up how often 1 occurs (2
times), how often 2 occurs (5 times), etc, and
shown them as a histogram.
Frequency Distribution
Frequency
Frequency is how often something occurs.
Example: Sam played football on
Saturday Morning,
Saturday Afternoon
Thursday Afternoon
The frequency was 2 on Saturday, 1 on Thursday and 3 for the whole
week.
Frequency Distribution
By counting frequencies we can make a Frequency Distribution table.
Example: Goals
Sam's team has scored the following numbers of goals in
recent games:
2, 3, 1, 2, 1, 3, 2, 3, 4, 5, 4, 2, 2, 3
Sam put the numbers in order,
then added up:
how often 1 occurs (2 times),
how often 2 occurs (5 times),
etc,
and wrote them down as a
Frequency Distribution table:
From the table we can see interesting things such as
getting 2 goals happens most frequently
only once did they get 5 goals
This is the definition:
Frequency Distribution: values and their frequency (how often each value
occurs).
Here is another example:
Example: Newspapers
These are the numbers of newspapers sold at a local shop over the last
10 days:
22, 20, 18, 23, 20, 25, 22, 20, 18, 20
Let us count how many of each number there is:
Papers Sold Frequency
18 2
19 0
20 4
21 0
22 2
23 1
24 0
25 1
It is also possible to group the values. Here they are grouped in 5s:
Papers Sold Frequency
15-19 2
20-24 7
25-29 1
(Learn more about Grouped Frequency Distributions)
Graphs
After creating a Frequency Distribution table you might like to make a Bar
Graph or a Pie Chartusing the Data Graphs (Bar, Line and Pie) page.
Stem and Leaf Plots
A special table where each data value is split into a "leaf" (usually the last
digit) and a "stem" (the other digits). Like in this example:
Example:
"32" is split into "3" (stem) and "2" (leaf).
The "stem" values are listed down, and the "leaf" values go right (or left)
from the stem values.
The "stem" is used to group the scores and each "leaf" indicates the
individual scores within each group.
Cumulative Tables and Graphs
Cumulative
Cumulative means "how much so far".
Think of the word "accumulate" which means to gather together.
To have cumulative totals, just add up the values as you go.
Example: Jamie has earned this much in the last 6 months:
Month Earned
March $120
April $50
May $110
June $100
July $50
August $20
To work out the cumulative totals, just add up as you go.
The first line is easy, the total earned so far is the same as Jamie
earned that month:
Month Earned Cumulative
March $120 $120
But for April, the total earned so far is $120 + $50 = $170 :
Month Earned Cumulative
March $120 $120
April $50 $170
And for May we continue to add up: $170 + $110 = $280
Month Earned Cumulative
March $120 $120
April $50 $170
May $110 $280
Do you see how we add the previous month's cumulative total to this
month's earnings?
Here is the calculation for the rest:
June is $280 + $100 = $380
July is $380 + $50 = $430
August is $430 + $20 = $450
And this is the result
Month Earned Cumulative
March $120 $120
April $50 $170
May $110 $280
June $100 $380
July $50 $430
August $20 $450
The last cumulative total should match the total of all earnings:
$450 is the last cumulative total ...
... it is also the total of all earnings:
$120+$50+$110+$100+$50+$20 = $450
So we got it right.
So that's how to do it, add up as you go down the list and you will have
cumulative totals.
You could also call it a "Running Total"
Graphs
You can make cumulative graphs if you want. Just plot each Month's
cumulative total:
Cumulative Bar Graph Cumulative Line Graph
How to Do a Survey
Survey Says ...
Turn on the television, radio or open a
newspaper and you will often see the
results from a survey.
Gathering information is an important way to help people
make decisions about topics of interest.
Surveys can help decide what needs changing, where money
should be spent, what products to purchase, what problems
there might be, or lots of other questions you may have at
any time.
The best part about surveys is that they can be used to
answer any question about any topic.
You can survey people (through questionnaires, opinion polls, etc)
or things (like pollution levels in a river, or traffic flow).
Four Steps
Here are four steps to a successful survey:
Step one: create the questions
Step two: ask the questions
Step three: tally the results
Step four: present the results
Let us look at those steps in more detail ...
Step One: Create the Questions
The first thing is to decide is
What questions do you want answered?
Sometimes these may be simple questions like:
"What is your favorite color?"
Other times the questions may be quite complex
such as:
Which roads have the worst traffic conditions
Simple Surveys
If you are doing a simple survey, you could use tally marks to represent each
person‟s answer:
Sometimes, it is helpful to be creative in how the people can respond. It
makes it more fun for both you and your respondents (the people answering
the question).
Example:What is your favorite color?
Have them write down their favorite color on a piece of paper and drop
it in a fish bowl.
Then, put all of the pieces of paper into piles and count them.
To help you make a good Questionnaire read our page Survey
Questions.
Step Two: Asking The Questions
Now you have your questions, go out and ask them! But who to ask?
If you survey a small group you can ask everybody (called a Census)
If you want to survey a large group, you may not be able to ask everybody
so you should ask a sample of the population (called a Sample)
If your are Sampling you should be careful who you ask.
To be a good sample, each person should be
chosen randomly
If you only ask people who look
friendly, you will only know what
friendly people think!
If you went to the swimming pool and
asked people "Can you swim?" you
will get a biased answer ... maybe
even 100% will say "Yes"
Note: the surveys where people are asked to ring a number
to vote are not very accurate, because only certain types of
people actually ring up!
So be careful not to bias your survey. Try to choose randomly.
Example: You want to know the favorite colors for people at your
school, but don't have the time to ask everyone.
Solution: Choose 50 people at random:
stand at the gate and choose "the next person to arrive" each time
or choose people randomly from a list and then go and find them!
or you could choose every 5th person
Your results will hopefully be nearly as good as if you asked everyone.
If you choose a person and they do not want to answer, just record "no
answer" on the survey form and mention how many people did not answer in
your report.
After completing a sampling survey you can use the information to make
a prediction as to how the rest of the population would respond.
The more people you have asked, the better your result will be.
Example: nationwide opinion polls survey up to 2,000 people, and the
results are nearly as good (within about 1%) as asking everyone.
Step Three: Tally the Results
Now you have finished asking questions it is time to tally the results.
By "tally" I mean add up. This usually involves lots of paperwork and
computer work (spreadsheets are useful!)
Example: For "favorite colors of my class" you can simply write tally
marks like this (every fifth mark crosses the previous 4 marks, so you
can easily see groups of 5):
Step Four: Presenting the Results
Now you have your results, you will want to show them to other people in
the best possible way.
We have written a special page called Showing the Results of a Survey, but
here is a quick summary:
Tables
Sometimes, you can simply report the information in a table.
A table is a very simple way to show others the results. A table should have
a title, so those looking at it understand what results the table shows:
Table: The Favorite Colors of My Class
Yellow Red Blue Green Pink
4 5 6 1 4
Statistics
You can also summarize the results using statistics, such
as mean or standard deviation
Example: you have lots of information about how long it takes people to
get to school but it may be simpler just to present a summary such as:
Shortest Journey: 3 minutes
Average Journey: 22 minutes
Longest Journey: 58 minutes
Graphs
But nothing makes a report look better than a nice graph or chart.
Use Data Graphs (Bar, Line and Pie) to make them.
Example Survey Question: What is your favorite color?
Have fun asking questions!!!!!
Survey Questions
How to make a good Questionnaire!
The first question is one you should ask yourself:
"What do I hope to learn from asking the questions?"
This defines your objective (the purpose, or why you
are conducting the survey).
Example: you want to clean up the local river. You feel that with some
help and some money you could make it really beautiful again.
You want to survey your local community to find out:
Are other people also worried about the river.
Would they be willing to donate their time or money to help.
Questions
Now you know why you are doing a survey, start writing down the questions
you will ask!
Just write down any questions you think may be useful. Don't worry about
quality at this stage, we will improve your list of questions later.
Example: Questions you could ask for the river survey:
Does pollution worry you?
Do you ever go down to the river?
Can you spare some money to help the river?
Have you noticed the pollution in the river?
Would you be happy to volunteer for river cleanup?
When would you be available to help?
How should we clean up the river?
etc...
You can also ask the person about themselves (not too personal!), such as
approximate age, male or female, etc, so that you know the kind of people
that you have been surveying.
Your Turn: Go ahead and write down the questions for your
own survey!
Types of Questions
A survey question can be:
Open-ended (the person can answer in any way they want), or
Closed-ended (the person chooses from one of several options)
Closed ended questions are much easier to total up later on, but may stop
people giving an answer they really want.
Example: "What is your favorite color?"
Open-ended: Someone may answer "dark fuchsia", in which case you
will need to have a category "dark fuchsia" in your results.
Closed-ended: With a choice of only 12 colors your work will be
easier, but they may not be able to pick their exact favorite color.
Look at each of your questions and decide if they should be
open-ended or closed ended (take the opportunity to rewrite
any questions, too)
Example: "What do you think is the best way to clean up the river?"
Make it Open-ended: the answers won't be easy to put in a table or
graph, but you may get some good ideas, and there may be some good
quotes for your report.
Example: "How often do you visit the river?"
Make it Closed-ended with the following options:
Nearly every day
At least 5 times a year
1 to 4 times a year
Almost never
You will be able to present this data in a neat bar graph.
Question Sequence
It is important that the questions don't "lead" people to the answer
Example: people may say "yes" to donate money if you ask the
questions this way
Do you love nature?
Will you donate money to help the river?
But probably will say "no" if you ask the questions this way:
Is lack of money a problem for you?
Will you donate money to help the river?
To avoid this kind of thing, try to have your questions go:
from the least sensitive to the most sensitive
from the more general to the more specific
from questions about facts to questions about opinions
Go through your questions and put them in the best
sequence possible
Example: I will ask people how often they visit the river (a fact) before
I ask them what they feel about pollution (an opinion)
I will ask people their general feelings about the environment before I
ask them their feelings about the river.
Neutral Questions
Your questions should also be neutral ... allowing the
person to think their own thoughts about the question.
In the example above we had the question "Do you love nature?" ... that
is a bad questionbecause it is almost forcing the person to say "Yes, of
course."
Try rewording it to be more neutral, for example:
Example: "How important is the natural environment to you?"
Not Important
Some Importance
Very Important
But you can also make statements and see if people agree:
Reword every question to be neutral
Possible Answers
For each "closed-ended" question try to think:
What are the possible answers to this question?
Make sure you have most of the
common answer available.
If you are not sure what people might
answer, you could always try a small
open ended survey (maybe ask your
friends or people in the street) to find
common answers.
Trick: try to avoid neutral answers (such as "don't care") because people
may choose them so they don't have to think about the answer!
It is also helpful to have an “other” category in case none of the choices are
satisfactory for the person answering the question.
Example: What is your favorite color?
Red, blue, green, yellow, purple, black, brown, orange, other
. Scaled Answers
Sometimes you could have a scale on which they can rate their feelings
about the question.
Have "opposite" words at either end and a scale in between like this:
Examples:
The river is ...
Polluted :_____:_____:_____:_____:_____:_____: Clean
Cleaning up the river is ...
Easy :_____:_____:_____:_____:_____:_____: Difficult
. Rated Items
For this type of answer the person gets to rate or rank each option.
Don't have too many items though, as that makes it too hard to answer.
Example: Please rank the following activities from 1 to 5, putting 1 next
to your favorite through to 5 for your least favorite.
___ Fishing
___ Football
___ Golf
___ Shopping
___ Sleeping
. Number Answers
You can also just ask for a number
Example: "How many times did you visit the river during the past
year?"
____ times
Look at each "closed-end" question and choose the best
answer options.
How Will I Gather the Answers?
Try to make life easier by thinking how you will
gather the answersbefore you ask the questions
It is important to make the process simple, for both
yourself and those responding.
The Questionnaire
You are going to want a neat form that
makes it easy to answer the questions AND
easy to total up the answers later on.
Type your questions and answer options into a word
processor or spreadsheet, and format it neatly.
Remember to leave plenty of space for open-ended
questions.
How Will I Show the Results?
Go over each of the questions and think how you want the answers to go into
your report:
in a table,
a bar graph,
a pie chart,
or just explained in words.
Make sure each question is set up so you can present the
answers in your chosen style.
Example: you decide to have six options for "How many times do you
visit the river" so the bar graph looks best.
Test It Out
You should test your questionnaire on a few people.
was each question clear and easy to understand?
were they happy with the options?
It is also a good idea to time how long it takes so you will be
able to tell people "this survey only takes 2 minutes" (or
however long it takes). Use ourStopwatch.
Try the questionnaire on some friends.
Take notes of any difficulties your friends have with the
questionnaire, and see what you can do to improve it.
Your Original Objective
Lastly, look back at your original objectives for this survey ...
will the questions really help you find out what you want to know?
are there some questions you can remove? (smaller surveys are
easier!)
This is your last chance to make sure your questionnaire is a
good one!
You Are Done!
Now you have your questions as perfect as you can get them ..
... go out and ask them!
Showing the Results of a Survey
So you have just Conducted a Survey and want to show
your results in the best possible way?
Here are some suggestions:
Tables
Sometimes, you can simply report the information in a table.
A table is a very simple way to show others the results. A table should have
a title, so those looking at it understand what it shows:
Table: The Favorite Colors of My Class
Yellow Red Blue Green Pink
4 5 6 1 4
Statistics
You can also summarize the results using statistics, such
as Mean, Median, Mode, Standard Deviation and Quartiles
Example: you have lots of information about how long it takes people to
get to school but it may be simpler just to present a summary such as:
Shortest Journey: 3 minutes
Average Journey: 22 minutes
Longest Journey: 58 minutes
Graphs
But nothing makes a report look better than a nice graph or chart
There are many different types of graphs. Three of the most
common are:
Line Graph - shows information that is somehow connected (such
as change over time)
Bar Graph – shows relative sizes of different results:
Pie Chart - shows sizes as part of a whole (good for showing
percentages).
You can create graphs like those using our Data Graphs (Bar, Line and
Pie) page
People's Comments
If people have given their opinions or comments in the survey, you can
present the more interesting ones:
Example: In response to the question "How can we best clean up the
river?" we received these interesting replies:
"The government has a special fund for this"
"The local gardening group has seedlings you could plant"
Report
Put it all together into a report, with a nice introduction, and conclusions at
the end, and you are done!
Accuracy and Precision
They mean slightly different things!
Accuracy
Accuracy is how close a measured value is to the actual (true) value.
Precision
Precision is how close the measured values are to each other.
Examples of Precision and Accuracy:
Low Accuracy
High Precision
High Accuracy
Low Precision
High Accuracy
High Precision
So, if you are playing soccer and you always hit the left goal post instead of
scoring, then you arenot accurate, but you are precise!
Bias (don't let precision fool you!)
If you measure something several times and all values are close,
they may all be wrong if there is a "Bias"
Bias is a systematic (built-in) error which makes all measurements wrong by
a certain amount.
Examples of Bias
The scales read "1 kg" when there is nothing on them
You always measure your height wearing shoes with thick soles.
A stopwatch that takes half a second to stop when clicked
In each case all measurements will be wrong by the same amount. That is
bias.
Degree of Accuracy
Accuracy depends on the instrument you are measuring with. But as a
general rule:
The degree of accuracy is half a unit each side of the unit of measure
Examples:
If your instrument measures in "1"s
then any value between 6½ and 7½ is measured as "7"
If your instrument measures in "2"s
then any value between 7 and 9 is measured as "8"
(Notice that the arrow points to the same spot, but the measured values are
different!
Read more at Errors in Measurement. )
Activity: Asking Questions
As you walk, or in the car or at home, look around and ask yourself
questions about the world around you.
Write down 5 of those questions that can be answered using numbers.
Examples:
How many trees in the park?
How long would it take to cut the grass along the street?
How much paint would it take to do the whole house?
Which Ice Cream sells the most?
You can use this form:
Question
1
2
3
4
5
Why Do This Activity?
It will improve your awareness and understanding of the world
It will increase your curiosity and
It will improve your "number-sense"
Try To Do This Your Whole Life
It is a good habit to always ask questions about the world.
Activity: Improving Questions
First do the Asking Questions Activity where you are asked to write down 5
real-world questions that can be answered using numbers.
Now we want to take those questions and make them better.
For each question:
Is it possible to answer?
Can we answer it exactly (or close enough)?
Do we know what each part of it means?
Ask "does that depend on ... "
Example: How many trees in the park?
When you start counting the trees you may find lots of tiny ones ...
should they be counted?
Maybe the question could be changed to
How many trees taller than 2 meters are in the park?
Example: How long would it take to cut the grass along the
street?
What are you using to cut the grass: A lawn mower? One you sit on?
Maybe the question could be changed to
How long would it take to cut the grass along the street
using our lawn mower?
Also: what does "along the street" mean? Just the grass alongside the
road? Maybe you need a map!
1 Original Question:
Improved Question:
2 Original Question:
Improved Question:
3 Original Question:
Improved Question:
4 Original Question:
Improved Question:
5 Original Question:
Improved Question:
Probability and Statistics
Finding a Central Value
When you have two or more numbers it is nice to find a value for the
"center".
2 Numbers
With just 2 numbers the answer is easy: go half-way in-between.
Example: what is the central value for 3 and 7?
Answer: Half-way in-between, which is 5.
You can calculate it by adding 3 and 7 and then dividing the result by 2:
(3+7) / 2 = 10/2 = 5
3 or More Numbers
You can use the same idea when you have 3 or more numbers:
Example: what is the central value of 3, 7 and 8?
Answer: You calculate it by adding 3, 7 and 8 and then dividing the results by
3 (because there are 3 numbers):
(3+7+8) / 3 = 18/3 = 6
Notice that we divided by 3 because we had 3 numbers ... very important!
The Mean
So far we have been calculating the Mean (or the Average):
Mean: Add up the numbers and divide by how many numbers.
But sometimes the Mean can let you down:
Example: Birthday Activities
Uncle Bob wants to know the average age at the party, to choose an activity.
There will be 6 kids aged 13, and also 5 babies aged 1.
Add up all the ages, and divide by 11 (because there are 11 numbers):
(13+13+13+13+13+13+1+1+1+1+1) / 11 = 7.5...
The mean age is about 7½, so he gets a Jumping Castle!
The 13 year olds are embarrassed,
and the 1-year olds can't jump!
The Mean was accurate, but in this case it was not useful.
The Median
But you could also use the Median: simply list all numbers in order and
choose the middle one:
Example: Birthday Activities (continued)
List the ages in order:
1, 1, 1, 1, 1, 13, 13, 13, 13, 13, 13
Choose the middle number:
1, 1, 1, 1, 1, 13 , 13, 13, 13, 13, 13
The Median age is 13 ... so let's have a Disco!
Sometimes there are two middle numbers. Just average them:
Example: What is the Median of 3, 4, 7, 9, 12, 15
There are two numbers in the middle:
3, 4, 7, 9, 12, 15
So we average them:
(7+9) / 2 = 16/2 = 8
The Median is 8
The Mode
The Mode is the value that occurs most often:
Example: Birthday Activities (continued)
Group the numbers so we can count them:
1, 1, 1, 1, 1, 13, 13, 13, 13, 13, 13
"13" occurs 6 times, "1" occurs only 5 times, so the mode is 13.
How to remember? Think "mode is most"
But Mode can be tricky, there can sometimes be more than one Mode.
Example: What is the Mode of 3, 4, 4, 5, 6, 6, 7
Well ... 4 occurs twice but 6 also occurs twice.
So both 4 and 6 are modes.
When there are two modes it is called "bimodal", when there are three or
more modes we call it "multimodal".
Conclusion
There are other ways of measuring central values, but Mean, Median and
Mode are the most common.
Use the one that best suits your data. Or better still, use all three!
How to Find the Mean
The mean is just the average of the numbers.
It is easy to calculate: add up all the numbers, then divide by how
many numbers there are.
In other words it is the sum divided by the count.
Example 1: What is the Mean of these numbers?
6, 11, 7
Add the numbers: 6 + 11 + 7 = 24
Divide by how many numbers (there are 3 numbers): 24 / 3 = 8
The Mean is 8
Why Does This Work?
It is because 6, 11 and 7 added together is the same as 3 lots of 8:
It is like you are "flattening out" the numbers
Example 2: Look at these numbers:
3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29
The sum of these numbers is 330
There are fifteen numbers.
The mean is equal to 330 / 15 = 22
The mean of the above numbers is 22
Negative Numbers
How do you handle negative numbers? Adding a negative number is the
same as subtracting the number (without the negative). For example 3 + (-
2) = 3-2 = 1.
Knowing this, let us try an example:
Example 3: Find the mean of these numbers:
3, -7, 5, 13, -2
The sum of these numbers is 3 - 7 + 5 + 13 - 2 = 12
There are 5 numbers.
The mean is equal to 12 ÷ 5 = 2.4
The mean of the above numbers is 2.4
Here is how to do it one line:
Mean = 3 − 7 + 5 + 13 − 2
= 12
= 2.4 5 5
Now have a look at The Mean Machine.
How to Find the Median Value
It's the middle number in a sorted list.
Median Value
The Median is the "middle number" (in a sorted list of numbers).
How to Find the Median Value
To find the Median, place the numbers in value order and find the middle
number.
Example: find the Median of 12, 3 and 5
Put them in order:
3, 5, 12
The middle number is 5, so the median is 5.
Example:
3, 13, 7, 5, 21, 23, 39, 23, 40, 23, 14, 12, 56, 23,
29
When we put those numbers in order we have:
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 40,
56
There are fifteen numbers. Our middle number will be the eighth number:
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 40,
56
The median value of this set of numbers is 23.
(It doesn't matter that some numbers are the same in the list.)
Two Numbers in the Middle
BUT, when there are an even amount of numbers things are slightly
different.
In that case we need to find the middle pair of numbers, and then find the
value that would be half way between them. This is easily done by adding
them together and dividing by two.
Example:
3, 13, 7, 5, 21, 23, 23, 40, 23, 14, 12, 56, 23, 29
When we put those numbers in order we have:
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56
There are now fourteen numbers and so we don't have just one middle
number, we have a pair of middle numbers:
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56
In this example the middle numbers are 21 and 23.
To find the value half-way between them, add them together and divide by 2:
21 + 23 = 44
44 ÷ 2 = 22
So the Median in this example is 22.
(Note that 22 was not in the list of numbers ... but that is OK because half the
numbers in the list are less, and half the numbers are greater.)
Your Turn
Remember: sort them first (by dragging them left or right) !
View Larger
Which is the Middle Number?
A quick way to find which is the middle number: count how many
numbers, add 1 then divide by 2
Example: There are 45 numbers
45 plus 1 is 46, then divide by 2 and you get 23
So the median is the 23rd number in the sorted list.
Example: There are 66 numbers
66 plus 1 is 67, then divide by 2 and you get 33.5
33 and a half? That means that the 33rd and 34th numbers in the sorted
list are the two middle numbers.
So to find the median: add the 33rd and 34th numbers together and divide
by 2.
How to Find the Mode or Modal Value
The mode is simply the number which appears most often.
Finding the Mode
To find the mode, or modal value, first put the numbers in order, then count
how many of each number.
Example:
3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29
In order these numbers are:
3, 5, 7, 12, 13, 14, 20, 23, 23, 23, 23, 29, 39, 40, 56
This makes it easy to see which numbers appear most often.
In this case the mode is 23.
Another Example: {19, 8, 29, 35, 19, 28, 15}
Arrange them in order: {8, 15, 19, 19, 28, 29, 35}
19 appears twice, all the rest appear only once, so 19 is the mode.
How to remember? Think "mode is most"
More Than One Mode
You can have more than one mode.
Example: {1, 3, 3, 3, 4, 4, 6, 6, 6, 9}
3 appears three times, as does 6.
So there are two modes: at 3 and 6
Having two modes is called "bimodal".
Having more than two modes is called "multimodal".
Grouping
When all values appear the same number of times the idea of a mode is not
useful. But you could group them to see if one group has more than the
others.
Example: {4, 7, 11, 16, 20, 22, 25, 26, 33}
Each value occurs once, so let us try to group them.
We can try groups of 10:
0-9: 2 values (4 and 7)
10-19: 2 values (11 and 16)
20-29: 4 values (20, 22, 25 and 26)
30-39: 1 value (33)
In groups of 10, the "20s" appear most often, so we could choose 25 as
the mode.
You could use different groupings and get a different answer!
Activity: Averages Brain-Teaser
Here is a little puzzle about averages. Is it right?
Who is Better at Kicking Goals?
At practice last week:
You scored 2 of 10 shots at goal
Sam scored 3 of 10 shots
Sam is better!
This week:
You scored 53 of 100 shots
Sam scored 6 of 10 shots
Sam is still better.
But let's add up the scores for BOTH weeks:
You scored 55 of 110 shots: that is 50%
Sam scored 9 of 20 shots: that is only 45%
Hang on! YOU are better!
Sam was better last week and this week ... but you are better over
both weeks?
Please explain.
...
Maybe make a table with all the data and do the calculations yourself
Sam You
Last Week
This Week
Both Weeks
....
... read on after you have thought about it ...
...
It is All True
Because you had SO MANY shots at goal this week, and did well at
them, you lifted your two-week average above Sam's.
At practice last week:
You scored 2 of 10 (20%)
Sam scored 3 of 10 (30%)
This week:
You scored 53 of 100 (53%)
Sam scored 6 of 10 (60%)
For BOTH weeks:
You scored 55 of 110 (50%)
Sam scored 9 of 20 (45%)
To be fair, we should really compare the averages when your and Sam's
attempts at goal is roughly the same.
If Sam had attempted 100 shots this week, he may have scored 60 out of
100, and his two-week average would have been about 57%, better than
you.
So be careful when comparing two sets of data with widely different counts.
The Mean from a Frequency Table
It is easy to calculate the Mean:
Add up all the numbers, then divide by how many numbers there are.
Example 1: What is the Mean of these numbers?
6, 11, 7
Add the numbers: 6 + 11 + 7 = 24
Divide by how many numbers (there are 3 numbers): 24 ÷ 3 = 8
The Mean is 8
But sometimes you won't have a simple list of numbers, you might have a
frequency table like this (the "frequency" says how often they occur):
Score Frequency
1 2
2 5
3 4
4 2
5 1
(it says that score 1 occurred 2 times, score 2 occurred 5 times, etc)
You could list all the numbers like this:
Mean = 1+1 + 2+2+2+2+2 + 3+3+3+3 + 4+4 + 5
(how many numbers)
But rather than do lots of adds (like 3+3+3+3) it is often easier to use
multiplication:
Mean = 2×1 + 5×2 + 4×3 + 2×4 + 1×5
(how many numbers)
And rather than count how many numbers there are, we can add up the
frequencies:
Mean = 2×1 + 5×2 + 4×3 + 2×4 + 1×5
2 + 5 + 4 + 2 + 1
So let's calculate:
Mean = 2 + 10 + 12 + 8 + 5
= 37
= 2.64... 14 14
And that is how to calculate the mean from a frequency table!
Here is another example:
Example: Parking Spaces per House in Hampton Street
Isabella went up and down the street to find out how many parking
spaces each house had. Here are her results:
Parking
Spaces Frequency
1 15
2 27
3 8
4 5
What is the mean number of Parking Spaces?
Answer:
Mean = 15× 1 + 27×2 + 8×3 + 5×4
= 15+54+24+20
= 2.05... 15+27+8+5 55
The Mean is 2.05 (to 2 decimal places)
(much easier than adding all numbers separately!)
Notation
Now you know how to do it, let's do that last example again, but using
formulas.
This symbol (called Sigma) means "sum up"
(read more at Sigma Notation)
So we can say "add up all frequencies" this way:
(where f is frequency)
And we would use it like this:
Likewise we can add up "frequency times score" this way:
(where f is frequency and x is the matching score)
And the formula for calculating the mean from a frequency table is:
The x with the bar on top says "the mean of x"
So now we are ready to do our example above, but with correct notation.
Example: Calculate the Mean of this Frequency Table
x f
1 15
2 27
3 8
4 5
And here it is:
There you go! You can use sigma notation.
Calculate in the Table
It is often better to do the calculations in the table.
Example: (continued)
From the previous example, calculate f × x in the right-hand column
and then do totals:
x f fx
1 15 15
2 27 54
3 8 24
4 5 20
TOTALS: 55 113
And the Mean is then easy:
Mean = 113 / 55 = 2.05...
Mean, Median and Mode
from Grouped Frequencies
Let's start off with some raw data (not a grouped frequency) ...
Example: Alex did a survey of how many games each of 20
friends owned, and got this:
9, 15, 11, 12, 3, 5, 10, 20, 14, 6, 8, 8, 12, 12, 18, 15, 6, 9, 18, 11
To find the Mean, add up all the numbers, then divide by how many numbers
there are:
Mean =
9+15+11+12+3+5+10+20+14+6+8+8+12+12+18+15+6
+9+18+11 = 11.
1 20
To find the Median, place the numbers in value order and find the middle
number (or the mean of the middle two numbers). In this case the mean of
the 10th and 11th values:
3, 5, 6, 6, 8, 8, 9, 9, 10, 11, 11, 12, 12, 12, 14, 15, 15, 18, 18, 20:
Median = 11 + 11
= 11 2
To find the Mode, or modal value, place the numbers in value order then
count how many of each number. The Mode is the number which appears
most often (you can have more than one mode):
3, 5, 6, 6, 8, 8, 9, 9, 10, 11, 11, 12, 12, 12, 14, 15, 15, 18, 18, 20:
12 appears three times, more often than the other values, so Mode = 12
Grouped Frequency Table
Now, let's make a Grouped Frequency Table of Alex's data:
Number of
games
Frequency
1 - 5 2
6 - 10 7
11 - 15 8
16 - 20 3
(It says that 2 of Alex's friends own somewhere between 1 and 5 games, 7
own between 6 and 10 games, etc)
Oh No!
Suddenly all the original data gets lost (naughty pup!)
Only the Grouped Frequency Table survived ...
... can we help Alex calculate the Mean, Median and Mode from just that
table?
The answer is ... no we can't. Not accurately anyway. But, we can
make estimates.
Estimating the Mean from Grouped Data
So all we have left is:
Number of
games
Frequency
1 - 5 2
6 - 10 7
11 - 15 8
16 - 20 3
The groups (1-5, 6-10, etc) also called class intervals, are
of width 5
The numbers 1, 6, 11 and 16 are the lower class boundaries
The numbers 5, 10, 15 and 20 are the upper class boundaries
The midpoints are halfway between the lower and upper class
boundaries
So the midpoints are 3, 8, 13 and 18
We can estimate the Mean by using the midpoints.
So, how does this work?
Think about Alex's 7 friends who are in the group 6 - 10: all we know is that
they each have between 6 and 10 games:
Maybe all seven of them have 6 games,
Maybe all seven of them have 10 games,
But it is more likely that there is a spread of numbers: some have 6,
some have 7, and so on
So we take an average: we assume that all seven of them have 8 games (8
is the average of 6 and 10), which is the midpoint of the group.
So, we could make the table in a different way:
Midpoint Frequency
3 2
8 7
13 8
18 3
Now we think "2 people have 3 games, 7 people have 8 games, 8 people
have 13 games and 3 people have 18 games", so we imagine the data looks
like this:
3, 3, 8, 8, 8, 8, 8, 8, 8, 13, 13, 13, 13, 13, 13, 13, 13, 18, 18, 18
Now we can add them all up and divide by 20. This is the quick way to do it:
Midpoint
x
Frequency
f
fx
3 2 6
8 7 56
13 8 104
18 3 54
Totals: 20 220
So an estimate of the mean number of games is:
Estimated Mean = 220
= 11 20
Estimating the Median from Grouped Data
To estimate the Median, let's look at our data again:
Number of
games
Frequency
1 - 5 2
6 - 10 7
11 - 15 8
16 - 20 3
The median is the mean of the middle two numbers (the 10th and
11th values) ...
... and they are both in the 11 - 15 group:
We can say "the median group is 11 - 15"
But if we need to estimate a single Median value we can use this formula:
Estimated Median = L + (n/2) − cfb
× w fm
where:
L is the lower class boundary of the group containing the median
n is the total number of data
cfb is the cumulative frequency of the groups before the median group
fm is the frequency of the median group
w is the group width
For our example:
L = 11
n = 20
cfb = 2 + 7 = 9
fm = 8
w = 5
Estimated Median = 11 + (20/2) − 9
× 5 = 11 + (1/8) x 5 = 11.625 8
Estimating the Mode from Grouped Data
Again, looking at our data:
Number of
games
Frequency
1 - 5 2
6 - 10 7
11 - 15 8
16 - 20 3
We can easily identify the modal group (the group with the highest
frequency), which is 11 - 15
We can say "the modal group is 11 - 15"
But the actual Mode may not even be in that group! Or there may be more
than one mode. Without the raw data we don't really know.
But, we can estimate the Mode using the following formula:
Estimated Mode = L + fm − fm-1
× w (fm − fm-1) + (fm − fm+1)
where:
L is the lower class boundary of the modal group
fm-1 is the frequency of the group before the modal group
fm is the frequency of the modal group
fm+1 is the frequency of the group after the modal group
w is the group width
In this example:
L = 11
fm-1 = 7
fm = 8
fm+1 = 3
w = 5
Estimated Mode = 11
+
8 − 7 × 5 = 11 + (1/6) × 5
= 11.833... (8 − 7) + (8 −
3)
Our final result is:
Estimated Mean: 11
Estimated Median: 11.625
Estimated Mode: 11.833...
(Compare that with the true Mean, Median and Mode of 11.1, 11 and
12 that we got at the very start.)
And that is how it is done.
Now let us look at two more special examples, and get some more practice
along the way!
Continuous Data
Data can be Discrete or Continuous:
Discrete data can only take certain values, like our previous example
(games owned)
Continuous data can take any value (within a range), such as length
or weight
Continuous data can be treated in exactly the same way as discrete
data, but with one important difference.
The difference concerns the class boundaries.
Example: You grew fifty baby carrots using special soil. You dig
them up and measure their lengths (to the nearest mm) and group
the results:
Length (mm) Frequency
150 - 154 5
155 - 159 2
160 - 164 6
165 - 169 8
170 - 174 9
175 - 179 11
180 - 184 6
185 - 189 3
Now, what does "155 - 159" mean?
The clue is "to the nearest mm".
A length of 154.5 mm would be rounded up to 155 mm (and placed
in 155 - 159),
Similarly 159.49 mm would be rounded down to 159 mm (and also be
placed in 155 - 159).
So lengths from 154.5 up to (but not including) 159.5 get placed in 155
- 159
And so for continuous data "155 - 159" has two types of numbers at the
beginning and end:
the lower class boundary of 155 and the upper class boundary of
159
the lower class limit of 154.5 and upper class limit of 159.5
Note that the upper class limit of one class interval is the lower class limit of
the next class interval.
So, how does this affect our calculations?
The Mean is not affected
But the Median and Mode now have L = Lower class limit (rather than
Lower class boundary)
Now let's go:
Mean
Length (mm) Midpoint
x
Frequency
f
fx
150 - 154 152 5 760
155 - 159 157 2 314
160 - 164 162 6 972
165 - 169 167 8 1336
170 - 174 172 9 1548
175 - 179 177 11 1947
180 - 184 182 6 1092
185 - 189 187 3 561
Totals: 50 8530
Estimated Mean = 8530
= 170.6 mm 50
Median
The Median is the mean of the 25th and the 26th length, so is in the 170 -
174 group:
L = 169.5 (the lower class limit of the 170 - 174 group)
n = 50
cfb = 5 + 2 + 6 + 8 = 21
fm = 9
w = 5
Estimated Median =
169.5 +
(50/2) −
21 × 5 = 169.5 + 2.22... = 171.7
mm (to 1 decimal) 9
Mode
The Modal group is the one with the highest frequency, which is 175 - 179:
L = 174.5 (the lower class limit of the 175 - 179 group)
fm-1 = 9
fm = 11
fm+1 = 6
w = 5
Estimated Mode =
174.5 +
11 − 9 × 5 = 174.5 + 1.42... = 175.9
mm (to 1 decimal) (11 − 9) +
(11 − 6)
Ages
Age is a special case.
When we say "Sarah is 17" she stays "17" up until her eighteenth birthday.
She might be 17 years and 364 days old and still be called "17".
In other words, even though "age" is a continuous variable (time), we treat it
as discrete.
Example: The ages of the 112 people who live on a tropical island
were grouped as follows:
Age Number
0 - 9 20
10 - 19 21
20 - 29 23
30 - 39 16
40 - 49 11
50 - 59 10
60 - 69 7
70 - 79 3
80 - 89 1
A child in the first group 0 - 9 could be almost 10 years old. So the midpoint
for this group is 5 not 4.5
The midpoints are 5, 15, 25, 35, 45, 55, 65, 75 and 85
Similarly, in the calculations of Median and Mode, we will use the class
boundaries 0, 10, 20 etc
Mean
Age Midpoint x
Number f
fx
0 - 9 5 20 100
10 - 19 15 21 315
20 - 29 25 23 575
30 - 39 35 16 560
40 - 49 45 11 495
50 - 59 55 10 550
60 - 69 65 7 455
70 - 79 75 3 225
80 - 89 85 1 85
Totals: 112 3360
Estimated Mean = 3360
= 30 112
Median
The Median is the mean of the ages of the 56th and the 57th people, so is in
the 20 - 29 group:
L = 20 (the lower class boundary of the class interval containing the
median)
n = 112
cfb = 20 + 21 = 41
fm = 23
w = 10
Estimated Median = 20
+
(112/2) −
41 × 10 = 20 + 6.52... = 26.5 (to 1
decimal) 23
Mode
The Modal group is the one with the highest frequency, which is 20 - 29:
L = 20 (the lower class boundary of the modal class)
fm-1 = 21
fm = 23
fm+1 = 16
w = 10
Estimated Mode =
20 +
23 − 21 × 10 = 20 + 2.22... = 22.2 (to 1
decimal) (23 − 21) +
(23 − 16)
Summary
For grouped data, we cannot find the exact Mean, Median and Mode,
we can only give estimates.
To estimate the Mean use the midpoints of the class intervals.
Estimated Median = L + (n/2) + cfb
× w fm
where:
L is the lower class boundary of the group containing the median
n is the total number of data
cfb is the cumulative frequency of the groups before the median group
fm is the frequency of the median group
w is the group width
Estimated Mode = L + fm − fm-1
× w (fm − fm-1) + (fm − fm+1)
where:
L is the lower class boundary of the modal group
fm-1 is the frequency of the group before the modal group
fm is the frequency of the modal group
fm+1 is the frequency of the group after the modal group
w is the group width
For continuous data use limits (rather than boundaries) for median
and mode
Weighted Mean
Also called Weighted Average
A mean where some values contribute more than others.
Mean
When we do a simple mean (or average), we give equal weight to each
number.
Here is the mean of 1, 2, 3 and 4:
Add up the numbers, divide by how many numbers:
Mean = 1 + 2 + 3 + 4
= 10
= 2.5 4 4
Weights
We could think that each of those numbers has a "weight" of ¼ (because
there are 4 numbers):
Mean = ¼ × 1 + ¼ × 2 + ¼ × 3 + ¼ × 4
= 0.25 + 0.5 + 0.75 + 1 = 2.5
Same answer.
Now let's change the weight of 3 to 0.7, and the weights of the other
numbers to 0.1 so the total of the weights is still 1:
Mean = 0.1 × 1 + 0.1 × 2 + 0.7 × 3 + 0.1 × 4
= 0.1 + 0.2 + 2.1 + 0.4 = 2.8
This weighted mean is now a little higher ("pulled" there by the weight of
3).
When some values get more weight than others
the central point (the mean) can change:
Decisions
Weighted means can help with decisions where some things are more
important than others:
Example: Sam wants to buy a new camera, and decides on
the following rating system:
Image Quality 50%
Battery Life 30%
Zoom Range 20%
The Cony camera gets 8 (out of 10) for Image Quality, 6 for Battery
Life and 7 for Zoom Range
The Sanon camera gets 9 for Image Quality, 4 for Battery Life and 6 for
Zoom Range
Which camera is best?
Cony: 0.5 × 8 + 0.3 × 6 + 0.2 × 7 = 4 + 1.8 + 1.4 = 7.2
Sanon: 0.5 × 9 + 0.3 × 4 + 0.2 × 6 = 4.5 + 1.2 + 1.2 = 6.9
Sam decides to buy the Cony.
What if the Weights Don't Add to 1?
When the weights don't add to 1, divide by the sum of weights.
Example: Alex usually works 7 days a week, but sometimes
just 1, 2, or 5 days.
The data:
2 weeks Alex worked 1 day each week
14 weeks Alex worked 2 days each week
8 weeks Alex worked 5 days each week
32 weeks Alex worked 7 days each week
What is the mean number of days Alex works per week?
Use "Weeks" as the weighting:
Weeks × Days = 2 × 1 + 14 × 2 + 8 × 5 + 32 × 7
= 2 + 28 + 40 + 224 = 294
Also add up the weeks:
Weeks = 2 + 14 + 8 + 32 = 56
Divide:
Mean = 294
= 5.25 56
It looks like this:
But it is often better to use a table to make sure you have all the numbers
correct:
Example (continued):
Have:
the number of weeks is the weight w
and days (the value we want the mean of) is x
Multiply w by x, sum up w and sum up wx:
Weight w
Days x
wx
2 1 2
14 2 28
8 5 40
32 7 224
Σw = 56 Σwx = 294
Note: Σ (Sigma) means "Sum Up"
Divide Σwx by Σx:
Mean = 294
= 5.25 56
And that leads us to our formula:
Weighted Mean = Σwx
Σw
In other words: multiply each weight w by its matching value x, sum that all
up, and divide by the sum of weights.
Summary
Weighted Mean: A mean where some values contribute more than
others.
When the weights add to 1: just multiply each weight by the
matching value and sum it all up
Otherwise, multiply each weight w by its matching value x, sum
that all up, and divide by the sum of weights:
Weighted Mean = Σwx
Σw
The Range (Statistics)
The Range is the difference between the lowest and highest values.
Example: In {4, 6, 9, 3, 7} the lowest value is 3, and the highest is 9.
So the range is 9-3 = 6.
It is that simple!
But perhaps too simple ...
The Range Can Be Misleading
The range can sometimes be misleading when there are extremely high or
low values.
Example: In {8, 11, 5, 9, 7, 6, 3616}:
the lowest value is 5,
and the highest is 3616,
So the range is 3616-5 = 3611.
The single value of 3616 makes the range large, but most values are
around 10.
So you may be better off using Interquartile Range or Standard Deviation.
Range of a Function
Range can also mean all the output
values of a function, see Domain,
Range and Codomain.
Quartiles
Quartiles are the values that divide a list of numbers into quarters.
First put the list of numbers in order
Then cut the list into four equal parts
The Quartiles are at the "cuts"
Like this:
Example: 5, 8, 4, 4, 6, 3, 8
Put them in order: 3, 4, 4, 5, 6, 8, 8
Cut the list into quarters:
And the result is:
Quartile 1 (Q1) = 4
Quartile 2 (Q2), which is also the Median, = 5
Quartile 3 (Q3) = 8
Sometimes a "cut" is between two numbers ... the Quartile is the average of
the two numbers.
Example: 1, 3, 3, 4, 5, 6, 6, 7, 8, 8
The numbers are already in order
Cut the list into quarters:
In this case Quartile 2 is half way between 5 and 6:
Q2 = (5+6)/2 = 5.5
And the result is:
Quartile 1 (Q1) = 3
Quartile 2 (Q2) = 5.5
Quartile 3 (Q3) = 7
Interquartile Range
The "Interquartile Range" is from Q1 to Q3:
To calculate it just subtract Quartile 1 from Quartile 3, like this:
Example:
The Interquartile Range is:
Q3 - Q1 = 8 - 4 = 4
Box and Whisker Plot
You can show all the important values in a "Box and Whisker Plot", like this:
A final example covering everything:
Example: Box and Whisker Plot and Interquartile
Range for
4, 17, 7, 14, 18, 12, 3, 16, 10, 4, 4, 11
Put them in order:
3, 4, 4, 4, 7, 10, 11, 12, 14, 16, 17, 18
Cut it into quarters:
3, 4, 4 | 4, 7, 10 | 11, 12, 14 | 16, 17, 18
In this case all the quartiles are between numbers:
Quartile 1 (Q1) = (4+4)/2 = 4
Quartile 2 (Q2) = (10+11)/2 = 10.5
Quartile 3 (Q3) = (14+16)/2 = 15
Also:
The Lowest Value is 3,
The Highest Value is 18
So now we have enough data for the Box and Whisker Plot:
And the Interquartile Range is:
Q3 - Q1 = 15 - 4 = 11
Percentiles
Percentile: the value below which a percentage of data falls.
Example: You are fourth tallest person in a group of 20
80% of people are shorter than you:
That means you are at the 80th percentile.
If your height is 1.85m then "1.85m" is the 80th percentile height in
that group.
In Order
The data needs to be in order! So percentiles of height need to be in height
order (sorted by height). If they were percentiles of weight, they would need
to be in weight order.
Deciles
A related idea is Deciles (sounds like decimal and percentile together),
which splits the data into 10% groups:
The 1st decile is the 10th percentile (the value that divides the data
so that 10% is below it)
The 2nd decile is the 20th percentile (the value that divides the
data so that 20% is below it)
etc!
Example: (continued)
You are at the 8th decile (the 80th percentile).
Quartiles
Another related idea is Quartiles, which splits the data into quarters:
Example: 1, 3, 3, 4, 5, 6, 6, 7, 8, 8
The numbers are in order. Cut the list into quarters:
In this case Quartile 2 is half way between 5 and 6:
Q2 = (5+6)/2 = 5.5
And the result is:
Quartile 1 (Q1) = 3
Quartile 2 (Q2) = 5.5
Quartile 3 (Q3) = 7
The Quartiles also divide the data into divisions of 25%, so:
Quartile 1 (Q1) can be called the 25th percentile
Quartile 2 (Q2) can be called the 50th percentile
Quartile 3 (Q3) can be called the 75th percentile
Example: (continued)
For 1, 3, 3, 4, 5, 6, 6, 7, 8, 8:
The 25th percentile = 3
The 50th percentile = 5.5
The 75th percentile = 7
Estimating Percentiles
We can estimate percentiles from a line graph.
Example: Shopping
A total of 10,000 people visited the shopping mall over 12 hours:
Time (hours) People
0 0
2 350
4 1100
6 2400
8 6500
10 8850
12 10,000
a) Estimate the 30th percentile (when 30% of the visitors
had arrived).
b) Estimate what percentile of visitors had arrived after 11
hours.
First draw a line graph of the data: plot the points and join them with a
smooth curve:
a) The 30th percentile occurs when the visits reach 3,000.
Draw a line horizontally across from 3,000 until you hit the curve, then
draw a line vertically downwards to read off the time on the horizontal
axis:
So the 30th percentile occurs after about 6.5 hours.
b) To estimate the percentile of visits after 11 hours: draw a line
vertically up from 11 until you hit the curve, then draw a line
horizontally across to read off the population on the horizontal axis:
So the visits at 11 hours were about 9,500, which is the 95th
percentile.
Mean Deviation
The mean of the distances of each value from their mean.
Yes, we use "mean" twice: Find the mean ... use it to work out distances ...
then find the mean of those!
Three steps:
1. Find the mean of all values
2. Find the distance of each value from that mean (subtract the mean
from each value, ignore minus signs)
3. Then find the mean of those distances
Like this:
Example: the Mean Deviation of 3, 6, 6, 7, 8, 11, 15, 16
Step 1: Find the mean:
Mean = 3 + 6 + 6 + 7 + 8 + 11 + 15 + 16
= 72
= 9 8 8
Step 2: Find the distance of each value from that mean:
Value Distance from
9
3 6
6 3
6 3
7 2
8 1
11 2
15 6
16 7
Which looks like this:
Step 3. Find the mean of those distances:
Mean Deviation = 6 + 3 + 3 + 2 + 1 + 2 + 6 + 7
= 30
= 3.75 8 8
So, the mean = 9, and the mean deviation = 3.75
It tells us how far, on average, all values are from the middle.
In that example the values are, on average, 3.75 away from the middle.
For deviation just think distance
Formula
The formula is:
Mean Deviation = Σ|x - μ|
N
Let's learn more about those symbols!
Firstly:
μ is the mean (in our example μ = 9)
x is each value (such as 3 or 16)
N is the number of values (in our example N = 8)
Absolute Deviation
Each distance we calculated is called an Absolute Deviation, because it is
the Absolute Value of the deviation (how far from the mean).
To show "Absolute Value" we put "|" marks either side like
this: |-3| = 3
For any value x:
Absolute Deviation = |x - μ|
From our example, the value 16 has Absolute Deviation = |x - μ| = |16 -
9| = |7| = 7
And now let's add them all up ...
Sigma
The symbol for "Sum Up" is Σ (called Sigma Notation), so we have:
Sum of Absolute Deviations = Σ|x - μ|
Divide by how many values N and we have:
Mean Deviation = Σ|x - μ|
N
Let's do our example again, using the proper symbols:
Example: the Mean Deviation of 3, 6, 6, 7, 8, 11, 15, 16
Step 1: Find the mean:
μ = 3 + 6 + 6 + 7 + 8 + 11 + 15 + 16
= 72
= 9 8 8
Step 2: Find the Absolute Deviations:
x |x - μ|
3 6
6 3
6 3
7 2
8 1
11 2
15 6
16 7
Σ|x - μ| = 30
Step 3. Find the Mean Deviation:
Mean Deviation = Σ|x - μ|
= 30
= 3.75 N 8
What Does It "Mean" ?
Mean Deviation tells us how far, on average, all values are from the middle.
Here is an example (using the same data as on the Standard
Deviation page):
Example: You and your friends have just measured the
heights of your dogs (in millimeters):
The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm
and 300mm.
Step 1: Find the mean:
μ = 600 + 470 + 170 + 430 + 300
= 1970
= 394 5 5
Step 2: Find the Absolute Deviations:
x |x - μ|
600 206
470 76
170 224
430 36
300 94
Σ|x - μ| = 636
Step 3. Find the Mean Deviation:
Mean Deviation = Σ|x - μ| = 636 = 127.2
N 5
So, on average, the dogs' heights are 127.2 mm from the mean.
(Compare that with the Standard Deviation of 147 mm)
A Useful Check
The deviations on one side of the mean should equal the deviations on
the other side.
From our first example:
Example: 3, 6, 6, 7, 8, 11, 15, 16
The deviations are:
6 + 3 + 3 + 2 + 1 = 2 + 6 + 7
15 = 15
Likewise:
Example: Dogs
Deviations left of mean: 224 + 94 = 318
Deviations right of mean: 206 + 76 + 36 = 318
If they are not equal ... you may have made a msitake!
Standard Deviation and Variance
Deviation just means how far from the normal
Standard Deviation
The Standard Deviation is a measure of how spread out numbers are.
Its symbol is σ (the greek letter sigma)
The formula is easy: it is the square root of the Variance. So now you ask,
"What is the Variance?"
Variance
The Variance is defined as:
The average of the squared differences from the Mean.
To calculate the variance follow these steps:
Work out the Mean (the simple average of the numbers)
Then for each number: subtract the Mean and square the
result (the squared difference).
Then work out the average of those squared differences. (Why
Square?)
Example
You and your friends have just measured the heights of your dogs (in
millimeters):
The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and
300mm.
Find out the Mean, the Variance, and the Standard Deviation.
Your first step is to find the Mean:
Answer:
Mean = 600 + 470 + 170 + 430 + 300
= 1970
= 394 5 5
so the mean (average) height is 394 mm. Let's plot this on the chart:
Now, we calculate each dogs difference from the Mean:
To calculate the Variance, take each difference, square it, and then average
the result:
So, the Variance is 21,704.
And the Standard Deviation is just the square root of Variance, so:
Standard Deviation: σ = √21,704 = 147.32... = 147 (to the
nearest mm)
And the good thing about the Standard Deviation is that it is useful. Now we
can show which heights are within one Standard Deviation (147mm) of the
Mean:
So, using the Standard Deviation we have a "standard" way of knowing what
is normal, and what is extra large or extra small.
Rottweilers are tall dogs. And Dachshunds are a bit short ... but don't tell
them!
Now try the Standard Deviation Calculator.
But ... there is a small change with Sample Data
Our example was for a Population (the 5 dogs were the only dogs we were
interested in).
But if the data is a Sample (a selection taken from a bigger Population),
then the calculation changes!
When you have "N" data values that are:
The Population: divide by N when calculating Variance (like
we did)
A Sample: divide by N-1 when calculating Variance
All other calculations stay the same, including how we calculated the mean.
Example: if our 5 dogs were just a sample of a bigger population of
dogs, we would divide by 4 instead of 5 like this:
Sample Variance = 108,520 / 4 = 27,130
Sample Standard Deviation = √27,130 = 164 (to the nearest
mm)
Think of it as a "correction" when your data is only a sample.
Formulas
Here are the two formulas, explained at Standard Deviation Formulas if you
want to know more:
The "Population Standard Deviation":
The "Sample Standard Deviation":
Looks complicated, but the important change is to
divide by N-1 (instead of N) when calculating a Sample Variance.
*Footnote: Why square the differences?
If we just added up the differences from the mean ... the negatives would
cancel the positives:
4 + 4 - 4 - 4
= 0 4
So that won't work. How about we use absolute values?
|4| + |4| + |-4| + |-4|
= 4 + 4 + 4 + 4
= 4 4 4
That looks good (and is the Mean Deviation), but what about this case:
|7| + |1| + |-6| + |-2|
= 7 + 1 + 6 + 2
= 4 4 4
Oh No! It also gives a value of 4, Even though the differences are more
spread out!
So let us try squaring each difference (and taking the square root at the
end):
√ 42 + 42 + 42 + 42
= √ 64
= 4 4 4
√ 72 + 12 + 62 + 22
= √ 90
= 4.74... 4 4
That is nice! The Standard Deviation is bigger when the differences are more
spread out ... just what we want!
In fact this method is a similar idea to distance between points, just applied
in a different way.
And it is easier to use algebra on squares and square roots than absolute
values, which makes the standard deviation easy to use in other areas of
mathematics.
Return to Top
Standard Deviation Calculator
This shows you the step-by-step calculations to work out the Standard
Deviation (see below for formulas).
Enter your numbers below, the answer is calculated "live":
When your data is the whole population the
formula is:
(The "Population Standard Deviation")
When your data is a sample the formula is:
(The "Sample Standard Deviation")
The important difference is "N-1" instead of "N" ... read more at Standard
Deviation Formulas.
Standard Deviation Formulas
Deviation just means how far from the normal
Standard Deviation
The Standard Deviation is a measure of how spread out numbers are.
You might like to read this simpler page on Standard Deviation first.
But here we explain the formulas.
The symbol for Standard Deviation is σ (the Greek letter sigma).
This is the formula for Standard Deviation:
Say what? Please explain!
OK. Let us explain it step by step.
Say you have a bunch of numbers like 9, 2, 5, 4, 12, 7, 8, 11.
To calculate the standard deviation of those numbers:
1. Work out the Mean (the simple average of the numbers)
2. Then for each number: subtract the Mean and square the
result
3. Then work out the mean of those squared differences.
4. Take the square root of that and you are done!
The formula actually says all of that, and I will show you how.
The Formula Explained
First, let us have some example values to work on:
Example: Sam has 20 Rose Bushes.
The number of flowers on each bush is
9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
Work out the Standard Deviation.
Step 1. Work out the mean
In the formula above μ (the greek letter "mu") is the mean of all our values
...
Example: 9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9,
6, 9, 4
The mean is:
(9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4) / 20 =
140/20 = 7
So:
μ = 7
Step 2. Then for each number: subtract the Mean and square the
result
This is the part of the formula that says:
So what is xi ? They are the individual x values 9, 2, 5, 4, 12, 7, etc...
In other words x1 = 9, x2 = 2, x3 = 5, etc.
So it says "for each value, subtract the mean and square the result", like this
Example (continued):
(9 - 7)2 = (2)2 = 4
(2 - 7)2 = (-5)2 = 25
(5 - 7)2 = (-2)2 = 4
(4 - 7)2 = (-3)2 = 9
(12 - 7)2 = (5)2 = 25
(7 - 7)2 = (0)2 = 0
(8 - 7)2 = (1)2 = 1
... etc ...
Step 3. Then work out the mean of those squared differences.
To work out the mean, add up all the values then divide by how many.
First add up all the values from the previous step.
But how do we say "add them all up" in mathematics? We use "Sigma": Σ
The handy Sigma Notation says to sum up as many terms as we want:
Sigma Notation
We want to add up all the values from 1 to N, where N=20 in our case
because there are 20 values:
Example (continued):
Which means: Sum all values from (x1-7)2 to (xN-7)2
We already calculated (x1-7)2=4 etc. in the previous step, so just sum
them up:
=
4+25+4+9+25+0+1+16+4+16+0+9+25+4+9+9+4+1+4+9
= 178
But that isn't the mean yet, we need to divide by how many, which is
simply done by multiplying by "1/N":
Example (continued):
Mean of squared differences = (1/20) × 178 = 8.9
(Note: this value is called the "Variance")
Step 4. Take the square root of that and you are done!
Example (concluded):
σ = √(8.9) = 2.983...
DONE!
Sample Standard Deviation
But wait, there is more ...
... sometimes your data is only a sample of the whole population.
Example: Sam has 20 rose bushes, but what if Sam only
counted the flowers on 6 of them?
The "population" is all 20 rose bushes,
and the "sample" is the 6 he counted. Let us say they were:
9, 2, 5, 4, 12, 7
We can still estimate the Standard Deviation.
But when you use the sample as an estimate of the whole population,
the Standard Deviation formula changes to this:
The formula for Sample Standard Deviation:
The important change is "N-1" instead of "N" (which is called
"Bessel's correction").
The symbols also change to reflect that we are working on a sample instead
of the whole population:
The mean is now x (for sample mean) instead of μ (the
population mean),
And the answer is s (for Sample Standard Deviation) instead of σ.
But that does not affect the calculations. Only N-1 instead of N changes
the calculations.
OK, let us now calculate the Sample Standard Deviation:
Step 1. Work out the mean
Example 2: Using sampled values 9, 2, 5, 4, 12, 7
The mean is (9+2+5+4+12+7) / 6 = 39/6 = 6.5
So:
x = 6.5
Step 2. Then for each number: subtract the Mean and square the
result
Example 2 (continued):
(9 - 6.5)2 = (2.5)2 = 6.25
(2 - 6.5)2 = (-4.5)2 = 20.25
(5 - 6.5)2 = (-1.5)2 = 2.25
(4 - 6.5)2 = (-2.5)2 = 6.25
(12 - 6.5)2 = (5.5)2 = 30.25
(7 - 6.5)2 = (0.5)2 = 0.25
Step 3. Then work out the mean of those squared differences.
To work out the mean, add up all the values then divide by how many.
But hang on ... we are calculating the Sample Standard Deviation, so
instead of dividing by how many (N), we are going to divide by N-1
Example 2 (continued):
Sum = 6.25 + 20.25 + 2.25 + 6.25 + 30.25 + 0.25 = 65.5
Divide by N-1: (1/5) × 65.5 = 13.1
(This value is called the "Sample Variance")
Step 4. Take the square root of that and you are done!
Example 2 (concluded):
s = √(13.1) = 3.619...
DONE!
Comparing
When we used the whole population we got: Mean = 7, Standard Deviation
= 2.983...
When we used the sample we got: Sample Mean = 6.5, Sample Standard
Deviation = 3.619...
Our Sample Mean was wrong by 7%, and our Sample Standard Deviation
was wrong by 21%.
Why Would We Take a Sample?
Mostly because it is easier and cheaper.
Imagine you want to know what the whole country thinks ... you can't
ask millions of people, so instead you ask maybe 1,000 people.
There is a nice quote (supposed to be by Samuel Johnson):
"You don't have to eat the whole ox to know that the meat is tough."
This is the essential idea of sampling. To find out information about the
population (such as mean and standard deviation), we do not need to look
at all members of the population; we only need a sample.
But when we take a sample, we lose some accuracy.
Summary
The Population Standard Deviation:
The Sample Standard Deviation:
Univariate and Bivariate Data
Univariate: one variable, Bivariate: two variables
Univariate means "one variable" (one type of data)
Example: Travel Time (minutes): 15, 29, 8, 42, 35, 21, 18, 42, 26
The variable is Travel Time
We can do lots of things with univariate data:
Find a central value using mean, median and mode
Find how spread out it is using range, quartiles and standard deviation
Make plots like Bar Graphs, Pie Charts and Histograms
Bivariate means "two variables", in other words there are two types of data
With bivariate data you have two sets of related data that you want
to compare:
Example:
An ice cream shop keeps track of how much ice cream they sell versus
the temperature on that day.
The two variables are Ice Cream Sales and Temperature.
Here are their figures for the last 12 days:
Ice Cream Sales vs Temperature
Temperature °C Ice Cream Sales
14.2° $215
16.4° $325
11.9° $185
15.2° $332
18.5° $406
22.1° $522
19.4° $412
25.1° $614
23.4° $544
18.1° $421
22.6° $445
17.2° $408
And here is the same data as a Scatter Plot:
It is now easy to see that warmer weather leads to more sales, but
the relationship is not perfect.
So with bivariate data we are interested in comparing the two sets of data
and finding anyrelationships.
We can use Tables, Scatter Plots, Correlation, Line of Best Fit, and plain old
common sense.
Scatter Plots
A graph of plotted points that show the
relationship between two sets of data.
In this example, each dot represents one person's
weight versus their height.
(The data is plotted on the graph as "Cartesian
(x,y) Coordinates")
Example:
The local ice cream shop keeps track of how much ice cream they sell versus
the temperature on that day. Here are their figures for the last 12 days:
Ice Cream Sales vs Temperature
Temperature °C Ice Cream Sales
14.2° $215
16.4° $325
11.9° $185
15.2° $332
18.5° $406
22.1° $522
19.4° $412
25.1° $614
23.4° $544
18.1° $421
22.6° $445
17.2° $408
And here is the same data as a Scatter Plot:
It is now easy to see that warmer weather leads to more sales, but the
relationship is not perfect.
Line of Best Fit
You can also draw a "Line of Best Fit" (also called a "Trend Line") on your
scatter plot:
Try to have the line as close as possible to all points, and as many points
above the line as below.
Example: Sea Level Rise
A Scatter Plot of Sea
Level Rise:
And here I have drawn
on a "Line of Best Fit".
Correlation
When the two sets of data are strongly linked together we say they have
a High Correlation.
The word Correlation is made of Co- (meaning "together"), and Relation
Correlation is Positive when the values increase together, and
Correlation is Negative when one value decreases as the other
increases
Like this:
(Learn More About Correlation)
Negative Correlation
Correlations can be negative, which means there is a correlation but one
value goes down as the other value increases.
Example : Birth Rate vs Income
The birth rate tends to be lower in richer
countries.
Below is a scatter plot for about 100 different
countries.
Country
Yearly
Production
per Person
Birth
Rate
Madagascar $800 5.70
India $3,100 2.85
Mexico $9,600 2.49
Taiwan $25,300 1.57
Norway $40,000 1.78
It has a negative correlation (the line slopes down)
Note: I tried to fit a straight line to the data, but maybe a curve would work
better, what do you think?
Outliers
"Outliers" are values that "lie outside" the other values.
When we collect data, sometimes
there are values that are "far away"
from the main group of data ... what
do we do with them?
Example: Long Jump
A new coach has been working with the Long Jump team this month,
and the athletes' performance has changed. Augustus can now jump
0.15m further, June and Carol can jump 0.06m further.
Here are all the results:
Augustus: +0.15m
Tom: +0.11m
June: +0.06m
Carol: +0.06m
Bob: + 0.12m
Sam: -0.56m
Oh no! Sam got worse.
Here are the results on the number line:
The mean is:
(0.15+0.11+0.06+0.06+0.12-0.56) / 6 = -0.06 / 6 = -0.01m
So, on average the performance went DOWN.
The coach is obviously useless ... right?
Sam's result is an "Outlier" ... what if we remove Sam's result?
Example: Long Jump (continued)
Let us try the results WITHOUT Sam:
Mean = (0.15+0.11+0.06+0.06+0.12)/6 = 0.08m
Hey, the coach looks much better now!
But is that fair? Can we just get rid of values we don't like?
What To Do?
You need to think "why is that value over there?"
It may be quite normal to have high or low values
People can be short or tall
Some days there is no rain, other days there can be a downpour
Athletes can perform better or worse on different days
Or there may be an unusual reason for extreme data
Example: Long Jump (continued)
We find out that Sam was feeling sick that day. Not the coach's fault at
all.
So it is a good idea in this case to remove Sam's result.
When you remove outliers YOU are influencing the data, it is no longer
"pure", so you shouldn't just get rid of the outliers without a good reason!
And when you do get rid of them, explain what you are doing and why.
Mean, Median and Mode
We saw how outliers affect the mean, but what about the median or mode?
Example: Long Jump (continued)
The median ("middle" value):
including Sam is: 0.085
without Sam is: 0.11 (went up a little)
The mode (the most common value):
including Sam is: 0.06
without Sam is: 0.06 (stayed the same)
The mode and median didn't change very much.
They also stayed around where most of the data is.
So it seems that outliers have the biggest effect on the mean, and not so
much on the median or mode.
Hint: calculate the median and mode when you have outliers.
Correlation
When two sets of data are strongly linked together we say they have a High
Correlation.
The word Correlation is made of Co- (meaning "together"), and Relation
Correlation is Positive when the values increase together, and
Correlation is Negative when one value decreases as the other
increases
Like this:
Correlation can have a value:
1 is a perfect positive correlation
0 is no correlation (the values don't seem linked at all)
-1 is a perfect negative correlation
The value shows how good the correlation is (not how steep the line is),
and if it is positive or negative.
Example: Ice Cream Sales
The local ice cream shop keeps track of how much ice cream they sell versus
the temperature on that day, here are their figures for the last 12 days:
Ice Cream Sales vs Temperature
Temperature °C Ice Cream Sales
14.2° $215
16.4° $325
11.9° $185
15.2° $332
18.5° $406
22.1° $522
19.4° $412
25.1° $614
23.4° $544
18.1° $421
22.6° $445
17.2° $408
And here is the same data as a Scatter Plot:
You can easily see that warmer weather leads to more sales, the relationship
is good but not perfect.
In fact the correlation is 0.9575 ... see at the end how I calculated
it.
Correlation Is Not Good at Curves
The correlation calculation only works well for relationships that follow a
straight line.
Our Ice Cream Example: there has been a heat wave!
It gets so hot that people aren't going near the shop, and sales start
dropping.
Here is the latest graph:
The correlation is now 0: "No Correlation" ... !
The calculated value of correlation is 0 (trust me, I worked it out),
which says there is "no correlation".
But we can see the data follows a nice curve that reaches a peak
around 25° C. But the correlation calculation is not "smart" enough to
see this.
Moral of the story: make a Scatter Plot, and look at it!
You may see more than the correlation value says.
Correlation Is Not Causation
"Correlation Is Not Causation" ... by that I mean: when there is a correlation
it does not mean that one thing causes the other
Example: Sunglasses vs Ice Cream
Our Ice Cream shop finds how many sunglasses were sold by a big
store for each day and compares them to their ice cream sales:
The correlation between Sunglasses and Ice Cream sales is
high
Does this mean that sunglasses make people want ice cream?
How To Calculate
How did I calculate the value 0.9575 at the top?
I used "Pearson's Correlation". There is software that can calculate it for you,
such as the CORREL() function in Excel or OpenOffice Calc ...
... but here is how to calculate it yourself:
Let us call the two sets of data "x" and "y" (in our case Temperature is x and
Ice Cream Sales is y):
Step 1: Find the mean of x, and the mean of y
Step 2: Subtract the mean of x from every x value (call them "a"), do
the same for y (call them "b")
Step 3: Calculate: a × b, a2 and b2 for every value
Step 4: Sum up a × b, sum up a2 and sum up b2
Step 5: Divide the sum of a × b by the square root of [(sum of a2) ×
(sum of b2)]
Here is how I calculated the first Ice Cream example (values rounded to 1 or
0 decimal places):
As a formula it is:
Where:
Σ is Sigma, the symbol for "sum up"
is each x-value minus the mean of x (called "a" above)
is each y-value minus the mean of y (called "b" above)
You probably won't have to calculate it like that, but at least you know it is
not "magic", but simply a routine set of calculations.
Approximate Values
There are also approximate ways to calculate a correlation coefficient, such
as "Spearman's rank correlation coefficient", but I prefer using a spreadsheet
like above.
Probability
How likely something is to happen.
Many events can't be predicted with total certainty. The best we can say is
how likely they are to happen, using the idea of probability.
Tossing a Coin
When a coin is tossed, there are two possible outcomes:
heads (H) or
tails (T)
We say that the probability of the coin landing H is ½.
And the probability of the coin landing T is ½.
Throwing Dice
When a single die is thrown, there are six possible
outcomes: 1, 2, 3, 4, 5, 6.
The probability of any one of them is 1/6.
Probability
In general:
Probability of an event happening = Number of ways it can happen
Total number of outcomes
Example: the chances of rolling a "4" with a die
Number of ways it can happen: 1 (there is only 1 face with a "4" on
it)
Total number of outcomes: 6 (there are 6 faces altogether)
So the probability = 1
6
Example: there are 5 marbles in a bag: 4 are blue, and 1 is
red. What is the probability that a blue marble will be
picked?
Number of ways it can happen: 4 (there are 4 blues)
Total number of outcomes: 5 (there are 5 marbles in total)
So the probability = 4
= 0.8 5
Probability Line
You can show probability on a Probability Line:
Probability is always between 0 and 1
Probability is Just a Guide
Probability does not tell us exactly what will happen, it is just a guide
Example: toss a coin 100 times, how many Heads will come
up?
Probability says that heads have a ½ chance, so we would expect 50
Heads.
But when you actually try it out you might get 48 heads, or 55 heads ...
or anything really, but in most cases it will be a number near 50.
Learn more at Probability Index.
Words
Some words have special meaning in Probability:
Experiment or Trial: an action where the result is uncertain.
Tossing a coin, throwing dice, seeing what pizza people choose are all
examples of experiments.
Sample Space: all the possible outcomes of an experiment
Example: choosing a card from a deck
There are 52 cards in a deck (not including Jokers)
So the Sample Space is all 52 possible cards: {Ace of Hearts, 2 of
Hearts, etc... }
The Sample Space is made up of Sample Points:
Sample Point: just one of the possible outcomes
Example: Deck of Cards
the 5 of Clubs is a sample point
the King of Hearts is a sample point
"King" is not a sample point. As there are 4 Kings that is 4 different
sample points.
Event: a single result of an experiment
Example Events:
Getting a Tail when tossing a coin is an event
Rolling a "5" is an event.
An event can include one or more possible outcomes:
Choosing a "King" from a deck of cards (any of the 4 Kings) is an event
Rolling an "even number" (2, 4 or 6) is also an event
The Sample Space is all possible
outcomes.
A Sample Point is just one possible
outcome.
And an Event can be one or more of
the possible outcomes.
Hey, let's use those words, so you get used to them:
Example: Alex decide to see how many times a "double"
would come up when throwing 2 dice.
Each time Alex throws the 2 dice is an Experiment.
It is an Experiment because the result is uncertain.
The Event Alex is looking for is a "double", where both dice have the
same number. It is made up of these 6 Sample Points:
{1,1} {2,2} {3,3} {4,4} {5,5} and {6,6}
The Sample Space is all possible outcomes (36 Sample Points):
{1,1} {1,2} {1,3} {1,4} ... {6,3} {6,4} {6,5} {6,6}
These are Alex's Results:
Experiment Is it a Double?
{3,4} No
{5,1} No
{2,2} Yes
{6,3} No
... ...
After 100 Experiments, Alex had 19 "double" Events ... is that close
to what you would expect?
Probability Line
Probability is the chance that something will happen. It can be shown on a
line.
The probability of an event occurring is somewhere between impossible and
certain.
As well as words we can use numbers (such as fractions or decimals) to show
the probability of something happening:
Impossible is zero
Certain is one.
Here are some fractions on the probability line:
We can also show the chance that something will happen:
a) The sun will rise tomorrow.
b) I will not have to learn mathematics at school.
c) If I flip a coin it will land heads up.
d) Choosing a red ball from a sack with 1 red ball and 3 green balls
Between 0 and 1
The probability of an event will not be less than 0.
This is because 0 is impossible (sure that something will not
happen).
The probability of an event will not be more than 1.
This is because 1 is certain that something will happen.
The Basic Counting Principle
When there are m ways to do one thing,
and n ways to do another,
then there are m×n ways of doing both.
Example: you have 3 shirts and 4 pants.
That means 3×4=12 different outfits.
Example: There are 6 flavors of ice-cream, and 3 different cones.
That means 6×3=18 different single-scoop ice-creams you could order.
It also works when you have more than 2 choices:
Example: You are buying a new car.
There are 2 body styles:
sedan or hatchback
There are 5 colors available:
There are 3 models: GL (standard model),
SS (sports model with bigger engine)
SL (luxury model with leather seats)
How many total choices?
You can see in this "tree" diagram:
You can count the choices, or just do the simple calculation:
Total Choices = 2 × 5 × 3 = 30
Independent or Dependent?
But it only works when all choices are independent of each other.
If one choice affects another choice (i.e. depends on another choice), then a
simple multiplication is not right.
Example: You are buying a new car ... but ...
the salesman says "You can't choose black for the hatchback" ...
well then things change!
You now have only 27 choices.
Because your choices are not independent of each other.
But you can still make your life easier with this calculation:
Choices = 5×3 + 4×3 = 15 + 12 = 27
Relative Frequency
How often something happens divided by all outcomes.
Example: if your team has won 9 games from a total of 12 games
played:
the Frequency of winning is 9
the Relative Frequency of winning is 9/12 = 75%
All the Relative Frequencies add up to 1 (except for any rounding error).
Example: Travel Survey
92 people were asked how they got to work:
35 used a car
42 took public transport
8 rode a bicycle
7 walked
The Relative Frequencies (to 2 decimal places) are:
Car: 35/92 = 0.38
Public Transport: 42/92 = 0.46
Bicycle: 8/92 = 0.09
Walking: 7/92 = 0.08
0.38+0.46+0.09+0.08 = 1.01
(It would be exactly 1 if we had used perfect accuracy),
Try it for yourself:
Activity: An Experiment with a Die
You will need:
A singledie
Interesting point
Many people think that one of these cubes is called "a dice". But no!
The plural is dice, but the singular is die. (i.e. 1 die, 2 dice.)
The common die has six faces:
We usually call the faces 1, 2, 3, 4, 5 and 6.
High, Low, and Most Likely
Before we start, let's think about what might happen.
Question: If you roll a die:
1. What is the least possible score?
2. What is the greatest possible score?
3. What do you think is the most likely score?
The first two questions are quite easy to answer:
1. The least possible score must be 1
2. The greatest possible score must be 6
3. The most likely score is ... ???
Are they all just as likely? Or will some happen more often?
Let us see which is most likely ...
The Experiment
Throw a die 60 times,
record the scores in a tally table.
You can record the results in this table using tally marks:
Score Tally Frequency
1
2
3
4
5
6
Total Frequency = 60
OK, Go!
... ...
Finished ...?
Now draw a bar graph to illustrate
your results.
You can fill in this one:
Or you can use Data Graphs (Bar,
Line and Pie)
then print it out.
You may get something like this:
Are the bars all the same height?
If not ... why not?
60 Throws
OK, why did I ask you to make 60 throws? Well, only 6 throws would not
give you good results, 600 throws would have been too hard, so I chose 60,
which is 10 lots of 6.
So we should expect 10 of each number, like this:
Those are the theoretical values,
as opposed to the experimental ones you got from your experiment!
How do those theoretical results compare with your experimental
results?
This graph and your graph should be similar, but they are not likely to be
exactly the same, as your experiment relied on chance, and the number of
times you did it was fairly small.
If you did the experiment a very large number of times, you would get
results much closer to the theoretical ones.
Questions
Which face came up most often? ____
Which face came up least often? ____
Do you think you would get the same results if you did this
again? Yes / No
An experiment gives results.
When done again it may give different results!
So it is important to know when results are good quality, or
just random.
Probability
On the page Probability you will find a formula:
Probability of an event happening = Number of ways it can happen
Total number of outcomes
Example: Probability of a 2
We know there are 6 possible outcomes.
And there is only 1 way to get a 2.
So the probability of getting 2 is:
Probability of a 2 = 1
6
Doing that for each score gets us:
Score Probability
1 1/6
2 1/6
3 1/6
4 1/6
5 1/6
6 1/6
Total = 1
The sum of all the probabilities is 1
For any experiment:
The sum of the probabilities of all possible outcomes is always equal to 1
Activity: An Experiment with Dice
We will be throwing two dice and adding the scores ...
You will need:
Twodice
Interesting point
Many people think that one of these cubes is called "a dice". But no!
The plural is dice, but the singular is die: i.e. 1 die, 2 dice.
The common die has six faces:
We usually call the faces 1, 2, 3, 4, 5 and 6.
We Will Be Throwing Two Dice and Adding the Scores ...
Example: if one die shows 2 and the other die shows 6, then the total
score would be 2 + 6 = 8
Question: Can you get a total of 8 any other way?
What about 6 + 2 = 8 (the other way around), is that a different
way?
Yes! Because the two dice are different.
Example: imagine one die is colored red and the other is
colored blue.
There are two possibilities:
So 2 + 6 and 6 + 2 are different.
And you can get 8 with other numbers, such as 3 + 5 = 8 and 4 +
4 = 8
High, Low, and Most Likely
Before we start, let's think about what might happen.
Question: If you throw 2 dice together and add the two scores:
1. What is the least possible total score?
2. What is the greatest possible total score?
3. What do you think is the most likely total score?
The first two questions are quite easy to answer:
1. The least possible total score must be 1 + 1 = 2
2. The greatest possible total score must be 6 + 6 = 12
3. The most likely total score is ... ???
Are they all just as likely? Or will some happen more often?
To help answer the third question let us try an experiment.
The Experiment
Throw two dice together 108 times,
add the scores together each time,
record the scores in a tally table.
Why 108? That seems a strange number to choose. I will explain later.
You can record the results in this table using tally marks:
Added
Scores Tally Frequency
2
3
4
5
6
7
8
9
10
11
12
Total Frequency = 108
OK, Go!
...
...
Finished ...?
Now draw a bar graph
to show your results.
Or you can use Data
Graphs (Bar, Line and
Pie) then print it out.
You may get something
like this:
Are the bars all about the same height?
If not ... why not?
So Why Did We Get That Shape?
The explanation is simple:
There is only one way to get a total of 2 (1 + 1),
but there are six ways of getting a total of 7 (1 + 6, 2 + 5, 3
+ 4, 4 + 3, 5 + 2 and 6 + 1)
Here is a table of all possibile outcomes, and the totals. I have also shown
what adds to 7 in bold.
Score on One Die
1 2 3 4 5 6
Score
on the
Other
Die
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
You can see there is only 1 way to get 2, there are 2 ways to get 3, and so
on.
Let us count the ways of getting each total and put them in a table:
Total
Score
Number of
Ways to
Get
Score
2 1
3 2
4 3
5 4
6 5
7 6
8 5
9 4
10 3
11 2
12 1
Total = 36
Can you see the Symmetry in this table?
2 and 12 have the same number of ways = 1 each
3 and 11 have the same number of ways = 2 each
4 and 10 have the same number of ways = 3 each
5 and 9 have the same number of ways = 4 each
6 and 8 have the same number of ways = 5 each
108 Throws
OK, why 108 throws? Well, only 36 throws would not give good results, 360
throws would be good but take a long time, so 108 (which is 3 lots of
36) seemed just right.
So let's multiply all these numbers by 3 to match our total of 108:
Total
Score
Number of
Ways to
Get
Score
2 3
3 6
4 9
5 12
6 15
7 18
8 15
9 12
10 9
11 6
12 3
Total = 108
Those are the theoretical values, as opposed to the experimental ones
you got from your experiment.
The theoretical values look like this in a bar graph:
How do these theoretical results compare with your experimental
results?
This graph and your graph should be quite similar, but they are not likely to
be exactly the same, as your experiment relied on chance, and the number
of times you did it was fairly small.
If you did the experiment a very large number of times, you should get
results much closer to the theoretical ones.
And, by the way, we've now answered the question from near the beginning
of the experiment:
What is the most likely total score?
7 has the highest bar, so 7 is the most likely total score.
Hey, is that why people talk about Lucky 7 ... ?
Probability
On the page Probability you will find a formula:
Probability of an event happening = Number of ways it can happen
Total number of outcomes
Example: Probability of a total of 2
We know there are 36 possible outcomes.
And there is only 1 way to get a total score of 2.
So the probability of getting 2 is:
Probability of a total of 2 = 1
36
Doing that for each score gets us:
Total
Score Probability
2 1/36
3 2/36
4 3/36
5 4/36
6 5/36
7 6/36
8 5/36
9 4/36
10 3/36
11 2/36
12 1/36
Total = 1
(Note: I didn't simplify the fractions)
The sum of all the probabilities is 1
For any experiment:
The sum of the probabilities of all possible outcomes is always equal to 1
Activity: Dropping a Coin onto a Grid
A few hundred years ago people enjoyed betting on coins tossed on to the
floor ... would they cross a line or not?
A man called "Buffon" (see "Buffon's Needle") started thinking about this and
worked out how to calculate the probability.
Now it is your turn to have a go!
You will need:
A small round coin,
such as a US penny, a 1c Euro or 5 Rupee.
A sheet of paper with a grid of 30 mm squares.
Steps
Measure the diameter of your coin: ____ mm
a US Penny is 19mm, a 1c Euro is 16.25mm, a Rs 5 is
23mm
Also measure the spacing of your grid (it may not print at
exactly 30mm): ____ mm
Put your sheet of paper on a flat surface such as a table top or
the floor.
From a height of about 5cm, drop the coin onto the paper and
record whether it lands:
A: Completely inside a square (not touching any grid
lines)
B: Crosses one or more lines
The exact height from which you drop the coin is not important, but don't
drop it so close to the paper that you are cheating!
If the coin rolls completely off the paper, then do not count that turn.
100 Times
Now we will drop the coin 100 times, but first ...
... what percentage do you think will land A, or B?
Make a guess (estimate) before you begin the experiment:
Your Guess for "A" (%):
Your Guess for "B" (%):
OK let's begin.
Drop the coin 100 times and record A (does not touch a line) or B (touches a
line) using Tally Marks:
Coin lands Tally Frequency Percentage
A
B
Totals: 100 100%
Now draw a Bar Graph to illustrate your results. You can create one at Data
Graphs (Bar, Line and Pie).
Are the bars the same height?
Did you expect them to be?
How does the result compare with your guess?
We Can Calculate What It Should Be ...
Here are some positions for the coin to land so it does not quite touch one
of the lines:
Place your coin on your grid (like above), and then put a mark on the paper
where the center of the coin is (just a rough estimate will do).
See how the coin's center is one radius r away from a
line.
(Read about a Circle's Radius and Diameter.)
Make lots of "center marks" then draw a box connecting them all like below:
d = Coin's diameter (2 × r)
When a coin's center is within the yellow box it won't touch any line.
The yellow box is smaller than the grid by two radiuses (= one diameter) of
the coin.
So what are the areas?
The area of the grid square is 30 × 30 = 900 mm2
The area of the yellow box is (30-d) × (30-d) = (30-d)2 mm2
The above calculation was for a 30 mm grid, but we can use S for grid size:
The area of the grid square is S × S = S2 mm2
The area of the yellow box is (S-d)2 mm2
Example: A 1c Euro (d=16.25 mm) on a 29mm grid (S=29
mm):
Grid Square = 292 = 841 mm2
Yellow Box = (29-16.25)2 = 12.752 = 162 mm2 (to the nearest mm2)
So you should expect the coin to land not crossing a line of the grid
approximately:
"A" = 162 / 841 = 19.3% of the time
And "B" = 100% - 19.3% = 80.7%
Now do the calculations for your own grid size and coin size.
Grid Spacing S (mm):
Diameter of Coin d (mm):
Area of Grid Square = S2 (mm2):
Area of Yellow Box = (S-d)2 (mm2):
"A" (%):
"B" (%):
How do these theoretical results compare with your experimental
results?
It won't be exact (because it is a random thing) but it may be close.
Different Sizes of Coin
Try repeating the experiment using a different sized coin.
First calculate the theoretical value ... how does this affect the values
for A and B?
Then do the experiment to see how close it gets.
What You Have Done
You have (hopefully) had fun running an experiment.
You have done some geometry, and had some experience calculating areas
and probabilities.
And you have seen the relationship between theory and reality.
Activity: Buffon's Needle
A few hundred years ago people enjoyed betting on coins tossed on to the
floor ... would they cross a line or not?
A man called "Buffon" started thinking about this and worked out
the probability. It is called "Buffon's Needle" in his honor.
Now it is your turn to have a go!
You will need:
A match, with the head cut off.
It must be less than 50 mm.
(You can use a needle, but be careful!)
A sheet of paper with lines 50 mm apart.
Steps
Measure the spacing of your lines (it may not print at exactly
50mm): ____ mm
Measure the length of your match (must be less than the line
spacing): ____ mm
Make sure your sheet of paper is on a flat surface such as a
table top or the floor.
From a height of about 5cm, drop the match onto the paper
and record whether it lands:
A: Not touching a line
B: Touching or crossing a line
The exact height from which you drop the match is not important, but don't
drop it so close to the paper that you are cheating!
If the match rolls completely off the paper, then do not count that turn.
100 Times
Now we will drop the match 100 times, but first ...
... what percentage do you think will land A, or B?
Make a guess (estimate) before you begin the experiment:
Your Guess for "A" (%):
Your Guess for "B" (%):
OK let's begin.
Drop the match 100 times and record A (does not touch a grid line)
or B (touches or crosses a grid line) using Tally Marks:
match lands Tally Frequency Percentage
A (no touch)
B (crosses)
Totals: 100 100%
Now draw a Bar Graph to illustrate your results. You can create one at Data
Graphs (Bar, Line and Pie).
Are the bars the same height?
Did you expect them to be?
How does the result compare with your guess?
We Can Calculate What It Should Be ...
Buffon used the results from his experiment with a needle to estimate the
value of π (Pi). He worked out this formula:
π ≈ 2L
xp
Where
L is the length of the needle (or match in our case)
x is the line spacing (50 mm for us)
p is the proportion of needles crossing a line (case B)
But today we are going to "change the subject" of the formula to work out
the "p" (the proportion of B):
Start with: π ≈ 2L/xp
multiply both sides by p: πp ≈ 2L/x
divide both sides by π: p ≈ 2L/πx
And we get:
p =
2L
πx
Example: John had a match of length 36 mm, and a 50 mm
line spacing.
So John has:
L = 36
x = 50
Substituting these values into the formula, John got:
p =
2 × 36 = 0.46...
π × 50
So John should expect the match to cross the line (case B) 46 times out
of 100
Fill in the following table using your own results:
Length of match "L" (mm):
Line Spacing "x" (mm):
Estimate for p (= 2L/πx):
How close were you?
It won't be exact (because it is a random thing) but it may be close.
Different Size of Match
Try repeating the experiment using a different sized match (but not larger
then the line spacing!)
Did you get better or worse results?
What You Have Done
You have (hopefully) had fun running an experiment.
You have had some experience with calculations.
And you have seen the relationship between theory and reality.
Random Words
Probability and English ... what a mix!
Random Letters
You would think it was easy to create random words ... just pick letters
randomly and put them together, and voila! a random word.
Well, here are 20 words made that way:
tldkl oewkx dmwol vuptg hvwjk naqid avypr zwtip
zgnzs bvdhd
muyfd ighgd xhlng oyecn vjnsl ssjrx gxald tukxj
rvfoq yxzxq
It turns out that the words are not only nonsense, but quite hard to
pronounce!
(Try saying "tldkl" or "oewkx")
You see, the probability is very unlikely ... you would have to try lots of
random combinations before getting lucky.
Why? Well, English has around 200,000 words (228,000 in the Oxford
English Dictionary, including many words no longer used) ... but how many
different words can be made with just 5 letters?
26 × 26 × 26 × 26 × 26 = 11,881,376 possible 5 letter words!
And that is just the 5 letter words ...
Let us guess that there are 40,000 words in English that have 5 letters. So
the probability of making a real word just randomly would be:
40,000 / 11,881,376 = 0.003, or about 0.3% chance
So real words are rare. And we can see that putting random letters
together is very unlikely to produce a real word.
Vowels
We can improve our success by insisting that a word have at least one vowel,
since nearly every word in English has one (except fly, by and a few others).
Like this:
ectot gjaqv kuifg vzicu zspsu pdidb wqdis uerrs
ucgej okimw
fnevz ewxko ljgew aglgo jpfoq dcytu uwkcj dzioy
wekdx xuybk
This is a great improvement. More words can be pronounced.
But there are still lots of strange words like "zspsu" and "xuybk"
Letter Frequency
So, our next improvement is to use less of the letters like j,x,z and q
and more of the letters like e,t and s.
In fact the frequency of letters in the English Language is well known.
Here is how many times you would expect to see a letter in every 1,000
letters:
a b c d e f g h i j k l m n o p q r s t u v w x y z
82 15 28 42 127 22 20 61 70 2 8 40 24 67 75 19 1 60 63 90 27 10 24 2 20 1
Can you see that "e" is common, but "z" is rare?
"e" is lkely to occur 127 times in every 1,000, or as a ratio 127/1000 =
.127 (=12.7%)
"z" is lkely to occur only 1 time in every 1,000, or as a ratio 1/1000 =
.001 (=0.1%)
So, by selecting letters based on that frequency (a bit like rolling a 1,000
sided die (dice), where each die has 82 a's, 15 b's ... and only one z), we
can get output like this:
elnao etgov segty laast aessn siuon oenha eaoas
ncoot ctwka
dmswo dpuoh eewis ebdni laarm syucs idvos lhina
igahh soyie
Still no real words, but some are close. And most of them can be
pronounced. (Great names if you are writing a science fiction novel!)
Try For Yourself!
You can try all three methods here ... see if you can get lucky and find a real
word:
Probability: Complement
Complement of an Event: All outcomes that are NOT the event.
When the event is Heads, the complement is Tails
When the event is {Monday, Wednesday} the complement
is {Tuesday, Thursday, Friday, Saturday, Sunday}
When the event is {Hearts} the complement is {Spades,
Clubs, Diamonds, Jokers}
So the Complement of an event is all the other outcomes (not the ones you
want).
And together the Event and its Complement make all possible outcomes.
Probability
Probability of an event happening =
Number of ways it can happen
Total number of outcomes
Example: the chances of rolling a "4" with a die
Number of ways it can happen: 1 (there is only 1 face with a "4" on it)
Total number of outcomes: 6 (there are 6 faces altogether)
So the probability =
1
6
The probability of an event is shown using "P":
P(A) means "Probability of Event A"
The complement is shown by a little ' mark such as A' (or
sometimes Ac or A):
P(A') means "Probability of the complement of Event A"
The two probabilities always add to 1
P(A) + P(A') = 1
Example: Rolling a "5" or "6"
Event A: {5, 6}
Number of ways it can happen: 2
Total number of outcomes: 6
P(A) =
2
=
1
6 3
The Complement of Event A is {1, 2, 3, 4}
Number of ways it can happen: 4
Total number of outcomes: 6
P(A') =
4
=
2
6 3
Let us add them:
P(A) + P(A') =
1
+
2
=
3
= 1
3 3 3
Yep, that makes 1
It makes sense, right? Event A plus all outcomes that are not Event A make
up all possible outcomes.
Why is the Complement Useful?
It is sometimes easier to work out the complement first.
Example. Throw two dice. What is the probability the two scores
are different?
Different scores are like getting a 2 and 3, or a 6 and 1. It is quite a long
list:
A = { (1,2), (1,3), (1,4), (1,5), (1,6),
(2,1), (2,3), (2,4), ... etc ! }
But the complement (which is when the two scores are the same) is only 6
outcomes:
A' = { (1,1), (2,2), (3,3), (4,4), (5,5), (6,6) }
And the probability is easy to work out:
P(A') = 6/36 = 1/6
Knowing that P(A) and P(A') together make 1, we can calculate:
P(A) = 1 - P(A') = 1 - 1/6 = 5/6
So in this case it's easier to work out P(A') first, then find P(A)
Probability: Types of Events
Events can be Independent, Mutually Exclusive or Conditional !
Life is full of random events!
You need to get a "feel" for them to be a smart and successful person.
The toss of a coin, throw of a dice and lottery draws are all examples of
random events.
Events
When we say "Event" we mean one (or more) outcomes.
Example Events:
Getting a Tail when tossing a coin is an event
Rolling a "5" is an event.
An event can include several outcomes:
Choosing a "King" from a deck of cards (any of the 4 Kings) is also an
event
Rolling an "even number" (2, 4 or 6) is an event
Independent Events
Events can be "Independent", meaning each event is not affected by any
other events.
This is an important idea! A coin does not "know" that it came up heads
before ... each toss of a coin is a perfect isolated thing.
Example: You toss a coin three times and it comes up "Heads" each
time ... what is the chance that the next toss will also be a "Head"?
The chance is simply 1/2, or 50%, just like ANY OTHER toss of
the coin.
What it did in the past will not affect the current toss!
Some people think "it is overdue for a Tail", but really truly the next toss of
the coin is totally independent of any previous tosses.
Saying "a Tail is due", or "just one more go, my luck is due" is
called The Gambler's Fallacy
(Learn more at Independent Events.)
Dependent Events
But some events can be "dependent" ... which means they can be affected
by previous events ...
Example: Drawing 2 Cards from a Deck
After taking one card from the deck there are less cards available, so
the probabilities change!
Let's say you are interested in the chances of getting a King.
For the 1st card the chance of drawing a King is 4 out of 52
But for the 2nd card:
If the 1st card was a King, then the 2nd card is less likely to be a King,
as only 3 of the 51 cards left are Kings.
If the 1st card was not a King, then the 2nd card is slightly more likely
to be a King, as 4 of the 51 cards left are King.
This is because you are removing cards from the deck.
Replacement: When you put each card back after drawing it the chances
don't change, as the events are independent.
Without Replacement: The chances will change, and the events
are dependent.
You can learn more about this at Dependent Events: Conditional Probability
Tree Diagrams
When you have Dependent Events it helps to make a "Tree Diagram"
Example: Soccer Game
You are off to soccer, and love being the Goalkeeper, but that depends
who is the Coach today:
with Coach Sam your probability of being Goalkeeper is 0.5
with Coach Alex your probability of being Goalkeeper is 0.3
Sam is Coach more often ... about 6 of every 10 games (a probability
of 0.6).
Let's build the Tree Diagram!
Start with the Coaches. We know 0.6 for Sam, so it must be 0.4 for
Alex (the probabilities must add to 1):
Then fill out the branches for Sam (0.5 Yes and 0.5 No), and then for
Alex (0.3 Yes and 0.7 No):
Now it is neatly laid out we could calculate probabilities (read more at
"Tree Diagrams").
Mutually Exclusive
Mutually Exclusive means you can't get both events at the same
time.
It is either one or the other, but not both
Examples:
Turning left or right are Mutually Exclusive (you can't do both at the
same time)
Heads and Tails are Mutually Exclusive
Kings and Aces are Mutually Exclusive
What isn't Mutually Exclusive
Kings and Hearts are not Mutually Exclusive, because you can have a
King of Hearts!
Like here:
Aces and Kings are
Mutually Exclusive
Hearts and Kings are
not Mutually Exclusive
Read more at Mutually Exclusive Events
Probability: Independent Events
Life is full of random events!
You need to get a "feel" for them to be a smart and successful person.
The toss of a coin, throwing dice and lottery draws are all examples of
random events.
Sometimes an event can affect the next event.
Example: taking colored marbles from a bag: as you take each marble
there are less marbles left in the bag, so the probabilities change.
We call those Dependent Events, because what happens depends on what
happened before (learn more about this at Conditional probability).
But otherwise they are Independent Events ...
Independent Events
Independent Events are not affected by previous events.
This is an important idea!
A coin does not "know" it came up heads before ...
.... each toss of a coin is a perfect isolated thing.
Example: You toss a coin and it comes up "Heads" three
times ... what is the chance that the next toss will also be a
"Head"?
The chance is simply ½ (or 0.5) just like ANY toss of the coin.
What it did in the past will not affect the current toss!
Some people think "it is overdue for a Tail", but really truly the next toss of
the coin is totally independent of any previous tosses.
Saying "a Tail is due", or "just one more go, my luck is due" is
called The Gambler's Fallacy
Of course your luck may change, because each toss of the coin has an equal
chance.
Probability of Independent Events
"Probability" (or "Chance") is how likely something is to happen.
So how do we calculate probability?
Probability of an event happening = Number of ways it can happen
Total number of outcomes
Example: what is the probability of getting a "Head" when
tossing a coin?
Number of ways it can happen: 1 (Head)
Total number of outcomes: 2 (Head and Tail)
So the probability = 1
= 0.5 2
Example: what is the probability of getting a "5" or "6" when
rolling a die?
Number of ways it can happen: 2 ("5" and "6")
Total number of outcomes: 6 ("1", "2", "3", "4", "5" and "6")
So the probability = 2
= 1
= 0.333... 6 3
Ways of Showing Probability
Probability goes from 0 (imposssible) to 1 (certain):
It is often shown as a decimal or fraction.
Example: the probability of getting a "Head" when tossing a coin:
As a decimal: 0.5
As a fraction: 1/2
As a percentage: 50%
Or sometimes like this: 1-in-2
Two or More Events
You can calculate the chances of two or more independent events
by multiplying the chances.
Example: Probability of 3 Heads in a Row
For each toss of a coin a "Head" has a probability of 0.5:
And so the chance of getting 3 Heads in a row is 0.125
So each toss of a coin has a ½ chance of being Heads, but lots of Heads in
a row is unlikely.
Example: Why is it unlikely to get, say, 7 heads in a row,
when each toss of a coin has a ½ chance of being Heads?
Because you are asking two different questions:
Question 1: What is the probability of 7 heads in a row?
Answer: ½×½×½×½×½×½×½ = 0.0078125 (less than 1%).
Question 2: Given that you have just got 6 heads in a row, what is
the probability thatthe next toss is also a head?
Answer: ½, as the previous tosses don't affect the next toss.
You can have a play with the Quincunx to see how lots of independent effects
can still have a pattern.
Notation
We use "P" to mean "Probability Of",
So, for Independent Events:
P(A and B) = P(A) × P(B)
Probability of A and B equals the probability of A times the probability of B
Example: you are going to a concert, and your friend says it
is some time on the weekend between 4 and 12, but won't
say more.
What are the chances it is on Sunday between 10 and 12?
Day: there are two days on the weekend, so P(Sunday) = 0.5
Time: between 4 and 12 is 8 hours, but you want between 10 and 12
which is only 2 hours:
P(Your Time) = 2/8 = 0.25
And:
P(Sunday and Your Time) = P(Sunday) × P(Your Time) = 0.5 × 0.25
= 0.125
Or a 12.5% chance
Another Example
Imagine there are two groups:
A member of each group gets randomly chosen for the winners circle,
then one of those gets randomly chosen to get the big money prize:
What is your chance of winnning the big prize?
there is a 1/5 chance of going to the winners circle
and a 1/2 chance of winning the big prize
So you have a 1/5 chance followed by a 1/2 chance ... which makes a 1/10
chance overall:
1 ×
1 =
1 =
1
5 2 5 × 2 10
Or you can calculate using decimals (1/5 is 0.2, and 1/2 is 0.5):
0.2 x 0.5 = 0.1
So your chance of winning the big money is 0.1 (which is the same as 1/10).
Coincidence!
Many "Coincidences" are, in fact, likely.
Example: you are in a room with 30 people, and find that
Zach and Anna celebrate their birthday on the same day.
Would you say "wow, how strange", or "that seems reasonable, with so
many people here".
In fact there is a 70% chance that would happen ... so it is likely.
Why is the chance so high?
Because you are comparing everyone to everyone else (not just one to many).
And with 30 people that is 435 comparisons
(Read Shared Birthdays to find out more.)
Example: Snap!
Did you ever say something the same as someone else, at the same
time too?
Wow, how amazing!
But you were probably sharing an experience (movie, journey,
whatever) and so your thoughts would be similar.
And there are only so many ways of saying something ...
... so it is like the card game "Snap!" ...
... if you speak enough words together, they will eventually match up.
So, maybe not so amazing, just simple chance at work.
Can you think of other cases where a "coincidence" was simply a likely thing?
Conclusion
Probability is: (Number of ways it can happen) / (Total number
of outcomes)
Dependent Events (such as removing marbles from a bag) are
affected by previous events
Independent events (such as a coin toss) are not affected by
previous events
You can calculate the probability of 2 or
more Independent events bymultiplying
Not all coincidences are really unlikely (when you think about
them).
Conditional Probability
How to handle Dependent Events
Life is full of random events! You need to get a "feel" for them to be a smart
and successful person.
Independent Events
Events can be "Independent", meaning each event is not affected by any
other events.
Example: Tossing a coin.
Each toss of a coin is a perfect isolated thing.
What it did in the past will not affect the current toss.
The chance is simply 1-in-2, or 50%, just like ANY toss of the coin.
So each toss is an Independent Event.
Dependent Events
But events can also be "dependent" ... which means they can be affected
by previous events ...
Example: Marbles in a Bag
2 blue and 3 red marbles are in a bag.
What are the chances of getting a blue marble?
The chance is 2 in 5
But after taking one out you change the chances!
So the next time:
if you got a red marble before, then the chance of a blue marble next
is 2 in 4
if you got a blue marble before, then the chance of a blue marble next
is 1 in 4
See how the chances change each time? Each event depends on what
happened in the previous event, and is called dependent.
That is the kind of thing we will be looking at here.
"Replacement"
Note: if you had replaced the marbles in the bag each time, then the
chances wouldnot have changed and the events would be independent:
With Replacement: the events are Independent (the chances
don't change)
Without Replacement: the events are Dependent (the chances
change)
Tree Diagram
A Tree Diagram: is a wonderful way to picture what is going on, so let's build
one for our marbles example.
There is a 2/5 chance of pulling out a Blue marble, and a 3/5 chance for Red:
We can even go one step further and see what happens when we select a
second marble:
If a blue marble was selected first there is now a 1/4 chance of getting a blue
marble and a 3/4 chance of getting a red marble.
If a red marble was selected first there is now a 2/4 chance of getting a blue
marble and a 2/4 chance of getting a red marble.
Now we can answer questions like "What are the chances of drawing 2
blue marbles?"
Answer: it is a 2/5 chance followed by a 1/4 chance:
Did you see how we multiplied the chances? And got 1/10 as a result.
The chances of drawing 2 blue marbles is 1/10
Notation
We love notation in mathematics! It means we can then use the power of
algebra to play around with the ideas. So here is the notation for probability:
P(A) means "Probability Of Event A"
In our marbles example Event A is "get a Blue Marble first" with a probability
of 2/5:
P(A) = 2/5
And Event B is "get a Blue Marble second" ... but for that we have 2 choices:
If we got a Blue Marble first the chance is now 1/4
If we got a Red Marble first the chance is now 2/4
So we have to say which one we want, and use the symbol "|" to mean
"given":
P(B|A) means "Event B given Event A"
In other words, event A has already happened, now what is the chance of
event B?
P(B|A) is also called the "Conditional Probability" of B given A.
And in our case:
P(B|A) = 1/4
So the probability of getting 2 blue marbles is:
And we write it as
"Probability of event A and event B equals
the probability of event A times the probability of event B given event A"
Let's do the next example using only notation:
Example: Drawing 2 Kings from a Deck
Event A is drawing a King first, and Event B is drawing a King second.
For the first card the chance of drawing a King is 4 out of 52
P(A) = 4/52
But after removing a King from the deck the probability of the 2nd card
drawn is less likely to be a King (only 3 of the 51 cards left are Kings):
P(B|A) = 3/51
And so:
P(A and B) = P(A) x P(B|A) = (4/52) x (3/51) = 12/2652
= 1/221
So the chance of getting 2 Kings is 1 in 221, or about 0.5%
Finding Hidden Data
Using Algebra we can also "change the subject" of the formula, like this:
Start with: P(A and B) = P(A) x P(B|A)
Swap sides: P(A) x P(B|A) = P(A and B)
Divide by P(A): P(B|A) = P(A and B) / P(A)
And we have another useful formula:
"The probability of event B given event A equals
the probability of event A and event B divided by the probability of event
A
Example: Ice Cream
70% of your friends like Chocolate, and 35% like Chocolate AND like
Strawberry.
What percent of those who like Chocolate also like Strawberry?
P(Strawberry|Chocolate) = P(Chocolate and Strawberry) /
P(Chocolate)
0.35 / 0.7 = 50%
50% of your friends who like Chocolate also like Strawberry
Big Example: Soccer Game
You are off to soccer, and want to be the Goalkeeper, but that depends who
is the Coach today:
with Coach Sam the probability of being Goalkeeper is 0.5
with Coach Alex the probability of being Goalkeeper is 0.3
Sam is Coach more often ... about 6 out of every 10 games (a probability
of 0.6).
So, what is the probability you will be a Goalkeeper today?
Let's build a tree diagram. First we show the two possible coaches: Sam or
Alex:
The probability of getting Sam is 0.6, so the probability of Alex must be 0.4
(together the probability is 1)
Now, if you get Sam, there is 0.5 probability of being Goalie (and 0.5 of not
being Goalie):
If you get Alex, there is 0.3 probability of being Goalie (and 0.7 not):
The tree diagram is complete, now let's calculate the overall probabilities.
Remember that:
P(A and B) = P(A) x P(B|A)
Here is how to do it for the "Sam, Yes" branch:
(When we take the 0.6 chance of Sam being coach and include the 0.5
chance that Sam will let you be Goalkeeper we end up with an 0.3 chance.)
But we are not done yet! We haven't included Alex as Coach:
An 0.4 chance of Alex as Coach, followed by an 0.3 chance gives 0.12
And the two "Yes" branches of the tree together make:
0.3 + 0.12 = 0.42 probability of being a Goalkeeper today
(That is a 42% chance)
Check
One final step: complete the calculations and make sure they add to 1:
0.3 + 0.3 + 0.12 + 0.28 = 1
Yes, they add to 1, so that looks right.
Friends and Random Numbers
Here is another quite different example of Conditional Probability.
4 friends (Alex, Blake, Chris and Dusty) each choose a random
number between 1 and 5. What is the chance that any of them
chose the same number?
Let's add our friends one at a time ...
First, what is the chance that Alex and Blake have the same
number?
Blake compares his number to Alex's number. There is a 1 in 5 chance of a
match.
As a tree diagram:
Note: "Yes" and "No" together makes 1
(1/5 + 4/5 = 5/5 = 1)
Now, let's include Chris ...
But there are now two cases to consider:
If Alex and Billy did match, then Chris has only one number to
compare to.
But if Alex and Billy did not match then Chris has two numbers to
compare to.
And we get this:
For the top line (Alex and Billy did match) we already have a match (a
chance of 1/5).
But for the "Alex and Billy did not match" there is now a 2/5 chance of
Chris matching (because Chris gets to match his number against both Alex
and Billy).
And we can work out the combined chance by multiplying the chances it
took to get there:
Following the "No, Yes" path ... there is a 4/5 chance of No, followed
by a 2/5 chance of Yes:
(4/5) × (2/5) = 8/25
Following the "No, No" path ... there is a 4/5 chance of No, followed
by a 3/5 chance of No:
(4/5) × (3/5) = 12/25
Also notice that when you add all chances together you still get 1 (a good
check that we haven't made a mistake):
(5/25) + (8/25) + (12/25) = 25/25 = 1
Now what happens when we include Dusty?
It is the same idea, just more of it:
OK, that is all 4 friends, and the "Yes" chances together make 101/125:
Answer: 101/125
But notice something interesting ... if we had followed the "No" path we
could haveskipped all the other calculations and made our life easier:
The chances of not matching are:
(4/5) × (3/5) × (2/5) = 24/125
So the chances of matching are:
1 - (24/125) = 101/125
(And we didn't really need a tree diagram for that!)
And that is a popular trick in probability:
It is often easier to work out the "No" case
(This idea is shown in more detail at Shared Birthdays.)
Probability Tree Diagrams
Calculating probabilities can be hard, sometimes you add them, sometimes
you multiply them, and often it is hard to figure out what to do ... tree
diagrams to the rescue!
Here is a tree diagram for the toss of a coin:
There are two "branches" (Heads and Tails)
The probability of each branch is written on
the branch
The outcome is written at the end of the
branch
We can extend the tree diagram to two tosses of a coin:
How do you calculate the overall probabilities?
You multiply probabilities along the branches
You add probabilities down columns
Now we can see such things as:
The probability of "Head, Head" is 0.5×0.5 = 0.25
All probabilities add to 1.0 (which is always a good check)
The probability of getting at least one Head from two tosses is
0.25+0.25+0.25 = 0.75
... and more
That was a simple example using independent events (each toss of a coin is
independent of the previous toss), but tree diagrams are really wonderful for
figuring out dependent events (where an event depends on what happens
in the previous event) like this example:
Example: Soccer Game
You are off to soccer, and love being the Goalkeeper, but that depends who
is the Coach today:
with Coach Sam the probability of being Goalkeeper is 0.5
with Coach Alex the probability of being Goalkeeper is 0.3
Sam is Coach more often ... about 6 out of every 10 games (a probability
of 0.6).
So, what is the probability you will be a Goalkeeper today?
Let's build the tree diagram. First we show the two possible coaches: Sam or
Alex:
The probability of getting Sam is 0.6, so the probability of Alex must be 0.4
(together the probability is 1)
Now, if you get Sam, there is 0.5 probability of being Goalie (and 0.5 of not
being Goalie):
If you get Alex, there is 0.3 probability of being Goalie (and 0.7 not):
The tree diagram is complete, now let's calculate the overall probabilities.
This is done by multiplying each probability along the "branches" of the tree.
Here is how to do it for the "Sam, Yes" branch:
(When we take the 0.6 chance of Sam being coach and include the 0.5
chance that Sam will let you be Goalkeeper we end up with an 0.3 chance.)
But we are not done yet! We haven't included Alex as Coach:
An 0.4 chance of Alex as Coach, followed by an 0.3 chance gives 0.12.
Now we add the column:
0.3 + 0.12 = 0.42 probability of being a Goalkeeper today
(That is a 42% chance)
Check
One final step: complete the calculations and make sure they add to 1:
0.3 + 0.3 + 0.12 + 0.28 = 1
Yes, it all adds up.
Conclusion
So there you go, when in doubt draw a tree diagram, multiply along the
branches and add the columns. Make sure all probabilities add to 1 and you
are good to go.
Mutually Exclusive Events
Mutually Exclusive: can't happen at the same time.
Examples:
Turning left and turning right are Mutually Exclusive (you can't do both
at the same time)
Tossing a coin: Heads and Tails are Mutually Exclusive
Cards: Kings and Aces are Mutually Exclusive
What is not Mutually Exclusive:
Turning left and scratching your head can happen at the same time
Kings and Hearts, because you can have a King of Hearts!
Like here:
Aces and Kings are
Mutually Exclusive
(can't be both)
Hearts and Kings are
not Mutually Exclusive
(can be both)
Probability
Let's look at the probabilities of Mutually Exclusive events. But first, a
definition:
Probability of an event happening = Number of ways it can happen
Total number of outcomes
Example: there are 4 Kings in a deck of 52 cards. What is the
probability of picking a King?
Number of ways it can happen: 4 (there are 4 Kings)
Total number of outcomes: 52 (there are 52 cards in total)
So the probability = 4
= 1
52 13
Mutually Exclusive
When two events (call them "A" and "B") are Mutually Exclusive it
is impossible for them to happen together:
P(A and B) = 0
"The probability of A and B together equals 0 (impossible)"
But the probability of A or B is the sum of the individual probabilities:
P(A or B) = P(A) + P(B)
"The probability of A or B equals the probability of A plus the probability of
B"
Example: A Deck of Cards
In a Deck of 52 Cards:
the probability of a King is 1/13, so P(King)=1/13
the probability of an Ace is also 1/13, so P(Ace)=1/13
When we combine those two Events:
The probability of a card being a King and an Ace is 0 (Impossible)
The probability of a card being a King or an Ace is (1/13) + (1/13)
= 2/13
Which is written like this:
P(King and Ace) = 0
P(King or Ace) = (1/13) + (1/13) = 2/13
Special Notation
Instead of "and" you will often see the symbol ∩ (which is the "Intersection"
symbol used in Venn Diagrams)
Instead of "or" you will often see the symbol ∪ (the "Union" symbol)
Example: Scoring Goals
If the probability of:
scoring no goals (Event "A") is 20%
scoring exactly 1 goal (Event "B") is 15%
Then:
The probability of scoring no goals and 1 goal is 0 (Impossible)
The probability of scoring no goals or 1 goal is 20% + 15% = 35%
Which is written:
P(A ∩ B) = 0
P(A ∪ B) = 20% + 15% = 35%
Remembering
To help you remember, think:
"Or has more ... than And"
∪ is like a cup which holds more than ∩
Not Mutually Exclusive
Now let's see what happens when events are not Mutually Exclusive.
Example: Hearts and Kings
Hearts and Kings together is only the King of Hearts:
But Hearts or Kings is:
all the Hearts (13 of them)
all the Kings (4 of them)
But that counts the King of Hearts twice!
So we correct our answer, by subtracting the extra "and" part:
16 Cards = 13 Hearts + 4 Kings - the 1 extra King of Hearts
Count them to make sure this works!
As a formula this is:
P(A or B) = P(A) + P(B) - P(A and B)
"The probability of A or B equals the probability of A plus the probability of
B
minus the probability of A and B"
Here is the same formula, but using ∪ and ∩:
P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
A Final Example
16 people study French, 21 study Spanish and there are 30
altogether. Work out the probabilities!
This is definitely a case of not Mutually Exclusive (you can study French AND
Spanish).
Let's say b is how many study both languages:
people studying French Only must be 16-b
people studying Spanish Only must be 21-b
And we get:
And we know there are 30 people, so:
(16-b) + b + (21-b) = 30
37 - b = 30
b = 7
And we can put in the correct numbers:
So we know all this now:
P(French) = 16/30
P(Spanish) = 21/30
P(French Only) = 9/30
P(Spanish Only) = 14/30
P(French or Spanish) = 30/30 = 1
P(French and Spanish) = 7/30
Lastly, let's check with our formula:
P(A or B) = P(A) + P(B) - P(A and B)
Put the values in:
30/30 = 16/30 + 21/30 – 7/30
Yes, it works!
Summary:
Mutually Exclusive
A and B together is impossible: P(A and B) = 0
A or B is the sum of A and B: P(A or B) = P(A) + P(B)
Not Mutually Exclusive
A or B is the sum of A and B minus A and B: P(A or B) =
P(A) + P(B) - P(A and B)
False Positives and False Negatives
Test Says "Yes" ... or does it?
When you have a test that can say "Yes" or "No" (such as a medical test),
you have to think:
It could be wrong when it says "Yes".
It could be wrong when it says "No".
Wrong?
It is like being told you did something when
you didn't!
Or you didn't do it when you really did.
There are special names for this, called "False Positive" and "False
Negative":
They say you did They say you didn't
You really did They are right! "False Negative"
You really didn't "False Positive" They are right!
Here are some examples of "false positives" and "false negatives":
Airport Security: a "false positive" is when ordinary items
such as keys or coins get mistaken for weapons (machine
goes "beep")
Quality Control: a "false positive" is when a good quality
item gets rejected, and a "false negative" is when a poor
quality item gets accepted
Antivirus software: a "false positive" is when a normal file is
thought to be a virus
Medical screening: low-cost tests given to a large group can
give many false positives (saying you have a disease when
you don't), and then ask you to get more accurate tests.
But many people don't understand the true numbers behind "Yes" or "No",
like in this example:
Example: Allergy or Not?
Hunter says she is itchy. There is a test for Allergy to Cats, but
this test is not always right:
For people that really do have the allergy, the test says
"Yes" 80% of the time
For people that do not have the allergy, the test says
"Yes" 10% of the time ("false positive")
Here it is in a table:
Test says "Yes" Test says "No"
Have allergy 80% 20% "False Negative"
Don't have it 10% "False Positive" 90%
Question: If 1% of the population have the allergy, and Hunter's
test says "Yes", what are the chances that Hunter really has the
allergy?
Do you think 75%? Or maybe 50%?
A test similar to this was given to Doctors and most guessed around
75% ...
... but they were very wrong!
(Source: "Probabilistic reasoning in clinical medicine: Problems and
opportunities" by David M. Eddy 1982, which this example is based on)
There are two good ways to work this out: "Imagine a 1000" and "Tree
Diagrams".
Try Imagining A Thousand People
When trying to understand questions like this, just imagine a large group
(say 1000) and play with the numbers:
Of 1000 people, only 10 really have the allergy (1% of 1000 is
10)
The test is 80% right for people who have the allergy, so it
will get 8 of those 10 right.
But 990 do not have the allergy, and the test will say "Yes" to
10% of them,
which is 99 people it says "Yes" to wrongly (false positive)
So out of 1000 people the test says "Yes" to (8+99) = 107
people
As a table:
1% have it Test says "Yes" Test says "No"
Have allergy 10 8 2
Don't have it 990 99 891
1000 107 893
So 107 people get a "Yes" but only 8 of those really have the allergy:
8 / 107 = about 7%
So, even though Hunter's test said "Yes", it is still only 7% likely that
Hunter has a Cat Allergy.
As A Tree
Drawing a tree diagram can really help:
First of all, let's check that all the percentages add up:
0.8% + 0.2% + 9.9% + 89.1% = 100% (good!)
And the two "Yes" answers add up to 0.8% + 9.9% = 10.7%, but only 0.8%
are correct.
0.8/10.7 = 7% (same answer as above)
Conclusion
When dealing with false positives and false negatives (or other tricky
probability questions) it pays to:
Imagine you have 1,000 (of whatever)
Or make a tree diagram
Shared Birthdays
This is a great puzzle, and you get to learn a lot about probability along the
way ...
There are 30 people in a room ... what is the chance that any
two of them celebrate their birthday on the same day? Assume
365 days in a year.
Some people think "there are 30 people, and 365 days, so 30/365 sounds
about right, and 30/365 = 0.08..."
But no!
The probability is much higher. It is actually likely there are people who
share a birthday in that room.
Because you should compare everyone to everyone
else.
And with 30 people that is 435 comparisons.
But you also have to be careful not to over-count the
chances.
I will show you how to do it ... starting with a smaller example:
Friends and Random Numbers
4 friends (Alex, Billy, Chris and Dusty) each choose a random
number between 1 and 5. What is the chance that any of them
chose the same number?
We will add our friends one at a time ...
First, what is the chance that Alex and Billy have the same
number?
Billy compares his number to Alex's number. There is a 1 in 5 chance of a
match.
As a tree diagram:
Note: "Yes" and "No" together make 1
(1/5 + 4/5 = 5/5 = 1)
Now, let's include Chris ...
But there are now two cases to consider (called "Conditional Probability"):
If Alex and Billy did match, then Chris has only one number to
compare to.
But if Alex and Billy did not match then Chris has two numbers to
compare to.
And we get this:
For the top line (Alex and Billy did match) we already have a match (a
chance of 1/5).
But for the "Alex and Billy did not match" there is a 2/5 chance of Chris
matching (against both Alex and Billy).
And we can work out the combined chance by multiplying the chances it
took to get there:
Following the "No, Yes" path ... there is a 4/5 chance of No, followed
by a 2/5 chance of Yes:
(4/5) × (2/5) = 8/25
Following the "No, No" path ... there is a 4/5 chance of No, followed
by a 3/5 chance of No:
(4/5) × (3/5) = 12/25
Also notice that adding all chances together is 1 (a good check that we
haven't made a mistake):
(5/25) + (8/25) + (12/25) = 25/25 = 1
Now what happens when we include Dusty?
It is the same idea, just more of it:
OK, that is all 4 friends, and the "Yes" chances together make 101/125:
Answer: 101/125
But notice something interesting ... if we had followed the "No" path we
could haveskipped all the other calculations and made our life easier:
The chances of not matching are:
(4/5) × (3/5) × (2/5) = 24/125
So the chances of matching are:
1 - (24/125) = 101/125
(And we didn't really need a tree diagram for that!)
And that is a popular trick in probability:
It is often easier to work out the "No" case
Example: what are the chances that with 6 people any of
them celebrate their Birthday in the same month? (Assume
equal months)
The "no match" case for:
2 people is 11/12
3 people is (11/12) × (10/12)
4 people is (11/12) × (10/12) × (9/12)
5 people is (11/12) × (10/12) × (9/12) × (8/12)
6 people is (11/12) × (10/12) × (9/12) × (8/12) × (7/12)
So the chance of not matching is:
(11/12) × (10/12) × (9/12) × (8/12) × (7/12) = 0.22...
Flip that around and we get the chance of matching:
1 - 0.22... = 0.78...
So, there is a 78% chance of any of them celebrating their Birthday in
the same month
And now we can try calculating the "Shared Birthday" question we started
with:
There are 30 people in a room ... what is the chance that any
two of them celebrate their birthday on the same day? Assume
365 days in a year.
It is just like the previous example! But bigger and more numbers:
The chance of not matching:
364/365 × 363/365 × 362/365 × ... × 336/365 = 0.294...
(I did that calculation in a spreadsheet, but there are also mathematical
shortcuts)
And the probability of matching is 1- 0.294... :
The probability of sharing a birthday = 1 - 0.294... = 0.706...
Or a 70.6% chance, which is likely!
In fact the probability for 23 people is about 50%.
And for 57 people it is 99% (almost certain!)
So, next time you are in a room with a group of people why not find out if
there are any shared birthdays?
Footnote: In real life birthdays are not evenly spread out ... more babies are
born in Spring. Also Hospitals prefer to work on weekdays, not weekends, so
there are more births early in the week. And then there are leap years. But
you get the idea.
Combinations and Permutations
What's the Difference?
In English we use the word "combination" loosely, without thinking if
the order of things is important. In other words:
"My fruit salad is a combination of apples, grapes and
bananas" We don't care what order the fruits are in, they could also
be "bananas, grapes and apples" or "grapes, apples and bananas", its
the same fruit salad.
"The combination to the safe was 472". Now we do care about
the order. "724" would not work, nor would "247". It has to be
exactly 4-7-2.
So, in Mathematics we use more precise language:
If the order doesn't matter, it is a Combination.
If the order does matter it is a Permutation.
Combinations and Permutations Calculator
Find out how many different ways you can choose items.
For an in-depth explanation of the formulas please visit Combinations and
Permutations.
View Larger
Note: The old version is here.
For an in-depth explanation please visit Combinations and Permutations.
Power Users!
You can now add "Rules" that will reduce the List:
The "has" rule which says that certain items must be included (for
the entry to be included).
Example: has 2,a,b,c means that an entry must have at least two
of the letters a, b and c.
The "no" rule which means that some items from the list must not
occur together.
Example: no 2,a,b,c means that an entry must not have two or
more of the letters a, b and c.
The "pattern" rule is used to impose some kind of pattern to each
entry.
Example: pattern c,* means that the letter c must be first
(anything else can follow)
(You can discuss these rules at the forum.)
Rules In Detail
The "has" Rule
The word "has" followed by a space and a number. Then a comma and a list
of items separated by commas.
The number says how many (minimum) from the list are needed for that
result to be allowed.
Example has 1,a,b,c
Will allow if there is an a, or b, or c, or a and b, or a and c, or b and
c, or all three a,b and c.
In other words, it insists there be an a or b or c in the result.
So {a,e,f} is accepted, but {d,e,f} is rejected.
Example has 2,a,b,c
Will allow if there is an a and b, or a and c, or b and c, or all
three a,b and c.
In other words, it insists there be at least 2 of a or b or c in the result.
So {a,b,f} is accepted, but {a,e,f} is rejected.
The "no" Rule
The word "no" followed by a space and a number. Then a comma and a list
of items separated by commas.
The number says how many (minimum) from the list are needed to be a
rejection.
Example: n=5, r=3, Order=no, Replace=no
Which normally produces:
{a,b,c} {a,b,d} {a,b,e} {a,c,d} {a,c,e} {a,d,e} {b,c,d} {b,c,e}
{b,d,e} {c,d,e}
But when we add a "no" rule like this:
a,b,c,d,e,f,g
no 2,a,b
We get:
{a,c,d} {a,c,e} {a,d,e} {b,c,d} {b,c,e} {b,d,e} {c,d,e}
The entries {a,b,c}, {a,b,d} and {a,b,e} are missing because the rule
says you can't have 2 from the list a,b (having an a or b is fine, but not
together)
Example: no 2,a,b,c
Allows only these:
{a,d,e} {b,d,e} {c,d,e}
It has rejected any with a and b, or a and c, or b and c, or even all
three a,b and c.
So {a,d,e) is allowed (only one out of a,b and c is in that)
But {b,c,d} is rejected (it has 2 from the list a,b,c)
Example: no 3,a,b,c
Allows all of these:
{a,b,d} {a,b,e} {a,c,d} {a,c,e} {a,d,e} {b,c,d} {b,c,e} {b,d,e}
{c,d,e}
Only {a,b,c} is missing because that is the only one that has 3 from the
list a,b,c
The "pattern" Rule
The word "pattern" followed by a space and a list of items separated by
commas.
You can include these "special" items:
? (question mark) means any item. It is like a "wildcard".
* (an asterisk) means any number of items (0, 1, or more). Like a
"super wildcard".
Example: pattern ?,c,*,f
Means "any item, followed by c, followed by zero or more items, then f"
So {a,c,d,f} is allowed
And {b,c,f,g} is also allowed (there are no items between c and f,
which is OK)
But {c,d,e,f} is not, because there is no item before c.
Example: how many ways can Alex, Betty, Carol and John be
lined up, with John after Alex.
Use: n=4, r=4, order=yes, replace=no.
Alex, Betty, Carol, John
pattern *,Alex,*,John
The result is:
{Alex,Betty,Carol,John} {Alex,Betty,John,Carol}
{Alex,Carol,Betty,John} {Alex,Carol,John,Betty}
{Alex,John,Betty,Carol} {Alex,John,Carol,Betty}
{Betty,Alex,Carol,John} {Betty,Alex,John,Carol}
{Betty,Carol,Alex,John} {Carol,Alex,Betty,John}
{Carol,Alex,John,Betty} {Carol,Betty,Alex,John}
Random Variables
A Random Variable is a set of possible values from a random experiment.
Example: Tossing a coin: we could get Heads or Tails.
Let's give them the values Heads=0 and Tails=1 and we have a
Random Variable "X":
In short:
X = {0, 1}
Note: We could have chosen Heads=100 and Tails=150 if we wanted! It
is our choice.
So:
We have an experiment (such as tossing a coin)
We give values to each event
The set of values is a Random Variable
Not Like an Algebra Variable
In Algebra a variable, like x, is an unknown value:
Example: x + 2 = 6
In this case we can find that x=4
But a Random Variable is different ...
A Random Variable has a whole set of values ...
... and it could take on any of those values, randomly.
Example: X = {0, 1, 2, 3}
X could be 1, 2, 3 or 4, randomly.
And they might each have a different probability.
Capital Letters
We use a capital letter, like X or Y, to avoid confusion with the Algebra type
of variable.
Sample Space
A Random Variable's set of values is the Sample Space.
Example: Throw a die once
Random Variable X = "The score shown on the top face".
X could be 1, 2, 3, 4, 5 or 6
So the Sample Space is {1, 2, 3, 4, 5, 6}
Probability
We can show the probability of any one value using this style:
P(X = value) = probability of that value
Example (continued): Throw a die once
X = {1, 2, 3, 4, 5, 6}
In this case they are all equally likely, so the probability of any one is
1/6
P(X = 1) = 1/6
P(X = 2) = 1/6
P(X = 3) = 1/6
P(X = 4) = 1/6
P(X = 5) = 1/6
P(X = 6) = 1/6
Note that the sum of the probabilities = 1, as it should be.
Example: Toss three coins.
X = "The number of Heads" is the Random Variable.
In this case, there could be 0 Heads (if all the coins land Tails up), 1 Head, 2
Heads or 3 Heads.
So the Sample Space = {0, 1, 2, 3}
But this time the outcomes are NOT all equally likely.
The three coins can land in eight possible ways:
X = "number
of Heads"
HHH
3
HHT
2
HTH
2
HTT
1
THH
2
THT
1
TTH
1
TTT
0
Looking at the table we see just 1 case of Three Heads, but 3 cases of Two
Heads, 3 cases of One Head, and 1 case of Zero Heads. So:
P(X = 3) = 1/8
P(X = 2) = 3/8
P(X = 1) = 3/8
P(X = 0) = 1/8
Example: Two dice are tossed.
The Random Variable is X = "The sum of the scores on the two dice".
Let's make a table of all possible values:
1st Die
2nd Die
1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
There are 6 × 6 = 36 of them, and the Sample Space = {2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12}
Let's count how often each value occurs, and work out the probabilities:
2 occurs just once, so P(X = 2) = 1/36
3 occurs twice, so P(X = 3) = 2/36 = 1/18
4 occurs three times, so P(X = 4) = 3/36 = 1/12
5 occurs four times, so P(X = 5) = 4/36 = 1/9
6 occurs five times, so P(X = 6) = 5/36
7 occurs six times, so P(X = 7) = 6/36 = 1/6
8 occurs five times, so P(X = 8) = 5/36
9 occurs four times, so P(X = 9) = 4/36 = 1/9
10 occurs three times, so P(X = 10) = 3/36 = 1/12
11 occurs twice, so P(X = 11) = 2/36 = 1/18
12 occurs just once, so P(X = 12) = 1/36
A Range of Values
We could also calculate the probability that a Random Variable takes on a
range of values.
Example (continued) What is the probability that the sum of
the scores is 5, 6, 7 or 8?
In other words: What is P(5 ≤ X ≤ 8)?
P(5 ≤ X ≤ 8) = P(X = 5) + P(X = 6) + P(X = 7) + P(X = 8) =
(4+5+6+5)/36 = 20/36 = 5/9
Solving
We can also solve a Random Variable equation.
Example (continued) If P(X = x) = 1/12, what is the value of
x?
P(X = 4) = 1/12, and P(X = 10) = 1/12
So there are two solutions: x = 4 or x = 10
Notice the different uses of X and x:
X represents the Random Variable "The sum of the scores on the two
dice".
x represents a value that X can take.
Continuous
Random Variables can be either Discrete or Continuous:
Discrete Data can only take certain values (such as 1,2,3,4,5)
Continuous Data can take any value within a range (such as a person's
height)
All our examples have been Discrete.
Learn more at Continuous Random Variables.
Mean, Variance, Standard Deviation
You can also learn how to find the Mean, Variance and Standard Deviation of
Random Variables.
Summary
A Random Variable is a set of possible values from a random
experiment.
The set of possible values is called the Sample Space.
A Random Variable is given a capital letter, such as X or Z.
Random Variables can be discrete or continuous.
So, we should really call this a "Permutation Lock"!
In other words:
A Permutation is an ordered Combination.
To help you to remember, think "Permutation ... Position"
Permutations
There are basically two types of permutation:
Repetition is Allowed: such as the lock above. It could be "333".
No Repetition: for example the first three people in a running race.
You can't be first andsecond.
1. Permutations with Repetition
These are the easiest to calculate.
When you have n things to choose from ... you have n choices each time!
When choosing r of them, the permutations are:
n × n × ... (r times)
(In other words, there are n possibilities for the first choice, THEN there
are n possibilites for the second choice, and so on, multplying each time.)
Which is easier to write down using an exponent of r:
n × n × ... (r times) = nr
Example: in the lock above, there are 10 numbers to choose from
(0,1,..9) and you choose 3 of them:
10 × 10 × ... (3 times) = 103 = 1,000 permutations
So, the formula is simply:
nr
where n is the number of things to choose
from, and you choose r of them
(Repetition allowed, order matters)
2. Permutations without Repetition
In this case, you have to reduce the number of available choices each time.
For example, what order could 16
pool balls be in?
After choosing, say, number "14" you
can't choose it again.
So, your first choice would have 16 possibilites, and your next choice would
then have 15 possibilities, then 14, 13, etc. And the total permutations would
be:
16 × 15 × 14 × 13 × ... = 20,922,789,888,000
But maybe you don't want to choose them all, just 3 of them, so that would
be only:
16 × 15 × 14 = 3,360
In other words, there are 3,360 different ways that 3 pool balls could be
selected out of 16 balls.
But how do we write that mathematically? Answer: we use the "factorial
function"
The factorial function (symbol: !) just means to multiply a series
of descending natural numbers. Examples:
4! = 4 × 3 × 2 × 1 = 24
7! = 7 × 6 × 5 × 4 × 3 × 2 × 1 = 5,040
1! = 1
Note: it is generally agreed that 0! = 1. It may seem funny that multiplying no
numbers together gets you 1, but it helps simplify a lot of equations.
So, if you wanted to select all of the billiard balls the permutations would be:
16! = 20,922,789,888,000
But if you wanted to select just 3, then you have to stop the multiplying after
14. How do you do that? There is a neat trick ... you divide by 13! ...
16 × 15 × 14 × 13 × 12 ... = 16 × 15 × 14 = 3,360
13 × 12 ...
Do you see? 16! / 13! = 16 × 15 × 14
The formula is written:
where n is the number of things to choose
from, and you choose r of them
(No repetition, order matters)
Examples:
Our "order of 3 out of 16 pool balls example" would be:
16! =
16! =
20,922,789,888,000 = 3,360
(16-3)! 13! 6,227,020,800
(which is just the same as: 16 × 15 × 14 = 3,360)
How many ways can first and second place be awarded to 10 people?
10! =
10! =
3,628,800 = 90
(10-2)! 8! 40,320
(which is just the same as: 10 × 9 = 90)
Notation
Instead of writing the whole formula, people use different notations such as
these:
Example: P(10,2) = 90
Combinations
There are also two types of combinations (remember the order
does not matter now):
Repetition is Allowed: such as coins in your pocket (5,5,5,10,10)
No Repetition: such as lottery numbers (2,14,15,27,30,33)
1. Combinations with Repetition
Actually, these are the hardest to explain, so I will come back to this later.
2. Combinations without Repetition
This is how lotteries work. The numbers are drawn one at a time, and if you
have the lucky numbers (no matter what order) you win!
The easiest way to explain it is to:
assume that the order does matter (ie permutations),
then alter it so the order does not matter.
Going back to our pool ball example, let us say that you just want to know
which 3 pool balls were chosen, not the order.
We already know that 3 out of 16 gave us 3,360 permutations.
But many of those will be the same to us now, because we don't care what
order!
For example, let us say balls 1, 2 and 3 were chosen. These are the
possibilites:
Order does matter Order doesn't
matter
1 2 3 1 2 3
1 3 2
2 1 3
2 3 1
3 1 2
3 2 1
So, the permutations will have 6 times as many possibilites.
In fact there is an easy way to work out how many ways "1 2 3" could be
placed in order, and we have already talked about it. The answer is:
3! = 3 × 2 × 1 = 6
(Another example: 4 things can be placed in 4! = 4 × 3 × 2 × 1 =
24 different ways, try it for yourself!)
So, all we need to do is adjust our permutations formula to reduce it by how
many ways the objects could be in order (because we aren't interested in the
order any more):
That formula is so important it is often just written in big parentheses like
this:
where n is the number of things to choose
from, and you choose r of them
(No repetition, order doesn't matter)
It is often called "n choose r" (such as "16 choose 3")
And is also known as the "Binomial Coefficient"
Notation
As well as the "big parentheses", people also use these notations:
Example
So, our pool ball example (now without order) is:
16! =
16! =
20,922,789,888,000 = 560
3!(16-3)! 3!×13! 6×6,227,020,800
Or you could do it this way:
16×15×14 =
3360 = 560
3×2×1 6
So remember, do the permutation, then reduce by a further "r!"
... or better still ...
Remember the Formula!
It is interesting to also note how this formula is nice and symmetrical:
In other words choosing 3 balls out of 16, or choosing 13 balls out of 16 have
the same number of combinations.
16! =
16! =
16! = 560
3!(16-3)! 13!(16-13)! 3!×13!
Pascal's Triangle
You can also use Pascal's Triangle to find the values. Go down to row "n" (the
top row is 0), and then along "r" places and the value there is your answer.
Here is an extract showing row 16:
1 14 91 364 ...
1 15 105 455 1365 ...
1 16 120 560 1820 4368 ...
1. Combinations with Repetition
OK, now we can tackle this one ...
Let us say there are five flavors of icecream: banana,
chocolate, lemon, strawberry and vanilla. You can have
three scoops. How many variations will there be?
Let's use letters for the flavors: {b, c, l, s, v}. Example
selections would be
{c, c, c} (3 scoops of chocolate)
{b, l, v} (one each of banana, lemon and vanilla)
{b, v, v} (one of banana, two of vanilla)
(And just to be clear: There are n=5 things to choose from, and you
choose r=3 of them.
Order does not matter, and you can repeat!)
Now, I can't describe directly to you how to calculate this, but I can show
you a special techniquethat lets you work it out.
Think about the ice cream being in boxes, you could
say "move past the first box, then take 3 scoops,
then move along 3 more boxes to the end" and you
will have 3 scoops of chocolate!
So, it is like you are ordering a robot to get your ice
cream, but it doesn't change anything, you still get
what you want.
Now you could write this down as (arrow means move,
circle means scoop).
In fact the three examples above would be written like this:
{c, c, c} (3 scoops of chocolate):
{b, l, v} (one each of banana, lemon and
vanilla):
{b, v, v} (one of banana, two of vanilla):
OK, so instead of worrying about different flavors, we have
a simpler problem to solve: "how many different ways can you arrange
arrows and circles"
Notice that there are always 3 circles (3 scoops of ice cream) and 4 arrows
(you need to move 4 times to go from the 1st to 5th container).
So (being general here) there are r + (n-1) positions, and we want to
choose r of them to have circles.
This is like saying "we have r + (n-1) pool balls and want to choose r of
them". In other words it is now like the pool balls problem, but with slightly
changed numbers. And you would write it like this:
where n is the number of things to choose
from, and you choose r of them
(Repetition allowed, order doesn't matter)
Interestingly, we could have looked at the arrows instead of the circles, and
we would have then been saying "we have r + (n-1) positions and want to
choose (n-1) of them to have arrows", and the answer would be the same
...
So, what about our example, what is the answer?
(5+3-1)! =
7! =
5040 = 35
3!(5-1)! 3!×4! 6×24
In Conclusion
Phew, that was a lot to absorb, so maybe you could read it again to be sure!
But knowing how these formulas work is only half the battle. Figuring out
how to interpret a real world situation can be quite hard.
But at least now you know how to calculate all 4 variations of "Order
does/does not matter" and "Repeats are/are not allowed".
Random Variables
A Random Variable is a set of possible values from a random experiment.
Example: Tossing a coin: we could get Heads or Tails.
Let's give them the values Heads=0 and Tails=1 and we have a
Random Variable "X":
In short:
X = {0, 1}
Note: We could have chosen Heads=100 and Tails=150 if we wanted! It
is our choice.
So:
We have an experiment (such as tossing a coin)
We give values to each event
The set of values is a Random Variable
Not Like an Algebra Variable
In Algebra a variable, like x, is an unknown value:
Example: x + 2 = 6
In this case we can find that x=4
But a Random Variable is different ...
A Random Variable has a whole set of values ...
... and it could take on any of those values, randomly.
Example: X = {0, 1, 2, 3}
X could be 1, 2, 3 or 4, randomly.
And they might each have a different probability.
Capital Letters
We use a capital letter, like X or Y, to avoid confusion with the Algebra type
of variable.
Sample Space
A Random Variable's set of values is the Sample Space.
Example: Throw a die once
Random Variable X = "The score shown on the top face".
X could be 1, 2, 3, 4, 5 or 6
So the Sample Space is {1, 2, 3, 4, 5, 6}
Probability
We can show the probability of any one value using this style:
P(X = value) = probability of that value
Example (continued): Throw a die once
X = {1, 2, 3, 4, 5, 6}
In this case they are all equally likely, so the probability of any one is
1/6
P(X = 1) = 1/6
P(X = 2) = 1/6
P(X = 3) = 1/6
P(X = 4) = 1/6
P(X = 5) = 1/6
P(X = 6) = 1/6
Note that the sum of the probabilities = 1, as it should be.
Example: Toss three coins.
X = "The number of Heads" is the Random Variable.
In this case, there could be 0 Heads (if all the coins land Tails up), 1 Head, 2
Heads or 3 Heads.
So the Sample Space = {0, 1, 2, 3}
But this time the outcomes are NOT all equally likely.
The three coins can land in eight possible ways:
X = "number
of Heads"
HHH
3
HHT
2
HTH
2
HTT
1
THH
2
THT
1
TTH
1
TTT
0
Looking at the table we see just 1 case of Three Heads, but 3 cases of Two
Heads, 3 cases of One Head, and 1 case of Zero Heads. So:
P(X = 3) = 1/8
P(X = 2) = 3/8
P(X = 1) = 3/8
P(X = 0) = 1/8
Example: Two dice are tossed.
The Random Variable is X = "The sum of the scores on the two dice".
Let's make a table of all possible values:
1st Die
2nd Die
1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
There are 6 × 6 = 36 of them, and the Sample Space = {2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12}
Let's count how often each value occurs, and work out the probabilities:
2 occurs just once, so P(X = 2) = 1/36
3 occurs twice, so P(X = 3) = 2/36 = 1/18
4 occurs three times, so P(X = 4) = 3/36 = 1/12
5 occurs four times, so P(X = 5) = 4/36 = 1/9
6 occurs five times, so P(X = 6) = 5/36
7 occurs six times, so P(X = 7) = 6/36 = 1/6
8 occurs five times, so P(X = 8) = 5/36
9 occurs four times, so P(X = 9) = 4/36 = 1/9
10 occurs three times, so P(X = 10) = 3/36 = 1/12
11 occurs twice, so P(X = 11) = 2/36 = 1/18
12 occurs just once, so P(X = 12) = 1/36
A Range of Values
We could also calculate the probability that a Random Variable takes on a
range of values.
Example (continued) What is the probability that the sum of
the scores is 5, 6, 7 or 8?
In other words: What is P(5 ≤ X ≤ 8)?
P(5 ≤ X ≤ 8) = P(X = 5) + P(X = 6) + P(X = 7) + P(X = 8) =
(4+5+6+5)/36 = 20/36 = 5/9
Solving
We can also solve a Random Variable equation.
Example (continued) If P(X = x) = 1/12, what is the value of
x?
P(X = 4) = 1/12, and P(X = 10) = 1/12
So there are two solutions: x = 4 or x = 10
Notice the different uses of X and x:
X represents the Random Variable "The sum of the scores on the two
dice".
x represents a value that X can take.
Continuous
Random Variables can be either Discrete or Continuous:
Discrete Data can only take certain values (such as 1,2,3,4,5)
Continuous Data can take any value within a range (such as a person's
height)
All our examples have been Discrete.
Learn more at Continuous Random Variables.
Mean, Variance, Standard Deviation
You can also learn how to find the Mean, Variance and Standard Deviation of
Random Variables.
Summary
A Random Variable is a set of possible values from a random
experiment.
The set of possible values is called the Sample Space.
A Random Variable is given a capital letter, such as X or Z.
Random Variables can be discrete or continuous.
Random Variables - Continuous
A Random Variable is a set of possible values from a random experiment.
Example: Tossing a coin: we could get Heads or Tails.
Let's give them the values Heads=0 and Tails=1 and we have a
Random Variable "X":
In short:
X = {0, 1}
Note: We could have chosen Heads=100 and Tails=150 if we wanted! It
is our choice.
Continuous
Random Variables can be either Discrete or Continuous:
Discrete Data can only take certain values (such as 1,2,3,4,5)
Continuous Data can take any value within a range (such as a person's
height)
In our Introduction to Random Variables (please read that first!) we look at
many examples of Discrete Random Variables.
But here we look at the more advanced topic of Continuous Random
Variables.
The Uniform Distribution
(Also called the Rectangular Distribution).
The Uniform Distribution has equal probability for all values of the Random
variable between a and b:
The probability of any value between a and b is p
We also know that p = 1/(b-a), because the total of all probabilities must be
1, so
the area of the rectangle = 1
p × (b−a) = 1
p = 1/(b−a)
We can write:
P(X = x) = 1/(b−a) for a ≤ x ≤ b
P(X = x) = 0 otherwise
Example: Old Faithful erupts every 91 minutes. You arrive
there at random and wait for 20 minutes ... what is the
probability you will see it erupt?
This is actually easy to calculate, 20 minutes out of 91 minutes is:
p = 20/91 = 0.22 (to 2 decimals)
But let's use the Uniform Distribution for practice.
To find the probability between a and a+20, find the blue area:
Area = (1/91) x (a+20 - a) = (1/91) x 20 = 20/91 = 0.22 (to
2 decimals)
So there is a 0.22 probability you will see Old Faithful erupt.
If you waited the full 91 minutes you would be sure (p=1) to have seen
it erupt.
But remember this is a random thing! It might erupt the moment you
arrive, or any time in the 91 minutes.
Cumulative Uniform Distribution
We can have the Uniform Distribution as a cumulative (adding up as it goes
along) distribution:
The probability starts at 0 and builds up to 1
This type of thing is called a "Cumulative distribution function", often
shortened to "CDF"
Example (continued):
Let's use the "CDF" of the Uniform Distribution to work out the
probability:
At a+20 the probability has accumulated to about 0.22
Other Distributions
Knowing how to use the Uniform
Distribution helps when dealing with
more complicated distributions like
this one:
The general name for any of these is probability density function or "pdf"
The Normal Distribution
The most important continuous distribution is the Standard Normal
Distribution
It is so important the Random Variable has its own special letter Z.
The graph for Z is a symmetrical bell-shaped curve:
Usually we want to find the probability of Z being between certain values.
Example: P(0 < Z < 0.45)
(What is the probability that Z is between 0 and 0.45)
This is found by using the Standard Normal Distribution Table
Start at the row for 0.4, and read along until 0.45: there is the value
0.1736
P(0 < Z < 0.45) = 0.1736
Summary
A Random Variable is a variable whose possible values are
numerical outcomes of a random experiment.
Random Variables can be discrete or continuous.
An important example of a continuous Random variable is
the Standard Normalvariable, Z.
Random Variables - Mean, Variance, Standard
Deviation
A Random Variable is a set of possible values from a random experiment.
Example: Tossing a coin: we could get Heads or Tails.
Let's give them the values Heads=0 and Tails=1 and we have a
Random Variable "X":
So:
We have an experiment (like tossing a coin)
We give values to each event
The set of values is a Random Variable
Learn more at Random Variables.
Mean, Variance and Standard Deviation
They have special notation:
μ is the Mean of X and is also called the Expected Value of X
Var(X) is the Variance of X
σ is the Standard Deviation of X
Mean or Expected Value
When we know the probability p of every value x we can calculate the
Expected Value (Mean) of X:
μ = Σxp
Note: Σ is Sigma Notation, and means to sum up.
To calculate the Expected Value:
multiply each value by its probability
sum them up
It is a weighted mean: values with higher probability have higher
contribution to the mean.
Variance
The Variance is:
Var(X) = Σx2p − μ2
To calculate the Variance:
square each value and multiply by its probability
sum them up and we get Σx2p
then subtract the square of the Expected Value μ2
Standard Deviation
The Standard Deviation is the square root of the Variance:
σ = √Var(X)
An example will help!
You plan to open a new McDougals Fried Chicken, and found
these stats for similar restaurants:
Percent Year's Earnings
20% $50,000 Loss
30% $0
40% $50,000 Profit
10% $150,000 Profit
Using that as probabilities for your new restaurant's profit, what is the
Expected Value and Standard Deviation?
The Random Variable is X = 'possible profit'.
Sum up xp and x2p:
Probability p
Earnings ($'000s) x
xp
x2p
0.2 -50 -10 500
0.3 0 0 0
0.4 50 20 1000
0.1 150 15 2250
Σp = 1 Σxp = 25 Σx2p = 3750
μ = Σxp = 25
Var(X) = Σx2p − μ2 = 3750 − 252 = 3750 − 625 = 3125
σ = √3125 = 56 (to nearest whole number)
But remember these are in thousands of dollars, so:
μ = $25,000
σ = $56,000
So you might expect to make $25,000, but with a very wide deviation
possible.
Let's try that again, but with a much higher probability for $50,000:
Example (continued):
Now with different probabilities (the $50,000 value has a high
probability of 0.7 now):
Probability p
Earnings ($'000s) x
xp
x2p
0.1 -50 -5 250
0.1 0 0 0
0.7 50 35 1750
0.1 150 15 2250
Σp = 1 Sums: Σxp = 45 Σx2p = 4250
μ = Σxp = 45
Var(X) = Σx2p − μ2 = 4250 − 452 = 4250 − 2025 = 2225
σ = √2225 = 47 (to nearest whole number)
In thousands of dollars:
μ = $45,000
σ = $47,000
The mean is now much closer to the most probable value.
And the standard deviation is a little smaller (showing that the values
are more central.)
Continuous
Random Variables can be either Discrete or Continuous:
Discrete Data can only take certain values (such as 1,2,3,4,5)
Continuous Data can take any value within a range (such as a person's
height)
Here we looked only at discrete data, as finding the Mean, Variance and
Standard Deviation of continuous data needs Integration.
Summary
A Random Variable is a variable whose possible values are
numerical outcomes of a random experiment.
The Mean (Expected Value) is: μ = Σxp
The Variance is: Var(X) = Σx2p − μ2
The Standard Deviation is: σ = √Var(X)
Quincunx Explained
A Quincunx or "Galton Board" (named after Sir Francis
Galton) is a triangular array of pegs.
Balls are dropped onto the top peg and then bounce their
way down to the bottom where they are collected in little
bins.
Each time a ball hits one of the pegs, it bounces either left
or right.
But this is interesting: if there is an equal
chance of bouncing left or right, then the
pegs collecting in the bins form the
classic "bell-shaped" curve of the normal
distribution.
(If the probabilities are not even, you still
get a nice "skewed" version of the normal
distribution.)
Formula
You can actually calculate the probabilities!
Think about this: a ball would end up in the bin k places
from the right if it has taken k left turns.
In this example, the ball has taken two bounces to the left,
and all other bounces were to the right. It ended up in the
bin two places from the right.
In the general case, if the quincunx has n rows then a
possible path for the ball would be k bounces to the left
and (n-k)bounces to the right.
And if the probability of bouncing to the left is p then we
can calculate the probability of a certain path like this:
The ball bounces k times to the left with a probability of p: pk
And the other bounces (n-k) have the opposite probability
of: (1-p)(n-k)
So, the probability of following such a path is pk(1-p)(n-k)
But there could be many such paths! For example the left turns could be
the 1st and 2nd, or 1st and 3rd, or 2nd and 7th, etc.
You could list all such paths (LLRRR.., LRLRR..., LRRL...), but there are two
easier ways.
How Many Paths
You can look at Pascal's Triangle. In fact, the Quincunx is just like Pascal's
Triangle, with pegs instead of numbers. The number on each peg shows you
how many different paths can be taken to get to that peg. Amazing but true.
Or you can use this formula from the subject of Combinations:
This is commonly called "n choose k" and written
C(n,k).
It is the calculation of the number of ways of
distributing k things in a sequence of n.
(The "!" means "factorial", for example 4! =
1×2×3×4 = 24)
Putting it all together, the resulting formula is:
(Which, by the way, is the formula for the binomial distribution.)
Example:
For 10 rows (n=10) and probability of bouncing left of 0.5 (p=0.5), we can
calculate the probability of being in the 3rd bin from the right (k=3) as:
also:
(This means there are 120 different paths that would end
up with the ball in the 3rd bin from the right.)
So we get:
In fact we can build a whole table for rows=10 and probability=0.5 like this:
From Right: 10 9 8 7 6 5 4 3 2 1 0
Probability: 0.001 0.010 0.044 0.117 0.205 0.246 0.205 0.117 0.044 0.010 0.001
Example: 100 balls 0 1 4 12 21 24 21 12 4 1 0
Now, of course, this is a random thing so your results may vary from this
ideal situation.
Another Example:
If the probability were 0.8 then the table would look like this:
From Right 10 9 8 7 6 5 4 3 2 1 0
Probability 0.107 0.268 0.302 0.201 0.088 0.026 0.006 0.001 0.000 0.000 0.000
Example: 100 balls 11 26 30 20 9 3 1 0 0 0 0
Try It Yourself
Run 100 (or more) balls through the Quincunx and see what results you get.
I have done this many times myself while developing the software. I never
got the perfect result, but always something surprisingly close. Good Luck!
The Binomial Distribution
"Bi" means "two" (like a bicycle has two wheels) ...
... so this is about things with two results.
Tossing a Coin:
Did we get Heads (H) or
Tails (T)
We say the probability of the coin landing H is ½
And the probability of the coin landing T is ½
Throwing a Die:
Did we get a four ... ?
... or not?
We say the probability of a four is 1/6 (one of the six faces is a four).
And the probability of not four is 5/6 (five of the six faces are not a four)
Let's Toss a Coin!
Toss a fair coin three times ... what is the chance of getting two Heads?
Outcome: the result of three coin tosses
Event: "Two Heads" out of three coin tosses
We could get any one of these outcomes (H stands for heads and T for
Tails):
HHH
HHT
HTH
HTT
THH
THT
TTH
TTT
Which outcomes do we want?
"Two Heads" could be in any order: "HHT", "THH" and "HTH" all have two
Heads (and one Tail).
So 3 of the outcomes produce "Two Heads".
What is the probability of each outcome?
Each outcome is equally likely, and there are 8 of them. So each has a
probability of 1/8
So the probability of event "Two Heads" is:
Number of
outcomes we want
Probability of
each outcome
3 × 1/8 = 3/8
Let's Calculate Them All:
The calculations are (P means "Probability of"):
P(Three Heads) = P(HHH) = 1/8
P(Two Heads) = P(HHT) + P(HTH) + P(THH) = 1/8 + 1/8 + 1/8
= 3/8
P(One Head) = P(HTT) + P(THT) + P(TTH) = 1/8 + 1/8 + 1/8 = 3/8
P(Zero Heads) = P(TTT) = 1/8
We can write this in terms of a Random Variable, X, = "The number of Heads
from 3 tosses of a coin":
P(X = 3) = 1/8
P(X = 2) = 3/8
P(X = 1) = 3/8
P(X = 0) = 1/8
And we can also draw a Bar Graph:
It is symmetrical!
Making a Formula
Now ... what are the chances of 5 heads in 9 tosses ... to list all outcomes
(512) would take a long time!
So let's make a formula.
In our previous example, how could we get the values 1, 3, 3 and 1 ?
They are actually in the third row of Pascal‟s
Triangle ... !
Can we make them using a formula?
Sure we can, and here it is:
n = total number
k = number we want
It is often called "n choose k" and you can read more
about it at Combinations and Permutations.
Note: the "!" means "factorial", for example 4! = 1×2×3×4 = 24
Let's use it:
Example: 3 tosses getting 2 Heads
We have n=3 and k=2
n! =
3! =
3×2×1 = 3
k!(n-k)! 2!(3-2)! 2×1 × 1
So there are 3 outcomes for "2 Heads"
(We knew that already, but now we have a formula for it.)
Let's use it for a harder question:
Example: what are the chances of 5 heads in 9 tosses?
We have n=9 and k=5
n! = 9! = 9×8×7×6×5×4×3×2×1 = 126
k!(n-k)! 5!(9-5)! 5×4×3×2×1 × 4×3×2×1
And for 9 tosses there are 29 = 512 total outcomes, so we get the
probability:
Number of
outcomes we want
Probability of
each outcome
126 × 1
= 126
512 512
P(X=5) = 126
= 63
= 0.24609375 512 256
About a 25% chance.
(Easier than listing them all.)
Bias!
So far the chances of success or failure have been equally likely.
But what if the coins are biased (land more on one side than another) or
choices are not 50/50.
Example: You sell sandwiches. 70% of people choose
chicken, the rest choose pork.
What is the probability of selling 2 chicken sandwiches to the
next 3 customers?
This is just like the heads and tails example, but with 70/30 instead of
50/50.
Let's draw a tree diagram:
The "Two Chicken" cases are highlighted.
Notice that the probabilities for "two chickens" all work out to be 0.147 ,
because we are multiplying two 0.7s and one 0.3 in each case.
Can we get the 0.147 from a formula? What we want is "two 0.7s and one
0.3"
0.7 is the probability of each choice we want, call it p
2 is the number of choices we want, call it k
Probability of "choices we want" (two chickens) is: pk
And
The probability of the opposite choice is: 1-p
The total number of choices is: n
The number of opposite choices is: n-k
Probability of "opposite choices" (one pork) is: (1-p)(n-k)
So all choices together is:
pk(1-p)(n-k)
Example: (continued)
p = 0.7 (chance of chicken)
n = 3
k = 2
So we get:
pk(1-p)(n-k) = 0.72(1-0.7)(3-2) = 0.72(0.3)(1) = 0.7 × 0.7 × 0.3
= 0.147
which is the probability of each outcome.
And the total number of those outcomes is:
n! =
3! =
3×2×1 = 3
k!(n-k)! 2!(3-2)! 2×1 × 1
And we get:
Number of
outcomes we want
Probability of
each outcome
3 × 0.147 = 0.441
So the probability of event "2 people out of 3 choose chicken" = 0.441
OK. That was a lot of work for something we knew already, but now we can
answer harder questions.
Example: You say "70% choose chicken, so 7 of the next 10
customers should choose chicken" ... what are the chances
you are right?
p = 0.7
n = 10
k = 7
So we get:
pk(1-p)(n-k) = 0.77(1-0.7)(10-7) = 0.77(0.3)(3) = 0.0022235661
That is the probability of each outcome.
And the total number of those outcomes is:
n! = 10! = 10×9×8 = 120
k!(n-k)! 7!(10-3)! 3×2×1
And we get:
Number of
outcomes we want
Probability of
each outcome
120 × 0.0022235661 = 0.266827932
In fact the probability of 7 out of 10 choosing chicken is only
about 27%
Moral of the story: even though the long-run average is 70%, don't
expect 7 out of the next 10.
Putting it Together
Now we know how to calculate how many:
n!
k!(n-k)!
And the probability of each:
pk(1-p)(n-k)
We can multiply them together:
Probability of k out of n ways:
P(k out of n) = n!
pk(1-p)
(n-k)
k!(n-k)!
The General Binomial Probability Formula
Important Notes:
The trials are independent,
There are only two possible outcomes at each trial,
The probability of "success" at each trial is constant.
Throw the Die
A fair die is thrown four times. Calculate the probabilities of getting:
0 Twos
1 Two
2 Twos
3 Twos
4 Twos
In this case n=4, p = P(Two) = 1/6
X is the Random Variable „Number of Twos from four throws‟.
Substitute x = 0 to 4 into the formula:
P(k out of n) = n!
pk(1-p)(n-k) k!(n-k)!
Like this (to 4 decimal places):
P(X = 0) = (4!/0!4!) × (1/6)0(5/6)4 = 1 × 1 × (5/6)4 = 0.4823
P(X = 1) = (4!/1!3!) × (1/6)1(5/6)3 = 4 × (1/6) × (5/6)3 = 0.3858
P(X = 2) = (4!/2!2!) × (1/6)2(5/6)2 = 6 × (1/6)2 × (5/6)2 = 0.1157
P(X = 3) = (4!/3!1!) × (1/6)3(5/6)1 = 4 × (1/6)3 × (5/6) = 0.0154
P(X = 4) = (4!/4!0!) × (1/6)4(5/6)0 = 1 × (1/6)4 × 1 = 0.0008
Summary: "for the 4 throws, there is a 48% chance of no twos, 39% chance
of 1 two, 12% chance of 2 twos, 1.5% chance of 3 twos, and a tiny 0.08%
chance of all throws being a two (but it still could happen!)"
This time the Bar Graph is not symmetrical:
It is not symmetrical!
It is skewed because p is not 0.5
Sports Bikes
Your company makes sports bikes. 90% pass final inspection (and 10% fail
and need to be fixed).
What is the expected Mean and Variance of the 4 next inspections?
First, let's calculate all probabilities.
n = 4,
p = P(Pass) = 0.9
X is the Random Variable "Number of passes from four inspections".
Substitute x = 0 to 4 into the formula:
P(k out of n) = n!
pk(1-p)(n-k) k!(n-k)!
Like this:
P(X = 0) = (4!/0!4!) × 0.900.14 = 1 × 1 × 0.0001 = 0.0001
P(X = 1) = (4!/1!3!) × 0.910.13 = 4 × 0.9 × 0.001 = 0.0036
P(X = 2) = (4!/2!2!) × 0.920.12 = 6 × 0.81 × 0.01 = 0.0486
P(X = 3) = (4!/3!2!) × 0.930.11 = 4 × 0.729 × 0.1 = 0.2916
P(X = 4) = (4!/4!0!) × 0.940.10 = 1 × 0.6561 × 1 = 0.6561
Summary: "for the 4 next bikes, there is a tiny 0.01% chance of no passes,
0.36% chance of 1 pass, 5% chance of 2 passes, 29% chance of 3 passes,
and a whopping 66% chance they all pass the inspection."
Mean, Variance and Standard Deviation
Let's calculate the Mean, Variance and Standard Deviation for the Sports Bike
inspections.
There are (relatively) simple formulas for them. They are a little hard to
prove, but they do work!
The mean, or "expected value", is:
μ = np
For the sports bikes:
μ = 4 × 0.9 = 3.6
So we would expect 3.6 bikes (out of 4) to pass the inspection.
Makes sense really ... 0.9 chance for each bike times 4 bikes equals 3.6
The formula for Variance is:
Variance: σ2 = np(1-p)
And Standard Deviation is the square root of variance:
σ = √(np(1-p))
For the sports bikes:
Variance: σ2 = 4 × 0.9 × 0.1 = 0.36
Standard Deviation is:
σ = √(0.36) = 0.6
Note: we could also calculate them manually, by making a table like this:
X P(X) X × P(X) X2 × P(X)
0 0.0001 0 0
1 0.0036 0.0036 0.0036
2 0.0486 0.0972 0.1944
3 0.2916 0.8748 2.6244
4 0.6561 2.6244 10.4976
SUM: 3.6 13.32
The mean is the Sum of (X × P(X)):
μ = 3.6
The variance is the Sum of (X2 × P(X)) minus Mean2:
Variance: σ2 = 13.32 − 3.62 = 0.36
Standard Deviation is:
σ = √(0.36) = 0.6
And we got the same results as before (yay!)
Summary
The General Binomial Probability Formula
P(k out of n) = n!
pk(1-p)(n-k) k!(n-k)!
Mean value of X: μ = np
Variance of X: σ2 = np(1-p)
Standard Deviation of X: σ = √(np(1-p))
Normal Distribution
Data can be "distributed" (spread out) in different ways.
It can be spread out
more on the left
Or more on the right
Or it can be all jumbled up
But there are many cases where the data tends to be around a central value
with no bias left or right, and it gets close to a "Normal Distribution" like this:
A Normal Distribution
The "Bell Curve" is a Normal Distribution.
And the yellow histogram shows some data that follows it closely, but not
perfectly (which is usual).
It is often called a "Bell Curve"
because it looks like a bell.
Many things closely follow a Normal Distribution:
heights of people
size of things produced by machines
errors in measurements
blood pressure
marks on a test
We say the data is "normally distributed".
The Normal Distribution has:
mean = median = mode
symmetry about the center
50% of values less than the
mean
and 50% greater than the
mean
Quincunx
You can see a normal distribution being created by random
chance!
It is called the Quincunx and it is an amazing machine.
Have a play with it!
Standard Deviations
The Standard Deviation is a measure of how spread out numbers are (read
that page for details on how to calculate it).
When you calculate the standard deviation of your data, you will find that
(generally):
68% of values are within
1 standard deviation of the mean
95% are within 2 standard
deviations
99.7% are within 3 standard
deviations
Example: 95% of students at school are between 1.1m and
1.7m tall.
Assuming this data is normally distributed can you calculate the
mean and standard deviation?
The mean is halfway between 1.1m and 1.7m:
Mean = (1.1m + 1.7m) / 2 = 1.4m
95% is 2 standard deviations either side of the mean (a
total of 4 standard deviations) so:
1 standard deviation = (1.7m-1.1m) / 4
= 0.6m / 4 = 0.15m
And this is the result:
It is good to know the standard deviation, because we can say that any value
is:
likely to be within 1 standard deviation (68 out of 100 should be)
very likely to be within 2 standard deviations (95 out of 100 should
be)
almost certainly within 3 standard deviations (997 out of 1000
should be)
Standard Scores
The number of standard deviations from the mean is also called the
"Standard Score", "sigma" or "z-score". Get used to those words!
Example: In that same school one of your friends is 1.85m
tall
You can see on the bell curve that 1.85m is 3 standard
deviations from the mean of 1.4, so:
Your friend's height has a "z-score" of 3.0
It is also possible to calculate how many standard deviations 1.85 is
from the mean
How far is 1.85 from the mean?
It is 1.85 - 1.4 = 0.45m from the mean
How many standard deviations is that? The standard deviation is
0.15m, so:
0.45m / 0.15m = 3 standard deviations
So to convert a value to a Standard Score ("z-score"):
first subtract the mean,
then divide by the Standard Deviation
And doing that is called "Standardizing":
You can take any Normal Distribution and convert it to The Standard Normal
Distribution.
Example: Travel Time
A survey of daily travel time had these results (in minutes):
26, 33, 65, 28, 34, 55, 25, 44, 50, 36, 26, 37, 43, 62, 35, 38, 45, 32,
28, 34
The Mean is 38.8 minutes, and the Standard Deviation is 11.4
minutes (you can copy and paste the values into the Standard
Deviation Calculator if you want).
Convert the values to z-scores ("standard scores").
To convert 26:
first subtract the mean: 26 - 38.8 = -12.8,
then divide by the Standard Deviation: -12.8/11.4 = -1.12
So 26 is -1.12 Standard Deviations from the Mean
Here are the first three conversions
Original Value Calculation Standard Score
(z-score)
26 (26-38.8) / 11.4 = -1.12
33 (33-38.8) / 11.4 = -0.51
65 (65-38.8) / 11.4 = +2.30
... ... ...
And here they are graphically:
You can calculate the rest of the z-scores yourself!
Here is the formula for z-score that we have been using:
z is the "z-score" (Standard Score)
x is the value to be standardized
μ is the mean
σ is the standard deviation
Why Standardize ... ?
It can help you make decisions about your data.
Example: Professor Willoughby is marking a test.
Here are the students results (out of 60 points):
20, 15, 26, 32, 18, 28, 35, 14, 26, 22, 17
Most students didn't even get 30 out of 60, and most will fail.
The test must have been really hard, so the Prof decides to Standardize
all the scores and only fail people 1 standard deviation below the mean.
The Mean is 23, and the Standard Deviation is 6.6, and these are
the Standard Scores:
-0.45, -1.21, 0.45, 1.36, -0.76, 0.76, 1.82, -1.36, 0.45, -0.15, -0.91
Only 2 students will fail (the ones who scored 15 and 14 on the test)
It also makes life easier because we only need one table (the Standard
Normal Distribution Table), rather than doing calculations individually for
each value of mean and standard deviation.
In More Detail
Here is the Standard Normal Distribution with percentages for every half of
a standard deviation, and cumulative percentages:
Example: Your score in a recent test was 0.5 standard
deviations above the average, how many people scored lower than
you did?
Between 0 and 0.5 is 19.1%
Less than 0 is 50% (left half of the curve)
So the total less than you is:
50% + 19.1% = 69.1%
In theory 69.1% scored less than you did (but with real data the
percentage may be different)
A Practical Example: Your company packages sugar in 1 kg bags.
When you weigh a sample of bags you get these
results:
1007g, 1032g, 1002g, 983g, 1004g, ... (a
hundred measurements)
Mean = 1010g
Standard Deviation = 20g
Some values are less than 1000g ... can you fix
that?
The normal distribution of your measurements looks like this:
31% of the bags are less than 1000g,
which is cheating the customer!
Because it is a random thing we can't stop bags having less than 1000g, but
we can reduce it a lot ...
if 1000g was at -3 standard deviations there would be
only 0.1% (very small)
at -2.5 standard deviations we can calculate:
below 3 is 0.1% and between 3 and 2.5 standard deviations
is 0.5%, together that is 0.1%+0.5% = 0.6% (a good
choice I think)
So let us adjust the machine to have 1000g at 2.5 standard
deviations from the mean.
We could adjust it to:
increase the amount of sugar in each bag (this would change the
mean), or
make it more accurate (this would reduce the standard deviation)
Let us try both:
Adjust the mean amount in each bag
The standard deviation is 20g,
and we need 2.5 of them:
2.5 × 20g = 50g
So the machine should
average 1050g, like this:
Adjust the accuracy of the machine
Or we can keep the same mean (of 1010g),
but then we
need 2.5 standard deviations to be equal to
10g:
10g / 2.5 = 4g
So the standard deviation should be 4g,
like this:
(We hope the machine is that accurate!)
Or perhaps we could have some combination of better accuracy and slightly
larger average size, I will leave that up to you!
In Even More Detail!
We have a Standard Normal Distribution Table if you want more accurate
values.
Standard Normal Distribution Table
This is the "bell-shaped" curve of the Standard Normal Distribution.
It is a Normal Distribution with mean 0 and standard deviation 1.
It shows you the percent of population:
between 0 and Z (option "0 to Z")
less than Z (option "Up to Z")
greater than Z (option "Z onwards")
It is correct to 0.1%, for example 17.36% is rounded to 17.4%
The Table
You can get more accurate values from the table below. The table shows the
area from 0 to Z.
Instead of one LONG table, we have put the "0.1"s running down, then the
"0.01"s running along. (Example of how to use is below)
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.
0
0.000
0
0.004
0
0.008
0
0.012
0
0.016
0
0.019
9
0.023
9
0.027
9
0.031
9
0.035
9
0.
1
0.039
8
0.043
8
0.047
8
0.051
7
0.055
7
0.059
6
0.063
6
0.067
5
0.071
4
0.075
3
0.
2
0.079
3
0.083
2
0.087
1
0.091
0
0.094
8
0.098
7
0.102
6
0.106
4
0.110
3
0.114
1
0.
3
0.117
9
0.121
7
0.125
5
0.129
3
0.133
1
0.136
8
0.140
6
0.144
3
0.148
0
0.151
7
0.
4
0.155
4
0.159
1
0.162
8
0.166
4
0.170
0
0.173
6
0.177
2
0.180
8
0.184
4
0.187
9
0.
5
0.191
5
0.195
0
0.198
5
0.201
9
0.205
4
0.208
8
0.212
3
0.215
7
0.219
0
0.222
4
0.
6
0.225
7
0.229
1
0.232
4
0.235
7
0.238
9
0.242
2
0.245
4
0.248
6
0.251
7
0.254
9
0.
7
0.258
0
0.261
1
0.264
2
0.267
3
0.270
4
0.273
4
0.276
4
0.279
4
0.282
3
0.285
2
0.
8
0.288
1
0.291
0
0.293
9
0.296
7
0.299
5
0.302
3
0.305
1
0.307
8
0.310
6
0.313
3
0.
9
0.315
9
0.318
6
0.321
2
0.323
8
0.326
4
0.328
9
0.331
5
0.334
0
0.336
5
0.338
9
1.
0
0.341
3
0.343
8
0.346
1
0.348
5
0.350
8
0.353
1
0.355
4
0.357
7
0.359
9
0.362
1
1.
1
0.364
3
0.366
5
0.368
6
0.370
8
0.372
9
0.374
9
0.377
0
0.379
0
0.381
0
0.383
0
1.
2
0.384
9
0.386
9
0.388
8
0.390
7
0.392
5
0.394
4
0.396
2
0.398
0
0.399
7
0.401
5
1.
3
0.403
2
0.404
9
0.406
6
0.408
2
0.409
9
0.411
5
0.413
1
0.414
7
0.416
2
0.417
7
1.
4
0.419
2
0.420
7
0.422
2
0.423
6
0.425
1
0.426
5
0.427
9
0.429
2
0.430
6
0.431
9
1.
5
0.433
2
0.434
5
0.435
7
0.437
0
0.438
2
0.439
4
0.440
6
0.441
8
0.442
9
0.444
1
1.
6
0.445
2
0.446
3
0.447
4
0.448
4
0.449
5
0.450
5
0.451
5
0.452
5
0.453
5
0.454
5
1.
7
0.455
4
0.456
4
0.457
3
0.458
2
0.459
1
0.459
9
0.460
8
0.461
6
0.462
5
0.463
3
1.
8
0.464
1
0.464
9
0.465
6
0.466
4
0.467
1
0.467
8
0.468
6
0.469
3
0.469
9
0.470
6
1.
9
0.471
3
0.471
9
0.472
6
0.473
2
0.473
8
0.474
4
0.475
0
0.475
6
0.476
1
0.476
7
2.
0
0.477
2
0.477
8
0.478
3
0.478
8
0.479
3
0.479
8
0.480
3
0.480
8
0.481
2
0.481
7
2.
1
0.482
1
0.482
6
0.483
0
0.483
4
0.483
8
0.484
2
0.484
6
0.485
0
0.485
4
0.485
7
2.
2
0.486
1
0.486
4
0.486
8
0.487
1
0.487
5
0.487
8
0.488
1
0.488
4
0.488
7
0.489
0
2.
3
0.489
3
0.489
6
0.489
8
0.490
1
0.490
4
0.490
6
0.490
9
0.491
1
0.491
3
0.491
6
2.
4
0.491
8
0.492
0
0.492
2
0.492
5
0.492
7
0.492
9
0.493
1
0.493
2
0.493
4
0.493
6
2.
5
0.493
8
0.494
0
0.494
1
0.494
3
0.494
5
0.494
6
0.494
8
0.494
9
0.495
1
0.495
2
2.
6
0.495
3
0.495
5
0.495
6
0.495
7
0.495
9
0.496
0
0.496
1
0.496
2
0.496
3
0.496
4
2.
7
0.496
5
0.496
6
0.496
7
0.496
8
0.496
9
0.497
0
0.497
1
0.497
2
0.497
3
0.497
4
2.
8
0.497
4
0.497
5
0.497
6
0.497
7
0.497
7
0.497
8
0.497
9
0.497
9
0.498
0
0.498
1
2.
9
0.498
1
0.498
2
0.498
2
0.498
3
0.498
4
0.498
4
0.498
5
0.498
5
0.498
6
0.498
6
3.
0
0.498
7
0.498
7
0.498
7
0.498
8
0.498
8
0.498
9
0.498
9
0.498
9
0.499
0
0.499
0
Example: Percent of Population Between 0 and 0.45
Start at the row for 0.4, and read along until 0.45: there is the value
0.1736
And 0.1736 is 17.36%
So 17.36% of the population are between 0 and 0.45 Standard
Deviations from the Mean.
Because the curve is symmetrical, the same table can be used for values
going either direction, so a negative 0.45 also has an area of 0.1736
Example: Percent of Population Z Between -1 and 2
From −1 to 0 is the same as from 0 to +1:
At the row for 1.0, first column 1.00, there is the value 0.3413
From 0 to +2 is:
At the row for 2.0, first column 2.00, there is the value 0.4772
Add the two to get the total between -1 and 2:
0.3413 + 0.4772 = 0.8185
And 0.8185 is 81.85%
Skewed Data
Data can be "skewed", meaning it tends to have a long tail on one side or
the other:
Negative Skew No Skew Positive Skew
Negative Skew?
Why is it called negative skew? Because
the long "tail" is on the negative side of the
peak.
People sometimes say it is "skewed to the
left" (the long tail is on the left hand side)
The mean is also on the left of the peak.
The Normal Distribution has No Skew
A Normal Distribution is not skewed.
It is perfectly symmetrical.
And the Mean is exactly at the peak.
Positive Skew
And positive skew is when the long tail is
on the positive side of the peak, and some
people say it is "skewed to the right".
The mean is on the right of the peak
value.
Example: Income Distribution
Here is some data I
extracted from a recent
Census.
As you can see it
is positively skewed ...
in fact the tail continues
way past $100,000
Calculating Skewness
"Skewness" (the amount of skew) can be calculated, for example you could
use the SKEW() function in Excel or OpenOffice Calc.
top related