1 matb344 applied statistics chapter 0: introduction to statistics

Post on 01-Jan-2016

225 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

MATB344 Applied Statistics

Chapter 0:Introduction to Statistics

2

What is Statistics?• Satistics - a branch of mathematics that have applications in daily life• like a “language” that have to be learned• Involves gathering, analysis and presentation of data.• Always encounter in our daily life. Unavoidable.• Example, in department of statistics, Malaysia…

3

Uses of Statistics• General usage:

– a theoretical discipline in its own right – a tool for researchers in other fields– a general tool to draw general conclusions in a large variety of

applications

• In Politics– Forecasting and predicting winners of elections

• where to concentrate campaign appearances• how and when to advertise• where to spend money effectively …

• In Industry– To market product.

• For example, to predict the average length of life of a light bulb, cannot test all the bulbs, so choose some sample to obtain the statistics.

4

Statistics in IT?• Analysis of experimentation results

• Design and developments of statistical softwares

• One of the machine learning methods uses statistical concepts – The Statistical Learning theory

5

What’s involved in Statistics?• Collects numbers or data

• Systematically organizes or arranges the data

• Analyzes the data…extracts relevant information to provide a complete numerical description

• Infers general conclusions about the problem using this numerical description

6

Example use of Statistics

Do you think that the United States war on Do you think that the United States war on terrorism will spread to countries other than terrorism will spread to countries other than Afghanistan?Afghanistan?

YES 64%YES 64% NO 34%NO 34%

Do you think that the United States should be Do you think that the United States should be directly involved in negotiating peace directly involved in negotiating peace agreements in other parts of the world?agreements in other parts of the world?

YES 62%YES 62% NO 31%NO 31%

Do you think that the United States war on Do you think that the United States war on terrorism will spread to countries other than terrorism will spread to countries other than Afghanistan?Afghanistan?

YES 64%YES 64% NO 34%NO 34%

Do you think that the United States should be Do you think that the United States should be directly involved in negotiating peace directly involved in negotiating peace agreements in other parts of the world?agreements in other parts of the world?

YES 62%YES 62% NO 31%NO 31%

THE WAR ON TERRORISMTHE WAR ON TERRORISM

• A newspaper produce the results of its survey on terrorism as follows

How do we get the poll?

Ask everyone?

Is it possible?

OF COURSE..NOT

7

Problems in Statistics• Need the conclusion and prediction

about the whole body of measurements, eg: Malaysians

• Solution: Use a smaller set of measurement to represent the whole body of measurements

• But we cannot survey on them all.• Sometimes, a the whole body of measurements is

large and cannot be totally enumerated.

POPULATION

SAMPLE

8

Examples• To predict the average length of life of a light

bulb - to enumerate the population is destructive. We

cannot take all light bulb and test.- So, select a smaller number of light bulb as a

sample • To forecast the winner of an election

- population of the whole country is too big and people do change their mind

- So, select a group of people in certain location to be the sample.

9

Common Terms• Experimental Units: The set of items or objects on

which measurements are taken • Sample (or Population): The set of measurements

taken on the experimental units.• Examples:

– Light bulbs• Experimental unit = a bulb• Sample = the measures of operating life of all the

bulbs.– Opinion polls

• Experimental unit = person• Sample = the set of opinions of each person of who

the winner of an election is.

10

Two kinds of Statistics

• Inferential Statistics

• Descriptive Statistics

11

Descriptive Statistics• Used to describe sets of measurements.• Example : Bar charts, pie charts, line

charts etc.• Suitable for entire population.

• DESCRIPTIVE STATISTICS consists of procedures used to summarize and describe the set of measurements.

12

Inferential Statistics

• Used to describe / make inferences about a population based on statistics of the sample.

• Used when we cannot enumerate the whole population

• INFERENTIAL STATISTICS: Procedures used to draw conclusions or inferences about the population from information contained in the sample.

13

Objective of Inferential Statistics

• To make inferences about a population– draw conclusions– make prediction– make decision

from information contained in a sample.

• The statistician’s job is to find the best way to do this.

14

But, …

• How can we be sure that the poll result is reliable?• We need a measure of reliability.

Who makes the best burgers? Votes Percent

McDonalds 123 Votes 13%

Burger King 384 Votes 39%

Wendy’s 304 Votes 31%

All three have equally good burgers

72 Votes 7%

None of these have good burgers 98 Votes 10%

consider this internet opinion poll…

Our conclusions could be incorrect…

15

Steps in Inferential Statistics

• Define the objective of the experiment and the population of interest

• Determine the design of the experiment and the sampling plan to be used

• Collect and analyze the data• Make inferences about the population from

information in the sample• Determine the goodness or reliability of the

inference.

16

Step 1:

• Define the objective of the experiment and the population of interest

Example : In Presidential ElectionObjective : To determine Who will get the most votesPopulation: Set of all votes (from all registered voters)Sample : Sample voters from each states in USA.

17

Step 2:

• Determine the design of the experiment and the sampling plan to be used

• To decide how to select sample.

• How big a sample to select.

• How much will it cost if the sample is selected.

18

Step 3:

• Collect and analyze the data

• Collect information from the sample

• Use appropriate method of analysis

19

Step 4:

• Make inferences about the population from information in the sample

• Use information from the analysis to make inference

• Many methods but only one is the most accurate. Choose the best.

20

Step 5:

• Determine the goodness or reliability of the inference.

• Inference might be wrong because we’re not looking at the whole population.

• Need to have measure of reliability

21

Conclusion: Learn to View Statistics Critically

• Why? Because Statistics can lie.• According to people against statistics - there

are three kinds of lies…..– Lies– Damn Lies– Statistics

• Be positive!!!You need to make statistics work for you, not lie for you!

22

Making Reliable Statistics

• Use software tools to help perform the procedures.

• List of Softwares:

– MINITAB

– SPSS

– Microsoft EXCEL

– Java applets.

top related