Download - Introduction of Statistics - KOCWcontents.kocw.net/KOCW/document/2014/Chungang/PoLin2/1.pdfData Quantitative Data Summary Table Stem-&-Leaf Display Frequency Distribution Histogram

Introduction of Statistics

By: Dr. Po-Lin Lai

Dr. PO-LIN LAI PhD. of Cardiff University Research interests: Air transportation Air cargo supply chain Airport management and operations. Airline finance

Email: [email protected]

This module is an applied statistics course for the 1st year students of Department of International Logistics.

Topics include main elements of business statistics and tutorials.

This course provides students with a range of statistical approaches to data analysis and techniques that would be useful in applications related to management and business related activities.

This course aims at getting used to statistics concepts, theory, terms, and applying to practical situations.

In addition students will learn how to analyse data using SPSS.

Anderson, D. Sweeney, D, Williams, T. Statistics for Business and Economics, 11th edition, South-Western.

Silver, M. Business Statistics, 2nd Edition, McGrawHill

Pallant, J. SPSS survival manual, 10th Edition, McGrawHill

Mid-Term Exam 35% Final Exam 35% Assignment 20% Class Participation & Discussion 10%

Content The concepts of Statistics Types of Statistical Applications in

Business Types of Data Collecting Data

Introduce the field of statistics Demonstrate how statistics applies to business Establish the link between statistics and data Identify the different types of data and data-

collection methods Differentiate between population and sample

data Differentiate between descriptive and inferential

statistics

Statistics is the science of data.It involves collecting, classifying, summarising, organising, analysing, and interpreting numerical information.

Population Sample Statistical inference

Population A population is the group of all items of

interest to a statistics practitioner. It is frequently very large or may infinitely large.

In the language of statistics, population does not necessarily refer to a group of people. It may refer to the population of ball bearings produced at a large plant.

Population A descriptive measure of a population is

called a parameter. Sample A sample is a set of data drawn from the

studied population. A descriptive measure of a sample is called a statistic. We use statistics to make inferences about parameters.

Some universities have signed agreements with a variety of private companies. These agreements bind the university to sell these companies’ products exclusively on the campus.

EX: CAU with a total enrollment of about 5,000 students has offered Pepsi-Cola an exclusivity agreement that would give Pepsi exclusive rights to sell its products at all university facilities for the next year with an option for future years. In return, the university would receive 35% of the on-campus revenues and an additional lump sum of $200,000 per year.

In Pepsi case, the statistic we would compute is the mean number of soft drinks consumed in the last week by the 500 students in the sample. We would then use the sample mean to infer the value of the population mean, which is the parameter of interest in this problem.

The parameter of interest in Pepsi Case is the mean number of soft drinks consumed by all the students at the university. In most applications of inferential statistics the parameter represents the information we need.

Statistical inference Statistical inference is the process of making an

estimate, prediction, or decision about a population based on sample data.

Because populations are almost always very large, investigating each member of the population would be impractical and expensive. It is far easier and cheaper to take a sample from the population of interest and draw conclusions or make estimates about the population on the basis of information provided by the sample.

Statistical inference However, such conclusions and estimates are not

always going to be correct. For this reason, we build into the statistical inference a measure of reliability. There are two such measures: the confidence level and the significance level. The confidence level is the proportion of times that an estimating procedure will be correct.

For example, in Pepsi Case, we will produce an estimate of the average number of soft drinks to be consumed by all 5,000 students that has a confidence level of 95%. In other words, estimates based on this form of statistical inference will be correct 95% of the time.

Accounting Sampling: audit

Marketing Consumer

Preferences: Tesco Financial Trends

Economics Forecasting Demographics

Finance Recommendations

for investment: coefficient of variation

StatisticalMethods

DescriptiveStatistics

InferentialStatistics

Involves Collecting Data Presenting Data Characterizing Data

Purpose Describe Data

X = 30.5 S2 = 113

0

25

50

Q1 Q2 Q3 Q4

$

InvolvesEstimationHypothesis

Testing

PurposeMake decisions about

population characteristics

Population?

Data are the facts and figures collected, analyzed, and summarized for presentation and interpretation.From the type, data can be classified in to Quantitative Data Qualitative Data

Quantitative Data:are recorded on a naturally occurring numerical scale.

Height, weight, salaries, and distances

3

52

71

4

8

943

120 12

21

Qualitative data: cannot be measured on a natural numerical scale; they can only be classified into one of a group of categories.

Classified into categories. College major of each

student in a class. Gender of each employee

at a company. Method of payment

(cash, check, credit card).

$ Credit

From the way you get the data Primary: is commissioned to solve this problem. Secondary: commissioned by somebody else.

From the purpose of analysis Cross-sectional: “snapshot”, same point in time.

▪ A market research report of the EU car market in 2009.▪ The UK temperature at July of 2012.

Time Series: over several periods of time.▪ The UK temperature at 200 to 2012.▪ Price of petrol between 2005 to 2009.

But the majority of data are derived from surveys, so we must consider possible sources of error.

5 types Mis-cording: human error linked with data entry. Interviewer records respondent’s age as 32, not 23.

Sampling error: Occurs naturally, depends on the sampling method. Can be calculated (estimated, e.g., +/- 3%) Declines as sample size rises, but not proportionately.

Response error: Arises because questions are asked in a social context. Importance of question wording. E.g. attribute ratings: 10 point scale, allocate 100 points. People answer because of what is expected of them.

Non-response error: Low response rates in surveys are normal (40% excellent). Low response reduces precision (increase sampling error) Non-response bias: responder differs from non-responder. Scott Armstrong: compare early and late respondents on

key questions.

Design error: Arises because of inappropriate sampling methods Choice of sampling frame (list) Problems with quota sampling. E.g.: Members of CIM; Cardiff high-street Tuesday a.m.

Data from a published source Data from a designed experiment Data from a survey Data collected observationally

Published source:book, journal, newspaper, Web site

Designed experiment:researcher exerts strict control over units

Survey:a group of people are surveyed and their responses are recorded

Observation study:units are observed in natural setting and variables of interest are recorded

A representative sample exhibits characteristics typical of those possessed by the population of interest.

A random sample of n experimental units is a sample selected from the population in such a way that every different sample of size n has an equal chance of selection.

Every sample of size n has an equal chance of selection.

1. Typical Software• SPSS• MINITAB• Excel

2. Need Statistical Understanding

• Assumptions• Limitations

Content Describing Qualitative Data Graphical Methods for Describing

Quantitative Data

Learning Objectives Describe data using graphs

Key terms A class is one of the categories into which

qualitative data can be classified. The class frequency is the number of

observations in the data set falling into a particular class.

The class relative frequency is the class frequency divided by the total numbers of observations in the data set.

The class percentage is the class relative frequency multiplied by 100.

Data Presentation

QualitativeData

QuantitativeData

SummaryTable

Stem-&-LeafDisplay

FrequencyDistribution

HistogramBarGraph

PieChart

DotPlot

1. Lists categories & number of elements in category

2. Obtained by tallying responses in category3. May show frequencies (counts), % or both

Row Is Category Tally:

|||| |||||||| ||||

Major CountAccounting 130Economics 20Management 50Total 200