data collection & sampling techniques 1. meaning of statistics statistics is used to mean either...

49
Data Collection & Sampling Techniques 1

Upload: margaret-bradley

Post on 23-Dec-2015

229 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

1

Data Collection & Sampling Techniques

Page 2: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

2

MEANING OF STATISTICS

• Statistics is used to mean either statistical data or statistical methods

• Statistics is a method of collecting, organising and analysing the numerical data for understanding a phenomenon or making wise decisions

Page 3: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

3

FUNCTIONS OF STATISTICS

• 1. To present facts in proper form• 2. To simplify unwieldy and complex data and to

make them easily understandable.• 3. To help the classification of data according to

various characteristics.• 4. To provide techniques for making comparisons• 5. To study relationships between different

phenomena.• 6. To indicate the trend behaviou.

Page 4: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

4

LIMITATIONS OF STATISTICS

• 1. Statistics does not study individuals• 2. Statistics does not study qualitative

phenomena• 3. Statistical results are true only on an average.• 4. Statistical laws are not exact. (like laws of

physical and natural sciences, statistical laws are only approximations and not exact.

• 5. Statistics does not reveal the entire story• 6. Statistics is liable to be misused.

Page 5: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

5

Uses of Statistics

• Describe data• Compare two or more data sets• Determine if a relationship exists between

variables• Make estimates about population characteristics• Predict past or future behavior of data

Page 6: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

6

Misuse of statistics

• “There are three types of lies---lies, damn lies, and statistics” Benjamin Disraeli

• “Figures don’t lie, but liars figure”• “Statistics can be used to prove anything ---

especially statisticians” Franklin P. Jones

Page 7: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

7

Sources of Misuse

• There are two main sources of misuse of statistics: – An agenda on the part of a dishonest researcher – Unintentional errors on part of a researcher

Page 8: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

8

Misuses of Statistics

• Survey Questions– Loaded Questions---unintentional wording to elicit

a desired response– Order of Questions– Nonresponse (Refusal)—subject refuses to answer

questions– Self-Interest ---Sponsor of the survey could enjoy

monetary gains from the results

Page 9: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

9

Misuses of Statistics

• Missing Data (Partial Pictures)– Detached Statistics ---no comparison is made – Percentages --

• Implied Connections– Correlation and Causality –when we find a

statistical association between two variables, we cannot conclude that one of the variables is the cause of (or directly affects) the other variable

Exercise 1

Page 10: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

10

Data Collection

• In research, statisticians use data in many different ways.

• Data can be used to describe situations. • Data can be collected in a variety of ways, BUT

if the sample data is not collected in an appropriate way, the data may be so completely useless that no amount of statistical torturing can salvage them.

Page 11: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

11

Data Analysis

Page 12: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

12

Course objectives

Trainees will analyze graphs.a. Analyze data presented in a graph.b. Compare and contrast multiple graphic

representations (circle graphs, line graphs, line plot graphs, pictographs, Venn diagrams, and bar graphs) for a single set of data and discuss the advantages/disadvantages of each.

c. Determine and justify the mean, range, mode, and median of a set of data.

Page 13: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

13

Terms

• Mean: The sum of the numbers in a set of data divided by the number of pieces of data. ( D+ X analysis, scan compliance, delivery percentage etc in MNOP KPI, work load calculation for post office )

• Median: The number in the middle of a set of data when the data are arranged in order from least to greatest. When there are 2 middle numbers, the median is the number that is halfway between the two middle numbers.

Page 14: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

14

Terms

• Mode: The number that occurs most frequently in a set of numbers.

• Range: The difference between the largest and smallest values in a numerical data set.

Page 15: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

15

Finding the Mean

• Step 1: Add all numbers in your set of data.• Step 2: Divide the sum by the number of

pieces of data.

Example:Set of Data: 15, 15, 14, 16

Sum: 60Total number of pieces of

data: 4Mean: 60 ÷ 4 = 15

Page 16: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

16

Finding the Median

• Step 1: Put all numbers in order from least to greatest.

• Step 2: Find the middle number.

Example:Set of Data: 15, 15, 14, 16

Ordered: 14, 15, 15, 16Middle Number: 15 and 15

Median: 15

Page 17: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

17

• Ex: test check figures for two days for unregistered article . The day should be a normal working day – ex Wednesday or Thursday .

• Here what we are assuming is that these days will have normal transactions. Hence these are the median for normal transactions.

• Other days work may vary between minimum and maximum

Page 18: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

18

Finding the Mode

• Step 1: Put all numbers in order from least to greatest.

• Step 2: Find the most popular number.

Example:Set of Data: 15, 15, 14, 16

Ordered: 14, 15, 15, 16Mode: 15

Page 19: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

19

• Ex – checking post man in the beat by the PRIP . The PRIP should select a point where the probability of the post man visiting the point is high . The Prip should be selecting the mode i. e. the point visited more frequently

Page 20: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

20

Finding the Range

• Step 1: Put all numbers in order from least to greatest.

• Step 2: Subtract the lowest number from the highest number.

Example:Set of Data: 15, 15, 14, 16

Ordered: 14, 15, 15, 16Range: 16 – 14 = 2

Page 21: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

21

Activity

• The no of articles booked in an MPCM in a post office is as follows – Monday – 175– Tuesday - 202– Wednesday - 180 – Thursday – 130 – Friday – 198– Saturday – 175

• Find the mean , median , mode and range for the above set of data

Page 22: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

22

Types of Graphs

Bar GraphCircle GraphLine Graph

Line Plot GraphVenn Diagram

Pictograph

Page 23: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

Bar Graph

Definition: a graph that shows data using horizontal or vertical bars.

Advantages:

•Easy to read

•Compares multiple sets of data

Disadvantages

•Not best for showing trends 23

Page 24: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

24

ExercisePrepare a bar chart with the given information

Year revenue

2009-10 6266

2010-11 6962

2011-12 7899

2012-13 9366

2013-14 10720

Page 25: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

Circle ( Pie )Graph

Definition: A graph that shows data in the form of a circle.

Advantages:

•Shows percentages

•Shows how a total is divided into parts

Disadvantages

•Not best for showing trends

25

Page 26: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

26

Exercise-----Prepare a pie chart Revenue year 2013-14Products Revenue Speed Post 1372.0Business Post 1029.4Bill Mail Service 103.0Express Parcel Post 77.6Retail Post 70.2Sale of Postage Stamps 622.8Logistic Post 15.3 Money Orders 606.9Others 852.5Revenue fro P.O. 4749.6SBCC 5971.3Total Revenue 10720.9

Page 27: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

Line Graph

Definition: A graph that shows data in the form of a line.

Advantages:

•Shows change over time

•Helps you see trends

Disadvantages

•Not easy to use to compare different categories of data 27

Page 28: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

28

Exercise From the following table prepare a line diagram

Year Expenditure

2009-10 13346

2010-11 13793

2011-12 14163

2012-13 15481

2013-14 16796

Page 29: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

Pictograph

Definition: A graph that displays data using symbols or pictures.

Advantages:

•Compares multiple sets of data

•Visually appealing

Disadvantages

•Hard to read when there are parts of pictures. 29

Page 30: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

30

Venn Diagram

Definition: Circles that show relationships among sets.

Advantages:

•Shows comparisons and contrasts easily.

Disadvantages

•Does not show trends

Page 31: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

31

• 100 trainees attended PA induction program in your PTC

• 80 trainees attended IP induction program in your PTC

• 20 have attended both IP and PA induction program in the same PTC – make a Venn diagram

Exercise 2

Page 32: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

32

Sampling and Sampling Distributions

Page 33: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

33

Sample and population (ASW, 15)• A population is the collection of all the elements of

interest.(census enumeration)• A sample is a part of the population.– Good or bad samples.– Representative or non-representative samples. A

researcher hopes to obtain a sample that represents the population, at least in the variables of interest for the issue being examined.

– Probabilistic samples are samples selected using the principles of probability. This may allow a researcher to determine the sampling distribution of a sample statistic.

Page 34: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

34

MEANING OF SAMPLING Sampling is a method in which only those items that are included in the sample are observed for purpose of drawing conclusions about the population from which sample is drawn.The so obtained sample will be called as statistic (i.e. The measures of central tendency and measures of dispersion are called statistic and are used as a basis for estimation population parameters).

Page 35: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

35

NEED FOR SAMPLING

• 1. Savings in time and money• 2. When the population is infinately large• The fact that the characteristics of the

sample are able to provide an approximately correct idea about the population parameters is borne out by the theory of probability.

Page 36: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

36

Methods of sampling – probabilistic• Random sampling methods – each member has an equal probability of

being selected.• Systematic – every kth case. Equivalent to random if patterns in list are

unrelated to issues of interest. Eg. Inspection of BO by divisional head.• Stratified samples – sample from each stratum or subgroup of a

population. Eg. SB withdrawal verification( more than 10000) .• Cluster samples – sample only certain clusters of members of a

population. Eg. city blocks, firms, test cards only on the addressees in the periphery of the jurisdiction, SB withdrawal checked only for C class offices , inspection of bad Bos .

• Multistage samples – combinations of random, systematic, stratified, and cluster sampling. Ex – checking of transaction particulars of selected days during the inspection of BO

• If probability involved at each stage, then distribution of sample statistics can be obtained.

Page 37: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

37

Basic Methods of Sampling

• Random Sampling– Selected by using chance

or random numbers– Each individual subject

(human or otherwise) has an equal chance of being selected

– Examples: • MO verification by PRIP • Drawing names from a

hat• Random Numbers

Page 38: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

38

Basic Methods of Sampling

• Systematic Sampling– Select a random starting point and then select every kth

subject in the population– Simple to use so it is used often

Page 39: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

39

Basic Methods of Sampling

Convenience SamplingUse subjects that are easily accessible Examples:

Using family members or students in a classroomMall shoppers

Page 40: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

40

Basic Methods of Sampling

Stratified SamplingDivide the population into at least two different groups with

common characteristic(s), then draw SOME subjects from each group (group is called strata or stratum)

Basically, randomly sample each subgroup or strataResults in a more representative sample

Page 41: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

41

Basic Methods of SamplingCluster Sampling

Divide the population into groups (called clusters), randomly select some of the groups, and then collect data from ALL members of the selected groups

Used extensively by government and private research organizations

Examples:Exit Polls

Page 42: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

42

Objects of sampling

1. To Obtain information about the population on the basis of sample drawn from such population.

2. To setup the limits of accuracy of the estimates of the population parameters computed on the basis of sample statistic.

Page 43: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

43

Some terms used in sampling

• Sampled population – population from which sample drawn (ASW, 258). Researcher should clearly define.

• Frame – list of elements that sample selected from (ASW, 258). Eg. telephone book, city business directory. May be able to construct a frame.

• Parameter – Numerical characteristics of a population (ASW, 259). Eg. total (annual GDP or exports), proportion p of population that votes Liberal in federal election. Also, µ or σ of a probability distribution are termed parameters.

• Statistic – numerical characteristics of a sample. Eg. pre-election polls.

• Sampling distribution of a statistic is the probability distribution of the statistic.

Page 44: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

44

Sampling distribution of a sample

• Sampling distribution of a statistic refers to the distribution of the various values, which can be assumed by that statistic, computed from the various samples of the same size randomly drawn from the population. Any statistical measure of statistic like mean, standard deviation etc. may be computed for each of the samples so drawn and a series of those value of statistic may be compiled. The various values of the statistic so obtained may be arranged as a frequency distribution which is known as the sampling distribution.

Page 45: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

45

Selecting a sample (ASW, 259-261)

• N is the symbol given for the size of the population or the number of elements in the population.

• n is the symbol given for the size of the sample or the number of elements in the sample.

• Simple random sample is a sample of size n selected in a manner that each possible sample of size n has the same probability of being selected.

• In the case of a random sample of size n = 1, each element has the same chance of being selected.

Page 46: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

46

Selecting a simple random sample

• Sample with replacement – after any element randomly selected, replace it and randomly select another element. But this could lead to the same element being selected more than once.

• More common is sample without replacement. Make sure that on each stage, each element remaining in the population has the same probability of being selected.

Page 47: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

47

Simple random sample of size 2 from a population of 4 elements

Population elements are A, B, C, D. N=4, n=2.1st element selected could be any one of the 4 elements and

this leaves 3, so there are 4 x 3 = 12 possible samples, each equally likely: AB, AC, AD, BA, BC, BD, CA, CB, CD, DA, DB, DC.

If the order of selection does not matter (ie. we are interested only in what elements are selected), then this reduces to 6 combination. If {AB} is AB or BA, etc., then the equally likely random samples are {AB}, {AC}, {AD}, {BC}, {BD}, {CD}. This is the number of combinations (ASW, 261, note 1).

12)!24(

!4

)!(

!

nN

NP

N

n

6)!24(!2

!4

)!(!

!

nNn

NC

N

n

Page 48: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

48

Standard error of a statistic

• The average amount of variability of the observations of a population is computed, it is known as standard deviation and the average amount of variability of observations of a sampling distribution computed is known as standard error.

Page 49: Data Collection & Sampling Techniques 1. MEANING OF STATISTICS Statistics is used to mean either statistical data or statistical methods Statistics is

49

Sampling from a process (ASW, 261)• Careful design for sample is especially important.– Sample production of milk at random times.– Sample of data of various products in the department

Like speed post, logistic post, business post etc .,

– we need to calculate the mean and standard deviation for the observations from the samples.

– How to calculate the mean and standard deviation of the population.

– (the standard deviation is the square root of the average of the squared distances of the observations from the mean.)