introduction to statistics

61
CHAPTER 1 INTRODUCTION TO STATISTICS INTRODUCTION - DEFINITION OF STATISTICS- FUNCTIONS- SCOPE-LIMITATIONS, CLASSIFICAITON AND TABULATION OF DATA

Upload: sanjeevseshanna

Post on 25-Dec-2015

29 views

Category:

Documents


0 download

DESCRIPTION

introduction to statistics

TRANSCRIPT

CHAPTER 1INTRODUCTION TO STATISTICSINTRODUCTION -DEFINITION OF STATISTICS- FUNCTIONS- SCOPE-LIMITATIONS, CLASSIFICAITON AND TABULATION OF DATA

Dataobservations (such as measurements,

genders, survey responses) that have

been collected.

Definitions

What is statistics?• Statistics is the study of how to collect,

organize, analyze, and interpret numerical information from data.

• Collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions.

There are two main branches of statistics1. Descriptive Statistics –Descriptive statistics

involves methods of organizing, picturing and summarizing information from data.

Example: Mean, Median, Mode2. Inferential Statistics – Inferential statistics

involves methods of using information from a sample to draw conclusions about the population i. e, the use of descriptive statistics to make educated guesses about something

The basic fallacies of inferential statistics:• They assume that the past is prologue to

the future• They assume you are going to test your

inference an infinite amount of times.

Characteristics of Statistics

• Statistics are aggregates of facts• Statistics are numerically expressed• Statistics are affected to a marked extent by

multiplicity of causes• Reasonable standards of accuracy in

enumeration, estimation or collection• Statistics are collected in a systematic manner• Statistics are collected for predetermined

purpose

Five stages of statistical investigation :Collection of Data

↓ Organization of data

↓ Presentation of data

↓ Analysis

↓ Interpretation of Results

(1) Collection of Data: A structure of statistical investigation is based on a systematic collection of data. The data is classified into two groups i) Internal data and ii) External dataInternal data are obtained from internal records related to operations of business organisation such as production, source of income and expenditure, inventory, purchases and accounts. The external data are collected and purchased by external agencies. The external data could be either primary data or secondary data. The primary data are collected for first time and original, while secondary data are collected by published by some agencies.

(2) Organisations of data :The collected data is a large mass of figures that needs to be organised. The collected data must be edited to rectify for any omissions, irrelevant answers, and wrong computations. The edited data must be classified and tabulated to suit further analysis.

3) Presentation of data :The large data that are collected cannot be understood and should be analysed easily and quickly. Therefore, collected data needs to be presented in tabular or graphic form. This systematic order and graphical presentation helps for further analysis.

4) Analysis of data: The analysis requires establishing the relationship between one or more variables. Analysis of data includes condensation, abstracting, summarization, conclusion etc. With the help of statistical tools and techniques like measures of dispersion central tendency, correlation, variance analysis etc analysis can be done.

(5)Interpretation of dataThe interpretation requires deep insight of the subject. Interpretation involves drawing the valid conclusions on the bases of the analysis of data. This process is very important as conclusions of results are done based on interpretation.

Functions

• Presentation of facts• Simplification of complexities• Facilitating comparisons• Facilitating the formulation of policies• Widening of human knowledge • Useful in testing the laws of other sciences• Facilitates the forecasting• Establishment of correlation between two facts

FUNCTIONS OF STATISTICS :Statistics as a discipline is considered indispensable in almost all spheres of human knowledge. There is hardly any branch of study which does not use statistics. Scientific, social and economic studies use statistics in one form or another. These disciplines make-use of observations, facts and figures, enquiries and experiments etc. using statistics and statistical methods. Statistics studies almost all aspects in an enquiry. It mainly aims at simplifying the complexity of information collected in an enquiry. It presents data in a simplified form as to make them intelligible. It analyses data and facilitates drawal of conclusions. Now let us briefly discuss some of the important functions of statistics.

1. Presents facts in simple form: Statistics presents facts and figures in a definite form. That makes the statement logical and convincing than mere description. It condenses the whole mass of figures into a single figure. This makes the problem intelligible.

.2. Reduces the Complexity of data: Statistics simplifies the complexity of data. The raw data are unintelligible. We make them simple and intelligible by using different statistical measures. Some such commonly used measures are graphs, averages, dispersions, skewness, kurtosis, correlation and regression etc. These measures help in interpretation and drawing inferences. Therefore, statistics enables to enlarge the horizon of ones knowledge

3. Facilitates comparison: Comparison between different sets of observation is an important function of statistics. Comparison is necessary to draw conclusions as Professor Boddington rightly points out.” the object of statistics is to enable comparison between past and present results to ascertain the reasons for changes, which have taken place and the effect of such changes in future. So to determine the efficiency of any measure comparison is necessary. Statistical devices like averages, ratios, coefficients etc. are used for the purpose of comparison.

4.Testing hypothesis: Formulating and testing of hypothesis is an important function of statistics. This helps in developing new theories. So statistics examines the truth and helps in innovating new ideas.

5.Formulation of Policies :Statistics helps in formulating plans and policies in different fields. Statistical analysis of data forms the beginning of policy formulations. Hence, statistics is essential for planners, economists, scientists and administrators to prepare different plans and programmes.

6. Forecasting :The future is uncertain. Statistics helps in forecasting the trend and tendencies. Statistical techniques are used for predicting the future values of a variable. For example a producer forecasts his future production on the basis of the present demand conditions and his past experiences. Similarly, the planners can forecast the future population etc. considering the present population trends.

7. Derives valid inferences :Statistical methods mainly aim at deriving inferences from an enquiry. Statistical techniques are often used by scholars planners and scientists to evaluate different projects. These techniques are also used to draw inferences regarding population parameters on the basis of sample information.

Limitation

• Statistics does not study individuals• Statistics deals with quantitative facts• Statistics is true only to its averages• Statistics may lead to fallacious conclusion• Only experts can make use of statistics• Homogeneity and uniformity is must

Limitations of statistics: Statistics with all its wide application in every sphere of human activity has its own limitations. Some of them are given below.1. Statistics is not suitable to the study of qualitative phenomenon: Since statistics is basically a science and deals with a set of numerical data, it is applicable to the study of only these subjects of enquiry, which can be expressed in terms of quantitative measurements. As a matter of fact qualitative phenomenon like honesty, poverty, beauty, intelligence etc, cannot be expressed numerically and any statistical analysis cannot be directly applied on these qualitative phenomenons. Nevertheless, statistical techniques may be applied indirectly by first reducing the qualitative expressions to accurate quantitative terms. For example, the intelligence of a group of students can be studied on the basis of their marks in a particular examination.

2. Statistics does not study individuals: Statistics does not give any specific importance to the individual items, in fact it deals with an aggregate of objects. Individual items, when they are taken individually do not constitute any statistical data and do not serve any purpose for any statistical enquiry.

3. Statistical laws are not exact: It is well known that mathematical and physical sciences are exact. But statistical laws are not exact and statistical laws are only approximations. Statistical conclusions are not universally true. They are true only on an average.

4. Statistics table may be misused: Statistics must be used only by experts; otherwise, statistical methods are the most dangerous tools on the hands of the inexpert. The use of statistical tools by the inexperienced and untraced persons might lead to wrong conclusions. Statistics can be easily misused by quoting wrong figures of data. As King says9aptly ‘ statistics are like clay of which one can make a God or Devil as one pleases’

.

5. Statistics is only, one of the methods of studying a problem: Statistical method do not provide complete solution of the problems because problems are to be studied taking the background of the countries culture, philosophy or religion into consideration. Thus the statistical study should be supplemented by other evidences. Statistics can analyze only aggregated observation or data: Any statistics is a collection of data. Individual observation does not belong to statistics hence, statistics analyses a collection of data and enlighten the overall estimated result. For-example the average income of the labourers of a business can be estimated by observing their per capital

Scope and importance of statistics

• Useful to bankers• Useful to insurance company• Useful to railways and other transport

agencies• Useful to business• Useful to economists• Useful to planning

Classification and Tabulation

Definition of Classification

“Classification is the process of arranging data into sequences and groups according to their common characteristics or separating them into different but related parts.”

- Secrist “ The process of grouping large number of individual

facts and observations on the basis of similarity among the items, is called classification”.

- Stockton & Clark

Meaning of Classification

• Classification is a process of arranging things or data in groups or classes according to their resemblances and affinities and gives expressions to the unity of attributes that may subsit among a diversity of individuals.

Characteristics of classification

– Classification performs homogeneous grouping of data

– It brings out points of similarity and dissimilating– The classification may be either real or imaginary – Classification is flexible to accommodate

adjustments

Objectives / purposes of classifications

To simplify and condense the large data To present the facts to easily in understandable

form To allow comparisons To help to draw valid inferences To relate the variables among the data To help further analysis To eliminate unwanted data To prepare tabulation

Important types of classification

• Geographical (i.e. on the basis of area or region wise)

• Chronological (On the basis of Temporal / Historical, i.e. with respect to time)

• Qualitative (on the basis of character / attributes)

• Numerical, quantitative (on the basis of magnitude)

Geographical ClassificationIn geographical classification, the classification

is based on the geographical regions.

Ex: Sales of the company (In Million Rupees) (region – wise

Region Sales

North 285

South 300

East 185

west 235

Chronological Classification If the statistical data are classified according to the

time of its occurrence, the type of classification is called chronological classification.

Sales reported by a departmental store Month Sales

(Rs. in lakh)

January 22Feb 26mar 32apr 25may 27

Qualitative ClassificationIn qualitative classifications, the data are classified

according to the presence or absence of attributes in given units. Thus, the classification is based on some quality characteristics / attributes.

Ex: Literacy, Education, Class grade etc.Further, it may be classified as a) Simple classification b) Manifold classification

Simple classification: If the classification is done into only two classes then classification is known as simple classification.

Ex: a) Population in to Male / Female b) Population into Educated / Uneducated

Manifold classification:

In this classification, the classification is based on more than one attribute at a time.

Population

Smokers Non-smokers

Illiterate Literate

Male Female

Male Female

Literate Illiterate

Male Female

Male Female

Quantitative Classification In Quantitative classification, the classification is based on

quantitative measurements of some characteristics, such as age, marks, income, production, sales etc. The quantitative phenomenon under study is known as variable and hence this classification is also called as classification by variable.

For a 50 marks test, Marks obtained by students as classified as follows

Marks No. of students

0 – 10 5

10 – 20 7

20 – 30 10

30 – 40 25

40 – 50 3

Total Students = 50

Meaning and Definition of Tabulation

Tabulation may be defined as systematic arrangement of data is column and rows. It is designed to simplify presentation of data for the purpose of analysis and statistical inferences.

Major Objectives of Tabulation

• To simplify the complex data • To facilitate comparison • To economise the space • To draw valid inference / conclusions • To help for further analysis

Differences between Classification and Tabulation

• First data are classified and presented in tables; classification is the basis for tabulation.

• Tabulation is a mechanical function of classification because is tabulation classified data are placed in row and columns.

• Classification is a process of statistical analysis while tabulation is a process of presenting data is suitable structure.

Classification of tables

Classification is done based on • Coverage (Simple and complex table) • Objective / purpose (General purpose /

Reference table / Special table or summary table)

• Nature of inquiry (primary and divided table

Diagrammatic and Graphic Representation

• Diagrammatic presentation A diagram is a visual form for presentation of statistical data. The diagram refers various types of devices such as bars, circles, maps, pictorials and cartograms etc.

Some important types of diagrams Line diagram

This is simplest type of one dimensional diagram. On the basis of size of the figures, heights of the bar / lines are drawn. The distance between bars are kept uniform. The limitation of this diagram are it is not attractive cannot provide more than one information.

Ex: Draw the line diagram for the following data

Year 2001,2002,2003,2004,2005,2006No. of students passed in first class with

distinction 5,7,12,5,13,15

2001 2002 2003 2004 2005 20064

6

8

10

12

14

16(15)

(13)

(5)

(12)

(7)

(5)

N

o. o

f stu

de

nts

pa

sse

d in

FC

D

Year

Simple bars diagram

The annual expresses of maintaining the car of various types are given below. Draw the vertical bar diagram. The annual expenses of maintaining includes (fuel + maintenance + repair + assistance + insurance).

Type of the car

Expense in Rs. / Year

Maruthi Udyog

47533

Hyundai 59230

Tata Motors 63270

47533

5923063270

30000

35000

40000

45000

50000

55000

60000

65000

70000

Maruthi Udyog Hyundai Tata Motors

Horizontal bar diagramWorld biggest top 10 steel makers are data are given below. Draw horizontal bar diagram.

110

32

31

30

24

20

18

18

17

16

0 20 40 60 80 100 120

Arcelor Mittal

Nippon

POSCO

JFE

BAO Steel

US Steel

NUCOR

RIVA

Thyssen-krupp

Tangshan

To

p -

10

Ste

el M

ak

ers

Production of Steel (Million Tonnes)

Compound bar diagram (Multiple bar diagram Ex: Draw the bar diagram for the following data. Resale value of the cars (Rs. 000) are as follows.

Year (Model)

Santro Zen Wagonr

2003 208 252 248

2004 240 278 274

2005 261 296 302

208

252 248240

278 274261

296 302

0

50

100

150

200

250

300

350

1 2 3Model of Car

Va

lue

in

Rs

.

Santro Zen Wagnor

• Pie diagram

7.58

39

14.92

38.5

Customs

Excise

Corporate Taxand Dividend

State’s taking

1. Ungrouped SeriesConstructing Frequency Table:Ex 1: The marks obtained by 50 students in an

examination are given below:

Prepare a Frequency Table.

30 45 48 55 39 32 31 22 21 18 54 59 61 33 34 44 10

38 19 62 74 43 73 41 46 33 51 37 85 85 71 29 22 62

29 58 55 63 64 44 43 27 32 43 52 31 47 64 18 51

Ex 2: Form a frequency distribution from the following data by inclusive method taking 4 as the magnitude class intervals.

Then convert into exclusive method.

10 17 15 23 11 33 38 41 25 26 8 11 9 24 29 32 37

39 40 14 27 37 28 13 14 18 17 16 15 23 25 28 30 32

34 26 12 15 19 31 40 39

INTRODUCTION TO TABULATION

DEFINITION

According to Tuttle, “A statistical table is the logical listing of related quantitative data in vertical columns and horizontal rows of numbers, with sufficient explanatory and qualifying words, phrases and statements in the form of titles, heading and footnotes to make clear the full meaning of the data and their origin”

OBJECTIVES OF TABULATION

1. To simplify the complex data

2. To economize space

3. To facilitate comparison

4. To facilitate statistical analysis

5. To save time

6. To depict trend

7. To help reference

Components Of Table1. Table number

2. Title of the table

3. Caption / Box head

4. Stub

5. Body / Field

6. Head note

7. Foot note

8. Source data

Stub

headings

Caption Total

(rows) Subhead Subhead

Column-

head

Column

head

Column-

head

Column

head

Stub

Entries

Total

(columns)

Foot note :Source note:

REQUIREMENTS OF GOOD STATISTICAL TABLES

1. Suit the purpose2. Scientifically prepared3. Clarity4. Manageable size5. Columns and rows should be numbered6. Suitably approximated7. Attractive get-up8. Units9. Average and totals10.Logical arrangement of items11.Proper lettering

Types of tables

1.Simple and Complex tables.

2.General purpose and special purpose tables.

3.Original and derived table.

Advantages of classification and tabulation

1. Clarifies the object

2. Simplifies the complex data

3. Economic space

4. Facilitates the comparison

5. It helps in references

6. Depict the trend

Disadvantages of classification and tabulation

1. Complicated process

2. Every data can not be put into tables

3. Lack of flexibility