introduction to statistics
DESCRIPTION
introduction to statisticsTRANSCRIPT
CHAPTER 1INTRODUCTION TO STATISTICSINTRODUCTION -DEFINITION OF STATISTICS- FUNCTIONS- SCOPE-LIMITATIONS, CLASSIFICAITON AND TABULATION OF DATA
Dataobservations (such as measurements,
genders, survey responses) that have
been collected.
Definitions
What is statistics?• Statistics is the study of how to collect,
organize, analyze, and interpret numerical information from data.
• Collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions.
There are two main branches of statistics1. Descriptive Statistics –Descriptive statistics
involves methods of organizing, picturing and summarizing information from data.
Example: Mean, Median, Mode2. Inferential Statistics – Inferential statistics
involves methods of using information from a sample to draw conclusions about the population i. e, the use of descriptive statistics to make educated guesses about something
The basic fallacies of inferential statistics:• They assume that the past is prologue to
the future• They assume you are going to test your
inference an infinite amount of times.
Characteristics of Statistics
• Statistics are aggregates of facts• Statistics are numerically expressed• Statistics are affected to a marked extent by
multiplicity of causes• Reasonable standards of accuracy in
enumeration, estimation or collection• Statistics are collected in a systematic manner• Statistics are collected for predetermined
purpose
Five stages of statistical investigation :Collection of Data
↓ Organization of data
↓ Presentation of data
↓ Analysis
↓ Interpretation of Results
(1) Collection of Data: A structure of statistical investigation is based on a systematic collection of data. The data is classified into two groups i) Internal data and ii) External dataInternal data are obtained from internal records related to operations of business organisation such as production, source of income and expenditure, inventory, purchases and accounts. The external data are collected and purchased by external agencies. The external data could be either primary data or secondary data. The primary data are collected for first time and original, while secondary data are collected by published by some agencies.
(2) Organisations of data :The collected data is a large mass of figures that needs to be organised. The collected data must be edited to rectify for any omissions, irrelevant answers, and wrong computations. The edited data must be classified and tabulated to suit further analysis.
3) Presentation of data :The large data that are collected cannot be understood and should be analysed easily and quickly. Therefore, collected data needs to be presented in tabular or graphic form. This systematic order and graphical presentation helps for further analysis.
4) Analysis of data: The analysis requires establishing the relationship between one or more variables. Analysis of data includes condensation, abstracting, summarization, conclusion etc. With the help of statistical tools and techniques like measures of dispersion central tendency, correlation, variance analysis etc analysis can be done.
(5)Interpretation of dataThe interpretation requires deep insight of the subject. Interpretation involves drawing the valid conclusions on the bases of the analysis of data. This process is very important as conclusions of results are done based on interpretation.
Functions
• Presentation of facts• Simplification of complexities• Facilitating comparisons• Facilitating the formulation of policies• Widening of human knowledge • Useful in testing the laws of other sciences• Facilitates the forecasting• Establishment of correlation between two facts
FUNCTIONS OF STATISTICS :Statistics as a discipline is considered indispensable in almost all spheres of human knowledge. There is hardly any branch of study which does not use statistics. Scientific, social and economic studies use statistics in one form or another. These disciplines make-use of observations, facts and figures, enquiries and experiments etc. using statistics and statistical methods. Statistics studies almost all aspects in an enquiry. It mainly aims at simplifying the complexity of information collected in an enquiry. It presents data in a simplified form as to make them intelligible. It analyses data and facilitates drawal of conclusions. Now let us briefly discuss some of the important functions of statistics.
1. Presents facts in simple form: Statistics presents facts and figures in a definite form. That makes the statement logical and convincing than mere description. It condenses the whole mass of figures into a single figure. This makes the problem intelligible.
.2. Reduces the Complexity of data: Statistics simplifies the complexity of data. The raw data are unintelligible. We make them simple and intelligible by using different statistical measures. Some such commonly used measures are graphs, averages, dispersions, skewness, kurtosis, correlation and regression etc. These measures help in interpretation and drawing inferences. Therefore, statistics enables to enlarge the horizon of ones knowledge
3. Facilitates comparison: Comparison between different sets of observation is an important function of statistics. Comparison is necessary to draw conclusions as Professor Boddington rightly points out.” the object of statistics is to enable comparison between past and present results to ascertain the reasons for changes, which have taken place and the effect of such changes in future. So to determine the efficiency of any measure comparison is necessary. Statistical devices like averages, ratios, coefficients etc. are used for the purpose of comparison.
4.Testing hypothesis: Formulating and testing of hypothesis is an important function of statistics. This helps in developing new theories. So statistics examines the truth and helps in innovating new ideas.
5.Formulation of Policies :Statistics helps in formulating plans and policies in different fields. Statistical analysis of data forms the beginning of policy formulations. Hence, statistics is essential for planners, economists, scientists and administrators to prepare different plans and programmes.
6. Forecasting :The future is uncertain. Statistics helps in forecasting the trend and tendencies. Statistical techniques are used for predicting the future values of a variable. For example a producer forecasts his future production on the basis of the present demand conditions and his past experiences. Similarly, the planners can forecast the future population etc. considering the present population trends.
7. Derives valid inferences :Statistical methods mainly aim at deriving inferences from an enquiry. Statistical techniques are often used by scholars planners and scientists to evaluate different projects. These techniques are also used to draw inferences regarding population parameters on the basis of sample information.
Limitation
• Statistics does not study individuals• Statistics deals with quantitative facts• Statistics is true only to its averages• Statistics may lead to fallacious conclusion• Only experts can make use of statistics• Homogeneity and uniformity is must
Limitations of statistics: Statistics with all its wide application in every sphere of human activity has its own limitations. Some of them are given below.1. Statistics is not suitable to the study of qualitative phenomenon: Since statistics is basically a science and deals with a set of numerical data, it is applicable to the study of only these subjects of enquiry, which can be expressed in terms of quantitative measurements. As a matter of fact qualitative phenomenon like honesty, poverty, beauty, intelligence etc, cannot be expressed numerically and any statistical analysis cannot be directly applied on these qualitative phenomenons. Nevertheless, statistical techniques may be applied indirectly by first reducing the qualitative expressions to accurate quantitative terms. For example, the intelligence of a group of students can be studied on the basis of their marks in a particular examination.
2. Statistics does not study individuals: Statistics does not give any specific importance to the individual items, in fact it deals with an aggregate of objects. Individual items, when they are taken individually do not constitute any statistical data and do not serve any purpose for any statistical enquiry.
3. Statistical laws are not exact: It is well known that mathematical and physical sciences are exact. But statistical laws are not exact and statistical laws are only approximations. Statistical conclusions are not universally true. They are true only on an average.
4. Statistics table may be misused: Statistics must be used only by experts; otherwise, statistical methods are the most dangerous tools on the hands of the inexpert. The use of statistical tools by the inexperienced and untraced persons might lead to wrong conclusions. Statistics can be easily misused by quoting wrong figures of data. As King says9aptly ‘ statistics are like clay of which one can make a God or Devil as one pleases’
.
5. Statistics is only, one of the methods of studying a problem: Statistical method do not provide complete solution of the problems because problems are to be studied taking the background of the countries culture, philosophy or religion into consideration. Thus the statistical study should be supplemented by other evidences. Statistics can analyze only aggregated observation or data: Any statistics is a collection of data. Individual observation does not belong to statistics hence, statistics analyses a collection of data and enlighten the overall estimated result. For-example the average income of the labourers of a business can be estimated by observing their per capital
Scope and importance of statistics
• Useful to bankers• Useful to insurance company• Useful to railways and other transport
agencies• Useful to business• Useful to economists• Useful to planning
Definition of Classification
“Classification is the process of arranging data into sequences and groups according to their common characteristics or separating them into different but related parts.”
- Secrist “ The process of grouping large number of individual
facts and observations on the basis of similarity among the items, is called classification”.
- Stockton & Clark
Meaning of Classification
• Classification is a process of arranging things or data in groups or classes according to their resemblances and affinities and gives expressions to the unity of attributes that may subsit among a diversity of individuals.
Characteristics of classification
– Classification performs homogeneous grouping of data
– It brings out points of similarity and dissimilating– The classification may be either real or imaginary – Classification is flexible to accommodate
adjustments
Objectives / purposes of classifications
To simplify and condense the large data To present the facts to easily in understandable
form To allow comparisons To help to draw valid inferences To relate the variables among the data To help further analysis To eliminate unwanted data To prepare tabulation
Important types of classification
• Geographical (i.e. on the basis of area or region wise)
• Chronological (On the basis of Temporal / Historical, i.e. with respect to time)
• Qualitative (on the basis of character / attributes)
• Numerical, quantitative (on the basis of magnitude)
Geographical ClassificationIn geographical classification, the classification
is based on the geographical regions.
Ex: Sales of the company (In Million Rupees) (region – wise
Region Sales
North 285
South 300
East 185
west 235
Chronological Classification If the statistical data are classified according to the
time of its occurrence, the type of classification is called chronological classification.
Sales reported by a departmental store Month Sales
(Rs. in lakh)
January 22Feb 26mar 32apr 25may 27
Qualitative ClassificationIn qualitative classifications, the data are classified
according to the presence or absence of attributes in given units. Thus, the classification is based on some quality characteristics / attributes.
Ex: Literacy, Education, Class grade etc.Further, it may be classified as a) Simple classification b) Manifold classification
Simple classification: If the classification is done into only two classes then classification is known as simple classification.
Ex: a) Population in to Male / Female b) Population into Educated / Uneducated
Manifold classification:
In this classification, the classification is based on more than one attribute at a time.
Population
Smokers Non-smokers
Illiterate Literate
Male Female
Male Female
Literate Illiterate
Male Female
Male Female
Quantitative Classification In Quantitative classification, the classification is based on
quantitative measurements of some characteristics, such as age, marks, income, production, sales etc. The quantitative phenomenon under study is known as variable and hence this classification is also called as classification by variable.
For a 50 marks test, Marks obtained by students as classified as follows
Marks No. of students
0 – 10 5
10 – 20 7
20 – 30 10
30 – 40 25
40 – 50 3
Total Students = 50
Meaning and Definition of Tabulation
Tabulation may be defined as systematic arrangement of data is column and rows. It is designed to simplify presentation of data for the purpose of analysis and statistical inferences.
Major Objectives of Tabulation
• To simplify the complex data • To facilitate comparison • To economise the space • To draw valid inference / conclusions • To help for further analysis
Differences between Classification and Tabulation
• First data are classified and presented in tables; classification is the basis for tabulation.
• Tabulation is a mechanical function of classification because is tabulation classified data are placed in row and columns.
• Classification is a process of statistical analysis while tabulation is a process of presenting data is suitable structure.
Classification of tables
Classification is done based on • Coverage (Simple and complex table) • Objective / purpose (General purpose /
Reference table / Special table or summary table)
• Nature of inquiry (primary and divided table
Diagrammatic and Graphic Representation
• Diagrammatic presentation A diagram is a visual form for presentation of statistical data. The diagram refers various types of devices such as bars, circles, maps, pictorials and cartograms etc.
Some important types of diagrams Line diagram
This is simplest type of one dimensional diagram. On the basis of size of the figures, heights of the bar / lines are drawn. The distance between bars are kept uniform. The limitation of this diagram are it is not attractive cannot provide more than one information.
Ex: Draw the line diagram for the following data
Year 2001,2002,2003,2004,2005,2006No. of students passed in first class with
distinction 5,7,12,5,13,15
2001 2002 2003 2004 2005 20064
6
8
10
12
14
16(15)
(13)
(5)
(12)
(7)
(5)
N
o. o
f stu
de
nts
pa
sse
d in
FC
D
Year
Simple bars diagram
The annual expresses of maintaining the car of various types are given below. Draw the vertical bar diagram. The annual expenses of maintaining includes (fuel + maintenance + repair + assistance + insurance).
Type of the car
Expense in Rs. / Year
Maruthi Udyog
47533
Hyundai 59230
Tata Motors 63270
47533
5923063270
30000
35000
40000
45000
50000
55000
60000
65000
70000
Maruthi Udyog Hyundai Tata Motors
Horizontal bar diagramWorld biggest top 10 steel makers are data are given below. Draw horizontal bar diagram.
110
32
31
30
24
20
18
18
17
16
0 20 40 60 80 100 120
Arcelor Mittal
Nippon
POSCO
JFE
BAO Steel
US Steel
NUCOR
RIVA
Thyssen-krupp
Tangshan
To
p -
10
Ste
el M
ak
ers
Production of Steel (Million Tonnes)
Compound bar diagram (Multiple bar diagram Ex: Draw the bar diagram for the following data. Resale value of the cars (Rs. 000) are as follows.
Year (Model)
Santro Zen Wagonr
2003 208 252 248
2004 240 278 274
2005 261 296 302
208
252 248240
278 274261
296 302
0
50
100
150
200
250
300
350
1 2 3Model of Car
Va
lue
in
Rs
.
Santro Zen Wagnor
1. Ungrouped SeriesConstructing Frequency Table:Ex 1: The marks obtained by 50 students in an
examination are given below:
Prepare a Frequency Table.
30 45 48 55 39 32 31 22 21 18 54 59 61 33 34 44 10
38 19 62 74 43 73 41 46 33 51 37 85 85 71 29 22 62
29 58 55 63 64 44 43 27 32 43 52 31 47 64 18 51
Ex 2: Form a frequency distribution from the following data by inclusive method taking 4 as the magnitude class intervals.
Then convert into exclusive method.
10 17 15 23 11 33 38 41 25 26 8 11 9 24 29 32 37
39 40 14 27 37 28 13 14 18 17 16 15 23 25 28 30 32
34 26 12 15 19 31 40 39
INTRODUCTION TO TABULATION
DEFINITION
According to Tuttle, “A statistical table is the logical listing of related quantitative data in vertical columns and horizontal rows of numbers, with sufficient explanatory and qualifying words, phrases and statements in the form of titles, heading and footnotes to make clear the full meaning of the data and their origin”
OBJECTIVES OF TABULATION
1. To simplify the complex data
2. To economize space
3. To facilitate comparison
4. To facilitate statistical analysis
5. To save time
6. To depict trend
7. To help reference
Components Of Table1. Table number
2. Title of the table
3. Caption / Box head
4. Stub
5. Body / Field
6. Head note
7. Foot note
8. Source data
Stub
headings
Caption Total
(rows) Subhead Subhead
Column-
head
Column
head
Column-
head
Column
head
Stub
Entries
Total
(columns)
Foot note :Source note:
REQUIREMENTS OF GOOD STATISTICAL TABLES
1. Suit the purpose2. Scientifically prepared3. Clarity4. Manageable size5. Columns and rows should be numbered6. Suitably approximated7. Attractive get-up8. Units9. Average and totals10.Logical arrangement of items11.Proper lettering
Types of tables
1.Simple and Complex tables.
2.General purpose and special purpose tables.
3.Original and derived table.
Advantages of classification and tabulation
1. Clarifies the object
2. Simplifies the complex data
3. Economic space
4. Facilitates the comparison
5. It helps in references
6. Depict the trend
Disadvantages of classification and tabulation
1. Complicated process
2. Every data can not be put into tables
3. Lack of flexibility