soc303 social statistics · 2020. 8. 27. · called inferential statistics. computing the single...

13
In this study session, you will be discussing the nature of statistical inquiries. You will start by highlighting the importance and roles statistics. This will bring about the discussion on applicability of statistics to science and other fields of study. Moving on, you will categorize data based on source and form. Here you will discuss the secondary and primary types of data. Likewise, you will describe the cross-sectional, time-series and panel data. When you have studied this session, you should be able to: 1.1 define statistics 1.2 describe data 1.3 highlight methods of data collection 1.4 discuss the sources of data in Nigeria This Study Session requires a one hour of formal study time. You may spend an additional two hours for revision.

Upload: others

Post on 10-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

SOC303 Social Statistics

In this study session, you will be discussing the nature of statistical inquiries. You will start by highlighting the importance and roles statistics. This will bring about the discussion on applicability of statistics to science and other fields of study. Moving on, you will categorize data based on source and form. Here you will discuss the secondary and primary types of data. Likewise, you will describe the cross-sectional, time-series and panel data.

When you have studied this session, you should be able to:

1.1 define statistics

1.2 describe data

1.3 highlight methods of data collection

1.4 discuss the sources of data in Nigeria

This Study Session requires a one hour of formal study time. You may spend an additional two hours for revision.

Page 2: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

Study Session 1 | Nature of Statistical Inquiries

Business statistics

Quantitative data

Qualitative data

Inferential statistics

Descriptive statistics

Nature of Statistical Inquiries

What is Statistics?

Importance

Roles

Applications

Data in Statistics

Types of Data: Based on Sources

Based on Forms

Methods of Data Collection

Observation

Mail Questionaire

Personal Interview

Telephone Interview

Page 3: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

SOC303 Social Statistics

To get a clear grasp of what statistics is, we need to examine series of definitions as provided by many authors. Some authors argued that Statistics is the practice or science of collecting and analysing numerical data in large quantities, especially for the purpose of inferring proportions in a whole from those in a representative sample. However, another definition sees Statistics as a form of mathematical analysis that uses quantified models, representations and synopses for a given set of experimental data or real-life studies. Statistics studies methodologies to gather, review, analyze and draw conclusions from data. Some statistical measures include mean, regression analysis, skewness, kurtosis, variance and analysis of variance. In addition, some see Statistics as the study of the collection, analysis, interpretation, presentation, and organization of data.

For most students, statistics is not a favourite course. It is viewed as hard, or cosmic, or just plain confusing. By now, you should be thinking: "I could just skip stat, and avoid making inferences about what populations are like by always collecting data on the whole population and knowing for sure what the population is like." Well, many things come back to money, and it is money that makes you take stat. Collecting data on a whole population is usually very expensive, and often almost impossible. If you can make a good, educated inference about a population from data collected from a small portion of that population, you will be able to save yourself, and your employer, a lot of time and money. You will also be able to make inferences about populations for which collecting data on the whole population is virtually impossible. Learning statistics now will allow you to save resources later and if the resources saved later are greater than the cost of learning statistics now, it will be worthwhile to learn statistics. It is my hope that the approach followed in this text will reduce the initial cost of learning statistics. If you have already had finance, you'll understand it this way—this approach to learning statistics will increase the net present value of investing in learning statistics by decreasing the initial cost.

Imagine how long it would take and how expensive it would be if Adamu and Suleiman decided that they had to find out what size of canvas every school football team player wore in order to see if football players wore the same size of canvas as basketball players. By knowing how samples are related to populations, Adamu and Suleiman can quickly and inexpensively get a good idea of what size of canvas football players wear, saving the school a lot of money.

Page 4: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

Study Session 1 | Nature of Statistical Inquiries

There are two basic types of inferences that can be made. The first is to

estimate something about the population, usually its mean. The second

is to see if the population has certain characteristics, for example you

might want to infer if a population has a mean greater than 5.6. This

second type of inference, hypothesis testing. If you understand

hypothesis testing, estimation is easy. There are many applications,

especially in more advanced statistics, in which the difference between

estimation and hypothesis testing seems blurred.

You will realize that statistics is useful in all spheres of human life. A woman with a given amount of money, going to the market to purchase food stuff for the family, takes decision on the types of food items to purchase, the quantity and the quality of the items to maximize the satisfaction she will derive from the purchase. For all these decisions, the woman makes use of statistics.

Government uses statistics as a tool for collecting data on economic aggregates such as national income, savings, consumption and gross national product. Government also uses statistic to measure the effects of external factors on its policies and to assess the trends in the economy so that it can plan future policies.

Government uses statistics during census. The various forms sent by the government to individuals and firms on annual income, tax returns, prices, costs, output and wage rates generate a lot of statistical data for the use of the government.

Business uses statistics to monitor the various changes in the national economy for the various budget decisions. Business makes use of statistics in production, marketing, administration and in personnel management.

Statistics is also used extensively to control and analyse stock level such as minimum, maximum and reorder levels. It is used by business in market research to determine the acceptability of a product that will be demanded at various prices by a given population in a geographical area. Management also uses statistics to make forecast about the sales and labour cost of a firm.

Management uses statistics to establish mathematical relationship between two or more variables for the purpose of predicting a variable in terms of others. For the conduct and analyses of biological, physical, medical and social researches, we use statistics extensively.

In applying statistics to, e.g., a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model process to

be studied. Populations can be diverse topics such as "all people living in a

Page 5: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

SOC303 Social Statistics

country" or "every atom composing a crystal". Statistics deals with all aspects of data including the planning of data collection in terms of the design of surveys and experiments. We therefore begin with a simple example:

There are thousands of commercial vehicle in Yola Adamawa state. What

is their average value?

It is obviously impractical to attempt to solve this problem directly by assessing the value of every single car in the state, adding up all those numbers, and then dividing by however many numbers there are. Instead, the best we can do would be to estimate the average. One natural way to do so would be to randomly select some of the cars, say 100 of them, ascertain the value of each of those cars, and find the average of those 100 numbers. The set of all those thousands of cars is called the population of interest, and the number attached to each one, its value, is a measurement. The average value is a parameter: a number that describes a characteristic of the population, in this case monetary worth. The set of 100 cars selected from the population is called a sample, and the 100 numbers, the monetary values of the cars we selected, are the sample data. The average of the data is called a statistic: a number calculated from the sample data.

If the average value of the cars in our sample was N80,357 then it seems reasonable to conclude that the average value of all cars is about N80,357. In reasoning this way we have drawn an inference about the population based on information obtained from the sample. In general, statistics is a study of data: describing properties of the data, which is called descriptive statistics, and drawing conclusions about a population of interest from information extracted from a sample, which is called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using it to make a statement about the population was an operation of inferential statistics.

o A form of mathematical analysis that uses quantified models,

representations and synopses for a given set of experimental

data or real-life studies is referred to as ______________.

Statistics is a form of mathematical analysis that uses quantified

models, representations and synopses for a given set of

experimental data or real-life studies.

Page 6: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

Study Session 1 | Nature of Statistical Inquiries

We are going to talk a lot about data in the course, so it is necessary for us to understand what it means. Data is a set of values of qualitative or quantitative variables. An example of qualitative data would be an anthropologist's handwritten notes about her interviews with people of an Indigenous tribe. Pieces of data are individual pieces of information. While the concept of data is commonly associated with scientific research, data is collected by a huge range of organizations and institutions, ranging from businesses (e.g., sales data, revenue, profits, and stock price), governments (e.g., crime rates, unemployment rates, literacy rates) and non-governmental organizations (e.g., censuses of the number of homeless people by non-profit organizations). Data is measured, collected and reported, and analysed, whereupon it can be visualized using graphs, images or other analysis tools. Data as a general concept refers to the fact that some existing information or knowledge is represented or coded in some form suitable for better usage or processing.

For us to have a clear understanding on the different types of data that exists, we have categorized them based on source and form. These categorizations are discussed below

Data can be classified base on the sources from which they are obtained. In this regards, we have:

These are data collected directly from the field of enquiries by the user(s) themselves. The researcher here collects the data directly from the source of information. Primary data are collected where:

i. The needed information does not exist elsewhere ii. The needed information exist but is not reliable

iii. Collecting the information at first hand is only way such information can be obtained

i. They are always relevant to the subject under study because they are collected primarily for the purpose.

ii. The investigator is able to get what he wants and able to gain considerable insight into the issue he is seeking information about.

iii. The error rate is limited. They are more accurate and reliable iv. Provide opportunity for the researcher to interact with study population. v. Information on other relevant issues can be obtained

i Always costly to collect

Page 7: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

SOC303 Social Statistics

ii Inadequate cooperation from the study population iii Wastes a lot of time and energy

These are data which already exist and may be adapted for use in the current survey. Secondary data can be sourced from publications and records of governments and non-government organisations, journals of universities and research institutes, media, organization and administrative records. hospital records.

Secondary data are data which have been collected by someone else or

some organization either in published or unpublished forms.

There are agencies saddled with the responsibilities of collecting such statistical data on a regular basis and collated into a form that most users will find suitable. In this type of data, lesser degree of control is exercised by individual investigator or user as the initial collection is often done for specific purpose that may be different from that of the secondary user.

For secondary data to be used with reasonable degree of confidence,

the validity of such data must be assessed. This involves checking for the

following:

a. The source of the data

b. The purpose of which it was collected

c. The method of data collection used

d. Definition of terms used

e. Coverage and changes overtime, if any

f. f. Method of analysis

Secondary data may be used in three ways by a researcher. First, some specific information from secondary sources may be used for reference purposes; second, secondary data may be used as bench marks against which the findings of a research may be tested; and third, secondary data may be used as the sole source of information for a research project.

i. Secondary data, if available, can be secured quickly and cheaply. ii. It is less expensive. Wider geographical area and longer reference period

may be covered without much cost.

Page 8: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

Study Session 1 | Nature of Statistical Inquiries

iii. The use of secondary data broadens the database from which scientific generalizations can be made.

iv. The use of secondary data enables a researcher to verify the findings based on primary data.

i. May not completely meet the need of the research at hand because it was not collected primarily for that purpose.

ii. Available data may not be as accurate as desired. iii. Secondary data are not up-to-date and might have become obsolete by the

time they appear in print, because of time lag in producing them. iv. Information about the whereabouts of sources may not be available to all

researchers.

Classification based on form of the data: Sometimes, data are classified based on the form of the data at hand and may be classified as:

These are data collected for cross-section of subjects (population under study) at a time. For example, data collected on a cross-section of household on demand for recharge card for the month of August 2013.

These are data collected on a particular variable or set of variables over time e.g. a set Nigeria’s Gross Domestic Product (GDP) values form 1970 to2012.

These combine the features of cross-sectional and time-series data. They are type of data collected from the same subjects over time. For example, a set of data collected on monthly recharge card expenditure from about 100 households in Lagos from January to December 2013 will form a panel data.

Social and Economic data of national importance are collected routinely

as by-product of governmental activities e.g. information on trade,

wages, prices, education, health, crime, aids and grants etc.

Observation is a means of collecting first hand information. Then, there are three methods of data collection with survey, and these are mail questionnaires, personal interviews, and telephone interviews.

Page 9: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

SOC303 Social Statistics

Here, the information is sought by way of investigator’s own direct observation without asking from the respondent. This could be by participant’s observation (events are observed without the participant knowing it e.g. study of bank queues) or mechanical observation (mechanical means are employed to gather more complex information e.g. weather forecast)

ADVANTAGES OF OBSERVATION

i. Elimination of subjective bias ii. Elicited information is current

iii. The method is independent of respondents’ willingness or otherwise to respond/

LIMITATIONS OF OBSERVATION

i. It could be expensive ii. Information provided is very limited

iii. Unforeseen factors very interfere with the observation activities

It is an impersonal survey method. Here, survey instrument (the questionnaire) is mailed to the selected respondents and the questionnaires are mailed back to the researcher after the respondents must have filled it up. This is very common in developed countries where the citizens appreciate the relevance of data and research. Under certain conditions and for a number of research purposes, an impersonal method of data collection can be useful.

ADVANTAGES OF MAIL QUESTIONNAIRE

i The cost is low compared to others ii Biasing error is reduced because respondents are not influenced by

interviewed characteristics or techniques. iii Questionnaires provide a high degree of anonymity for respondents. This

is especially important when sensitive issues are involved. iv Respondents have time to think about their answers and /or consult other

sources. v Questionnaires provide wide access to geographically dispersed samples at

low cost

DISADVANTAGES OF MAIL QUESTIONNAIRE

i Questionnaires require simple, easily understood questions and instructions

ii Mail questionnaires do not offer researchers the opportunity to probe for additional information or to clarify answers.

iii Researchers cannot control who fills out the questionnaire. iv Response rate are low

Page 10: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

Study Session 1 | Nature of Statistical Inquiries

Factors Affecting the Response Rate of Mail Questionnaires

Researchers use various strategies to overcome the difficulty of securing an acceptable response rate to mail questionnaires and to increase the response rate.

i. Sponsorship: The sponsorship of a questionnaire motivates the respondents to fill the questionnaires and return them. Therefore, investigators must include information on sponsorship, usually in the cover letter accompanying the questionnaire.

ii. Inducement to response: Researchers who use mail surveys must appeal to the respondents and persuade them that they should participate by filling out the questionnaires and mailing them back. For example, a student conducting a survey for a class project may mention that his or her grade may be affected by the response to the questionnaire.

iii. Questionnaire format and methods of mailing- Designing a mail questionnaire involves several considerations: typography, colour, and length and type of cover letter.

The personal interview is a face-to-face, interpersonal role situation in which an interviewer asks respondents question designed to elicits answers pertinent to the research hypotheses. The questions, their wording, and their sequence define the structure of the interview.

ADVANTAGES OF PERSONAL INTERVIEW

i. Flexibility: The interview allows great flexibility in the questioning process, and the greater the flexibility, the less structure the interview. Some interviews allow the interviewer to determine the wording of the questions, to clarify terms that are unclear, to control the order in which the question are presented, and to probe for additional information and details.

ii. Control of the interview situation: An interviewer can ensure that the respondents answer the questions in the appropriate sequence or that they answer certain questions before they ask subsequent questions.

iii. High response rate: The personal interview results in a higher response rate than the mail questionnaire.

iv. Fuller information: An interviewer can collect supplementary information about respondents. This may include background information, personal characteristics and their environment that can aid the researcher in interpreting the results.

DISADVANTAGES OF THE PERSONAL INTERVIEW

i. Higher cost: The cost of interview studies is significantly higher than that of mail survey. Costs are involved in selecting, training, and supervising interviewers; in paying them; and in the travel and time required to conduct interviews.

ii. Interviewer bias: The very flexibility that is the chief advantage of interviews leaves room for the interviewer’s personal influence and bias.

Page 11: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

SOC303 Social Statistics

iii. Lack of anonymity: The interview lacks the anonymity of the mail questionnaire. Often the interviewer knows all or many of the potential respondents (their names, addresses, and telephone numbers). Thus respondents may feel threatened or intimidated by the interviewer, especially if a respondent is sensitive to the topic or some of the questions.

It is also called telephone survey, and can be characterized as a semi-personal method of collecting information. In comparison, the telephone is convenient, and it produces a very significant cost saving.

ADVANTAGES OF TELEPHONE INTERVIEW

i. Moderate cost ii. Speed: Telephone interviews can reach a large of respondents in a short

time. Interviewers can code data directly into computers, which can later compile the data.

iii. High response rate: Telephone interviews provide access to people who might be unlikely to reply to a mail questionnaire or refuse a personal interview.

iv. Quality: High quality data can be collected when interviewers are centrally located and supervisors can ensure that questions are being asked correctly and answers are recorded properly.

DISADVANTAGES OF TELEPHONE INTERVIEW

i. Reluctant to discuss sensitive topics: Respondents may be resistant to discuss some issues over the phone.

ii. The “broken off” interview: Respondents can terminate the interview before it is completed.

iii. Less information Interviewers cannot provide supplemental information about the respondents’ characteristics or environment.

o The three methods of data collection are _______, ___________, and

______________.

Mail questionnaire, personal interview, and telephone interviews

are the three major methods of collecting data.

Page 12: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

Study Session 1 | Nature of Statistical Inquiries

1.1 define statistics

Statistics is the practice or science of collecting and analysing numerical data in large quantities, especially for the purpose of inferring proportions in a whole from those in a representative sample.

We can also see Statistics as a form of mathematical analysis that uses quantified models, representations and synopses for a given set of experimental data or real-life studies.

1.2 describe data

Data is a set of values of qualitative or quantitative variables.

Data is measured, collected and reported, and analysed, whereupon it can be visualized using graphs, images or other analysis tools.

Data as a general concept refers to the fact that some existing information or knowledge is represented or coded in some form suitable for better usage or processing.

1.3 highlight methods of data collection

We able to discover that there are basically three methods of data collection. They are:

Mail questioner

Personal interview

Telephone interview

SAQ 1.1 (tests Learning Outcome 1.1) Statistics is seen as the process of data collection and analysis. Can you explain further?

SAQ 1.2 (tests Learning Outcome 1.2) Mr Adamu came to your school to collect some information which he will process as a data for his personal project. Under which categorization of data is the source of Mr Adamu's types of data? Discuss!

SAQ 1.3 (tests Learning Outcome 1.3) From our discussion in this study session, you realized there are 3 major methods of data collection. Discuss any two of these methods.

Page 13: SOC303 Social Statistics · 2020. 8. 27. · called inferential statistics. Computing the single number N80,357 to summarize the data was an operation of descriptive statistics; using

SOC303 Social Statistics

Scan the QR code with your mobile device to reveal the feedback to the SAQs. The feedback is also available on your course website: https://lms.cdlce.uniabuja.edu.ng/course/view.php?id=133