why include statistics as part of psychology? –doing psychology research –reading psychology...

• Why include statistics as part of Psychology?– Doing psychology research

– Reading psychology research articles

– Analytical reasoning, critical thinking

– Statistics• Fundamental tool for all scientific inquiry• Way of making sense out of data

Statistics

Populations and Samples

• Population– the group of individuals (or things) of interest in a particular

study• For example, a researcher may be interested in the relation between class

size (variable 1) and academic performance (variable 2) for the population of third-grade children.

• Sample– Usually populations are so large that a researcher cannot

examine the entire group• a sample is selected to represent the population in a research study

• Sample size depends on the type of research

– The goal is to use the results obtained from the sample to help answer questions about the population

Sampling from a Population

Figure 1-1 (p. 6)The relationship between a population and a sample.

Make an Inference

representative of the population

Variables And Data• A variable is a characteristic or condition that can change

or take on different values– Most research begins with a general question about the

relationship between two variables for a specific group of individuals. (similar to forming an hypothesis)

• Data are measurements or observations– The measurements obtained in a research study are called the

data or data set

– Each measurement is a datum (singular) or score – The goal of statistics is to help researchers organize and

interpret the data.

Sources of Data

• Observation Research– Naturalistic no intervention

– Poor control

– Correlational data • Survey Research

– A correlational method of collecting data

– Do not exercise any control over time order

– Poor control of alternatives

– Can show relationships

• Experiments– Exercise control over covariation, time order and alternatives

– Can help establish causation

Using Statistics in Psychology

• Carrying out psychological research using an empirical approach means the collection of data. Statistics are a way of making use of this data– Descriptive Statistics: used to describe

characteristics of our sample– Inferential Statistics: used to generalise from our

sample to our population– Any samples used should therefore be representative

of the target population

Descriptive Statistics

• Descriptive statistics are methods for organizing and summarizing data. • For example, tables or graphs are used to organize

data, and descriptive values such as the average score are used to summarize data.

• A descriptive value for a population is called a parameter and a descriptive value for a sample is called a statistic.

Inferential Statistics

• Inferential statistics– methods for using sample data to make general

conclusions (inferences) about populations – a sample is only a part of the population– sample data provide only limited information about

the population– sample statistics are imperfect representatives of the

population parameters because of sampling error

Sampling Error

• The discrepancy between a sample statistic and its population parameter is called sampling error.

• Defining and measuring sampling error is a large part of inferential statistics.

Sampling from a Population

Figure 1-2 (p. 9)A demonstration of sampling error. Two samples are selected from the same population. Notice that the sample statistics are different from one sample to another, and all of the sample statistics are different from the corresponding population parameters. The natural differences that exist, by chance, between a sample statistic and a population parameter are called sampling error.

Margin of Error Box 1.1

• Is an example of sampling error

• Terminology used in polling data such as political polls

• Amount of error between a sample statistic and a population parameter

• There will always be sampling error in:– survey research– experiments

Figure 1-3 (p. 10)The role of statistics in experimental research.

Relationship Between Variables

• Correlational Method– Measuring two variables for each individual

• Height and Weight

• SAT and GPA

• Wake-up Time and Academic Performance (Figure 1.4)

– Determine the relationship between the variables– Limitations of the correlational method

• Can not demonstrate cause-and-effect relationships

Figure 1.4One of two data structures for studies evaluating the relationship between variables. Note that there are two separate measurements for each individual (wake-up time and academic performance). The same scores are shown in a table (a) and in a graph (b).

Hypothetical data showing results from a correlational study evaluating the relationship between exposure to TV violence and aggressive behavior for a sample of 10 children. Note that we have measured two different variables, obtaining two different scores, for each child. The data show a tendency for higher levels of TV violence to be associated with higher levels of aggressive behavior.

• Comparing two groups of scores– Experimental (see Figure 1.5 )

• One variable defines the groups (violence vs no violence)– Independent variable

• Another variable is the measurement, scores from the groups– Dependent variable

– NonExperimental “quasi-experimental”• Natural or pre-existing groups such as gender which are selected not

manipulated

• Before and after measurements for example before and after therapy– Do not confuse with control vs experimental groups

– There is only one group of participants whom get measured twice

Relationship Between Variables

Figure 1.6 The Structure of an experiment. Participants are randomly assigned to one of two treatment conditions: counting money or counting blank pieces of paper. Later, each participant is tested by placing one hand in a bowl of hot (122 F) water and rating the level of pain. A difference between the ratings for the two groups is attributed to the treatment (paper vs money).

The structure of an experiment. Volunteers are randomly assigned to one of two treatment conditions: a 70° room or a 90° room. A list of words is presented and the participants are tested by writing down as many words as they can remember from the list. A difference between groups is attributed to the treatment (the temperature of the room).

In this experiment, the effect of instructional method (the independent variable) on test performance (the dependent variable) is examined. However, any difference between groups is performance cannot be attributed to the method of instruction. In this experiment, there is a confounding variable. The instructor teaching the course varies with the independent variable,so that the treatment of the groups differs in more ways than one (instructional method and instructor vary).

Figure 1-7 (p. 17) Two examples of nonexperimental studies that involve comparing two groups of scores. In (a) the study uses two preexisting groups (boys/girls) and measures a dependent variable (verbal scores) in each group.

In (b), time is the variable used to define the two groups, and the dependent variable (depression) is measured at each of the two times.

NonExperimental “quasi-experimental” Terminology

• Similar data structure to experiments– One variable identifies groups (independent)– A second variable is measured to obtain data (dependent)

• For nonexperimental – Independent variable such as gender is not manipulated– So it is called “quasi-independent variable”

Data Structures and Statistical Method

• Data structure is used to classify statistical methods– One group with two variables measured for each

individual • Survey research

– Collect GPA and SAT scores for each person

– Use correlational statistics to describe the data

• Survey or Observational Research– Number of individuals in a group

– Groups based on “natural” categories such as gender

– Groups based on some activity such as “talk” vs “text” (table 1.1)

– Use Chi-square statistic to describe the data

– See scales of measurement on page 23

Data Structures and Statistical Method

• Data structure is used to classify statistical methods– Two or more groups of scores

• Compare two groups such as “Money” and “Paper” (fig 1.6)

• Two groups of individuals

• Compare average from each group of scores

• Several different statistical tests are used such as t-test or ANOVA based on number of groups

Constructs and Operational Definitions

To form a hypothesis from a research question the researcher needs to define the variables– What the effects of drug “Bulk-O” on weight gain?

• Independent variable is drug or no drug

• Dependent variable is weight gain which is “concrete”

– What are the effects of drug PQX1450 on Anxiety?• Independent variable is drug or no drug

• Dependent variable is “anxiety” which is a construct

• so we need to define the construct of anxiety

• Need an Operational Definition for anxiety

– How intelligent are students taking Methods course?

Variables And MeasurementDiscrete Variable

– Discrete categories such as students, cars, houses– Usually a count of the number of individuals or things

• number of students in class• number of cars in the parking lot• number of houses along the street

– Also called “Categorical variables” – Sometimes referred to as “Qualitative variables” which is

confusing because qualitative is just description not counting

Continuous Variable– Variable can be divided into an infinite number of values– height, weight, time

Use of Real Limits with Continuous Variables

When working with continuous variable– Can adjust precision by changing units

• Hours vs Minutes vs Seconds up to the limit of accuracy for the measuring device such as a wall clock or a stop watch

Because a variable such as weight is infinitely divisible the researcher needs to set boundaries or limits – use real limits which are boundaries located exactly half-

way between adjacent categories. – Researcher decides where to set limits as a practical matter

such as record weight to the nearest pound• So if someone has weight of 149.6 they are in 150 data• Each value “150” is an interval with upper and lower limits• Values that fall on the boundary “150.5” can be rounded up or

down just be consistent with the rounding rule

Figure 1.8 p.21When measuring weight to the nearest whole pound, 149.6 and 150.3 are assigned the value of 150 (top). Any value in the interval between 149.5 and 150.5 is given the value of 150.

Measuring Variables

To establish relationships between variables, researchers must observe the variables and record their observations. This requires that the variables be measured.

The process of measuring a variable requires a set of categories called a scale of measurement and a process that classifies each individual into one category.

Four Types of Measurement Scales :

Nominal (by name / category)

Ordinal (by order / rank)

Interval (meaningful, equal interval scaling)

Ratio (interval with a “real”zero point –degrees Kelvin)

Scales of Measurement

Nominal Scale– “Names”– Classifying subjects into categories– No category is “more” or “less,” just different– Categories can be labeled by

• words (e.g., Male, Female) or

• numbers (e.g., 0, 1) which can be confusing

– Nominal scale always yields discrete variable


Ordinal Scale– “ordered”– Categories are in ordered sequence, ranked– Examples:

• Gold, silver, bronze medals

• Don’t know how far gold was from silver, or silver from bronze

• Class standing (33rd out of 108)

– Ordinal scale technically yields discrete variables (can not be ranked 33rd and a half)

– Different statistical procedures are required.


Interval Scales– Distance between two values is the same at any point on

the scale• The difference between scores of 6 and 10 is 4 units• The difference between scores of 26 and 30 is 4 units

– Interval scale does not have absolute zero• Attitudinal scales, on a scale of from 1(not a all) to 10 (a great

deal) how much do you like anchovy pizza? • Example 1.2 page 25: convert ratio scale to interval scale• Height measurements (ratio scale) can be converted to

difference scores i.e. difference from the average score• Average height of 50 inches so a height of 52 becomes a

difference score of +2 which is an interval scale measurement


Ratio Scales– In addition to having even intervals we can calculate

ratios so a – Ratio scale has meaningful, absolute zero

• Distance: zero distance

• Weight: zero weight

• Temperature: absolute zero but not zero degrees Fahrenheit

• Time: zero time ??

But what about– IQ score? Interval– Score on test of neuroticism? Interval


In practice, many psychological variables give more than ordinal-level information, but not possible to clearly establish that they are interval-level. Generally treated as interval data.

Many statistical procedures assume at least interval level data, but function reasonably well with ordinal-level data

Statistical Notation

N refers to number of subjects; N=6.

Xi refers to the ith person’s score on variable X.

For this data set, X4 = 1, Y 2 = 10.

Subject# Gender (X) Age (Y)

1 1 8

2 1 10

3 0 7

4 1 6

5 0 10

6 0 12

–X is a discrete variable, where 0=men and 1=women.

–Y is a continuous variable, representing years of age.


Greek letter Sigma symbolizes summation

Subject# Gender (X) Age (Y)

1 1 8

2 1 10

3 0 7

4 1 6

5 0 10

6 0 12

Y = ?

X = ?

–53

–Illegal Operation

–Gender (x)

–Girl

–Girl

–Boy

–Girl

–Boy

–Boy

Statistical Notation–Examples 1.3, 1.4 & 1.5 p 28

X X2 (X-1) (X-1) 2

3 9 2 4

1 1 0 0

7 49 6 36

4 16 3 9

–ΣX = 3+1+7+4 = 15

–ΣX2 = 9+1+49+16 = 75

–Σ(X-1) = 2+0+6+3 = 11

–Σ(X-1)2 = 4+0+36+9 = 49

–However

–ΣX-1 = 15 – 1 = 14

–Because

–The order of operations is:–1. parentheses–2. exponents–3. multiply / divide–4. summation ( Σ) –5. addition / subtraction


Example 1.6 p 29ΣX = 3+1+7+4 = 15ΣY = 5+3+4+2 = 14ΣXY = 15+3+28+8 = 54

Σ(X + Y) = ΣX + ΣYΣ(X + Y) = 8+4+11+6 = 29ΣX + ΣY = 15+14 = 29

Person X Y XY

A 3 5 15

B 1 3 3

C 7 4 28

D 4 2 8

why include statistics as part of psychology? –doing psychology research –reading psychology...

Documents

population sample data

population sample statistics

data statistics slide

data descriptive statistics

sample inferential statistics

hypothesis data

collection of data

summarizing data