why include statistics as part of psychology? –doing psychology research –reading psychology...
TRANSCRIPT
• Why include statistics as part of Psychology?– Doing psychology research
– Reading psychology research articles
– Analytical reasoning, critical thinking
– Statistics• Fundamental tool for all scientific inquiry• Way of making sense out of data
Statistics
Populations and Samples
• Population– the group of individuals (or things) of interest in a particular
study• For example, a researcher may be interested in the relation between class
size (variable 1) and academic performance (variable 2) for the population of third-grade children.
• Sample– Usually populations are so large that a researcher cannot
examine the entire group• a sample is selected to represent the population in a research study
• Sample size depends on the type of research
– The goal is to use the results obtained from the sample to help answer questions about the population
Sampling from a Population
Figure 1-1 (p. 6)The relationship between a population and a sample.
Make an Inference
representative of the population
Variables And Data• A variable is a characteristic or condition that can change
or take on different values– Most research begins with a general question about the
relationship between two variables for a specific group of individuals. (similar to forming an hypothesis)
• Data are measurements or observations– The measurements obtained in a research study are called the
data or data set
– Each measurement is a datum (singular) or score – The goal of statistics is to help researchers organize and
interpret the data.
Sources of Data
• Observation Research– Naturalistic no intervention
– Poor control
– Correlational data • Survey Research
– A correlational method of collecting data
– Do not exercise any control over time order
– Poor control of alternatives
– Can show relationships
• Experiments– Exercise control over covariation, time order and alternatives
– Can help establish causation
Using Statistics in Psychology
• Carrying out psychological research using an empirical approach means the collection of data. Statistics are a way of making use of this data– Descriptive Statistics: used to describe
characteristics of our sample– Inferential Statistics: used to generalise from our
sample to our population– Any samples used should therefore be representative
of the target population
Descriptive Statistics
• Descriptive statistics are methods for organizing and summarizing data. • For example, tables or graphs are used to organize
data, and descriptive values such as the average score are used to summarize data.
• A descriptive value for a population is called a parameter and a descriptive value for a sample is called a statistic.
Inferential Statistics
• Inferential statistics– methods for using sample data to make general
conclusions (inferences) about populations – a sample is only a part of the population– sample data provide only limited information about
the population– sample statistics are imperfect representatives of the
population parameters because of sampling error
Sampling Error
• The discrepancy between a sample statistic and its population parameter is called sampling error.
• Defining and measuring sampling error is a large part of inferential statistics.
Sampling from a Population
Figure 1-2 (p. 9)A demonstration of sampling error. Two samples are selected from the same population. Notice that the sample statistics are different from one sample to another, and all of the sample statistics are different from the corresponding population parameters. The natural differences that exist, by chance, between a sample statistic and a population parameter are called sampling error.
Margin of Error Box 1.1
• Is an example of sampling error
• Terminology used in polling data such as political polls
• Amount of error between a sample statistic and a population parameter
• There will always be sampling error in:– survey research– experiments
Figure 1-3 (p. 10)The role of statistics in experimental research.
Relationship Between Variables
• Correlational Method– Measuring two variables for each individual
• Height and Weight
• SAT and GPA
• Wake-up Time and Academic Performance (Figure 1.4)
– Determine the relationship between the variables– Limitations of the correlational method
• Can not demonstrate cause-and-effect relationships
Figure 1.4One of two data structures for studies evaluating the relationship between variables. Note that there are two separate measurements for each individual (wake-up time and academic performance). The same scores are shown in a table (a) and in a graph (b).
Hypothetical data showing results from a correlational study evaluating the relationship between exposure to TV violence and aggressive behavior for a sample of 10 children. Note that we have measured two different variables, obtaining two different scores, for each child. The data show a tendency for higher levels of TV violence to be associated with higher levels of aggressive behavior.
• Comparing two groups of scores– Experimental (see Figure 1.5 )
• One variable defines the groups (violence vs no violence)– Independent variable
• Another variable is the measurement, scores from the groups– Dependent variable
– NonExperimental “quasi-experimental”• Natural or pre-existing groups such as gender which are selected not
manipulated
• Before and after measurements for example before and after therapy– Do not confuse with control vs experimental groups
– There is only one group of participants whom get measured twice
Relationship Between Variables
Figure 1.6 The Structure of an experiment. Participants are randomly assigned to one of two treatment conditions: counting money or counting blank pieces of paper. Later, each participant is tested by placing one hand in a bowl of hot (122 F) water and rating the level of pain. A difference between the ratings for the two groups is attributed to the treatment (paper vs money).
The structure of an experiment. Volunteers are randomly assigned to one of two treatment conditions: a 70° room or a 90° room. A list of words is presented and the participants are tested by writing down as many words as they can remember from the list. A difference between groups is attributed to the treatment (the temperature of the room).
In this experiment, the effect of instructional method (the independent variable) on test performance (the dependent variable) is examined. However, any difference between groups is performance cannot be attributed to the method of instruction. In this experiment, there is a confounding variable. The instructor teaching the course varies with the independent variable,so that the treatment of the groups differs in more ways than one (instructional method and instructor vary).
Figure 1-7 (p. 17) Two examples of nonexperimental studies that involve comparing two groups of scores. In (a) the study uses two preexisting groups (boys/girls) and measures a dependent variable (verbal scores) in each group.
In (b), time is the variable used to define the two groups, and the dependent variable (depression) is measured at each of the two times.
NonExperimental “quasi-experimental” Terminology
• Similar data structure to experiments– One variable identifies groups (independent)– A second variable is measured to obtain data (dependent)
• For nonexperimental – Independent variable such as gender is not manipulated– So it is called “quasi-independent variable”
Data Structures and Statistical Method
• Data structure is used to classify statistical methods– One group with two variables measured for each
individual • Survey research
– Collect GPA and SAT scores for each person
– Use correlational statistics to describe the data
• Survey or Observational Research– Number of individuals in a group
– Groups based on “natural” categories such as gender
– Groups based on some activity such as “talk” vs “text” (table 1.1)
– Use Chi-square statistic to describe the data
– See scales of measurement on page 23
Data Structures and Statistical Method
• Data structure is used to classify statistical methods– Two or more groups of scores
• Compare two groups such as “Money” and “Paper” (fig 1.6)
• Two groups of individuals
• Compare average from each group of scores
• Several different statistical tests are used such as t-test or ANOVA based on number of groups
Constructs and Operational Definitions
To form a hypothesis from a research question the researcher needs to define the variables– What the effects of drug “Bulk-O” on weight gain?
• Independent variable is drug or no drug
• Dependent variable is weight gain which is “concrete”
– What are the effects of drug PQX1450 on Anxiety?• Independent variable is drug or no drug
• Dependent variable is “anxiety” which is a construct
• so we need to define the construct of anxiety
• Need an Operational Definition for anxiety
– How intelligent are students taking Methods course?
Variables And MeasurementDiscrete Variable
– Discrete categories such as students, cars, houses– Usually a count of the number of individuals or things
• number of students in class• number of cars in the parking lot• number of houses along the street
– Also called “Categorical variables” – Sometimes referred to as “Qualitative variables” which is
confusing because qualitative is just description not counting
Continuous Variable– Variable can be divided into an infinite number of values– height, weight, time
Use of Real Limits with Continuous Variables
When working with continuous variable– Can adjust precision by changing units
• Hours vs Minutes vs Seconds up to the limit of accuracy for the measuring device such as a wall clock or a stop watch
Because a variable such as weight is infinitely divisible the researcher needs to set boundaries or limits – use real limits which are boundaries located exactly half-
way between adjacent categories. – Researcher decides where to set limits as a practical matter
such as record weight to the nearest pound• So if someone has weight of 149.6 they are in 150 data• Each value “150” is an interval with upper and lower limits• Values that fall on the boundary “150.5” can be rounded up or
down just be consistent with the rounding rule
Figure 1.8 p.21When measuring weight to the nearest whole pound, 149.6 and 150.3 are assigned the value of 150 (top). Any value in the interval between 149.5 and 150.5 is given the value of 150.
Measuring Variables
To establish relationships between variables, researchers must observe the variables and record their observations. This requires that the variables be measured.
The process of measuring a variable requires a set of categories called a scale of measurement and a process that classifies each individual into one category.
Four Types of Measurement Scales :
Nominal (by name / category)
Ordinal (by order / rank)
Interval (meaningful, equal interval scaling)
Ratio (interval with a “real”zero point –degrees Kelvin)
Scales of Measurement
Nominal Scale– “Names”– Classifying subjects into categories– No category is “more” or “less,” just different– Categories can be labeled by
• words (e.g., Male, Female) or
• numbers (e.g., 0, 1) which can be confusing
– Nominal scale always yields discrete variable
Scales of Measurement
Ordinal Scale– “ordered”– Categories are in ordered sequence, ranked– Examples:
• Gold, silver, bronze medals
• Don’t know how far gold was from silver, or silver from bronze
• Class standing (33rd out of 108)
– Ordinal scale technically yields discrete variables (can not be ranked 33rd and a half)
– Different statistical procedures are required.
Scales of Measurement
Interval Scales– Distance between two values is the same at any point on
the scale• The difference between scores of 6 and 10 is 4 units• The difference between scores of 26 and 30 is 4 units
– Interval scale does not have absolute zero• Attitudinal scales, on a scale of from 1(not a all) to 10 (a great
deal) how much do you like anchovy pizza? • Example 1.2 page 25: convert ratio scale to interval scale• Height measurements (ratio scale) can be converted to
difference scores i.e. difference from the average score• Average height of 50 inches so a height of 52 becomes a
difference score of +2 which is an interval scale measurement
Scales of Measurement
Ratio Scales– In addition to having even intervals we can calculate
ratios so a – Ratio scale has meaningful, absolute zero
• Distance: zero distance
• Weight: zero weight
• Temperature: absolute zero but not zero degrees Fahrenheit
• Time: zero time ??
But what about– IQ score? Interval– Score on test of neuroticism? Interval
Scales of Measurement
In practice, many psychological variables give more than ordinal-level information, but not possible to clearly establish that they are interval-level. Generally treated as interval data.
Many statistical procedures assume at least interval level data, but function reasonably well with ordinal-level data
Statistical Notation
N refers to number of subjects; N=6.
Xi refers to the ith person’s score on variable X.
For this data set, X4 = 1, Y 2 = 10.
Subject# Gender (X) Age (Y)
1 1 8
2 1 10
3 0 7
4 1 6
5 0 10
6 0 12
–X is a discrete variable, where 0=men and 1=women.
–Y is a continuous variable, representing years of age.
Statistical Notation
Greek letter Sigma symbolizes summation
Subject# Gender (X) Age (Y)
1 1 8
2 1 10
3 0 7
4 1 6
5 0 10
6 0 12
Y = ?
X = ?
–53
–Illegal Operation
–Gender (x)
–Girl
–Girl
–Boy
–Girl
–Boy
–Boy
Statistical Notation–Examples 1.3, 1.4 & 1.5 p 28
X X2 (X-1) (X-1) 2
3 9 2 4
1 1 0 0
7 49 6 36
4 16 3 9
–ΣX = 3+1+7+4 = 15
–ΣX2 = 9+1+49+16 = 75
–Σ(X-1) = 2+0+6+3 = 11
–Σ(X-1)2 = 4+0+36+9 = 49
–However
–ΣX-1 = 15 – 1 = 14
–Because
–The order of operations is:–1. parentheses–2. exponents–3. multiply / divide–4. summation ( Σ) –5. addition / subtraction
Statistical Notation
Example 1.6 p 29ΣX = 3+1+7+4 = 15ΣY = 5+3+4+2 = 14ΣXY = 15+3+28+8 = 54
Σ(X + Y) = ΣX + ΣYΣ(X + Y) = 8+4+11+6 = 29ΣX + ΣY = 15+14 = 29
Person X Y XY
A 3 5 15
B 1 3 3
C 7 4 28
D 4 2 8