research methodology: tools
TRANSCRIPT
-
Research Methodology: Tools
Applied Data Analysis (with SPSS)
Lecture 01: Introduction to SPSS
February 2014
Prof. Dr. Jrg Schwarz Lic. phil. Heidi Bruderer Enzler
MSc Business Administration
Slide 2
Contents
Resources I ______________________________________________________________________________________________ 3
First Steps _______________________________________________________________________________________________ 4
Design of the Data Editor and Datasets _______________________________________________________________________ 6
Running Analyses with SPSS _______________________________________________________________________________ 9
Special Issue: Using the Syntax Editor ______________________________________________________________________ 14
Data Entry ______________________________________________________________________________________________ 29
Modifying data values ____________________________________________________________________________________ 21
Select Cases & Split File __________________________________________________________________________________ 25
Resources II ____________________________________________________________________________________________ 29
-
Slide 3
Resources I
Manuals
This introduction is based on the manual IBM SPSS Statistics 21 Brief Guide. You can find this manual and the IBM SPSS Statistics 21 Core System User's Guide here: Ilias http://www-01.ibm.com/support/docview.wss?uid=swg27024972
Slide 4
First Steps
Change the Application Language
The language can be selected through the "General" tab under EditOptions:
-
Slide 5
Sample Dataset "demo.sav"
This introduction is based on the demo.sav dataset. It is a fictional survey that includes basic demographic data and consumer information of several thousand persons (n = 6400). Name Labelage Age in yearsmarital Marital statusaddress Years at current addressincome Household income in thousandsinccat Income category in thousandscar Price of primary vehiclecarcat Primary vehicle price categoryed Level of educationemploy Years with current employerretire Retiredempcat Years with current employerjobsat Job satisfactiongender Genderreside Number of people in householdwireless Wireless servicemultline Multiple linesvoice Voice mailpager Paging serviceinternet Internetcallid Caller IDcallwait Call waitingowntv Owns TVownvcr Owns VCRowncd Owns stereo/CD playerownpda Owns PDAownpc Owns computer .
Slide 6
Opening the Data File "demo.sav"
In preparation for today, you have saved the dataset "demo.sav" locally.
You did find it in one of the following locations: On your laptop: C\\Programs\IBM\SPSS\Statistics\21\Samples\ On Ilias: Lectures Lecture 01 Introduction Data Resources
To open this dataset you can do either of the following: Double click on SPSS data file (the dataset "demo.sav") Use the menu FileOpenData
-
Slide 7
Design of the Data Editor and Datasets
The Two Parts of the Data Editor
The Data Editor shows the contents of the current data file.
Variable view
Each row is a variable, each column is an attribute of the variable.
Data view
Columns represent variables, rows represent cases.
Slide 8
Structure of an SPSS Dataset
SPSS data are organized according to cases (rows) and variables (columns).
Cases (Rows) In a survey of individuals, each row repre-sents a respondent. In a scientific experiment, each row usually corresponds to a measurement.
Variables (Columns) Each column of the data editor corresponds to a particular attribute. In many areas of research, these measurements are called variables.
Data View
-
Slide 9
Running Analyses with SPSS
Running an Analysis
The "Analyze" menu contains different methods of analysis. For example a simple frequency table with histogram:
AnalyzeDescriptive StatisticsFrequencies
Slide 10
Intermezzo: Alphabetical View of the Variables in the Dialog Boxes
By default, SPSS dialog boxes display variables with their labels:
This could make the search for particular variables difficult.
Variables are displayed with their labels.
-
Slide 11
SPSS can be adjusted so that variables are displayed with their names and in alphabetical order.
To do so, select the following setting under the General tab of EditOptions:
Slide 12
Place the cursor in the box that contains the variables, and enter a character from the keyboard. The first variable beginning with this character will appear.
This allows you to quickly search through the variable box to find a variable.
Variables are displayed alpha-betically by their names.
-
Slide 13
Viewing Results: SPSS Output
The output includes the syntax of the command and its results (frequency table, histogram).
Syntax is the internal "language" of SPSS.
Slide 14
Special Issue: Using the Syntax Editor
Structure of SPSS
Output
Data Editor
Syntax Editor
*.sav files
*.spv files
*.sps files
-
Slide 15
Working with Syntax
Open a new syntax file through the menu: FileNewSyntax
Output
Data Editor
Syntax-Editor
*.sav files
*.spv files
*.sps files
Slide 16
How do you get the command syntax?
Option I: Perform an analysis through the menu
Example: AnalyzeDescriptive StatisticsFrequencies
Output
Data Editor
-
Slide 17
Where is the syntax for this analysis? => The syntax is displayed in the output.
Double-click the syntax part in the log, highlight and copy the syntax.
Paste the syntax into the Syntax Editor.
Slide 18
Option II: Paste the syntax directly from the dialog box ("Paste" button).
Option III: Write the syntax yourself.
Executing the Syntax Place the cursor inside the syntax editor and run the analysis through the menu RunSelection.
-
Slide 19
Typical Syntax File
Why should you use syntax? Rapidly leads to greater efficiency.
Documentation Reproducing the results Automatically process many commands Allows access to all commands Communication with other persons Opens the world of macros
Slide 20
What if the Syntax is not Displayed in the Output?
Through the menu EditOptionsViewer, choose "Display commands in the log"
The syntax is now displayed in the output.
-
Slide 21
Modifying Data Values
The data may not always exist in a form that can be used for analysis or reporting. For example, you may want to: convert a scale variable into a categorical variable. merge different response categories into a single category. calculate a new variable from the difference between two existing variables.
Slide 22
Computing a new variable
New variables can be computed based on existing ones, for example by averaging scores, summing them up etc. For example you may want to compute the equivalence income (based on the household in-come and the number of persons in the household).
TransformCompute Variable
Syntax
COMPUTE income_equiv = income / SQRT(reside).
-
Slide 23
Recoding a variable Example: creating a categorical variable from a scale variable. For example, based on age in years we could build age categories.
Menu: TransformRecode into Different Variables
Slide 24
Syntax
RECODE age (Lowest thru 24=1) (25 thru 44=2) (45 thru 60=3) (61 thru Highest=4) INTO age_r.
FREQUENCIES VARIABLES=age age_r /ORDER ANALYSIS.
Result
Scale values (age) Categorical values (age_r)
:
Categories 1: up to 24 years 2: 25 - 44 years 3: 45 - 60 years 4: over 60 years
==>
-
Slide 25
Select Cases & Split File
Select cases
A particular subset of the data can be analyzed by selecting specific cases. Through this, all un-desired cases of your data set are either temporarily or permanently deleted.
For example, you may want to analyze only respondents who are older than 45 years.
Menu: DataSelect Cases
Slide 26
Syntax Result
USE ALL.
COMPUTE filter_$=(age > 45).
FILTER BY filter_$.
EXECUTE .
FREQUENCIES VARIABLES=age
/FORMAT=NOTABLE
/HISTOGRAM
/ORDER=ANALYSIS.
FILTER OFF.
USE ALL.
EXECUTE .
These lines remove the "filter" for all analyses to come.
-
Slide 27
Split File
Sometimes data in different categories should be analyzed separately. To do this, the data can be split up, and the same analysis can be performed on two or more da-tasets.
For example, we could split the dataset by means of the variable age_r which means we are conducting separate analyses for each of the age categories.
Menu: DataSplit File
Slide 28
Syntax Result
SORT CASES BY age_r .
SPLIT FILE SEPARATE BY age_r .
FREQUENCIES VARIABLES=income
/FORMAT=NOTABLE
/HISTOGRAM
/ORDER=ANALYSIS.
SPLIT FILE OFF.
This line removes the split for all analyses to come.
-
Slide 29
Data Entry
Data Entry Options
There are different ways to enter data into SPSS.
Data can be directly entered into SPSS or can be imported from many different sources:
Direct: SPSS Data Editor
From a spreadsheet program (such as Excel)
From a database program (such as Access)
From other applications (such as a text editor)
Scanners may be efficient for entering large amounts of data.
Slide 30
Data Editor: Defining Variables, Entering Data & Missing Values Entering (new) numerical data
Open a new data file (through the menu FileNewData)
At the bottom of the Data Editor window, switch to Variable View. Enter age in the first row of the first column.
Enter marital in the second row.
Enter income in the third row.
New variables are automatically assigned the "Numeric" data type.
-
Slide 31
Switch to the Data View in order to enter values.
To suppress the decimal place for the variables age, marital and income: At the bottom of the Data Editor window, switch to Variable View. Select the "Decimals" column and enter a 0 for age. Select the "Decimals" column and enter a 0 for marital.
Slide 32
Adding variable labels and value labels
Enter "Respondent's age" into the age cell of the "Labels" column. Do the same for "Marital Status", and so on.
Select the Values cell for marital and open the dialog box.
For "Value", enter 1.
For "Label", enter "single".
Click on "Add" so that this designation is registered.
-
Slide 33
Handling missing values
In general, missing or invalid data should not be ignored.
Sometimes survey participants refuse to answer particular questions. They may not know an answer, or may respond in an unexpected way.
If these data are not identified or filtered out, your analysis may not yield correct results.
Empty data cells, or cells that con-tain invalid input, are converted to missing values, which are displayed as a period.
Slide 34
The reason why data is missing could be important for your analysis.
For example, for a particular question, it could be useful to distinguish between those who re-fused to answer and those for whom the question was not applicable.
In "Variable View" select the "Missing" cell for income and open the dialog box.
In this dialog box you can specify up to three different missing values, either by defining a range of values, or particular single values.
-
Slide 35
Importing Data
Data can be imported from different sources.
Reading an SPSS Data File SPSS data files have a file extension of *.sav.
Importing data from a spreadsheet In addition to entering data into the data editor, you can import from programs such as Mi-
crosoft Excel. The column headings serve as variable names.
Importing data from a text file Text files are common sources of data. Many spreadsheet programs and databases can save
their contents in text file format. For example, in CSV files, variables are separated with commas or tabs.
Importing data from a database (not in this course) Data from a database can be imported with the help of a database wizard.
Slide 36
Importing data from a spreadsheet
Search for the Excel file demo.xls. On your computer: In the "Samples" subdirectory of the installation directory
C\\Programs\IBM\SPSS\Statistics\21\Samples\ On Ilias: Lectures Lecture 01 Introduction Data Resources
Column headings = Variable names
:
-
Slide 37
Open the Excel file through the SPSS File menu (Excel file must be closed)
Slide 38
Importing data from a text file
Search for the text file demo.txt On the Computer under: C\\Programs\IBM\SPSS\Statistics\21\Samples\ On Ilias: > > Data Resources
Open the text file through the SPSS File menu (text file must be closed)
-
Slide 39
Slide 40
Resources II SPSS Help System (Core System User's Guide)
Help Menu (the most important entries) Topics: This allows you to access the tabs Contents, Index and Search. Use these tabs to
search for particular Help topics. Tutorial: Step-by-step instructions for many basic functions. Case studies: Practical examples to design different types of statistical analyses and for the
interpretation of results. Statistics Coach: This coach helps you to find the procedure that you would like to use. The
Statistics Coach offers access to most of the procedures. Command Syntax Reference: Detailed information about command syntax is available
from two sources: as a component of the Help system, and as a separate PDF-document in the Command Syntax Reference manual, which is also available through the Help menu.
Context dependent help Dialog box help: Most dialog boxes contain the Help button, through which you can call up
corresponding help topics for the dialog box. Pivot Table Context Help Menu: If you right-click on a term in a viewer-activated pivot ta-
ble, and then select Direct Help from the context menu, you obtain a definition of the term. Command syntax: Place your cursor inside a block of command syntax in the command
syntax window, and press the F1 key on the keyboard.
-
Slide 41
Help Menu:
Help dialog box
=>
=>
Slide 42
Tutorials
:
-
Slide 43
Online Resources
SPSS Solutions for Education www-01.ibm.com/software/analytics/spss/academic/students/resources.html
User [email protected] Password 7mydevelopper
SPSS Support (primarily for the Knowledge Base) http://support.spss.com/tech/default.asp
User spssswitzerland Password spssswitzerland
SPSS Support (resources for all levels of users and application developers) www.spss.com/devcentral
User [email protected] Password 7mydevelopper
Other Resources / Forum / Discussion www.ats.ucla.edu/stat/spss http://spssx-discussion.1045642.n5.nabble.com www.spssusers.co.uk
www.dynelytics.com
Slide 44
Notes: