i owa s tate u niversity department of animal science getting started using sas software animal...
TRANSCRIPT
IOWA STATE UNIVERSITYDepartment of Animal Science
Getting Started Using SAS Software
Animal Science 500
Lecture No. 2
IOWA STATE UNIVERSITYDepartment of Animal Science
SAS Historyu Developed in the late 1960s and 1970s at North
Carolina State Universityn Original use was for management and analysis of
agricultural field experiments
u Headquarters still in Cary, NC
u Claim to be the most widely used statistical software
u SAS used to be an acronym for “Statistical Analysis Systemn No real meaning today – just “SAS”
IOWA STATE UNIVERSITYDepartment of Animal Science
SAS Productsu Base SAS – data management and basic procedures
u SAS/STAT – statistical analysis
u SAS/GRAPH – presentation quality graphics
u SAS/OR – operations research
u SAS/ETS – econometrics and time series analysis
u SAS/IML – interactive matrix language
u SAS/AF – applications facility (menus and interfaces)
u SAS/QC – quality control
u SAS/Genetics – use for analyzing genetic marker data
u Other products for spreadsheets, databases, and connectivity between different machine interfaces that are running SAS
IOWA STATE UNIVERSITYDepartment of Animal Science
Resources for SASu Numerous books
n Published by SAS and frequently authored users themselves
n Published independent of SAS n All can be purchased from a variety of on-line book
storesl Amazonl ABE.coml Etc.
u SAS documentation is available on-line
IOWA STATE UNIVERSITYDepartment of Animal Science
Resources for SASu SAS documentation is available on-line
n Majority of book or “hard” documentation can by found on-line
n Extensive help section for SAS
u Can e-mail SAS software consultants with technical questions that appear to not have solutions available.
IOWA STATE UNIVERSITYDepartment of Animal Science
On-line support
u http://support.sas.com/onlinedoc/913/docMainpage.jsp
IOWA STATE UNIVERSITYDepartment of Animal Science
Format of SAS programu SAS is very user friendly
n No rules about format of your program and several programming rules are allowed;l Statements or program can be written in UPPERCASE or
lowercase or a COMbinATionl Statements can continue across multiple linesl Two statements can be on the same linel You can start the program or statements in any column
IOWA STATE UNIVERSITYDepartment of Animal Science
Making your SAS program User Friendly
u Keep your program organizedn Often it is easier to review your program by writing
only one statement per line.n Start DATA and PROC statements in leading columns
l Indent other code that follows the DATA or PROC step
n To keep your code understandable, include comments l Future usel Someone else has to use or manipulate your data or
program
IOWA STATE UNIVERSITYDepartment of Animal Science
Making your SAS program User Friendlyn To keep your code understandable, include comments
l Future usel Someone else has to use or manipulate your data or programl To insert comments use *ahead of comment and SAS will not
read this line.• Often start comment lines with multiple **** so comments are
easily identifiable.l Can also “comment out” lines of sas code
• Surround the sas code you do not want to use but want to save using /* at the beginning and */ at the end of the code you wish to not utilize.
IOWA STATE UNIVERSITYDepartment of Animal Science
Making your SAS program User Friendly
u To keep your code understandable, include comments
l SAS can ignore the comments by surrounding it l Two ways to insert the comments
1. Start comment with * and end with a ;
2. Start with /* and end with */1. Use care when using this method if using program with another
program or in another
u Commonly called commenting or “commenting out” lines of code
IOWA STATE UNIVERSITYDepartment of Animal Science
Making your SAS program User Friendly
u Particularly important when:n Writing materials and methods for a variety of
publicationsn In the review process of journal articlesn Keep meticulous records on your analyses
IOWA STATE UNIVERSITYDepartment of Animal Science
Programming Tipsu Programming in SAS is a step by step process
n Write a portion of the programn Begin with writing the program in small steps
l The infile statementl Obtain meansl Examine the distributionl Etc.
n Sort out any errors – if any and correct them and move the next step of the process
l Just because your program “works” or does not give you an error message does not mean everything is correct
Check results or output at this point
n The step by step process helps to find and correct errors more easily than attempting to write the entire program and begin the analysis at that point.
IOWA STATE UNIVERSITYDepartment of Animal Science
Getting Started with A SAS Analysis
u Your data set must be imported
u Once imported your data set is a “SAS data set”
u SAS can read almost any type of data
u Once your data is read, SAS will keep track of what is where and in what form it is in
u User only has to provide the name and location of the data set you want to usn SAS can figure our what is in it
IOWA STATE UNIVERSITYDepartment of Animal Science
Data
u Data you want analyzed is in a table formatn Columns represent the variables of what was measured n The rows represent the observations
u Data typesn SAS simplifies data into 2 types
1. Numeric
2. Character
IOWA STATE UNIVERSITYDepartment of Animal Science
Data
Data types
Numeric – are numbersCan have mathematic processes carried out
(added, subtracted, multiplied, divided, etc.)
Can have any number of decimal places
Can be positive or negative (the data sign + or – can be included in the numeric data variable
column. The E can be used for scientific notation)
IOWA STATE UNIVERSITYDepartment of Animal Science
Data
Data types
CharacterAnything that is not a numeric data is character
Data can contain numbers, letters, special characters ($,#, !, etc.
If data has letters or special characters data must be character
If data contains only numbers then data can be either numeric or character
Sometimes data that is only numbers still may be better as a character variable
Example – zip codes
IOWA STATE UNIVERSITYDepartment of Animal Science
Missing data
u Missing datan Character missing data is depicted in a data set as a
blankn Numeric missing data is noted in a data set by a .
IOWA STATE UNIVERSITYDepartment of Animal Science
Data characteristics
u Data set sizen Prior to SAS 9.1 data sets could contain 32,767 variables
(columns)n SAS 9.1 and later, the number of variables not limited by
softwarel Hardware my limit size of data set capable of being evaluated
n The number of observations (rows) in all versions of SAS is not limitedl Limited only by computer hardware limitationsl Spreadsheet used to record data observations may be a limiting
factor
IOWA STATE UNIVERSITYDepartment of Animal Science
SAS Naming rules
u Variable names limited to 32 characters or lessn Limited to the use of letters, the underscore _ key and
digits and can be in either upper or lower casen SAS does remember the case of the first occurrence of
each variable name and uses that case when printing results.
n Before SAS version 7 the limit was 8n Advice to use as few as possible so long as you can
clearly identify the variable at some later pointl Long names have to be included in any Procedure (PROC)
where the variable must be identified or used in some modeling process
IOWA STATE UNIVERSITYDepartment of Animal Science
SAS Naming Rulesu Variable names may not contain embedded
blanks. n V1 and V_1 are acceptable; V 1 is not. n Graduation Date is acceptable Graduation Date is not.
u Certain names are reserved for use by SAS –n _N_n _TYPE_n _NAME_
u Logical operators such as ge, lt, and, and eq should not be used as variable names.
IOWA STATE UNIVERSITYDepartment of Animal Science
SAS Program Structure
u Program might also be referred to as code by some
u Two components to SAS programs1. Data step
2. Procedure step (often called the Proc step)
u Every SAS statement MUST end with a semicolon ( ; ).
IOWA STATE UNIVERSITYDepartment of Animal Science
SAS Program Structure
u The Data stepn Reads data from external source – can be read into
your program a variety of waysn Manipulate your data – often making new
calculations based on the original datan Combines data with other data – combining data
sets to do even more calculations or manipulations for a desired outcome
n Print reports based on the data that is inputted – can be used for a variety of purposes
n These tools can be used to prepare the data for use by one of the procedures available in SAS
IOWA STATE UNIVERSITYDepartment of Animal Science
SAS Program Structure
u The Procedure stepn Performs analyses on your data
l Proc Sort l Proc Meansl Proc Mergel Proc Anoval Proc Mixed
n Can produce volumes of outputn Often the most effective way to learn SAS (using
both the data step and the procedure step) is by doing
IOWA STATE UNIVERSITYDepartment of Animal Science
SAS Program Structure
u Statements used are exclusive to either the DATA or the PROC steps
u Remember n Data steps read and modify the datan Proc steps analyze, perform a utility or print the data
u Data steps must begin with the words Datan This step can also include:
l DO loopsl IF – THEN /ELSE logic l SELECT – WHEN / ELSEl Large assortment of other numeric and / or character function
IOWA STATE UNIVERSITYDepartment of Animal Science
SAS Program Structure
u Data steps must begin with the words Datan Can combine data in a variety of ways
l Match and merge – i.e. merging by id for examplel Concatenate – merging two variables end to end
IOWA STATE UNIVERSITYDepartment of Animal Science
SAS Program Structure
u Procedure statements must begin with PROC followed by a name of a proceduren Printn Sortn Means
u A step ends when SAS finds:1. A new step (the next DATA or PROC statement)
2. A run statement
3. The end of the program (if running in batch mode)
IOWA STATE UNIVERSITYDepartment of Animal Science
SAS Program Structure
u The run statement tells SAS to run all of the preceding lines of a step
n Can also indicate to run the lines that are highlighted
u Is also the rare “global” statements that are not part of a DATA or PROC step
u I typically place a “quit” statement (quit;) after each run statement
n Prevents your program from remaining in a do loop or continue running when the program has an error and continues to run potentially infinitely.
IOWA STATE UNIVERSITYDepartment of Animal Science
Arithmetic Operators
u Arithmetic operators indicate that an arithmetic calculation is performed, as shown in the following table:
IOWA STATE UNIVERSITYDepartment of Animal Science
Arithmetic Operators
Operation Symbol Example Result+ addition Num + Num
Example: 5 + 3add two numbers together
- subtraction Num - Num Example: 5 – 3 or can use two variables ending wt. – beginning wt.
subtract the value of 5 -3
* multiplication (table note 1)
2*yAlways have to have * cannot use 2(y) or 2y
multiply 2 by the value of Y
/ division var/5or can use variable weight gain / days on test
divide the value of VAR by 5
** can
also use the ^
exponentiation a**2or a^2
raise A to the second power