i owa s tate u niversity department of animal science getting started using sas software animal...

29
IOWA STATE UNIVERSITY Department of Animal Science Getting Started Using SAS Software Animal Science 500 Lecture No. 2

Upload: lucy-williams

Post on 17-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

IOWA STATE UNIVERSITYDepartment of Animal Science

Getting Started Using SAS Software

Animal Science 500

Lecture No. 2

IOWA STATE UNIVERSITYDepartment of Animal Science

SAS Historyu Developed in the late 1960s and 1970s at North

Carolina State Universityn Original use was for management and analysis of

agricultural field experiments

u Headquarters still in Cary, NC

u Claim to be the most widely used statistical software

u SAS used to be an acronym for “Statistical Analysis Systemn No real meaning today – just “SAS”

IOWA STATE UNIVERSITYDepartment of Animal Science

SAS Productsu Base SAS – data management and basic procedures

u SAS/STAT – statistical analysis

u SAS/GRAPH – presentation quality graphics

u SAS/OR – operations research

u SAS/ETS – econometrics and time series analysis

u SAS/IML – interactive matrix language

u SAS/AF – applications facility (menus and interfaces)

u SAS/QC – quality control

u SAS/Genetics – use for analyzing genetic marker data

u Other products for spreadsheets, databases, and connectivity between different machine interfaces that are running SAS

IOWA STATE UNIVERSITYDepartment of Animal Science

Resources for SASu Numerous books

n Published by SAS and frequently authored users themselves

n Published independent of SAS n All can be purchased from a variety of on-line book

storesl Amazonl ABE.coml Etc.

u SAS documentation is available on-line

IOWA STATE UNIVERSITYDepartment of Animal Science

Resources for SASu SAS documentation is available on-line

n Majority of book or “hard” documentation can by found on-line

n Extensive help section for SAS

u Can e-mail SAS software consultants with technical questions that appear to not have solutions available.

IOWA STATE UNIVERSITYDepartment of Animal Science

On-line support

u http://support.sas.com/onlinedoc/913/docMainpage.jsp

IOWA STATE UNIVERSITYDepartment of Animal Science

Format of SAS programu SAS is very user friendly

n No rules about format of your program and several programming rules are allowed;l Statements or program can be written in UPPERCASE or

lowercase or a COMbinATionl Statements can continue across multiple linesl Two statements can be on the same linel You can start the program or statements in any column

IOWA STATE UNIVERSITYDepartment of Animal Science

Making your SAS program User Friendly

u Keep your program organizedn Often it is easier to review your program by writing

only one statement per line.n Start DATA and PROC statements in leading columns

l Indent other code that follows the DATA or PROC step

n To keep your code understandable, include comments l Future usel Someone else has to use or manipulate your data or

program

IOWA STATE UNIVERSITYDepartment of Animal Science

Making your SAS program User Friendlyn To keep your code understandable, include comments

l Future usel Someone else has to use or manipulate your data or programl To insert comments use *ahead of comment and SAS will not

read this line.• Often start comment lines with multiple **** so comments are

easily identifiable.l Can also “comment out” lines of sas code

• Surround the sas code you do not want to use but want to save using /* at the beginning and */ at the end of the code you wish to not utilize.

IOWA STATE UNIVERSITYDepartment of Animal Science

Making your SAS program User Friendly

u To keep your code understandable, include comments

l SAS can ignore the comments by surrounding it l Two ways to insert the comments

1. Start comment with * and end with a ;

2. Start with /* and end with */1. Use care when using this method if using program with another

program or in another

u Commonly called commenting or “commenting out” lines of code

IOWA STATE UNIVERSITYDepartment of Animal Science

Making your SAS program User Friendly

u Particularly important when:n Writing materials and methods for a variety of

publicationsn In the review process of journal articlesn Keep meticulous records on your analyses

IOWA STATE UNIVERSITYDepartment of Animal Science

Programming Tipsu Programming in SAS is a step by step process

n Write a portion of the programn Begin with writing the program in small steps

l The infile statementl Obtain meansl Examine the distributionl Etc.

n Sort out any errors – if any and correct them and move the next step of the process

l Just because your program “works” or does not give you an error message does not mean everything is correct

Check results or output at this point

n The step by step process helps to find and correct errors more easily than attempting to write the entire program and begin the analysis at that point.

IOWA STATE UNIVERSITYDepartment of Animal Science

Getting Started with A SAS Analysis

u Your data set must be imported

u Once imported your data set is a “SAS data set”

u SAS can read almost any type of data

u Once your data is read, SAS will keep track of what is where and in what form it is in

u User only has to provide the name and location of the data set you want to usn SAS can figure our what is in it

IOWA STATE UNIVERSITYDepartment of Animal Science

Data

u Data you want analyzed is in a table formatn Columns represent the variables of what was measured n The rows represent the observations

u Data typesn SAS simplifies data into 2 types

1. Numeric

2. Character

IOWA STATE UNIVERSITYDepartment of Animal Science

Data

Data types

Numeric – are numbersCan have mathematic processes carried out

(added, subtracted, multiplied, divided, etc.)

Can have any number of decimal places

Can be positive or negative (the data sign + or – can be included in the numeric data variable

column. The E can be used for scientific notation)

IOWA STATE UNIVERSITYDepartment of Animal Science

Data

Data types

CharacterAnything that is not a numeric data is character

Data can contain numbers, letters, special characters ($,#, !, etc.

If data has letters or special characters data must be character

If data contains only numbers then data can be either numeric or character

Sometimes data that is only numbers still may be better as a character variable

Example – zip codes

IOWA STATE UNIVERSITYDepartment of Animal Science

Missing data

u Missing datan Character missing data is depicted in a data set as a

blankn Numeric missing data is noted in a data set by a .

IOWA STATE UNIVERSITYDepartment of Animal Science

Data characteristics

u Data set sizen Prior to SAS 9.1 data sets could contain 32,767 variables

(columns)n SAS 9.1 and later, the number of variables not limited by

softwarel Hardware my limit size of data set capable of being evaluated

n The number of observations (rows) in all versions of SAS is not limitedl Limited only by computer hardware limitationsl Spreadsheet used to record data observations may be a limiting

factor

IOWA STATE UNIVERSITYDepartment of Animal Science

SAS Naming rules

u Variable names limited to 32 characters or lessn Limited to the use of letters, the underscore _ key and

digits and can be in either upper or lower casen SAS does remember the case of the first occurrence of

each variable name and uses that case when printing results.

n Before SAS version 7 the limit was 8n Advice to use as few as possible so long as you can

clearly identify the variable at some later pointl Long names have to be included in any Procedure (PROC)

where the variable must be identified or used in some modeling process

IOWA STATE UNIVERSITYDepartment of Animal Science

SAS Naming Rulesu Variable names may not contain embedded

blanks. n V1 and V_1 are acceptable; V 1 is not. n Graduation Date is acceptable Graduation Date is not.

u Certain names are reserved for use by SAS –n _N_n _TYPE_n _NAME_

u Logical operators such as ge, lt, and, and eq should not be used as variable names.

IOWA STATE UNIVERSITYDepartment of Animal Science

SAS Program Structure

u Program might also be referred to as code by some

u Two components to SAS programs1. Data step

2. Procedure step (often called the Proc step)

u Every SAS statement MUST end with a semicolon ( ; ).

IOWA STATE UNIVERSITYDepartment of Animal Science

SAS Program Structure

u The Data stepn Reads data from external source – can be read into

your program a variety of waysn Manipulate your data – often making new

calculations based on the original datan Combines data with other data – combining data

sets to do even more calculations or manipulations for a desired outcome

n Print reports based on the data that is inputted – can be used for a variety of purposes

n These tools can be used to prepare the data for use by one of the procedures available in SAS

IOWA STATE UNIVERSITYDepartment of Animal Science

SAS Program Structure

u The Procedure stepn Performs analyses on your data

l Proc Sort l Proc Meansl Proc Mergel Proc Anoval Proc Mixed

n Can produce volumes of outputn Often the most effective way to learn SAS (using

both the data step and the procedure step) is by doing

IOWA STATE UNIVERSITYDepartment of Animal Science

SAS Program Structure

u Statements used are exclusive to either the DATA or the PROC steps

u Remember n Data steps read and modify the datan Proc steps analyze, perform a utility or print the data

u Data steps must begin with the words Datan This step can also include:

l DO loopsl IF – THEN /ELSE logic l SELECT – WHEN / ELSEl Large assortment of other numeric and / or character function

IOWA STATE UNIVERSITYDepartment of Animal Science

SAS Program Structure

u Data steps must begin with the words Datan Can combine data in a variety of ways

l Match and merge – i.e. merging by id for examplel Concatenate – merging two variables end to end

IOWA STATE UNIVERSITYDepartment of Animal Science

SAS Program Structure

u Procedure statements must begin with PROC followed by a name of a proceduren Printn Sortn Means

u A step ends when SAS finds:1. A new step (the next DATA or PROC statement)

2. A run statement

3. The end of the program (if running in batch mode)

IOWA STATE UNIVERSITYDepartment of Animal Science

SAS Program Structure

u The run statement tells SAS to run all of the preceding lines of a step

n Can also indicate to run the lines that are highlighted

u Is also the rare “global” statements that are not part of a DATA or PROC step

u I typically place a “quit” statement (quit;) after each run statement

n Prevents your program from remaining in a do loop or continue running when the program has an error and continues to run potentially infinitely.

IOWA STATE UNIVERSITYDepartment of Animal Science

Arithmetic Operators

u Arithmetic operators indicate that an arithmetic calculation is performed, as shown in the following table:

IOWA STATE UNIVERSITYDepartment of Animal Science

Arithmetic Operators

Operation Symbol Example Result+ addition Num + Num

Example: 5 + 3add two numbers together

- subtraction Num - Num Example: 5 – 3 or can use two variables ending wt. – beginning wt.

subtract the value of 5 -3

* multiplication (table note 1)

2*yAlways have to have * cannot use 2(y) or 2y

multiply 2 by the value of Y

/ division var/5or can use variable weight gain / days on test

divide the value of VAR by 5

** can

also use the ^

exponentiation a**2or a^2

raise A to the second power