sas programming basics - university of north carolina at ...people.uncw.edu/blumj/stt305/ppt/sas...

42
SAS Programming Basics

Upload: domien

Post on 12-Apr-2018

221 views

Category:

Documents


3 download

TRANSCRIPT

SAS Programming Basics

SAS Programs

SAS Programs consist of three major components:

Global statements

Procedures

Data steps

SAS Programs

GlobalStatements Procedures

DataStep

Notes

Data steps and procedures are made up of one or more statements.

All statements end with a semicolon. This is the only restriction, SAS programs are otherwise free-

form—e.g. the following are equivalent:

proc sort data=mysas.fish out=work.fish2;

by lt hg;

run;

proc sort data=mysas.fish out=work.fish2;by lt hg;run;

Notes

The run statement is optional (only the final one is required). The run statement indicates to SAS that

the procedure or data step is complete.

Starting a new data step or procedure is also taken by SAS as an indication that the previous one is complete (nesting of data steps and/or procedures is not allowed).

Using the run statement is generally good practice…

SAS Data Set Basics

SAS data set reference Two levels: libref.data-set

libref is the SAS library (work library is default if libref is omitted)

data-set is the name of the SAS data set

SAS naming conventions Names can be up to 32 characters in length Not case sensitive Must begin with a letter or underscore Contain only letters, numbers or underscores Applies to data sets, variables and libraries, but

libraries are limited to 8 characters.

SAS Data Libraries

A SAS data library is a directory on your computer where SAS data sets are stored or will be stored.

A library reference name (libref) can be assigned using the libname statement:libname libref ‘path-specification’;

The libref must follow naming conventions and the path specified must exist.

SAS Data Set Basics

SAS data sets consist of two portions

Descriptor portion

Data portion

Descriptor portion

Contains various status information

Can be viewed with PROC CONTENTS

The Contents Procedure

To view the descriptor portion of a data set:

proc contents data=libref.filename;

run;

To view data set listing for a particular library:

proc contents data=libref._all_ nods;

run; Keyword todisplay all datasets in library

Suppressdescriptor

portion

Quick Exercise

Assign a library reference to the Orion sub-folder of the SAS Programming Data folder; then run PROC CONTENTS on this library.

SAS Data Set Basics

Data portion

Contains variable names and values

Can be viewed with various procedures or via the explorer.

SAS variables and values are of two types

Numeric

Character

The Print Procedure

The print procedure is primarily used to display raw data.

In its most basic form, it simply sends the data to the output window.

Print does allow for some levels of customization and summarization.

General Syntax

proc print data=SAS-data-set options ;

var variable-list ;

by <descending> variable-list ;

pageby variable-list ;

id variable-list ;

sum variable-list ;

sumby variable-list ;

run;

Invoke the procedurewith options (including

data set)

General Syntax

proc print data=SAS-data-set options ;

var variable-list ;

by <descending> variable-list ;

pageby variable-list ;

id variable-list ;

sum variable-list ;

sumby variable-list ;

run;

Select the set ofvariables (columns) todisplay—default is all

General Syntax

proc print data=SAS-data-set options ;

var variable-list ;

by <descending> variable-list ;

pageby variable-list ;

id variable-list ;

sum variable-list ;

sumby variable-list ;

run;

Group by a particularvariable(s), data set

must be sorted.

descending is optionalhere. In SAS help

optional keywords areenclosed in < >; however,they are not actually typed

when the option is used

General Syntax

proc print data=SAS-data-set options ;

var variable-list ;

by <descending> variable-list ;

pageby variable-list ;

id variable-list ;

sum variable-list ;

sumby variable-list ;

run;

If by variables areused, this statementcan be included toget page breaks.

General Syntax

proc print data=SAS-data-set options ;

var variable-list ;

by <descending> variable-list ;

pageby variable-list ;

id variable-list ;

sum variable-list ;

sumby variable-list ;

run;

Used in conjunctionwith by to alter thedisplay of groups.

General Syntax

proc print data=SAS-data-set options ;

var variable-list ;

by <descending> variable-list ;

pageby variable-list ;

id variable-list ;

sum variable-list ;

sumby variable-list ;

run;

Self explanatory—only applicable tonumeric variables.

General Syntax

proc print data=SAS-data-set options ;

var variable-list ;

by <descending> variable-list ;

pageby variable-list ;

id variable-list ;

sum variable-list ;

sumby variable-list ;

run;

If by variables areused, this will permitaltering sub-totals.

Simple Examples

Try this:proc print data=mysas.fish;

var name elv sa z hg;

run;

Vs. this:proc print data=mysas.fish label;

var name elv sa z hg;

run;

Vs. this:proc print data=mysas.fish label noobs;

var name elv sa z hg;

run;

Most SAS procedureswill use labels

whenever present, PRINT does not.

The observationcolumn can be

removed.

Using BY Processing (and SORT)

Any time a BY statement is used in a procedure, the data set must be sorted on the listed variables. The sort procedure can be used to sort a data set.

Syntax:proc sort data=SAS-data-set <out=SAS-data-set > ;

by <descending > variable1 … <descending > variablek;run;

Sorts by first variable listed, then by the second variable within each group of the first, and so on.

An Example with BY Groups

proc sort data=mysas.fish out=work.fish_sort;

by lt;

run;

title 'Grouped by Lake Type';

proc print data=fish_sort label noobs;

by lt;

var name elv sa z hg;

run;

Normally the sorted data would replace the previous data set,here an output data set must

be specified, why?

Remember, as far as SAS isconcerned, work.fish_sortand fish_sort are the same

data set.

Result (First Page)

Modification

proc print data=fish_sort label noobs;

by lt;

id lt;

var name elv sa z hg;

sum hg;

run;

ID alters the display ofby groups.

Get totals and sub-totalsfor hg variable.

Result (First Page)

Other Statements of Use

PRINT is not a very sophisticated procedure, and we will not use it much. But it is good for illustrating some general concepts.

Titles and Footnotes

General form

titlen ‘title text’;

footnoten ‘footnote text’;

Titles appear at the top of the page, notes at the bottom (default title is The SAS System)

n can be any whole number between 1 and 10 (title is equivalent to title1).

More on Titles and Footnotes

Titles remain in effect for the SAS session unless they are cancelled or changed.

Submission of a titlen statement:

Replaces previous title with same number

Removes all titles with higher number

Submission of a null title statement (title;) cancels all titles.

Above works same for footnotes.

The Label Statement

The label statement applies a label to be written in place of the variable name for display

label variable1=‘label1’ variable2=‘label2’ …;

Labels assigned using the label statement in a procedure are temporary—only in effect for that procedure.

Example

Add the command

label name= ‘Name of Lake/Pond';

inside the PROC PRINT to get…

Output

Where Processing

One can also subset output using a where statement in any procedure (or data step)

where expression;

Procedure only processes records for which the expression is true

Example: Add the following to your last PRINT.

where hg ge 0.5;

Relational Operators

Relation Symbol Mnemonic

Equal = eq

Not Equal ^= ne

Greater Than > gt

Less Than < lt

Greater or Equal To >= ge

Less or Equal To <= le

Compound Conditions

Ex.proc print data=fish_sort label noobs;

where lt ne . and hg ge 0.5;

by lt;

id lt;

var name elv sa z hg;

run;

Compound Conditions

Ex.proc print data=fish_sort label noobs;

where lt ne . and hg ge 0.5;

by lt;

id lt;

var name elv sa z hg;

run;

The keywords and, or are available along with parentheses to help you set compound conditions.

Result

Other Operators

Between; example:

proc print data=fish_sort label noobs;

where hg between 0.3 and 0.5;

by lt;

id lt;

var name elv sa z hg;

run;

In; example:

proc print data=fish_sort label noobs;

where lt in (1,3);

by lt;

id lt;

var name elv sa z hg;

run;

General System Options

Options statement

options SAS-system-options;

Option linesize=n

ls=n

pagesize=n

ps=n

date

nodate

number

nonumber

pageno=n

Function Sets the number of characters per line for output.

(and log)

Sets the number of lines per page for output.

(and log)

Turns printing of date and time on/off.

Turns printing of page numbers on/off.

Specifies beginning page number for output.

Exercises

Create the following output from the “fish” data set, grouped on both the dam and lt variables (first page shown):

Exercises

Modify the previous to produce what follows (use a where statement):

Exercises

Use the “projects” data set to create:

Exercises

Since these variables do not have labels, modify the code to include them as follows: