kaz sas manual basic

31
Kaz SAS Kaz’s SAS manual BASIC Version 11/12/2004 by Kazuaki Uekawa, Ph.D. www.estat.us [email protected] Copyright © 2002 By Kazuaki Uekawa All rights reserved. 1

Upload: nagendra377

Post on 28-Nov-2014

129 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Kaz SAS Manual Basic

Kaz SAS

Kaz’s SAS manual

BASIC

Version 11/12/2004

by Kazuaki Uekawa, Ph.D.

www.estat.us

[email protected]

Copyright © 2002 By Kazuaki Uekawa All rights reserved.

1

Page 2: Kaz SAS Manual Basic

Kaz SAS

Profile:

Kazuaki (Kaz) Uekawa, Ph.D.

I was born in Hiroshima, Japan, and went to college at Doshisha University in Kyoto and worked for a few

years as a high schoolteacher in Osaka. In 2000 I got my doctorate in Sociology at the University of

Chicago. I learned SAS while working for a research project lead by Charles Bidwell and Anthony Bryk,

both Sociology professors at the U of C. Currently I am a research analyst at a non-profit research

organization located in Washington DC. From my window, I can see the Washington Monument.

Table of Contents

I. Basic Operations..........................................................................................................................4

1. Ask questions to SAS by emailing support@sas.com......................................................................4

2. How do I start and what mini-windows do I look at?.................................................................5

3. How do I look at data sets?.......................................................................................................6

4. Assigning library name and create folders................................................................................7

5. How do we create SAS data?....................................................................................................9

A) Create a SAS data using a SAS syntax...............................................................................9

B) Create SAS data Via. Ms-Excel Sheets..............................................................................10

C) Create a SAS data set via an external text file.................................................................11

6. Examples of data steps...........................................................................................................12

7. Manipulating variables in data steps......................................................................................13

8. Lots of manipulation techniques to be used in a data step.....................................................16

9. Using Character Functions to create new variables................................................................17

10. Application: How do we restrict analytical samples using NMISS function...........................17

II. Procedures.................................................................................................................................18

11. PROC CONTENTS: Description of Contents..........................................................................18

12. PROC PRINT: See Data.........................................................................................................19

13. PROC SORT: Sorting Observations based on a value of variable..........................................19

14. PROC MEANS: Get Descriptive Statistics (Mean, STD, Min, Max).........................................20

2

Page 3: Kaz SAS Manual Basic

Kaz SAS

15. PROC FREQ: Get Frequencies...............................................................................................21

16. PROC UNIVARIATE: Get elaborate statistics and a univariate plot........................................21

17. PROC PLOT: Plotting Two Variables.......................................................................................22

18. PROC TIMEPLOT: Time Plot...................................................................................................22

19. PROC CORR: Correlation......................................................................................................22

20. PROC REG: OLS Regression.................................................................................................23

21. PROC LOGISTIC: Logistic Regression....................................................................................23

22. MAKE AN ASCHI FILE............................................................................................................23

III. More Procedures.....................................................................................................................24

23. PROC STANDARD: Standardize Values.................................................................................24

24. PROC RANK: Rank observations...........................................................................................25

25. PROC SQL: Creating group-level mean variables.................................................................25

26. PROC IMPORT.......................................................................................................................26

IV. Merging Data Sets..................................................................................................................27

V. MACROs.....................................................................................................................................28

27. Most common way of using a macro...................................................................................28

28. Simple macro using LET statement.....................................................................................28

29. Macro can be specified from data (not directly by you).......................................................28

3

Page 4: Kaz SAS Manual Basic

Kaz SAS

I. Basic Operations

1. Ask questions to SAS by emailing [email protected]

When you have a question about SAS, you can email SAS institutes’ technical support team. The address is

[email protected]. At the beginning of your email content, you copy the information you get at the head of

your log file. The log file is a file that you get when you run SAS. It looks like this:

NOTE: Copyright (c) 1999-2001 by SAS Institute Inc., Cary, NC, USA.

NOTE: SAS (r) Proprietary Software Release 8.2 (TS2M0)

Licensed to UNIVERSITY OF XXXXX, Site XXXXX.

NOTE: This session is executing on the WIN_ME platform.

I developed my SAS skills mostly by communicating with SAS tech team.

I often use GOOGLE to get answers to my questions. I think SAS’s help menu is not very easy to understand

because they don’t always show you the best examples.

4

Page 5: Kaz SAS Manual Basic

Kaz SAS

2. How do I start and what mini-windows do I look at?

In Windows, you can activate SAS by going to START ALL PROGRAMThe SAS System. Confirm

that you get three windows.

1. Editor file. This is where you write your syntax.

2. Log file. This file shows your errors.

3. Output file. You get results in this window.

5

Click Explorer to look at the data sets. See next page on this.

Click this man to run your program.

Click on this ! mark to cancel when the program is running.

Page 6: Kaz SAS Manual Basic

Kaz SAS

3. How do I look at data sets?

This syntax (you type in into the editor file) gets you an example data to look at.

data abcd;

set sashelp.Prdsale;

run;

You can look at the data set in this way if you follow the four steps below.

To get the view above where you can examine the data, follow the following steps.

6

1. ClickExplorer

2. Click Libraries

3. ClickWork or other folders.

4. ClickThe data set.

Notes:I look at the data sets to check if there is anything wrong with it. You must close the data sets before you run anything else if the syntax you wrote affects the data set.

Look closely if there is any irregularity in data.

Page 7: Kaz SAS Manual Basic

Kaz SAS

4. Assigning library name and create folders

You need a libname statement at the head of your SAS programs. With these, you assign nick names (library

name) to indicate folders that host your SAS data sets. For example:

libname here "C:\TEMP";

libname there "C:\";

Running above creates two folders “here” and “there” in the libraries in the explorer’s view as you see in the

picture below (See previous page to see how to get to this view).

Imagine there is a data set called MYDATA and it is in C: \

TEMP. You can create it in this way:libname here "C:\TEMP";

data here.MYDATA;

X=1;

run;

This silly data has one observation, which is X whose value is

1. Because you decided to call that folder by a nickname

HERE, you will be referring to the data set as “here.MYDATA.”

For example, to print the contents of that data, you will do

this:

proc print data=here.MYDATA;

run;

To see what variables are in the data, do this:

proc contents data=here.MYDATA;

run;

What are other folders? Sashelp hosts lots of data sets that SAS institutes

ship with the SAS software for demonstration’s sake. I have never opened

Sasuser or Maps. “Work” hosts temporary data sets that you create as you

program in SAS. Temporary data sets disappear if you close your SAS

program. Permanent data sets, on the other hand, are the data sets you create to keep even after you turn quit

SAS. Next page elaborate eon these things.

7

Page 8: Kaz SAS Manual Basic

Kaz SAS

Here are some silly example syntax to show you what the folders do and

what temporary and permanent data sets are.

/*libname statements just need to occur at the

beginning of the syntax file*/

libname here "C: \TEMP";

libname there "C:\";

/*this creates a data called Wally in WORK folder*/

data Wally;

x=1;

y=2;

z=3;

run;

/*this creates a data called ABC in HERE folder*/

data here.Wally;

x=4;

y=5;

z=6;

run;

/*this creates a data called ABC in THERE folder*/

data there.Wally;

x=7;

y=8;

z=9;

run;

/*Use proc print to see the content of the data sets*/

proc print data=work.ABC;

run;

proc print data=here.ABC;

run;

proc print data=there.ABC;

run;

8

Click on these folders to find different “Wally” data sets.

The following would do the same:proc print;run;(when data is not specified, SAS just uses whatever data it sees right before the syntax.)

proc print data=ABC;run;(“work.” can be omitted in this way. I always omit it.)

Page 9: Kaz SAS Manual Basic

Kaz SAS

5. How do we create SAS data?

A) Create a SAS data using a SAS syntax

Of course you can create data in your syntax.

libname here "C:\";

data kaz;

input ID 1 SEX $ 4-9 height 13-15 ;

cards;

1 Male 170

2 Female 165

3 Male

4 Male 168

5 Female 170

;

run;

proc print;

run;

9

When a value is missing, it is safe to enter a dot instead of leaving it empty in this way. But empty is also okay because INPUT LINE explicitly is telling SAS where to find values for each variable (e.g., height 13-15).

After creating a data set, you want to see the data to see if there is anything wrong. Because this is a small data set, you can do PROC PRINT to print it on your output window. The other useful way is to click on the actual SAS data to see the content. I explained it earlier.

Page 10: Kaz SAS Manual Basic

Kaz SAS

B) Create SAS data Via. Ms-Excel Sheets

This uses the first row for variable names.

Then use this syntax to import the excel sheet

(C:\mary.xls) as a SAS data set (JOHN):

PROC IMPORT OUT= JOHN

DATAFILE= "C:\mary.xls"

DBMS=EXCEL2000 REPLACE;

RUN;

/*This one ignores variable names.

It also specifies the sheet from

which to take data*/

PROC IMPORT OUT= JOHN

DATAFILE= "C:\mary.xls"

DBMS=EXCEL2000 REPLACE;

GETNAMES=NO;SHEET=”Sheet1”;

RUN;

Be sure to close the excel sheet when you run the syntax to import it. Otherwise, you

get this error message:

ERROR: File _IMEX_.'Sheet1$'n.DATA does not exist.

ERROR: Import unsuccessful. See SAS Log for details.

NOTE: The SAS System stopped processing this step because of errors.

10

Page 11: Kaz SAS Manual Basic

Kaz SAS

C) Create a SAS data set via an external text file

Imagine you have a text file (say, kaz.txt) that looks like this in your C temp folder.

If you know where the data points are exactly in the data, you can indicate the locations in the

following way.

data kaz;

infile "C:\TEMP\kaz.txt" ;

input ID 1 SEX $ 4-9 height 13-15 ;

run;

proc print;

run;

If character variable is just one word (e.g., Male), then we don’t really need to tell SAS about exact

locations. SAS will consider each block of words or numbers as one value. But you need to say

“missover,” so in case SAS won’t encounter a value (as in the third observation in this data set) at

an expected place, it will consider it as a missing value. If a character variable contains more than

one word, then use the method above instead of below.

libname here "C:\TEMP";

data kaz;

infile "C:\TEMP\kaz.txt" missover;

input ID SEX $ height ;

run;

proc print;

run;

11

missover: when data are missing, SAS will treat them as missing values

It’s okay for a value to be missing. Dot “.” is often used to indicate a missing value, though. It is safer that way.

$ indicates that SEX is a character variable. SAS always needs to know if it is a character variable or a numeric variable.

Page 12: Kaz SAS Manual Basic

Kaz SAS 12

Page 13: Kaz SAS Manual Basic

Kaz SAS

Data Steps and Creating New Variables

6. Examples of data steps

Any SAS program consists of two elements. One is DATA STEPS and the other is PROCs (such as

proc print or proc means). I discuss data steps in this chapter. I show you some variations of data

steps, so you understand them by examples.

libname here "C:\TEMP";

libname there "C:\";

data xyz;

set abc;

/*here manipulation of data */

run;

data abc;

set abc;

/*here manipulation of data */

run;

data xyz;

set here.abc;

/*here manipulation of data */

run;

data here.abc;

set xyz;

/*here manipulation of data */

run;

data there.abc;

set here.abc;

/*here manipulation of data */

run;

13

I am creating a new temporary data set XYZ (to be found in the WORK folder) based on an already exisiting temporary data set called ABC (found in the WORK folder).

I am creating a new temporary data set ABC (to be found in the WORK folder) based on an already existing temporary data set called ABC (found in the WORK folder). The latter ABC will be overridden by a new data ABC. This is perfectly okay.

I am creating a new temporary data set XYZ based on an already existing permanent data set called ABC (found in the HERE folder, which is C:\TEMP).

I am creating a new permanent data set ABC in the HERE folder (which is C:\TEMP) based on an already existing temporary data set called XYZ.

I am creating a new permanent data set ABC in the THERE folder (which is C:\TEMP) based on an already existing permanent data set called ABC in the HERE folder (which is C:\).

Reminder:Temporary data sets: Found in WORK folder. They disappear when a session ends..Work folder: Click on Explorer Click on LIBRARIES Click on WORKThe HERE folder and THERE folder: HERE and THERE are the arbitrary names that I assigned by giving LIBNAME statement. They refer to paths that I specified.

Page 14: Kaz SAS Manual Basic

Kaz SAS

7. Manipulating variables in data steps

We use a SAS sample data set sashelp.Class (a data set called Class stored in

SASHELP folder) to practice creating new variables. Do this to find out

what this data set has:

proc contents data=sashelp.Class;

run;

You get information below, telling you that the data set has AGE, Height,

Name, SEX, and weight.

# Variable Type Len Pos

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

3 Age Num 8 0

4 Height Num 8 8

1 Name Char 8 24

2 Sex Char 1 32

5 Weight Num 8 16

Here is a sample of how you can work on this data set to create Body Mass Index, as well as other useful

variables.

data ABC;

set sashelp.Class;

*Creating a character variable indicating a person's BMI status (Body

Mass Index);

weight_metric=weight*0.45359237;

height_metric=(height* 2.54)/100 ;

BMI=weight_metric/(height_metric**2);

/*Definition of obesity Normal weight = 18.5-24.9

Overweight = 25-29.9 Obesity = BMI of 30 or greater */

length status $ 15;

If BMI < 18.5 then status="Underweight";

If BMI => 18.5 and BMI < 25 then status="Normal";

If BMI => 25 and BMI < 30 then status="Overweight";

If BMI >= 30 then status="Obese";

run;

14

You always need to create a new data to create new variables.

Without length statement, SAS would set the length of character to the first value it encounters, which would be “Underweight” in this case.

Page 15: Kaz SAS Manual Basic

Kaz SAS

8. Lots of manipulation techniques to be used in a data step

data abc;

set sashelp.Class;

var1=height+weight;

var2=sum(of height weight);

var3=weight-height;

var4=height*weight;

var5=height/weight;

var6=1/(height+weight);

var7=mean(of height weight);

var7B=mean(height, weight);/*this way is okay too*/

var8=max(of height weight);

var9=min(of height weight);

var10=log(height);

var11=abs(var3); /*Absolute values: this takes out negative signs*/

var12=nmiss(of height weight);/*N of missing cases*/

var13=n(of height weight); /*N of observations*/

run;

proc print;

run;

How is Z=mean(of X1 X2 X3) different from Z=(X1+X2+X3)/3;?

How is Z=sum(of X1 X2 X3) different from Z=X1+X2+X3;?

Functions, such as mean(of …) or sum (of …), take statistics of non-missing values. They do return

values even when some of the variables in the brackets are missing. For example, if X1 is missing:

X=mean (of X1 X2 X3); will return the average of X2 and X3.

In contrast,

X=(X1+X2+X3)/2 will return a missing value, namely, “.”

15

Page 16: Kaz SAS Manual Basic

Kaz SAS

9. Using Character Functions to create new variables

data abc;

set sashelp.Class;

var1=name||sex;

var2=compress(name||sex);/*COMPRESS gets rid of space in between*/

var3=substr(name,1,3);/*take the first 3 letters starting from the first

letter*/

var4=upcase(name);/*upper case*/

run;

proc print;

run;

10. Application: How do we restrict analytical samples using NMISS function

Imagine we are running several procedures on your data. We want to always be using the same

number of observations, but sometimes depending on the pattern of missing values, it is hard to

use the same data for each procedure. Here is a way to force your sample to be the same, by

making sure that you are using a set of variables that does not have any missing values. (I forget

what pairwise deletion and listwise deltion meant, but this is to do one of them, which is a stricter

way of selecting cases.)

Use NMISS function to create a new variable john.

data ABC;

set sashelp.class;

if name=”Janet” then height=.; /*just imagine Janet was missing a value for her height*/

X=nmiss(of height weight);/*this returns the number of missing cases*/

run;

proc means data=ABC;

where X=0; /*Run only when X=0, namely, number of missing cases is 0*/

var weight height;

run;

proc reg data=ABC;

where X=0; /*Run only when X=0, namely, number of missing cases is 0*/

model height=weight;

16

Page 17: Kaz SAS Manual Basic

Kaz SAS

run;

17

Page 18: Kaz SAS Manual Basic

Kaz SAS

II. Procedures

11. PROC CONTENTS: Description of Contents

Data ABC;set sashelp.Prdsale;

run;

/*1111111111111111111111111*/

/*simple way*/

proc contents data=ABC;

run;

/*I like "position option" because it gives me a table that is sorted by

the position of variables in the data, in addition to alphabetically

sorted table*/

proc contents data=ABC position;

run;

/*Easiest way to produce RTF or EXCEL documents off PROC CONTENTS*/

/*but I don't like this way because it comes with too many details*/

ods rtf file ="C:\TEMP\datadictionary1.rtf";

proc contents data=ABC position;

run;

ods rtf close;

ods html file ="C:\TEMP\datadictionary1.xls";

proc contents data=ABC position;

run;

ods html close;

/*Using ODS we get only the data we want.*/

proc contents data=ABC position;

ods output position=whatever_name_you_want ;

run;

ods rtf file ="C:\TEMP\datadictionary2.rtf";

proc print data=whatever_name_you_want noobs;

title "data dictionary in RTF";

var variable label ;

run;

ods rtf close;

18

Page 19: Kaz SAS Manual Basic

Kaz SAS

12. PROC PRINT: See Data

proc print data=sashelp.class;

run;

proc print data=sashelp.class;

VAR name weight height sex;

run;

proc print data=sashelp.class round noobs;

where sex="M";

VAR name weight height sex;

run;

proc sort data=sashelp.class out=kaz;

by height;

run;

proc print data=kaz (obs=5);

title "Observation sorted by height. Also show only five shortest

people";

var name height sex;

run;

19

Page 20: Kaz SAS Manual Basic

Kaz SAS

I have cleaned up this document up to here. I am still working on the

rest.

The rest of this manual is based on this data set:

http://www.estat.us/sas/kazclass.txt

Download the digital version of this document and cut and paste the following data. The data comes from

TIMSS (Third International Mathematics and Science Survey). MAT7 is 7th graders’ and MAT8 is 8th

graders’ nation-mean mathematics score. NATEXAM is 1 when a nation has a national examination system,

NATTEXT is 1 if a nation decides on textbooks at the national-level, and NATSYLB is 1 when a nation

decides on syllabus at the national-level. Block is a geographical area. PROP is a proportion of kids in

middle school.data kaz;

input

acro $ NATION $ 6-14 NAME $ 15-33 MAT7 MAT8 GNP14 PROP NATEXAM NATSYLB NATTEXT block

$;

cards;

aus Australi Australia 498 529.63 -0.15526 84 0 1 0 ocea

aut Austria Austria 509 539.43 -0.29163 100 0 0 1 weuro

bfl Belgi_FL Belgium (Fl) 558 565.18 -0.25157 100 1 1 0 weuro

bfr Belgi_FR Belgium (Fr) 507 526.26 -0.25157 100 0 1 0 weuro

can Canada Canada 494 527.24 0.07184 88 0 0 0 namer

col Colombia Colombia 369 384.76 -0.23699 62 0 1 0 samer

cyp Cyprus Cyprus 446 473.59 -0.41906 95 0 1 1 seuro

csk Czech Czech Republic 523 563.75 -0.34840 86 0 1 0 eeuro

dnk Denmark Denmark 465 502.29 -0.34057 100 1 0 0 weuro

fra France France 492 537.83 0.55791 100 0 1 0 weuro

deu Germany Germany 484 509.16 0.91992 100 0 0 0 weuro

grc Greece Greece 440 483.90 -0.32620 99 0 1 1 seuro

hkg HongKong Hong Kong 564 588.02 -0.31638 98 1 1 1 seasia

hun Hungary Hungary 502 537.26 -0.37602 81 0 0 0 eeuro

isl Iceland Iceland 459 486.78 -0.42606 100 0 0 0 neuro

irn Iran Iran, Islamic Rep. 401 428.33 -0.17095 66 0 1 1 meast

irl Ireland Ireland 500 527.40 -0.38919 100 1 1 0 weuro

isr Israel Israel . 521.59 -0.35464 87 0 1 0 meast

jpn Japan Japan 571 604.77 1.85543 96 0 1 0 seasia

kor Korea Korea 577 607.38 -0.01168 93 0 1 1 seasia

kwt Kuwait Kuwait . 392.18 -0.40359 60 0 1 1 meast

lva Latvia Latvia (LSS) 462 493.36 -0.42319 87 0 0 0 eeuro

20

Page 21: Kaz SAS Manual Basic

Kaz SAS

ltu Lithuani Lithuania 428 477.23 -0.41785 78 1 1 1 eeuro

nld Netherla Netherlands 516 540.99 -0.18184 93 1 0 0 weuro

nzl NewZeala New Zealand 472 507.80 -0.38319 100 1 1 0 ocea

nor Norway Norway 461 503.29 -0.35450 100 0 1 1 neuro

prt Portugal Portugal 423 454.45 -0.32588 81 0 1 0 weuro

rom Romania Romania 454 481.55 -0.35396 82 1 1 1 eeuro

rus RussianF Russian Federation 501 535.47 0.12827 88 1 0 0 eeuro

sco Scotland Scotland 463 498.46 0.48017 100 0 0 0 weuro

sgp Singapor Singapore 601 643.30 -0.37279 84 1 1 1 seasia

slv SlovakRe Slovak Republic 508 547.11 -0.40217 89 0 1 0 eeuro

svn Slovenia Slovenia 498 540.80 -0.41310 85 0 1 1 eeuro

esp Spain Spain 448 487.35 0.03461 100 0 1 1 weuro

swe Sweden Sweden 477 518.64 -0.30049 99 0 1 0 neuro

che Switzerl Switzerland 506 545.44 -0.27916 91 0 0 0 weuro

tha Thailand Thailand 495 522.37 -0.14533 37 0 1 1 seasia

usa USA United States 476 499.76 5.37506 97 0 0 0 namer

;run;proc print;run;

13. PROC SORT: Sorting Observations based on a value of variable

You would be using this procedure a lot, but be careful with large data set. This procedure

consumes lots of computation time.

PROC SORT data=kaz out=kaz2;

/*If you don’t want to create a new data set, just write “out=kaz”*/

by mat8;

run;

Advanced topics:

proc sort data=kaz out=kaz2 nodupkey;

by block;

run;

proc print data=kaz2;run;

This takes only the first observation of each block. Imagine that you have data where there are

individual level variable (e.g., 100 students) and group level variable (e.g., 10 schools). Imagine

you want to get school level information from this data. Above procedure would take just the first

observation of each school and gets you ten lines of data for 10 schools. Ignore individual-level

21

Page 22: Kaz SAS Manual Basic

Kaz SAS

variables, however.

You can use more than one variable in by line.

proc sort data=kaz out=kaz2;

by natexam block;

run;

/*How would the new data look like?*/

proc print data=kaz2;run;

14. PROC MEANS: Get Descriptive Statistics (Mean, STD, Min, Max)

PROC MEANS data=kaz;

VAR mat7 mat8;

run;

Advanced topic: Group means.

/*Report group means*/

proc sort data=kaz out=kaz2;by block;run;

proc means data=kaz2;

by block;

var mat7 mat8;

run;

You can also use “class” statement instead of “by” statement. Class statement is easier because you don’t

need to sort the data by the by-variable before it. I forgot what the downside of it was.

proc means data=kaz2; /*now, kaz2 does not have to be sorted by block*/

class block;

var mat7 mat8;

run;

/*Save group means*/

ods listing close; /*printing of results suppressed*/

proc means data=kaz2; /*make sure kaz2 is already sorted by group ID*/

by block;

22

Page 23: Kaz SAS Manual Basic

Kaz SAS

var mat7 mat8;

ods output summary=john; /*Output Delivery System Used. See SAS manual 2*/

run;

ods listing on; /*printing of results resumed*/

proc print data=john;

run;

/*Get standard errors by adding STDERR*/

/*But it would only get standard error, so you must add other statistics you would like with it. Specify mean,

N, STD, MAX, and MIN*/

PROC MEANS data=kaz mean n std max min stderr;

VAR mat7 mat8;run;

run;

I recommend reading a chapter on PROC MEANS in SAS CD-online. It is a very versatile procedure.

15. PROC FREQ: Get Frequencies

PROC FREQ data=kaz;

Tables natexam ;

Run;

Advanced topics:

Get cross tabulation:

PROC FREQ data=kaz;

tables natexam*block;

run;

16. PROC UNIVARIATE: Get elaborate statistics and a univariate plot

PROC UNIVARIATE PLOT DATA=KAZ;

var mat7 mat8 gnp14;

run;

Advanced topic:Get a whisker plot by sub groups, so you can compare group values. But the output is text-

based and pretty ugly.

proc sort data=kaz out=kaz2;

by block;

run;

PROC UNIVARIATE data=kaz2 plot;

23

Page 24: Kaz SAS Manual Basic

Kaz SAS

by block;

var mat8;

run;

17. PROC PLOT: Plotting Two Variables

This is text-based graph. Use proc gplot for a nicer graphic.

PROC PLOT data=KAZ;

Plot mat7*mat8;

run;

18. PROC TIMEPLOT: Time Plot

proc timeplot data=KAZ;

plot mat8= '*';

id NAME;

run;

Advanced topics:

/*Sort first by the variable of your interest and see it*/

/*you will be seeing a ranking of nations*/

proc sort data=kaz out=kaz2;

by mat8;

run;

proc timeplot data=KAZ2;

plot mat8= '*';

id NAME;

run;

Add bells and whistles. Below, I am asking, “Does GNP has anything to do with test score?

/*First sort by GNP*/

proc sort data=kaz out=kaz2;

by gnp14;

run;

proc timeplot data=KAZ2;

title “TIMSS countries sorted by GNP”;

plot mat7 mat8/overlay hiloc npp ;

id NAME block gnp14 prop;

run;

19. PROC CORR: Correlation

PROC CORR DATA=KAZ;

24

Page 25: Kaz SAS Manual Basic

Kaz SAS

VAR mat7 mat8 gnp14;

Run;

20. PROC REG: OLS Regression

PROC REG DATA=KAZ;

MODEL mat8=natexam gnp14;

Run;

Advanced Topic:

http://www.estat.us/sas/OLS%20tables%20for%20learning.txt

21. PROC LOGISTIC: Logistic Regression

/*I don’t know if natexam can be considered a dependent variable, but for the sake of demonstration*/

PROC logistic data=kaz descend;

Model natexam=gnp14;

run;

/*option descend makes sure that RROC LOGISTIC is modeling the probability that the outcome=1.

Without this option, it would model the probability that the outcome=0*/

22. MAKE AN ASCHI FILE

To use a stand-alone software program, you may have to create a simple aschi file. But I rarely use this lately

because many software read SAS data directly.

data timss;set kaz;

file "aschi_example.txt";

put (nation) (10.0) (mat7 mat8) (8.0);

run;

25

Page 26: Kaz SAS Manual Basic

Kaz SAS

III. More Procedures

23. PROC STANDARD: Standardize Values

Make Z-score with a mean of 0 and standard deviation of 1

proc standard data=kaz out=kaz2 mean=0 std=1;

var mat7 mat8;

run;

/*then see what you did*/

proc print data=kaz2;

run;

Advanced technique: Standardize within groups.

/*First sort by group ID*/

proc sort data=kaz out=kaz2;

by block;

run;

/*Use by statement*/

proc standard data=kaz2 out=kaz3 mean=0 std=1;

by block;

var mat7 mat8;

run;

26

Page 27: Kaz SAS Manual Basic

Kaz SAS

24. PROC RANK: Rank observations

proc rank data=kaz out=kaz2 group=3;

/*Creates 3 groups. The new values will be 0, 1, and 2. */

var mat7 mat8;

RANKS Rmat7 Rmat8;

/*give names to the new variables*/

Run;

/*see what happened*/

proc print data=kaz2;

var mat7 Rmat7 mat8 Rmat8;

RUN;

Research Tip:

Why do we use rank?

a. We can split the sample based on the rank. e.g., high SES student sample versus low SES student sample.

b. We can create dummy variables quickly by specifying group=2. e.g., high SES student will receive 1; else

0. This grouping occurs at the median point of a variable, which may or may not be always the best strategy.

Alternative way is to assign 1 and 0 based on some meaningful threshold. For example, I have temperature

data, I may use a medium point to split the data if it makes sense, but maybe I use 0 degree (Freezing point)

as a meaningful point to split the data instead.

25. PROC SQL: Creating group-level mean variables

One could use proc means to derive group-level means. I don’t recommend this since it involves extra steps

of merging the mean data back to the main data set. Extra steps always create rooms for errors. PROC SQL

does it at once.

proc sql;

create table kaz2 as

select *,

mean(mat7) as mean_mat7,

mean(mat8) as mean_mat8,

mean(gnp14) as mean_gnp

from kaz

group by block;

run; /*proc sql does not really require run statement, but for the sake of consistency*/

27

Page 28: Kaz SAS Manual Basic

Kaz SAS

proc print data=kaz2;

run;

26. PROC IMPORT

Since you learned proc export, why not learn proc import. You can read excel data into SAS by this.

For an experiment, create an excel sheet in C drive and import it into SAS using the following code.

PROC IMPORT OUT= mine

DATAFILE= "C:\example.xls"

DBMS=EXCEL2000 REPLACE;

GETNAMES=YES;

RUN;

proc print data=mine;

run;

28

Page 29: Kaz SAS Manual Basic

Kaz SAS

IV. Merging Data Sets

libname here “C:\”;

/*Create two data sets A and B.*/

data A;

set kaz; /*I am assuming that you already have this data set “kaz” */

keep nation mat7;

run;

data B;

set kaz;

keep nation mat8;

run;

/*MERGE DATA SETS*/

/*First sort them by a common ID*/

/*Here they are already sorted, so the following two lines are not really necessary*/

proc sort data=A;by nation;run;

proc sort data=B;by nation;run;

data NEW;

merge A B;

by nation;

run;

/*Confirm*/

proc print data=NEW;

run;

29

Page 30: Kaz SAS Manual Basic

Kaz SAS

V. MACROs

27. Most common way of using a macro

%macro john (group=,var1=,var2=);

proc means data=sashelp.class;

class &group;

var &var1;

run;

%mend john;

%john(group=sex,var1=height weight);

%john(group=sex,var1=age height);

28. Simple macro using LET statement

%let john=height weight;

proc reg data=sashelp.class;

title "&john ";

model weight=height;

run;

29. Macro can be specified from data (not directly by you)

data X;

x="John's phone Number";

y="312-234-3999";

run;

data x2;

set x;

call symput ("example", x);

run;

proc print data=x2;

title "&example";

var y;

run;

30