statistical analysis system sas. (statistical analysis system) it was developed by james good...

94
Statisti cal Analysis System SAS

Upload: annabelle-lane

Post on 03-Jan-2016

219 views

Category:

Documents


5 download

TRANSCRIPT

Statistical

Analysis System

SAS

SAS (Statistical Analysis System)

It was developed by James Good knight

1970-It was package1980-Language 1990-Software

SAS

Technical Techno Functional Functional

| | |

SASBASE SASWare house admin SASStat

SASMACROrsquoS SASETL Studio SASGraph

SASACCESS SASOLAP SASOR

SASAF

In which domains SAS can be used

CLINICAL

BANKING

INSURANCE

INTRODUCTION TO THE SAS SYSTEM

SAS is an integrated system of software solutions that enables you to perform thefollowing tasks

data entry retrieval and managementreport writing and graphics designstatistical and mathematical analysisbusiness forecasting and decision supportapplications development

Base SAS software provides you with essential tools for the basic data-driven tasks that you commonly perform as a programmer

Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system

In almost any format including raw data SAS data sets and files created by other vendors software

Managing Data After you have accessed your data you can use the SAS programming language to manipulate it

Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information

Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships

Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables

create a variety of charts and plots

compute a variety of descriptive statistics including the mean sum variance standard deviation and more

compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics

Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets

SAS WINDOW ENVIRONMENT

Five windows in SAS

1Editor window2Output window3Log window4Result window5Explorer window

1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

SAS (Statistical Analysis System)

It was developed by James Good knight

1970-It was package1980-Language 1990-Software

SAS

Technical Techno Functional Functional

| | |

SASBASE SASWare house admin SASStat

SASMACROrsquoS SASETL Studio SASGraph

SASACCESS SASOLAP SASOR

SASAF

In which domains SAS can be used

CLINICAL

BANKING

INSURANCE

INTRODUCTION TO THE SAS SYSTEM

SAS is an integrated system of software solutions that enables you to perform thefollowing tasks

data entry retrieval and managementreport writing and graphics designstatistical and mathematical analysisbusiness forecasting and decision supportapplications development

Base SAS software provides you with essential tools for the basic data-driven tasks that you commonly perform as a programmer

Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system

In almost any format including raw data SAS data sets and files created by other vendors software

Managing Data After you have accessed your data you can use the SAS programming language to manipulate it

Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information

Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships

Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables

create a variety of charts and plots

compute a variety of descriptive statistics including the mean sum variance standard deviation and more

compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics

Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets

SAS WINDOW ENVIRONMENT

Five windows in SAS

1Editor window2Output window3Log window4Result window5Explorer window

1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

SAS

Technical Techno Functional Functional

| | |

SASBASE SASWare house admin SASStat

SASMACROrsquoS SASETL Studio SASGraph

SASACCESS SASOLAP SASOR

SASAF

In which domains SAS can be used

CLINICAL

BANKING

INSURANCE

INTRODUCTION TO THE SAS SYSTEM

SAS is an integrated system of software solutions that enables you to perform thefollowing tasks

data entry retrieval and managementreport writing and graphics designstatistical and mathematical analysisbusiness forecasting and decision supportapplications development

Base SAS software provides you with essential tools for the basic data-driven tasks that you commonly perform as a programmer

Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system

In almost any format including raw data SAS data sets and files created by other vendors software

Managing Data After you have accessed your data you can use the SAS programming language to manipulate it

Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information

Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships

Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables

create a variety of charts and plots

compute a variety of descriptive statistics including the mean sum variance standard deviation and more

compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics

Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets

SAS WINDOW ENVIRONMENT

Five windows in SAS

1Editor window2Output window3Log window4Result window5Explorer window

1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

In which domains SAS can be used

CLINICAL

BANKING

INSURANCE

INTRODUCTION TO THE SAS SYSTEM

SAS is an integrated system of software solutions that enables you to perform thefollowing tasks

data entry retrieval and managementreport writing and graphics designstatistical and mathematical analysisbusiness forecasting and decision supportapplications development

Base SAS software provides you with essential tools for the basic data-driven tasks that you commonly perform as a programmer

Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system

In almost any format including raw data SAS data sets and files created by other vendors software

Managing Data After you have accessed your data you can use the SAS programming language to manipulate it

Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information

Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships

Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables

create a variety of charts and plots

compute a variety of descriptive statistics including the mean sum variance standard deviation and more

compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics

Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets

SAS WINDOW ENVIRONMENT

Five windows in SAS

1Editor window2Output window3Log window4Result window5Explorer window

1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

INTRODUCTION TO THE SAS SYSTEM

SAS is an integrated system of software solutions that enables you to perform thefollowing tasks

data entry retrieval and managementreport writing and graphics designstatistical and mathematical analysisbusiness forecasting and decision supportapplications development

Base SAS software provides you with essential tools for the basic data-driven tasks that you commonly perform as a programmer

Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system

In almost any format including raw data SAS data sets and files created by other vendors software

Managing Data After you have accessed your data you can use the SAS programming language to manipulate it

Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information

Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships

Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables

create a variety of charts and plots

compute a variety of descriptive statistics including the mean sum variance standard deviation and more

compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics

Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets

SAS WINDOW ENVIRONMENT

Five windows in SAS

1Editor window2Output window3Log window4Result window5Explorer window

1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Base SAS software provides you with essential tools for the basic data-driven tasks that you commonly perform as a programmer

Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system

In almost any format including raw data SAS data sets and files created by other vendors software

Managing Data After you have accessed your data you can use the SAS programming language to manipulate it

Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information

Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships

Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables

create a variety of charts and plots

compute a variety of descriptive statistics including the mean sum variance standard deviation and more

compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics

Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets

SAS WINDOW ENVIRONMENT

Five windows in SAS

1Editor window2Output window3Log window4Result window5Explorer window

1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system

In almost any format including raw data SAS data sets and files created by other vendors software

Managing Data After you have accessed your data you can use the SAS programming language to manipulate it

Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information

Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships

Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables

create a variety of charts and plots

compute a variety of descriptive statistics including the mean sum variance standard deviation and more

compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics

Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets

SAS WINDOW ENVIRONMENT

Five windows in SAS

1Editor window2Output window3Log window4Result window5Explorer window

1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Managing Data After you have accessed your data you can use the SAS programming language to manipulate it

Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information

Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships

Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables

create a variety of charts and plots

compute a variety of descriptive statistics including the mean sum variance standard deviation and more

compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics

Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets

SAS WINDOW ENVIRONMENT

Five windows in SAS

1Editor window2Output window3Log window4Result window5Explorer window

1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships

Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables

create a variety of charts and plots

compute a variety of descriptive statistics including the mean sum variance standard deviation and more

compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics

Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets

SAS WINDOW ENVIRONMENT

Five windows in SAS

1Editor window2Output window3Log window4Result window5Explorer window

1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables

create a variety of charts and plots

compute a variety of descriptive statistics including the mean sum variance standard deviation and more

compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics

Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets

SAS WINDOW ENVIRONMENT

Five windows in SAS

1Editor window2Output window3Log window4Result window5Explorer window

1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets

SAS WINDOW ENVIRONMENT

Five windows in SAS

1Editor window2Output window3Log window4Result window5Explorer window

1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

SAS WINDOW ENVIRONMENT

Five windows in SAS

1Editor window2Output window3Log window4Result window5Explorer window

1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

2OUT PUT window Results of program will be displayed in output window which has an extension of LIST

3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations

4Result windowIt displays result of all the programs in editor windowNo extension for result window

5Explorer windowContains Libraries and Mycomputer

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

SAS LANGUAGE

The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages

In SAS you use these elements within one of two groups of SAS statements

DATA stepsPROC steps

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

DATA STEPA DATA step consists of a group of statements in the SAS language that can

read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Syntax

DATA ltDATA Set Namegt

INPUT ltvar1gtltvar2gthellipltvarngt

CARDS

Data valuesRUN

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

SAS DATA SETSA SAS data set consists of the following-descriptor information-data values

The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

PROCEDURE STEP BLOCK

SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

EG

proc print data=temp

Run

proc sort data=temp out=sampby namerun

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Data Types in SAS System

1)Numerical Data(0-9)2)Character Data (A-Z)

SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Different Data bases

Best db storage

TextExcelAccessDB2OracleTera Data

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

LIBRARIES

There are 2 ways of creating libraries

1Menu driven

2Programming coding

1Menu driven

Explorer

|

Right click

|

New

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

2Through programming

Editor window

LIBNAME ltName of librarygt ltpathgt

LIBNAME Hari ldquoDGaneshrdquo

Example

To delete Library

Libname Guru clear

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements

_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

RULES FOR MOST SAS NAMES

SAS names are used for SAS data set names variable names and other items The following rules apply

_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

OVERVIEW OF THE DATA STEP

The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

DURING THE COMPILE PHASE

When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items

INPUT BUFFER PROGRAM DATA VECTOR

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement

PROGRAM DATA VECTOR (PDV)

Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

The PDV contain two automatic variables

1) _N_ It gives information about variables and observations

2) _ERROR_ classified as 2 types

i) If _error_=0 means no error in program

ii) _error_=1 means there are errors in program

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Creating the Input Buffer and the Program Data Vector

When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

The following figure shows the Input Buffer and the program data vector after DATA step compilation

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Position of the Pointer in the Input Buffer Before SAS Reads Data

The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Program Data Vector with Computed Value of the Sum Statement

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Writing an Observation to the SAS Data Set

The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Compilation

Checks code for errors

Translate code to machine code

Establishes an area of memory called input buffer if reading raw data

Establishes an area of memory called the program Data Vector

Assign required attributes to variables

Creates the descriptor portion of the new data set

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Execution

During the execution phase SAS

Initializes the PDV to missing

Reads data values in to PDV

Carries out assignment statement and conditional processing

Writes the observation in the PDV to the output SAS data set at the end of the data step

Returns to the top of the Data step

Initialize any variable that are not read from SAS data sets to missing

Repeat the process

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file

The DATA step stops executing after it processes the last input record

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Word Scanner

Compiler

Input stack

Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun

Data temp

compilerWord scanner

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

What Are the SAS Language Elements

Data set optionsInformats and formatsFunctionsStatementsSAS system options

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Definition of Data Set Option

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

Renaming variables

Selecting only the first or last n observations for processing

Dropping variables from processing or from the output data set

Specifying a password for a data set

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Syntax for Data Set Options

Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces

(option-1=value-1ltoption-n=value-ngt)

These examples show data set options in SAS statements

data scores (keep=team game1 game2 game3)

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Formats and Informats

Denition of a FormatA format is an instruction that SAS uses to write data values

Syntax of a Format

SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats

data tempamount=114532put amount dollar102run

The DOLLARwd format in the PUT statement produces this result

$114532

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Informats

Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas

$1000000

To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Syntax of an Informat

SAS informats have the following formlt$gtinformatltwgtltdgt

Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Functions

Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value

Syntax of Functions

The syntax of a function is as follows

function-name (argument-1ltargument-ngt)

x=max (cashcredit)x=sqrt(1500)

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Statements

Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon

INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

SAS System Options

System options are instructions that affect your SAS session

Syntax of SAS System Options

The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change

options nodate linesize=72

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

STANDARD DATA

The data values are in the standard format then the data is called standard data

Eg 467

NON STANDARD DATA

If data values are not in the standard format then data is called as non-standard data

Eg

18-10-05

45000

$21000

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Informats are used to read non-standard data

data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Date Informats

Date Informat Format

12-07-78 DDMMYY8 DDMMYY8

21-09-05 DDMMYY10 DDMMYY10

22Jan89 Date7 Date7

22jan1989 Date9 Date9

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Numeric Informats

Numeric Informat Format

25000 COMMA6 COMMA6

$3000 DOLLAR6 DOLLAR6

25000 COMMA6 WORDS6

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

DEFINING VARIABLES IN SAS

INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data

SAS variables can have these attributes

_ name_ type_ length_ informat_ format_ label

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

DIFFERENT WAYS TO READ DATA

1RAW DATA IN THE JOB STREAM

You can place data directly in the job stream with the programming statements that make up the DATA step

The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

data weight_club

input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight

datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

3DATA IN A SAS DATA SET

You can also use data that is already stored in a SAS data set as input to a new data set

To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements

_ SET statement_ MERGE statement

Data TempSet weight_clubRun

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

2DATA IN AN EXTERNAL FILE

If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file

data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

4DATA IN A DBMS FILE

If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set

libname dblib oracle user=scott password=tiger

data employeesset dblibemployeesrun

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

DATA SET OPTIONS

Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these

KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement

Data sampSet weight_club (Keep= IdNumber Team)Run

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1

Data A1Set weight_club (Drop= IdNumber Name)Run

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

OBS=

Specifies when to stop processing observations

data s1 set weight_club(obs=3)run

Firstobs=

Specifies which observation SAS processes first

data s1 set weight_club(obs=4 firstobs=2)run

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

RENAME=

Changes the name of a variable

data two (rename=(name=Pname))set weight_club run

PW=

Assigns a read write or alter password to a SAS and enables access to a password-protected SAS

data two1 (Pw=ram) set weight_club run

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

WHERE=

Selects observations that meets the specified conditiondata weight_club

Data tmpset weight_club (where=(Name =David Shaw))run

IN=

Creates a variable that indicates whether the data set contributed data to the current observation

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

DATA STEP STATEMENTS

Data statementBegins a DATA step and provides names for any output SAS data sets

Creating an Output Data Set

data example1

set weight_club run

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

When Not Creating a Data Set

data _NULL_set weight_clubput Name run

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

CARDS StatementIndicates that data lines follow

DATALINES Statement (New version)Indicates that data lines follow

Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

DELETE Statement

Stops processing the current observation

if Team=ldquoredrdquo then delete

FORMAT StatementAssociates formats with variables

INFORMAT StatementAssociates informats with variables

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

DATALINES4 Statement or Cards4

Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

DM Statement

Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements

dm log lsquoclearrsquo

KEEP Statement

Includes variables in output SAS data sets

data averageset weight_clubkeep name teamrun

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

LABEL Statement

Assigns descriptive labels to variables

data rtestset weight_clublabel name=teamnamerun

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

LENGTH Statement

Specifies the number of bytes for storing variables

data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

INPUT Statement

Reads input values from specified columns and assigns them to the corresponding SAS variables

This DATA step demonstrates how to read input data records with column input

data scores

input name $ 1-18 score1 25-27 score2 30-32

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

INPUT METHODS

1)List INPUT METHOD

2)Column INPUT METHOD

3)NAMED INPUT METHOD

4)FORMATTED INPUT METHOD

5)ABSOLUTE INPUT METHOD

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

1)List INPUT METHOD

In this method the data values should be seperated by at least single space

EG

-Do-

2)Column INPUT METHOD

In this method character data values contain more than 8 characters and it can contain blank spaces also

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run

3)NAMED INPUT METHOD

In this method data values are followed by variable names

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run

4)FORMATTED INPUT METHOD

In this method variables length followed by period to specify the length of the variable for all data values

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run

5)ABSOLUTE INPUT METHOD

In this input method we are using column hold pointer to give exact location of data values

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Holding a Record Across Iterations of the DATA Step

The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step

data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94

Thanks

Feedback at infoSitworldin

  • PowerPoint Presentation
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • Slide 52
  • Slide 53
  • Slide 54
  • Slide 55
  • Slide 56
  • Slide 57
  • Slide 58
  • Slide 59
  • Slide 60
  • Slide 61
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Slide 66
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • Slide 75
  • Slide 76
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Slide 81
  • Slide 82
  • Slide 83
  • Slide 84
  • Slide 85
  • Slide 86
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Slide 93
  • Slide 94