statistical analysis system sas. (statistical analysis system) it was developed by james good...
TRANSCRIPT
Statistical
Analysis System
SAS
SAS (Statistical Analysis System)
It was developed by James Good knight
1970-It was package1980-Language 1990-Software
SAS
Technical Techno Functional Functional
| | |
SASBASE SASWare house admin SASStat
SASMACROrsquoS SASETL Studio SASGraph
SASACCESS SASOLAP SASOR
SASAF
In which domains SAS can be used
CLINICAL
BANKING
INSURANCE
INTRODUCTION TO THE SAS SYSTEM
SAS is an integrated system of software solutions that enables you to perform thefollowing tasks
data entry retrieval and managementreport writing and graphics designstatistical and mathematical analysisbusiness forecasting and decision supportapplications development
Base SAS software provides you with essential tools for the basic data-driven tasks that you commonly perform as a programmer
Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system
In almost any format including raw data SAS data sets and files created by other vendors software
Managing Data After you have accessed your data you can use the SAS programming language to manipulate it
Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information
Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships
Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables
create a variety of charts and plots
compute a variety of descriptive statistics including the mean sum variance standard deviation and more
compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics
Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets
SAS WINDOW ENVIRONMENT
Five windows in SAS
1Editor window2Output window3Log window4Result window5Explorer window
1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
SAS (Statistical Analysis System)
It was developed by James Good knight
1970-It was package1980-Language 1990-Software
SAS
Technical Techno Functional Functional
| | |
SASBASE SASWare house admin SASStat
SASMACROrsquoS SASETL Studio SASGraph
SASACCESS SASOLAP SASOR
SASAF
In which domains SAS can be used
CLINICAL
BANKING
INSURANCE
INTRODUCTION TO THE SAS SYSTEM
SAS is an integrated system of software solutions that enables you to perform thefollowing tasks
data entry retrieval and managementreport writing and graphics designstatistical and mathematical analysisbusiness forecasting and decision supportapplications development
Base SAS software provides you with essential tools for the basic data-driven tasks that you commonly perform as a programmer
Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system
In almost any format including raw data SAS data sets and files created by other vendors software
Managing Data After you have accessed your data you can use the SAS programming language to manipulate it
Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information
Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships
Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables
create a variety of charts and plots
compute a variety of descriptive statistics including the mean sum variance standard deviation and more
compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics
Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets
SAS WINDOW ENVIRONMENT
Five windows in SAS
1Editor window2Output window3Log window4Result window5Explorer window
1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
SAS
Technical Techno Functional Functional
| | |
SASBASE SASWare house admin SASStat
SASMACROrsquoS SASETL Studio SASGraph
SASACCESS SASOLAP SASOR
SASAF
In which domains SAS can be used
CLINICAL
BANKING
INSURANCE
INTRODUCTION TO THE SAS SYSTEM
SAS is an integrated system of software solutions that enables you to perform thefollowing tasks
data entry retrieval and managementreport writing and graphics designstatistical and mathematical analysisbusiness forecasting and decision supportapplications development
Base SAS software provides you with essential tools for the basic data-driven tasks that you commonly perform as a programmer
Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system
In almost any format including raw data SAS data sets and files created by other vendors software
Managing Data After you have accessed your data you can use the SAS programming language to manipulate it
Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information
Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships
Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables
create a variety of charts and plots
compute a variety of descriptive statistics including the mean sum variance standard deviation and more
compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics
Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets
SAS WINDOW ENVIRONMENT
Five windows in SAS
1Editor window2Output window3Log window4Result window5Explorer window
1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
In which domains SAS can be used
CLINICAL
BANKING
INSURANCE
INTRODUCTION TO THE SAS SYSTEM
SAS is an integrated system of software solutions that enables you to perform thefollowing tasks
data entry retrieval and managementreport writing and graphics designstatistical and mathematical analysisbusiness forecasting and decision supportapplications development
Base SAS software provides you with essential tools for the basic data-driven tasks that you commonly perform as a programmer
Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system
In almost any format including raw data SAS data sets and files created by other vendors software
Managing Data After you have accessed your data you can use the SAS programming language to manipulate it
Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information
Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships
Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables
create a variety of charts and plots
compute a variety of descriptive statistics including the mean sum variance standard deviation and more
compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics
Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets
SAS WINDOW ENVIRONMENT
Five windows in SAS
1Editor window2Output window3Log window4Result window5Explorer window
1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
INTRODUCTION TO THE SAS SYSTEM
SAS is an integrated system of software solutions that enables you to perform thefollowing tasks
data entry retrieval and managementreport writing and graphics designstatistical and mathematical analysisbusiness forecasting and decision supportapplications development
Base SAS software provides you with essential tools for the basic data-driven tasks that you commonly perform as a programmer
Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system
In almost any format including raw data SAS data sets and files created by other vendors software
Managing Data After you have accessed your data you can use the SAS programming language to manipulate it
Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information
Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships
Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables
create a variety of charts and plots
compute a variety of descriptive statistics including the mean sum variance standard deviation and more
compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics
Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets
SAS WINDOW ENVIRONMENT
Five windows in SAS
1Editor window2Output window3Log window4Result window5Explorer window
1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Base SAS software provides you with essential tools for the basic data-driven tasks that you commonly perform as a programmer
Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system
In almost any format including raw data SAS data sets and files created by other vendors software
Managing Data After you have accessed your data you can use the SAS programming language to manipulate it
Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information
Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships
Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables
create a variety of charts and plots
compute a variety of descriptive statistics including the mean sum variance standard deviation and more
compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics
Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets
SAS WINDOW ENVIRONMENT
Five windows in SAS
1Editor window2Output window3Log window4Result window5Explorer window
1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Accessing Datayou can access data that is stored almost anywhere whether it is in a file on your system or data that is stored another database system
In almost any format including raw data SAS data sets and files created by other vendors software
Managing Data After you have accessed your data you can use the SAS programming language to manipulate it
Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information
Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships
Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables
create a variety of charts and plots
compute a variety of descriptive statistics including the mean sum variance standard deviation and more
compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics
Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets
SAS WINDOW ENVIRONMENT
Five windows in SAS
1Editor window2Output window3Log window4Result window5Explorer window
1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Managing Data After you have accessed your data you can use the SAS programming language to manipulate it
Format your data create variables (columns) use operators to evaluate data values use functions to create and recode data values subset data perform conditional processingmerge a wide range of data sources create retrieve and update database information
Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships
Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables
create a variety of charts and plots
compute a variety of descriptive statistics including the mean sum variance standard deviation and more
compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics
Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets
SAS WINDOW ENVIRONMENT
Five windows in SAS
1Editor window2Output window3Log window4Result window5Explorer window
1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Analyzing Data and Presenting Information Once your data is in shape you can use SAS to analyze data and produce reports Your SAS output can range from a simple listing of a data set to customized reports of complex relationships
Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables
create a variety of charts and plots
compute a variety of descriptive statistics including the mean sum variance standard deviation and more
compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics
Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets
SAS WINDOW ENVIRONMENT
Five windows in SAS
1Editor window2Output window3Log window4Result window5Explorer window
1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Analysis Base SAS provides powerful data analysis tools For example you can produce tables frequency counts and cross-tabulation tables
create a variety of charts and plots
compute a variety of descriptive statistics including the mean sum variance standard deviation and more
compute correlations and other measures of association as well as multi-way cross-tabulations and inferential statistics
Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets
SAS WINDOW ENVIRONMENT
Five windows in SAS
1Editor window2Output window3Log window4Result window5Explorer window
1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Presentation For reporting and displaying analytical results SAS gives you an almost limitless number of visually appealing output formats such as an array of markup languages including HTML4 and XML output that is formatted for a high-resolution printer such PostScript PDF and PCL files RTF color graphs that you can make interactive using ActiveX controls or Java applets
SAS WINDOW ENVIRONMENT
Five windows in SAS
1Editor window2Output window3Log window4Result window5Explorer window
1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
SAS WINDOW ENVIRONMENT
Five windows in SAS
1Editor window2Output window3Log window4Result window5Explorer window
1Editor window Editor window contains the list of programs which hasan extension of SASWe can type any no of programs in editor window We can execute all programs at a time or individually
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
2OUT PUT window Results of program will be displayed in output window which has an extension of LIST
3LOG WINDOWSuppose if there are any errors or warnings in the program those messages will be displayed in log windowIt displays the licensed agreement of SAS Version no of variables no of observations
4Result windowIt displays result of all the programs in editor windowNo extension for result window
5Explorer windowContains Libraries and Mycomputer
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
SAS LANGUAGE
The SAS language consists of statements expressions options formats and functions similar to those of many other programming languages
In SAS you use these elements within one of two groups of SAS statements
DATA stepsPROC steps
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
DATA STEPA DATA step consists of a group of statements in the SAS language that can
read data from external fileswrite data to external filesread SAS data sets and data viewscreate SAS data sets and data viewsCreate multiple SAS data sets in one DATA STEPCombine existing data setsCreating accumulating totalsManipulate numeric and character values
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Syntax
DATA ltDATA Set Namegt
INPUT ltvar1gtltvar2gthellipltvarngt
CARDS
Data valuesRUN
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
EGdata tempinput name $ nodatalineshari 102ravi 104ganesh 105kiran 109run
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
SAS DATA SETSA SAS data set consists of the following-descriptor information-data values
The descriptor information describes the contents of the SAS data set to SAS The data values are data that has been collected or calculated They are organized into rows called observations and columns called variables An observation is a collection of data values that usually relate to a single object A variable is the set of data values that describe a given characteristic
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
SAS VARIABLES AND OBSERVATIONSThe below figure shows a SAS data set The data describes participants in a 16-week weight program at a health and fitness club The data for each participant includes an identification number name team name and weight at the beginning and end of the program
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
PROC STEPOnce your data is accessible as a SAS data set you can analyze the data and write reports by using a set of tools known as SAS proceduresA group of procedure statements is called a PROC step SAS procedures analyze data in SAS data sets to produce statistics tables reports charts and plots to create SQL queries and to perform other analyses and operations on your data They also provide ways to manage and print SAS files
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
PROCEDURE STEP BLOCK
SyntaxProc ltProcedure namegtStatement 1Statement 2Statement nRun
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
EG
proc print data=temp
Run
proc sort data=temp out=sampby namerun
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Data Types in SAS System
1)Numerical Data(0-9)2)Character Data (A-Z)
SAS System by default reads both numeric and character data as numeric onlyTo read character data Use $ symbolThe no of variables in SAS System is up to 32767 charactersSAS Reads Data values observation by observationThe no of observations in SAS Data set depends on system configuration or hard disk spaceIn SAS data set for missing value in output it shows a period() for missing value and ldquoblank space ldquo for character value
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Different Data bases
Best db storage
TextExcelAccessDB2OracleTera Data
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
LIBRARIES
There are 2 ways of creating libraries
1Menu driven
2Programming coding
1Menu driven
Explorer
|
Right click
|
New
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
2Through programming
Editor window
LIBNAME ltName of librarygt ltpathgt
LIBNAME Hari ldquoDGaneshrdquo
Example
To delete Library
Libname Guru clear
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
RULES FOR SAS STATEMENTSThere are only a few rules for writing SAS statements
_ SAS statements end with a semicolon_ You can enter SAS statements in lowercase uppercase or a mixture of the two_ You can begin SAS statements in any column of a line and write several statements on the same line_ You can begin a statement on one line and continue it on another line but you cannot split a word between two lines
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
RULES FOR MOST SAS NAMES
SAS names are used for SAS data set names variable names and other items The following rules apply
_ A SAS name can contain from one to 32 characters_ The first character must be a letter or an underscore (_)_ Subsequent characters must be letters numbers or underscores_ Blanks cannot appear in SAS names
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
DATA STEP PROCESSINGThe DATA step is one of the basic building blocks of SAS programming It creates the data sets that are used in a SAS programrsquos analysis and reporting procedures
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
OVERVIEW OF THE DATA STEP
The DATA step consists of a group of SAS statements that begins with a DATA statement The DATA statement begins the process of building a SAS data set and names the data set The statements that make up the DATA step are compiled and the syntax is checked If the syntax is correct then the statements are executed In its simplest form the DATA step is a loop with an automatic output and return action
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
DURING THE COMPILE PHASE
When you submit a DATA step for execution SAS checks the syntax of the SAS statements and compiles them that is automatically translates the statements into machine code SAS further processes the code and creates the following two items
INPUT BUFFER PROGRAM DATA VECTOR
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
INPUT BUFFERInput buffer is a logical area in memory into which SAS reads each record of raw data when SAS executes an INPUT statement
PROGRAM DATA VECTOR (PDV)
Is a logical area in memory where SAS builds a data set oneobservation at a time When a program executes SAS reads data values from the input buffer or creates them by executing SAS language statements The data values are assigned to the appropriate variables in the program data vector From here SAS writes the values to a SAS data set as a single observation
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
The PDV contain two automatic variables
1) _N_ It gives information about variables and observations
2) _ERROR_ classified as 2 types
i) If _error_=0 means no error in program
ii) _error_=1 means there are errors in program
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Creating the Input Buffer and the Program Data Vector
When DATA step statements are compiled SAS determines whether to create an input buffer If the input file contains raw data (as in the example above) SAS creates an input buffer to hold the data before moving the data to the program data vector (PDV)
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
data total_points (drop=TeamName) input TeamName $ ParticipantName $ Event1 Event2 Event3 TeamTotal + (Event1 + Event2 + Event3)datalinesKnights Sue 6 8 8Cardinals Jane 9 7 8Knights John 7 7 7Knights Lisa 8 9 9Knights Fran 7 6 6Knights Walter 9 8 10run
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
The following figure shows the Input Buffer and the program data vector after DATA step compilation
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Position of the Pointer in the Input Buffer Before SAS Reads Data
The INPUT statement then reads data values from the record in the input buffer and writes them to the PDV where they become variable values The following figure shows both the position of the pointer in the input buffer and the values in the PDV after SAS reads the first record
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Program Data Vector with Computed Value of the Sum Statement
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Writing an Observation to the SAS Data Set
The First Observation in Data Set TOTAL_POINTSOutput SAS Data Set TOTAL_POINTS 1st observation
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
SAS then returns to the DATA statement to begin the next iteration SAS resets the values in the PDV in the following way1048576 The values of variables created by the INPUT statement are set to missing1048576 The value created by the Sum statement is automatically retained1048576 The value of the automatic variable _N_ is incremented by 1 and the value of _ERROR_ is reset to 0
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Compilation
Checks code for errors
Translate code to machine code
Establishes an area of memory called input buffer if reading raw data
Establishes an area of memory called the program Data Vector
Assign required attributes to variables
Creates the descriptor portion of the new data set
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Execution
During the execution phase SAS
Initializes the PDV to missing
Reads data values in to PDV
Carries out assignment statement and conditional processing
Writes the observation in the PDV to the output SAS data set at the end of the data step
Returns to the top of the Data step
Initialize any variable that are not read from SAS data sets to missing
Repeat the process
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Each time the DATA statement executes a new iteration of the DATA step begins and the _N_ automatic variableis incremented by 1As SAS continues to read records the value in TeamTotal grows larger as more participant scores are added to the variable _N_ is incremented at the beginning of each iteration of the DATA step This process continues until SAS reaches the end ofthe input file
The DATA step stops executing after it processes the last input record
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Word Scanner
Compiler
Input stack
Data tempInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Input stackInput name $ noCardsHari 101let list=nameproc print data=tempvar amplistrun
Data temp
compilerWord scanner
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
ldquoThe process that SAS uses to extract words and symbols from the input stack to word scanner is called tokenizationrdquo Tokenization is performed by a component of SAS called the word scannerThe word scanner starts at the first character in the input stack and examines each character in turnLiteral a string of characters enclosed in quotation marks Number digits date values time values and hexadecimal numbers Name a string of characters beginning with an underscore or letter Special any character or group of characters that have special meaning to SAS Examples of special characters include + - $ ( ) amp =
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
What Are the SAS Language Elements
Data set optionsInformats and formatsFunctionsStatementsSAS system options
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Definition of Data Set Option
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
Renaming variables
Selecting only the first or last n observations for processing
Dropping variables from processing or from the output data set
Specifying a password for a data set
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Syntax for Data Set Options
Specify a data set option in parentheses after a SAS data set name To specify several data set options separate them with spaces
(option-1=value-1ltoption-n=value-ngt)
These examples show data set options in SAS statements
data scores (keep=team game1 game2 game3)
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
data points (Keep=Event1 Event2) input TName $ PName $ Event1 Event2 datalinesKnights Sue 6 8 Cardinals Jane 9 7 Knights John 7 7 Knights Lisa 8 9 run
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Formats and Informats
Denition of a FormatA format is an instruction that SAS uses to write data values
Syntax of a Format
SAS formats have the following formlt$gtformatltwgtltdgtHere is an explanation of the syntax$indicates a character format its absence indicates a numeric formatformatnames the format
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
wspecifies the format width which for most formats is the number of columns in theoutput datadspecifies an optional decimal scaling factor in the numeric formats
data tempamount=114532put amount dollar102run
The DOLLARwd format in the PUT statement produces this result
$114532
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Informats
Denition of an InformatAn informat is an instruction that SAS uses to read data values into a variable For example the following value contains a dollar sign and commas
$1000000
To remove the dollar sign ($) and commas () before storing the numeric value 1000000 in a variable read this value with the COMMA11 informat
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Syntax of an Informat
SAS informats have the following formlt$gtinformatltwgtltdgt
Here is an explanation of the syntax$indicates a character informat its absence indicates a numeric informatinformatnames the informatwspecifies the informat widthdspecifies an optional decimal scaling factor in the numeric informats
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
data tmp1input ename $ edate informat edate ddmmyy8format edate date9cardshari 101007run
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Functions
Denition of FunctionsA SAS function performs a computation or system manipulation on arguments and returns a value
Syntax of Functions
The syntax of a function is as follows
function-name (argument-1ltargument-ngt)
x=max (cashcredit)x=sqrt(1500)
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Statements
Definition of StatementsA SAS statement is a series of items that may include keywords SAS names special characters and operatorsAll SAS statements end with a semicolon
INPUT ListPUT DATALINESDO Iterative DO Until DO While SELECT DROPMERGE SETFILE LENGTHSum ENDOUTPUT KEEP DATA RETAIN
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
SAS System Options
System options are instructions that affect your SAS session
Syntax of SAS System Options
The syntax for specifying system options in an OPTIONS statement isOPTIONS option(s)Here is an explanation of the syntaxoptionspecifies one or more SAS system options that you want to change
options nodate linesize=72
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
STANDARD DATA
The data values are in the standard format then the data is called standard data
Eg 467
NON STANDARD DATA
If data values are not in the standard format then data is called as non-standard data
Eg
18-10-05
45000
$21000
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Informats are used to read non-standard data
data datesinput name $ Bdate ddmmyy8 format Bdate ddmmyy8cardshari 21-10-84ravi 22-11-86run
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Date Informats
Date Informat Format
12-07-78 DDMMYY8 DDMMYY8
21-09-05 DDMMYY10 DDMMYY10
22Jan89 Date7 Date7
22jan1989 Date9 Date9
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Numeric Informats
Numeric Informat Format
25000 COMMA6 COMMA6
$3000 DOLLAR6 DOLLAR6
25000 COMMA6 WORDS6
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
DEFINING VARIABLES IN SAS
INPUT statement provides instructions for reading data it defines the variables for the data set that come from the raw data
SAS variables can have these attributes
_ name_ type_ length_ informat_ format_ label
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
DIFFERENT WAYS TO READ DATA
1RAW DATA IN THE JOB STREAM
You can place data directly in the job stream with the programming statements that make up the DATA step
The DATALINES statement tells SAS that raw data followsThe single semicolon that follows the last line of data marks the end of the data The DATALINES statement and data lines must occur last in the DATA step statements
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
data weight_club
input IdNumber 1-4 Name $ 6-20 Team $ StartWeight EndWeight
datalines1023 David Shaw red 189 1651049 Amelia Serrano yellow 145 1241219 Alan Nance red 210 1921246 Ravi Sinha yellow 194 1771078 Ashley McKnight red 127 118
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
3DATA IN A SAS DATA SET
You can also use data that is already stored in a SAS data set as input to a new data set
To read data from an existing SAS data set you must specify the existing data setrsquos name in one of these statements
_ SET statement_ MERGE statement
Data TempSet weight_clubRun
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
2DATA IN AN EXTERNAL FILE
If your raw data is already stored in a file then you do not have to bring that file into the data stream Use an INFILE statement to specify the file containing the raw data The statements in the code that follows demonstrate the same example this time showing that the raw data is stored in an external file
data ltdataset namegtinfile rsquoyour-input-file pathfilenameextensionrsquoinput ltvar 1gt ltvar 2gt helliphellip run
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
4DATA IN A DBMS FILE
If you have data that is stored in another vendorrsquos database management system (DBMS) files then you can use SASACCESS software to bring this data into a SAS data set SASACCESS software enables you to assign a libref to a library containing the DBMS file In this example a libref is declared and points to a library containing Oracle data SAS reads data from an Oracle file into a SAS data set
libname dblib oracle user=scott password=tiger
data employeesset dblibemployeesrun
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
DATA SET OPTIONS
Data set options specify actions that apply only to the SAS data set with which they appear They enable you to perform operations such as these
KEEPThis example uses the KEEP= data set option in the SET statement to read only the variables that represent the inSet Statement
Data sampSet weight_club (Keep= IdNumber Team)Run
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
DROPUse the DROP= option to create a subset of a larger data set when you want to specify which variables are being excluded rather than which ones are being included The following DATA step reads all of the variables from the data set weight_club except for those that are specified with the DROP= option and then creates a data set named A1
Data A1Set weight_club (Drop= IdNumber Name)Run
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
OBS=
Specifies when to stop processing observations
data s1 set weight_club(obs=3)run
Firstobs=
Specifies which observation SAS processes first
data s1 set weight_club(obs=4 firstobs=2)run
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
RENAME=
Changes the name of a variable
data two (rename=(name=Pname))set weight_club run
PW=
Assigns a read write or alter password to a SAS and enables access to a password-protected SAS
data two1 (Pw=ram) set weight_club run
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
WHERE=
Selects observations that meets the specified conditiondata weight_club
Data tmpset weight_club (where=(Name =David Shaw))run
IN=
Creates a variable that indicates whether the data set contributed data to the current observation
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
DATA STEP STATEMENTS
Data statementBegins a DATA step and provides names for any output SAS data sets
Creating an Output Data Set
data example1
set weight_club run
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
When Not Creating a Data Set
data _NULL_set weight_clubput Name run
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
CARDS StatementIndicates that data lines follow
DATALINES Statement (New version)Indicates that data lines follow
Using the DATALINES Statement In this example SAS reads a data line and assigns values to two character variables NAME and DEPT for each observation in the DATA step
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
DELETE Statement
Stops processing the current observation
if Team=ldquoredrdquo then delete
FORMAT StatementAssociates formats with variables
INFORMAT StatementAssociates informats with variables
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
data two1input ename $ eid hiredate informat hiredate mmddyy8datalineshari 101 120105ravi 102 110306run
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
DATALINES4 Statement or Cards4
Indicates that data lines that contain semicolons followdata biblioinput number citation $50datalines46 19882 LIN ET AL 1995 BRADY 19933 BERG 1990 ROA 1994 WILLIAMS 1992
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
DM Statement
Submits SAS Program Editor Log Procedure Output or text editor commands as SAS Statements
dm log lsquoclearrsquo
KEEP Statement
Includes variables in output SAS data sets
data averageset weight_clubkeep name teamrun
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
LABEL Statement
Assigns descriptive labels to variables
data rtestset weight_clublabel name=teamnamerun
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
LENGTH Statement
Specifies the number of bytes for storing variables
data testlengthinput firstname$ lastname$ n1 n2length name $25 datalinesAlexander Robinson 35 11
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
INPUT Statement
Reads input values from specified columns and assigns them to the corresponding SAS variables
This DATA step demonstrates how to read input data records with column input
data scores
input name $ 1-18 score1 25-27 score2 30-32
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
INPUT METHODS
1)List INPUT METHOD
2)Column INPUT METHOD
3)NAMED INPUT METHOD
4)FORMATTED INPUT METHOD
5)ABSOLUTE INPUT METHOD
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
1)List INPUT METHOD
In this method the data values should be seperated by at least single space
EG
-Do-
2)Column INPUT METHOD
In this method character data values contain more than 8 characters and it can contain blank spaces also
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
data tempinput id 1-3 name $ 7-18 age 21-22cards101 shiva krish 38102 ravi krish 38103 rama krish 38run
3)NAMED INPUT METHOD
In this method data values are followed by variable names
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
data sampinput id= name= $ age=cardsid=290 name=ravi age=20id=291 name=rani age=19run
4)FORMATTED INPUT METHOD
In this method variables length followed by period to specify the length of the variable for all data values
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
data oneinput id 3 name $ 11 age 3datalines101 praveenraj 25102 kiranraj 23run
5)ABSOLUTE INPUT METHOD
In this input method we are using column hold pointer to give exact location of data values
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
data twoinput 1 id 3+5 10 name $ 4+5 19 agecards102 hari 29run
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Holding a Record Across Iterations of the DATA Step
The INPUT statement uses the double trailing to control the input pointer across iterations of the DATA step
data testinput name $ age datalinesJohn 13 Monica 12 Sue 15 Stephen 10Marc 22 Lily 17
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
data num infile datalines dsd input x y z datalines123456789 data nums infile datalines dsd delimiter= input X Y Z datalines123456789
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
data weather infile datalines missover input temp1-temp5 datalines 979 981 983 986 992 991 985 975 962 973 983 976 965
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-
Thanks
Feedback at infoSitworldin
- PowerPoint Presentation
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
- Slide 48
- Slide 49
- Slide 50
- Slide 51
- Slide 52
- Slide 53
- Slide 54
- Slide 55
- Slide 56
- Slide 57
- Slide 58
- Slide 59
- Slide 60
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Slide 66
- Slide 67
- Slide 68
- Slide 69
- Slide 70
- Slide 71
- Slide 72
- Slide 73
- Slide 74
- Slide 75
- Slide 76
- Slide 77
- Slide 78
- Slide 79
- Slide 80
- Slide 81
- Slide 82
- Slide 83
- Slide 84
- Slide 85
- Slide 86
- Slide 87
- Slide 88
- Slide 89
- Slide 90
- Slide 91
- Slide 92
- Slide 93
- Slide 94
-