i am making document for the convenience of group members

Upload: crismaraj

Post on 06-Apr-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    1/17

    I am making document for the convenience of group members.

    SAS-INTERVIEW QUESTIONS

    1. W

    hat SAS statements would you code to read an external raw data file to a DATA step?

    Ans : Infile and Input statements are used to readexternal raw data file to a Data Step.

    2. How do you read in the variable that you need?

    Ans : If we want to read a particular variable in aset of SAS data set, we can mention the variablewe want in the INPUT statement.

    1. Are you familiar with special input delimiters? How are they used?

    Ans : Yes, we have special delimiters like DLM and DSDin SAS. Both these delimiters can be used inthe infile statement

    The DLM can read the commas and spacesas data delimiters. You may choose any delimitersyou wish with this option. You can choose

    multiple character such as DLM=XX for your delimiter.The DSD option allows you to treat two

    consecutive delimiters as containing a missingvalue.

    1. If reading a variable length file with fixed input, how would you prevent SAS from reading thenext record if the last variable didnt have a value?

    Ans : W e can use MISS OVER option in the INFILE statement

    1. W hat is the difference between an informat and a format? Name three informat or format?

    Ans : An informat is an instruction that SAS uses to readdata values into a variableA format is an instruction that SAS uses to writedata values

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    2/17

    The three informat are : -A) Date informatB) Character informatc) Numeric informat

    The three Formats are:-A) Date format

    B) Character FormatC) Numeric Format

    1. Name and describe three SAS function that u have used, if any?

    Ans : A) SUM Function : It adds the variable together by ignoring the missing values if any

    E.G : Var=SUM (var1, var2varn);Var1= SUM (1,., 3) = 4

    B) Mean Function : This function returns the arithmeticmean (average) and ignores the missing value.E.G : Var=MEAN (var1, var2, var3varn);

    C) SUBSTR Function: The SUBSTR function extracts a portion of the character data values based on howmany characters are designated for retrieval.

    E.G : Var=SUBSTR (var, start

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    3/17

    The single trailing @ tells the SAS system to holdthe line.

    The double Trailing @@ tells the SAS system to

    Hold the line more strongly.

    NOTE : An INPUT statement ending with @@ instructs the program to release the current rawdata line only when there are no data values left to be read from that line. The @@, therefore,hold the input record even across multiple iteration of the data step.

    1. Under what circumstances would you code a SELECT construct instead of IF statement?

    Ans : Especially if you are recoding a variable into a

    large number of categories.

    1. W hat statement do you code to tell SAS that it is to

    write to an external file?

    Ans : Filename fileref path;File fileref;Put _all _ /* will write all the variables. */Or put the variables which you require.

    1. If reading an external file to produce an external

    file, what shortcut to write record without codingevery single variable on the record?

    Ans : Put _all _

    1. If you do not want any SAS output from a data step, how

    would you code the data statement to prevent SAS from producing a set?

    Ans : By using DATA _NULL_ the desired output is a fileand not a SAS dataset.

    13. W hat is the one statement to set the criteria of a data

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    4/17

    that can be coded in any step?

    Ans : Options statement

    14. Have you ever-linked SAS code? If so, describe the like

    and any required statement used to either process thecode or the step itself.

    Ans : The link statement tells SAS to jump immediatelyTo the statement label that is indicated in the

    Label statement and to continue executingstatements from that point until a RETURNstatement is executed. The RETURN statementends program control to the statement immediatelyfollowing the LINK statement.

    Note:

    The LINK statement and the destination must be in the same DATA step. The destinationis identified by a statement label in theLINK statement.

    15. How would you include common or reuse code to beProcessed along with your statement?

    Ans : By using %Include

    16. W hen looking for the data contained in a character string of 150 bytes, which function is the best tolocate that data : scan, index or indexc?

    Ans : Scan

    17. If you have a data set that contains 100 variables, butyou need only five of those, what is the code to forceSAS to use only those variables?

    Ans : Use keep = option;

    1. Code a PROC SORT on a data set containing state,

    district and country as the primary variable, alongwith several numeric variables.

    Ans : PROC SORT data-set-name;

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    5/17

    BY state district country;Run;

    1. How would you delete duplicate observation?

    Ans : There are three ways to delete duplicateobservations in a dataset

    1) Proc sort data=SAS-data-set nodups; by var;run;

    2) Proc sql;Create sas-data-set as select * from

    old_sas_data_set where var= distinct (var);quit;

    3) Data clean;Set temp;By group;If first.group and last.group thenRun;

    20. How would you code a merge that will keep only theobservation that have matches form both sets?

    Ans : By using the IN internal variable in the mergestatement.

    DATA NE W ;MERGE ONE_TEMP (IN=ONE) T W O_TEMP (IN=T W O);BY NAME;IF ONE=1 AND T W O=1;RUN;

    1. W hat is the Program Data Vector (PDV)? W hat are their

    functions?

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    6/17

    Ans : Program Data Vector is the temporary holding area.For example The W HERE statement is may be more efficient then the sub setting If (especiallyif you are taking a very small sunset from a large file) because it checks on the validity of thecondition to see if the observation is to be kept or not. This temporary holding area is called the

    program data vector (PDV).22. Does SAS Translate (compile) or does it Interpret?

    Explain.

    Ans : W hen you submit a DATA step for execution, SASchecks the syntax of the SAS statements and compilesthem, that is, automatically translates thestatements into machine code. In this phase, SASidentifies the type and length of each new variable,and determines whether a type conversion is

    necessary for each subsequent reference to avariable.

    23. At compile time when a SAS data set is read, what itemsare created?

    Ans : At compile time SAS creates the following

    A) Input Buffer B) Program Data Vector(pdv)C) Descriptor information

    24. Name statements that are recognized at compile timeOnly?

    Ans : Drop Keep e.t.c

    25. Identify statement whose placement in the DATA step iscritical

    Ans : Input Statement.

    26. Name statements that function at both compile andexecution time.

    27. Name statements that are execution only.

    28. In the flow of the DATA step processing, what is thefirst action in a typical DATA step?

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    7/17

    Ans : SAS first performs Syntax check.

    29. W hat is _n_?

    Ans:

    This is nothing but a implicit variable created bySAS during data processing. It gives the total number Of records SAS has iterated in a dataset. It isAvailable only for data step and not for procs.

    E.G : If we want to find every third record in aDataset then we can use the _n_ as followsData new-sas-data-set;Set old;If mod (_n_, 3) =1 then;Run;

    Note : If we use a where clause to subset the _n_ W ill not yield the required result.

    BASE SAS:

    30. W hat is the effect of the OPTION statement ERROR=1?

    Ans : If the particular data step has one or more errorsthen end the processing

    31. W hats the difference between VAR A1 A4 and VAR A1--A4?

    32. W hat do the SAS log messages numeric values have beenconverted to character mean?

    Ans : If we try some character function on the numericvalues the SAS will automatically convert thenumeric variable into character variable.

    33. W hy is a STOP statement needed for a POINT=option on a SETstatement?

    Ans : Because POINT= reads only the specified observations,SAS cannot detect an end-of-file condition as itwould if the file were being read sequentially.Because detecting an end-of-file condition terminatesa DATA step automatically, failure to substitute

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    8/17

    another means of terminating the DATA step when youuse POINT= can cause the DATA step to go into acontinuous loop.

    NOTE :

    You cannot use the POINT= option with any of thefollowing :

    BY statement W HERE statement W HERE= data set option transport format data sets sequential data sets (on tape or disk) a table from another vendor's relational database management system.

    34. How do you control the number of observation and /or

    variable read or write?Ans : By specifying obs option

    35. Approximately what date is represented by the SAS datevalue of 730?

    Ans : 1 January 1962.

    36. How would remove a format that has been permanentlyassociated with a variable.

    Ans : By Using proc datasets library= somelibrary;Modify sasdataset;Run;

    37. W hat does the RUN statement do?

    Ans : The run statement executes the statement.

    38. W hy SAS considered self-documenting?

    Ans : when a sas-data-set is created SAS creates theDescriptor portion and the data portion of theData set. The descriptor portion contains theDetails like when the dataset was created, no. of Observations, no. of variables e.t.c. Hence SAS isConsidered self documenting.

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    9/17

    39. Briefly describe 5 ways to do a table lookup in SAS.

    Ans :

    1) Simple table lookup (merging (merge (includingIN=OPTION) and sub setting IF statement)

    2) Simple table lookup (formats (PROC FORMAT AND PUTfunction).

    3) Looking up with two variable (merging (merge(including IN=OPTION) and sub setting IF statement)

    4) Looking up with two variable ((formats (PROCFORMAT, PUT AND INPUT Function)

    5) A two-way Looking table (merge statement using twovariables).

    40. W hat are some good SAS programming practices for processing vary large data set?

    Ans : For vary large data set with many variables we canmake use of arrays in the SAS systerm.

    41. How would you create a data set with 1 observation and 30variables from a data set with 30 observations and 1Variable?

    Ans : Using Proc Transpose and also do with the sas arrays.

    44. W hat are _numeric_ and _character_ and what do they do?

    Ans : If we want to do a particular task for all thenumeric variable we can use the _numeric_ and same asif we want to do a particular task for all thecharacter variable we can use the _character_

    46. W hat is the order of application for output data set

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    10/17

    option, input data set option and SAS statement?

    Ans : INPUT data set option, SAS statement option and thenOUTPUT option.

    47.W

    hat is the order of evaluation of the comparisonoperators : + - * /** ()?

    Missing Value:

    56. How many missing values are available? W hen might you usethem?

    Ans : Two missing values are available in SAS, they are

    numeric and character.57. How do you test for missing values?

    Ans : W e can test the missing values by using NMISSoption in the input statement

    58. How are numeric and character missing values representedinternally?

    Ans : The numeric missing values represented as dots(.) andthe character missing values represented as blank

    FUNCTIONS:

    59. W hat is the significance of the OF in X=SUM (OF a1-a4,a6, a9);?

    60. W hat do the PUT and INPUT function do?

    Ans : The PUT function is used to identify the logicProblem W hich piece of code is executed and notexecuted what the current value of the particular variable and what the current value of the allvariable.

    INPUT function :

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    11/17

    The traditional use is the reread a character variable with a numeric format, execute a character-to-numeric conversion.

    The character to numeric conversion function;

    INPUT (variable, informat-name)The INPUT function converts the character variable to numeric

    Salary=input (EMP_SALARY, dollar7.);

    Character value Numeric valueEMP_SALARY SALARY$85,000 85000

    Rename the assigning variable we cannot have the same name.

    Like : EMP_SALARY=input (EMP_SALARY, dollar7.);

    The numeric to character conversion function

    PUT (variable, informat-name);newphone=put (phone, 7);

    numeric value character valuePHONE PHONE6778000 6778000

    61. W hich date advances a date, time or date/time value by agiven interval?

    62. W hat do the MOD and INT function do?

    Ans : MOD function is very useful if suppose you want toselect every third observation from SAS data set.Example= data third;

    Set old;If mod(_N_,3)=1;Run;

    The INT function retunes the integer portion of anargument. To truncate a number (drop off thefractional part), you use the INT function.

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    12/17

    63. In ARRAY processing, what does the DIM function do?

    Ans : DIM is the dimension function. This returns thelength of the array (i.e. the number of variable inthe list).

    64. How would you determine the number of missing or non-missing value in computation?

    Ans : W e can use the N option for the number of NON-MISSING values and NMISS option for the number of MISSING values.

    65. W hat is the difference between : X=a+b+c+d; and X=SUM (a, b, c, d);?

    Ans : If we use SUM (a, b, c, d) it will ignore the missingValues if any and compute the sum.

    For E.G SUM(1,.,2,3)=6X=1+.+2+3 = MISSING.

    66. There is a field containing a date. It needs to bedisplayed in the format ddmonyy if its before 1975,ddmon ccyy if its after 1985, and as disco years if its

    between 1975 and 1985. How would you accomplish this indata step code? Using only PROC FORMAT.

    67. In the following DATA step, what is needed for fractionto print to the log

    Ans : data _null_; X=1/3;if X=.333 then ;

    put fraction;run;

    68. W hat is the difference between calculating the meanusing the mean function and PROC ME ANS ?

    Ans : The mean function returns the mean of the non-missingvalues in the variable list. Actually, you may nothave figured out the importance of the way the MEAN

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    13/17

    function deals with the missing values, and this isquit important .if you calculate SCORE by simplyadding up all the item and dividing by 50 as follows

    SCORE=(item1 +item2+item3+..+item50)/50;

    You would be in big trouble if any of the items hadmissing values. W hen SAS statement tries to doarithmetic operation on missing values, the result isalways missing.

    P ROCs:

    69. If you were given several SAS data sets you wereunfamiliar with, how would you find out the variable namesand formats of each dataset?

    Ans:

    I can use the contents Procedure of all in thelibname and see all the variable name and formats of each data setEG : PROC CONTENTS DATA=LIBREF._ALL_;RUN;

    70. How would you keep SAS from overlaying the SAS set withits sorted version?

    Ans : By creating a new dataset after sorting by specifyingOut = new sas dataset

    71. In PROC PRINT, can you print only variable that begin withthe letter A

    Ans : Yes we can print variable which begin with the letter A by using the W HERE statement in the PROC PRINTstatement

    W HERE (VARIABLE NAME) LIKE A%;Or W HERE (VARIABLE NAME = : A;

    72. W hat are some differences between PROC SUMMARY and PROCMEANS ?

    Ans :

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    14/17

    1) PROC ME ANS produces subgroup statistics only when a BY statement is used and the inputdata has been previously sorted (use PROC SORT) by the BY variables.PROC SUMMARYautomatically produces statistics for all subgroups, giving you all the information in one run

    that you would get by repeatedly sorting a data set by the variables that define each subgroupand running PROC ME ANS /. 2) PROC SUMMARY does not produce any information in your

    output so you will always need to use the OUTPUT statement to create a new data set and usePROC PRINT to see the computed statistics.

    P ROC FREQ:

    73. Code the table statement for a single-level (most common)frequency.

    Ans The statement for single-level.

    DATA MAR.FREQTEST;SET BAS.AMPERS;P ROC FREQ DATA =MAR.FREQTEST;TABLE AGE;RUN ;

    74. Code the table statement to produce a multi-levelfrequency.

    Ans : The statement for multilevel.

    DATA MAR.FREQTEST;SET BAS.AMPERS;P ROC FREQ DATA =MAR.FREQTEST;TABLE AGE * gender;RUN ;

    75. Name the option to produce a frequency line items rather that a table.

    76. Produce output from a frequency. Restrict the printing of the table.

    P ROC MEANS:

    77. Code a PROC ME ANS that shows both summed and averagedoutput of the data.

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    15/17

    78. Code the option that will allow ME ANS to include missingnumeric data to be included in the report.

    79. Code the ME ANS to produce output to be used later.

    80. Do you use PROC REPORT or PROC TABULATE?W

    hich do you prefer? Explain.

    MERGING/U P DATING :

    81. W hat happens in a one-on-one merge? W hen would you useone?

    Ans : If you want to merge two data set that have differentvariable and only one variable as a common variable

    with that unique variable we can merge the data setwith one-on-one merge.

    82. How would you combine 3 or more tables with differentstructures?

    83. W hat is the problem with merging two data set that havevariable with the same name but different data?

    Ans :The second data set value will overwrite the valueof the first data set.

    84. W hen would you choose to MERGE two data sets together andwhen would you SET two data sets?

    Ans : If we want to create a dataset as an exact copy of The old dataset without any bothering about whichDataset is going to contribute to the new datasetThen we will use set statement.If we want to control the contribution of the oldDatasets to the new dataset then we will use theMerge statement

    85. W hich data set is the controlling data set in the MERGEstatement?

    Ans : The second final dataset after the merge statement.

    86. How do the IN= variable improve the capability of aMERGE?

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    16/17

    Ans : IN is a implicit variable in SAS which helps in

    controlling which dataset needs to contribute tothe new dataset

    87. Explain the message MERGE HAS ONE OR MORE DATASETSW

    ITHREPEATS OF BY VARIABLE.

    COSTOMIZED RE P ORT WRITING:

    88. W hat is the purpose of the statement DATA_NULL_?

    Ans : Use the keyword _NULL_, which allows the power of theDATA step without creating a data set.

    89. W hat is the pound sign used for the DATA _NULL_?

    Ans :

    1. W hat is the purpose of using the N=PS option?

    Ans : Specifying N=PS in the FILE statement allows the output pointer to write on any line of the current output

    MACRO: 91. W hat system option would you use to help debug a macro?

    Ans : Symbolgen Mlogic Mprint

    92. Describe how you would create a macro variable?

    Ans : %let var=value;

    93. How do you identify a macro variable?

    94. How do you define the end of a macro?

    Ans : %mend

    95. How do you assign a macro variable to a SAS variable?

    Ans : Using CallSymput

  • 8/3/2019 I Am Making Document for the Convenience of Group Members

    17/17

    96. what is the difference between %LOCAL and %GLOBAL?

    Ans:

    The %LOCAL that variable will be used only at the particular block only but in case of the %GLOBAL thatvariable will be used till the end of the SAS session

    97. How long can a macro variable be? A token?

    Ans : Till it passes to the word scanner.

    98. If you use a SYMPUT in a DATA step, when and where can you

    use the macro variable?Ans : It can be used outside the scope of dataset and will

    Be globally available.

    100. How would you code a macro statement to produceinformation on the SAS log?

    Ans : %put Statement