please don't lag behind lag! anjan matlapudi and j. daniel knapp pharmacy informatics and...
TRANSCRIPT
Please Don't Lag Behind LAG!
Anjan Matlapudi and J. Daniel Knapp
Pharmacy Informatics and Finance
PerformRx
Introduction
Lag is a common word meaning to fail to keep up or to fall behind.
- From Wikipedia
Lag: fail to maintain a desired pace or to keep up; fall or stay behind: After five minutes of hard running, some of them began to lag.
A person who lags behind, is the last to arrive, etc. - From Dictionary.com
Simple Example
Day Temp Lag LeadYesterday 70
Today 60 70 80
Tomorrow 80
Why We need the Lag Function ? Useful function while computing across
observations. Compute observations with reference to
previous date and time functions. Lag is the powerful function when we
use with other SAS functions such as DIF,INTCK, RETAIN etc..
Simplify our code and easy to manipulate DATA.
Syntaxdata TempTbl;set Dailytemp;
lag_qty = lag(qty);run;
Table 1Member ID Drug Fill Date Drug Name Qty
1111 01/01/2010 ABILIFY 101111 01/16/2010 ABILIFY 201111 03/16/2010 ABILIFY 301111 03/30/2010 ABILIFY 402222 01/04/2010 GLEEVEC 102222 02/10/2010 GLEEVEC 202222 02/15/2010 GLEEVEC 302222 07/01/2010 GLEEVEC 40
Example 1data Example1;set Table1;
lag_qty = lag(qty);run;
Example 1Member ID Drug Fill Date Drug Name Qty Lag
Qty
1111 01/01/2010 ABILIFY 10 .
1111 01/16/2010 ABILIFY 20 10
1111 03/16/2010 ABILIFY 30 20
1111 03/30/2010 ABILIFY 40 30
2222 01/04/2010 GLEEVEC 10 40
2222 02/10/2010 GLEEVEC 20 10
2222 02/15/2010 GLEEVEC 30 20
2222 07/01/2010 GLEEVEC 40 30
Example 1data Example1;set Table1;
lag_qty = lag(qty); by MemberId;run;
Table 1Member ID Drug Fill Date Drug Name Qty Lag
Qty
1111 01/01/2010 ABILIFY 10 .
1111 01/16/2010 ABILIFY 20 10
1111 03/16/2010 ABILIFY 30 20
1111 03/30/2010 ABILIFY 40 30
2222 01/04/2010 GLEEVEC 10 40
2222 02/10/2010 GLEEVEC 20 10
2222 02/15/2010 GLEEVEC 30 20
2222 07/01/2010 GLEEVEC 40 30
Example 1
data Example1;set Table1;
lag_qty = lag(qty); by MemberId;If not first.MemberId then lag_qty = lag(qty);
run;
Wrong ResultsMember ID Drug Fill Date Drug Name Qty Lag
Qty
1111 01/01/2010 ABILIFY 10 .
1111 01/16/2010 ABILIFY 20 .
1111 03/16/2010 ABILIFY 30 10
1111 03/30/2010 ABILIFY 40 20
2222 01/04/2010 GLEEVEC 10 .
2222 02/10/2010 GLEEVEC 20 40
2222 02/15/2010 GLEEVEC 30 30
2222 07/01/2010 GLEEVEC 40 20
Question
What is the Reason Behind
?
Memory Allocation in Queue
line 1line .
line 1line 2
line 3line 2
line 3line 4
line 5line 4
Observation 1Observation 2Observation 3Observation 4Observation 5
Observation .Observation 1Observation 2Observation 3Observation 4
line 1line .line .
line 1line 2line .
line 1line 2line 3
line 4line 2line 3
line 5line 4line 3
Observation 1Observation 2Observation 3Observation 4Observation 5
Observation .Observation .Observation 1Observation 2Observation 3
LAG 1 LAG 2
Input data set Input data setoutput data set output data set
Example 1
data Example1;set Table1; lag_qty = lag(qty);
by MemberId;If first.MemberId then Qty = lag_qty;
run;. ;
Correct ResultsMember ID Drug Fill Date Drug Name Qty Lag
Qty
1111 01/01/2010 ABILIFY 10 .
1111 01/16/2010 ABILIFY 20 10
1111 03/16/2010 ABILIFY 30 20
1111 03/30/2010 ABILIFY 40 30
2222 01/04/2010 GLEEVEC 10 .
2222 02/10/2010 GLEEVEC 20 10
2222 02/15/2010 GLEEVEC 30 20
2222 07/01/2010 GLEEVEC 40 30
Iterationsdata Example1;
set Table1; prevQty = lag(qty); prevQty1 = lag(qty); prevQty2 = lag(qty); prevQty3 = lag(qty); prevQty4 = lag(qty); prevQty5 = lag(qty); prevQty6 = lag(qty); prevQty7 = lag(qty);
prevQty8 = lag(qty);run;
Output Results
Member ID
Drug Fill Date
Drug Name
Qty PrevQty PrevQty1
PrevQty2
PrevQty3
PrevQty4
PrevQty5
PrevQty6
PrevQty7
1111 01/01/2010 ABILIFY 10 . . . . . . . .
1111 01/16/2010 ABILIFY 20 10 . . . . . . .
1111 03/16/2010 ABILIFY 30 20 10 . . . . . .
1111 03/30/2010 ABILIFY 40 30 20 10 . . . . .
2222 01/04/2010 GLEEVEC 10 40 30 20 10 . . . .
2222 02/10/2010 GLEEVEC 20 10 40 30 20 10 . . .
2222 02/15/2010 GLEEVEC 30 20 10 40 30 20 10 . .
2222 07/01/2010 GLEEVEC 40 30 20 10 40 30 20 10 .
Best Practice
To understand data and make sure to deal with missing values.
Proc Freq is the best way to check missing values in the data.
Proc Freq data = Example1;Table MemberId*FillDate/Missing List;Run;
Question
Does LEAD Functions Exists in SAS®
?
LEAD Example
data LeadTbl;merge Table1 Table1(firstobs=2 keep=qty rename=(qty=lead_qty));
run;
LEAD Output Member ID Drug Fill Date Drug Name Qty Lead Qty
1111 01/01/2010 ABILIFY 10 20
1111 01/16/2010 ABILIFY 20 30
1111 03/16/2010 ABILIFY 30 40
1111 03/30/2010 ABILIFY 40 10
2222 01/04/2010 GLEEVEC 10 20
2222 02/10/2010 GLEEVEC 20 30
2222 02/15/2010 GLEEVEC 30 40
2222 07/01/2010 GLEEVEC 40
Question
Does LAG/LEAD Functions Works in PROC SQL
?
LAG in Proc SQLproc sql;
select DrugName,qty,LAG(qty,1) as prevQtyfrom Example1;
quit;
Log Output520 proc sql;521 select drug, qty,522 LAG(qty,1) as prevQty523 from demo;
ERROR: The LAG function is not supported in PROC SQL, it is only valid within the DATA step.524 quit;NOTE: The SAS System stopped processing this step because of errors.*;
LEAD in Proc SQLproc sql; select DrugName,qty, LEAD(qty,1) as LeadQty from Example1;quit;
Log Output520 proc sql;521 select drug, qty,522 LEAD(qty,1) as NextQty523 from Example1; ERROR: The LEAD Could not be Located.524 quit;NOTE: The SAS System stopped processing this step because of errors.*;
Proc Expandproc expand data = Table1 out = Leadtbl
method=none; by MemberId;
convert Qty =lead_qty / transformout= (lead);convert Qty = lag_qty / transformout= (lag);convert Qty =lead2_qty / transformout= (lead 2);convert Qty =lag2_qty / transformout= (lag 2);
run;
LAG/Lead Output ResultsMember ID Drug Name Qty Lag1 Lag2 Lead Lead2
1111 ABILIFY 10 . . 20 30
1111 ABILIFY 20 10 . 30 40
1111 ABILIFY 30 20 10 40 .
1111 ABILIFY 40 30 20 . .
2222 GLEEVEC 10 . . 20 30
2222 GLEEVEC 20 10 . 30 40
2222 GLEEVEC 30 20 10 40 .
2222 GLEEVEC 40 30 20 . .
Table -2 LAG With INTCK function Example
Member ID Drug Fill Date Drug Name Qty
1111 01/01/2010 ABILIFY 10
1111 01/15/2010 AMBIEN 20
1111 03/16/2010 ALORA 5
2222 01/01/2010 ASPRIN 1
3333 01/04/2010 TYLENOL 1
4444 02/10/2010 ABILIFY 10
4444 02/15/2010 AMOXIL 20
4444 07/01/2010 DIXOL 20
INTCK Functiondata singlelPrescriptions DualPrescriptions; set Table2; by MemberID DrugFill_Date;*----Output single prescriptions----*;if first.MemberID and last.MemberID then output singlelPrescriptions; else do;
lag_memberid = lag(MemberID);
lag_drugFill_date = lag(DrugFill_Date); if lag_memberid = MemberID then do;
numdays= intck('DAY',lag_drugFill_date, DrugFill_Date); if numdays < 30 then star = "*";-Output with '*' flag; end; output DualPrescriptions; end;run;
LAG with INTCK Output
Member ID
Drug Fill Date
Drug Name Qty Num Days
Star
1111 01/01/2010 ABILIFY 10 .
1111 01/15/2010 AMBIEN 20 14 *1111 03/16/2010 ALORA 5 60
2222 02/10/2010 ABILIFY 10 .
2222 02/15/2010 AMOXIL 20 5 *
2222 07/01/2010 DOXIL 20 136
Table -3 LAG With DIFF function Example
Member ID Drug Fill Date Drug Name Price1111 01/01/2007 ABILIFY $301111 01/15/2008 ABILIFY $601111 10/16/2009 ABILIFY $801111 07/01/2010 ABILIFY $902222 01/04/2007 GLEEVEC $302222 02/10/2008 GLEEVEC $402222 02/15/2009 GLEEVEC $602222 07/01/2010 GLEEVEC $90
DIF Functiondata DifTbl; set Table3;
by MemberID DrugFill_Date;
diff_drugFill_date = dif(DrugFill_Date);
lag_price = lag(price); dif_price = dif(price);if first.memberid then do;diff_drugFill_date = .;
lag_price = .;dif_price = .;
end;if diff_drugFill_date > 365 then do;
percent_increase = round((dif_price/lag_price)*100) ;end;
run;
LAG with DIF Output Member ID Drug Name Price Fill
DateLag
Price Dif
PricePer
Increase
1111 ABILIFY $30 . . . .
1111 ABILIFY $60 379 30 30 100%
1111 ABILIFY $80 640 60 20 33%
1111 ABILIFY $90 259 80 10 .
2222 GLEEVEC $30 . . . .
2222 GLEEVEC $40 402 30 10 33%
2222 GLEEVEC $60 674 40 20 50%
2222 GLEEVEC $90 198 60 30 .
Table – 4 LAG With RETAIN function Example
Member ID
Drug Fill Date
Drug Name Qty Lag Price
1111 01/01/2007 ABILIFY 10 $6.25
1111 01/15/2008 ABILIFY 20
1111 10/16/2009 ABILIFY 30
1111 07/01/2010 ABILIFY 40
2222 01/04/2007 GLEEVEC 10 $5.55
2222 02/10/2008 GLEEVEC 20
2222 02/15/2009 GLEEVEC 30
2222 07/01/2010 GLEEVEC 40
RETAIN Function
data RetainTbl(rename=(Old_price=price));
set Table4;
by MemberID ;
if first.MemberID then old_price=price;
retain old_price;
if price EQ . then old_price=old_price;
else old_price=price;
lag_qty = lag (qty);
if first.MemberID then lag_qty = .;
else avgprice = mean(old_price*lag_qty);
run;
LAG with RETAIN Output
Member ID
Drug Fill Date
Drug Name QTY Lag Price
Price Var
Lag Qty Avg Price
1111 01/01/2007 ABILIFY 10 $6.25 $6.25 .
1111 01/15/2008 ABILIFY 20 $6.25 10 $62.50
1111 10/16/2009 ABILIFY 30 $6.25 20 $125.00
1111 07/01/2010 ABILIFY 40 $6.25 30 $187.50
2222 01/04/2007 GLEEVEC 10 $5.55 $5.55 . .
2222 02/10/2008 GLEEVEC 20 $5.55 10 $55.55
2222 02/15/2009 GLEEVEC 30 $5.55 20 $111.00
2222 07/01/2010 GLEEVEC 40 $5.55 30 $166.50
Conclusions We demonstrated the basic use the of
LAG function. We hope The LAG Code example with other functions in this paper will be useful to the Base SAS® programmers.
We suggest careful look at the data structure and make sure to deal with null values and assign correctly when you use LAG function.
Kennett Square, PAMushroom Capital of the World
Kennett Square, PAMushroom Festival
Mushroom Festival
Growing Mushrooms
Longwood Gardens
Thank you
Acknowledgements Mr. Shimels Afework, Sr. Director
of our company for his support.
SAS ® indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies.
Questions?
Anjan Matlapudi and J. Daniel Knapp PerformRx
The Next Generation PBM, 200 Stevens Drive, Philadelphia, PA 19113
A Way to Work with Invoice Flat Files in SAS®
Tips to Use Character String Functions in Record Lookup