04 February 2010

prxparse(dates)

suppose i’ve got some odd character dates defined as follows:

d1='29-31 DEC 2009,04-05 JAN 2010'; *broken across month, year;
d2='04-05,08-09,11-12 FEB 2010'; *broken within month;
d3='01-03 MAR 2010'; *contiguous within month;
d4='31 MAR-01 APR 2010'; *contiguous across month;
d5='29 DEC 2009-01 JAN 2010'; *contiguous across year;
d6='01 FEB 2010'; *single day;


trying to query based on dates would be next to impossible. so, prx those things.



Original: >>>29-31 DEC 2009,04-05 JAN 2010<<<
sd=29 sm=DEC sy=2009 sdate=29DEC2009
ed=05 em=JAN ey=2010 edate=05JAN2010

Original: >>>04-05,08-09,11-12 FEB 2010<<<
sd=04 sm=FEB sy=2010 sdate=04FEB2010
ed=12 em=FEB ey=2010 edate=12FEB2010

Original: >>>01-03 MAR 2010<<<
sd=01 sm=MAR sy=2010 sdate=01MAR2010
ed=03 em=MAR ey=2010 edate=03MAR2010

Original: >>>31 MAR-01 APR 2010<<<
sd=31 sm=MAR sy=2010 sdate=31MAR2010
ed=01 em=APR ey=2010 edate=01APR2010

Original: >>>29 DEC 2009-01 JAN 2010<<<
sd=29 sm=DEC sy=2009 sdate=29DEC2009
ed=01 em=JAN ey=2010 edate=01JAN2010

Original: >>>01 FEB 2010<<<
sd= sm= sy= sdate=01FEB2010
ed= em= ey= edate=01FEB2010


full code lives here.

09 November 2009

the utility of generic macros

i’ve been using sas for roughly 5 years now. for the past few years, i’ve been building up my toolbox of what i call ‘utility macros’. these are macros that are generic enough to be used in a wide variety of situations, but perform very specific, routine, and/or tedious tasks. these macros will very rarely produce the final product, but are quite useful in achieving that final state. these macros are in persistent development and various stages of stability. so examples may help?

20 October 2009

an mmddyyyy hhmmss informat. kinda.

a while ago, i took the hack shortcut of reading character datetime stamps into datetime values. i even went so far as suggesting a regex would be a better solution.

well, here’s that better solution:

data mmddyyyy_hhmmss_xm;
if _n_=1 then do;
re=prxparse('/(\d+)\/(\d+)\/(\d\d\d\d)\s+(\d+):(\d\d):(\d\d)\s+([AP]M)/');
retain re;
end;

infile datalines; input; _sdt=_infile_;

format dt datetime19.;

array _s(*) _sdt;
array _d(*) dt;
do i = 1 to dim(_s);
if prxmatch(re, _s[i]) then
_d[i]=dhms(mdy(input(prxposn(re,1,_s[i]),2.) /*month*/
,input(prxposn(re,2,_s[i]),2.) /*day*/
,input(prxposn(re,3,_s[i]),4.) /*year*/)
,input(prxposn(re,4,_s[i]),2.) /*hour*/
+ 12*(prxposn(re,7,_s[i]) eq 'PM') /*pm offset*/
,input(prxposn(re,5,_s[i]),2.) /*minute*/
,input(prxposn(re,6,_s[i]),2.) /*second*/);
end;
output;
drop re i;
datalines;
9/21/2009 10:02:42 PM
10/01/2009 9:44:37 AM
1/1/2009 12:13:14 AM
run;


and the output



_sdt                                                       dt
9/21/2009 10:02:42 PM 21SEP2009:22:02:42
10/01/2009 9:44:37 AM 01OCT2009:09:44:37
1/1/2009 12:13:14 AM 01JAN2009:12:13:14

15 July 2009

random dates

this is a cross posting from this sas discussion forum.


assuming you know the min and max of allowable dates (01jan2005 - 30jun2009, for example), you can use random functions.

data randates;
mindate='01jan2005'd;
maxdate='30jun2009'd;
range = maxdate-mindate+1;
format mindate maxdate randate date9.;
do i = 1 to 10000;
RanDate = mindate + int(ranuni(12345)*range);
output;
end;
run;


using this method, you don't need to find the number of days in a given month/year, SAS just knows.



a simple min/max check shows the date limits were respected



proc sql;
select distinct
min(randate) format=date9.
, max(randate) format=date9.
from randates;
quit;


output



--------------------
01JAN2005 30JUN2009