SAS关于宏、宏函数、宏变量、data步、proc步和call execute的理解
SAS宏和宏函数的问题困扰了我三年之久,终于在昨日想通了,而想通的原因也很搞笑,仅仅是当时意识到了sas宏和宏函数是两个东西,自定的宏并不是宏函数(在其他编程语言中自定义函数和语言本身函数是一样的,受此影响!)
转载请注明出处:http://www.cnblogs.com/SSSR/p/6380957.html
结论:sas宏并不具备sas宏函数的功能,它仅仅只是一段文本,这段文本中如果有参数和宏函数,我们只是把参数替换掉和宏函数执行了,然后生成一个正常的文本(包含data步和proc步)提交给sas运行。
遇到宏函数时会直接执行,遇到宏时会直接进行文本替换(宏中的宏函数也会直接执行),宏函数返回的文本会和其他的data步和proc步组合然后一起提交给sas步运行。
宏函数直接执行,sas函数需要在data步中执行(%put和put的区别)
下面来看两个简单的例子:
options symbolgen mprint;*将宏编译的过程也打印出来,以便进行测试; %macro s(i); %put &i.; %mend; data _null_; set sashelp.class; put name; %s('program'); run; /*以下是结果:汉字为注释 105 %macro s(i); 106 107 %put &i.; 108 %mend; 109 110 data _null_; 111 set sashelp.class; 112 put name; 113 %s('program'); //下面是这句宏调用的解析 SYMBOLGEN: Macro variable I resolves to 'program' // I是%s的宏参数,被替换成了program。下面是直接打印了program,这个是为啥呢?就是因为%put是宏函数,会立即执行,从而先打印了program,然后才会将正常的data步提交给sas执行,run;后的部分是宏编译完以后执行的结果。 'program' 114 run; Alfred Alice Barbara Carol Henry James Jane Janet Jeffrey John Joyce Judy Louise Mary Philip Robert Ronald Thomas William NOTE: There were 19 observations read from the data set SASHELP.CLASS. NOTE: DATA statement used (Total process time): real time 0.01 seconds */ %macro t(i); put &i.; %mend; data _null_; set sashelp.class; %t(name); %s('programe') run; /*以下为结果:中文为注释 135 %macro t(i); 136 137 put &i.; 138 %mend; 139 140 data _null_; 141 set sashelp.class; 142 %t(name); SYMBOLGEN: Macro variable I resolves to name //宏变量I被解析成name,注意这个name不带双引号, MPRINT(T): put name; //这里put name和data set等一起被提交给sas执行 143 %s('programe') SYMBOLGEN: Macro variable I resolves to 'programe' //这个因为是%put所以被直接执行了。 'programe' 144 run; Alfred Alice Barbara Carol Henry James Jane Janet Jeffrey John Joyce Judy Louise Mary Philip Robert Ronald Thomas William NOTE: There were 19 observations read from the data set SASHELP.CLASS. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds */ proc print data=sashelp.class; run;
再来看一个复杂的例子:
options symbolgen mprint; data b; input a1 a2 a3; cards; 1 2 3 ; run; %macro varx; %do i=0 %to 2; proc sql ; select a%eval(&i.+1) into:a%eval(&i.+1) from b; quit; %let j=%eval(&i.+1); data _null_; a= &&a&j.; put a; run; %end ; %mend varx; %varx; /*以下为结果:中文为注释,整个程序循环了三次i=0 1 2,所以会生成三次proc sql和三次data步,然后提交给sas运行。从这个就可以看出sas宏的一个作用,减少代码量,将重复的代码进行封装。 145 data b; 146 input a1 a2 a3; 147 cards; NOTE: The data set WORK.B has 1 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 149 ; 150 run; //以上为数据集生成 151 152 %macro varx; 153 %do i=0 %to 2; 154 proc sql ; 155 select a%eval(&i.+1) into:a%eval(&i.+1) 156 from b; 157 quit; 158 %let j=%eval(&i.+1); 159 data _null_; 160 a= &&a&j.; 161 put a; 162 run; 163 %end ; 164 %mend varx; 165 %varx; MPRINT(VARX): proc sql ; SYMBOLGEN: Macro variable I resolves to 0 SYMBOLGEN: Macro variable I resolves to 0 MPRINT(VARX): select a1 into:a1 from b; MPRINT(VARX): quit; //从proc sql到这里就是宏varx编译生成的一段proc sql代码,这部分提交给sas运行。 NOTE: PROCEDURE SQL used (Total process time): real time 0.00 seconds cpu time 0.00 seconds SYMBOLGEN: Macro variable I resolves to 0 MPRINT(VARX): data _null_; SYMBOLGEN: && resolves to &. SYMBOLGEN: Macro variable J resolves to 1 SYMBOLGEN: Macro variable A1 resolves to 1 MPRINT(VARX): a= 1; MPRINT(VARX): put a; MPRINT(VARX): run; //从data步到这一行是宏varx编译生成的第二段代码data步,这段代码中宏变量I被编译成了0,j直接被计算成了1.这段代码也同样展示了select into生成的宏变量在其他程序中调用的执行顺序。 1 NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds MPRINT(VARX): proc sql ; SYMBOLGEN: Macro variable I resolves to 1 SYMBOLGEN: Macro variable I resolves to 1 MPRINT(VARX): select a2 into:a2 from b; MPRINT(VARX): quit; //重复proc sql NOTE: PROCEDURE SQL used (Total process time): real time 0.00 seconds cpu time 0.00 seconds SYMBOLGEN: Macro variable I resolves to 1 MPRINT(VARX): data _null_; SYMBOLGEN: && resolves to &. SYMBOLGEN: Macro variable J resolves to 2 SYMBOLGEN: Macro variable A2 resolves to 2 MPRINT(VARX): a= 2; MPRINT(VARX): put a; MPRINT(VARX): run; //重复data 2 NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds MPRINT(VARX): proc sql ; SYMBOLGEN: Macro variable I resolves to 2 SYMBOLGEN: Macro variable I resolves to 2 MPRINT(VARX): select a3 into:a3 from b; MPRINT(VARX): quit; //重复proc sql NOTE: PROCEDURE SQL used (Total process time): real time 0.01 seconds cpu time 0.01 seconds SYMBOLGEN: Macro variable I resolves to 2 MPRINT(VARX): data _null_; SYMBOLGEN: && resolves to &. SYMBOLGEN: Macro variable J resolves to 3 SYMBOLGEN: Macro variable A3 resolves to 3 MPRINT(VARX): a= 3; MPRINT(VARX): put a; MPRINT(VARX): run; //重复data 3 NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds */
未完,后续预告call execute(http://bbs.pinggu.org/thread-2377205-1-1.html 此链接的讲解非常透彻,同时call excute可以调用宏,而且讲解了在使用宏和宏变量时单引号和双引号的处理)。
在Python和R语言中,有一个for循环的功能非常好用,但是在sas中没有类似的功能。鉴于sas data步中的pdv是一行一行的读取执行的,so可以结合call execute来实现for循环功能:
以下代码实现的功能是将三行数据生成三个数据集,数据集的名称是dataset列的值。
data a;
input A B dataset $;
datalines;
1 2 ds_1
6 7 ds_2a
2 3 ds_100c
;
run;
data _null_;
set a;
call execute('data ' ||dataset||';'||'set a (firstobs=' ||_n_|| ' obs=' || _n_ || ');run;' );
run;
/* or */
data _null_;
set a;
call execute('data ' || dataset || ';' || 'set a ; if dataset= "' || dataset || '"; run ;' );
run;