Advanced Uses of the Colon Modifier【转载】

In SAS, the colon (:) can be used in conjunction with all of the comparison operators (=, >, <, >=, <=, ne, in) to compare prefix's.

Consider the following examples, given the following surnames:  

  • Bevan

  • Bosley

  • Bowen

  • Burden

  • Bush

Consider the following data step statements using the colon modifier:

  

if surname =: 'B' then ...
 

will find all 5 surnames.

 

if surname =: 'Bo' then ...
 

will find Bosley and Bowen only.

 

if surname >=: 'Bo' then ...
 

will find Bosley, Bowen, Burden and Bush only.

 

if surname <: 'Bo' then ...
 

will find Bevan only.

 

if surname ne: 'Be' then ...
 

will find Bosley, Bowen, Burden and Bush only.

 

if surname in: ('Be','Bu') then ...
 

will find Bevan, Burden and Bush only.

 

Using the colon modifier with equals (=:) is commonly referred to as ‘begins with’. BEWARE! What SAS actually does is adjust the character string to be the same length before they can be compared – it will truncate the longer string to the length of the shorter string during the comparison.

 

So, given the list of surnames – Tom, Tomlinson, Tomson , the following conditional statement will return the results Tom and Tomson

 

if surname =: 'Tomson' then ...
 

The surprising result here is that surname=’Tom’ actually meets this condition. In performing this comparison SAS has determined that the value of ‘surname’ is shorter than the right hand side of the equation, and so the right hand side get’s truncated to the same length.

 

A zero length character used in either in: or =: will always evaluate to 0, or false.

 

The colon modifier can also be used as a variable name wildcard. For example, if you have a dataset with many variables sharing the same prefix but with a different suffix (e.g. balance_012008, balance_022008, balance_032008) the colon modifier can save a lot of typing .

 

data subset;
set bigdataset (keep=balance:);
...processing statements…
run;


This will keep all the variables that begin with ‘balance’ that are on ‘bigdataset’.

This feature can be used in many ways, such as:

 

sum (of balance:) 
 

will sum all the variables beginning with ‘balance’

 

array bal (*) balance:; 
 

creates an array for all ‘balance’ variables

 

drop balance:; 
 

drops all variables beginning with ‘balance’. 

posted @ 2013-12-12 16:33  寒秋绝月  阅读(337)  评论(0编辑  收藏  举报