可变区组长度--区组随机

项目遇到非4的倍数的sample size，百度知网均没有找到现成的‘轮子’，尝试自己造。几名统计师讨论之后得到的结论，望大家批评指正。

首先，常规的随机区组使用 proc plan，没有异议。但是只能在factors 设置相同长度的区组长度，如24例受试者，factors block=6 length=4，

cite：SAS proc plan的内部逻辑为先随机出6个区组，后给每个区组出4个随机数

（

The procedure first generates a random permutation of the integers 1 to 4 and then, for each of these, generates a random permutation of the integers 1 to 3. You can think of factor Two as being nested within factor One, where the levels of factor One are to be randomly assigned to 4 units.

）

本例逻辑是基于完全随机思想，id=1 2 3 4 5 ...，rand=uniform(seed)，

见 code

/*上文第3步*/

举例：若sample size为30，逻辑为：

设置区组长度length为 4 4 4 4 4 4 6
给length随机排序，eg: 4 4 4 4 6 4 4
创建block_ID，如第二区组中的四个受试者为24 24 24 24 ，第五个区组的受试者为56 56 56 56 56 56（此步为了方便后面proc rank 的by statement）
给每个受试者一个随机数
在区组内排序、分组

code如下：

data core;
input blockl blockID;
cards;
4 1
4 2
4 3
4 4
4 5
4 6
6 7
;/*保证sum(blockl)=30*/
run;

%macro rd_blk;

/*先区组随机排序*/
data random;
set core;
rand=uniform(2020);
output;
run;

proc rank data=random out=rank;
var rand;
ranks r_rank;
run;

proc sort data=rank out=result;
by r_rank;
run;

/*上文第3步*/
data core1;
set result;
do i=1 to blockl;
lent=blockl;
block_id=input(compress(r_rank)||compress(lent),8.);
output;
end;
run;

data random1;
set core1;
rand1=uniform(111);
run;

proc rank data=random1 out=rank1;
by block_id ;
var rand1;
ranks r_rank1;
run;

%mend rd_blk;
%rd_blk;

data result1(keep=blockID r_rank1 SUBJid g);
set rank1;
retain SUBJid 100;
SUBJid+1;
id=input(compress(lent)||compress(r_rank1),8.);
if id in (41 42 61 62 63) then g='A(T-R)';
if id in (43 44 64 65 66) then g='B(R-T)';
label blockID='区组号' r_rank1='随机数' SUBJid='受试者编号' g='组别';
run;

proc sql;
select count(*) as numA from result1
where g='A';
quit;

proc sql;
select count(*) as numB from result1
where g='B';
quit;

其中需要自定义的有：区组长度和等于sample size、种子数、最后的分组，若blockl=8,

    if id in (81 82 83 84) then g='A';

    if id in (85 86 87 88) then g='B';

posted on 2020-08-17 13:17 be·freedom 阅读(2240) 评论(3) 编辑收藏举报