sphinx 同时使用多个索引进行检索探究

2014年2月15日 11:24:34

结论:

1.一次性使用多个索引进行查询的时候,返回的结果集中的fields字段没有什么清楚的意义(也没有找到文档对它的说明)

2.如果程序中一次搜索使用了多个索引,如果它们配置文件中过滤用的属性(aql_attr_uint,sql_field_string...)不全相同,那么最终返回的结果集中,只包含这几个索引中共有的属性

实验:

建立两个索引:goods_brand,  goods_cate, 分别是商品信息+品牌信息,商品信息+分类信息

 1  sql_query = select gid, gid as goodsid, siteid, catename from v_goods_info_cate
 2  sql_attr_uint = siteid
 3  sql_attr_uint = goodsid
 4  sql_field_string = catename
 5 
 6 #######################
 7 
 8  sql_query = select gid, gid as goodsid, siteid, brandname from v_goods_info_brand
 9  sql_attr_uint = siteid
10  sql_attr_uint = goodsid
11  sql_field_string = brandname

注:

1. brandname 是商品的品牌名字, catename是商品的分类名字

2. brandname, catename 在索引时,既作为全文索引,又作为属性值返回

3. siteid在两个索引中都有,brandname和catename只在各自的索引中存在

测试程序代码

1 $sphObj->AddQuery($keyword, 'goods_brand');
2 $sphObj->AddQuery($keyword, 'goods_cate');
3 $sphObj->AddQuery($keyword, 'goods_cate, goods_brand');
4 $sphObj->AddQuery($keyword, 'goods_brand,goods_cate');
5 
6 var_dump($rs[0]['fields'], $rs[0]['words'], $rs[0]['matches']);

注:

在程序中做控制:搜索"机"这个字,在goods_cate和goods_brand索引中各只有两条记录符合要求(一共有4条记录):

1.分别执行测试代码的第1行和第2行,并用第6行打印出结果:

 1 //goods_brand
 2 array (size=1)
 3   0 => string 'brandname' (length=9)
 4 
 5 array (size=1)
 6   '机' => 
 7     array (size=2)
 8       'docs' => string '10049' (length=5)
 9       'hits' => string '10049' (length=5)
10 
11 array (size=2)
12   0 => 
13     array (size=3)
14       'id' => string '157978' (length=6)
15       'weight' => string '1' (length=1)
16       'attrs' => 
17         array (size=3)
18           'goodsid' => string '157978' (length=6)
19           'siteid' => string '102' (length=3)
20           'brandname' => string '无锡一机' (length=12)
21   1 => 
22     array (size=3)
23       'id' => string '157980' (length=6)
24       'weight' => string '1' (length=1)
25       'attrs' => 
26         array (size=3)
27           'goodsid' => string '157980' (length=6)
28           'siteid' => string '102' (length=3)
29           'brandname' => string '无锡一机' (length=12)
30 
31 //goods_cate
32 array (size=1)
33   0 => string 'catename' (length=8)
34 
35 array (size=1)
36   '机' => 
37     array (size=2)
38       'docs' => string '43986' (length=5)
39       'hits' => string '43986' (length=5)
40 
41 array (size=2)
42   0 => 
43     array (size=3)
44       'id' => string '158010' (length=6)
45       'weight' => string '1' (length=1)
46       'attrs' => 
47         array (size=3)
48           'goodsid' => string '158010' (length=6)
49           'siteid' => string '102' (length=3)
50           'catename' => string '磨齿机' (length=9)
51   1 => 
52     array (size=3)
53       'id' => string '158014' (length=6)
54       'weight' => string '1' (length=1)
55       'attrs' => 
56         array (size=3)
57           'goodsid' => string '158014' (length=6)
58           'siteid' => string '102' (length=3)
59           'catename' => string '旋压机' (length=9)
View Code

注:

每个索引单独被使用时,各对应两条记录(一共有4条记录)

每条匹配记录中的'attrs'中有siteid+brandname,或者,siteid+catename

 

2.当用一次查询用多个索引时:分别执行第3行和第4行,并用第6行打印出结果:

 1 //goods_brand在前,goods_cate在后
 2 array (size=1)
 3   0 => string 'brandname' (length=9)
 4 
 5 array (size=1)
 6   '机' => 
 7     array (size=2)
 8       'docs' => string '54035' (length=5)
 9       'hits' => string '54035' (length=5)
10 
11 array (size=4)
12   0 => 
13     array (size=3)
14       'id' => string '157978' (length=6)
15       'weight' => string '1' (length=1)
16       'attrs' => 
17         array (size=2)
18           'goodsid' => string '157978' (length=6)
19           'siteid' => string '102' (length=3)
20   1 => 
21     array (size=3)
22       'id' => string '157980' (length=6)
23       'weight' => string '1' (length=1)
24       'attrs' => 
25         array (size=2)
26           'goodsid' => string '157980' (length=6)
27           'siteid' => string '102' (length=3)
28   2 => 
29     array (size=3)
30       'id' => string '158010' (length=6)
31       'weight' => string '1' (length=1)
32       'attrs' => 
33         array (size=2)
34           'goodsid' => string '158010' (length=6)
35           'siteid' => string '102' (length=3)
36   3 => 
37     array (size=3)
38       'id' => string '158014' (length=6)
39       'weight' => string '1' (length=1)
40       'attrs' => 
41         array (size=2)
42           'goodsid' => string '158014' (length=6)
43           'siteid' => string '102' (length=3)
44 
45 //goods_cate在前,goods_brand在后
46 array (size=1)
47   0 => string 'catename' (length=8)
48 
49 array (size=1)
50   '机' => 
51     array (size=2)
52       'docs' => string '54035' (length=5)
53       'hits' => string '54035' (length=5)
54 
55 array (size=4)
56   0 => 
57     array (size=3)
58       'id' => string '157978' (length=6)
59       'weight' => string '1' (length=1)
60       'attrs' => 
61         array (size=2)
62           'goodsid' => string '157978' (length=6)
63           'siteid' => string '102' (length=3)
64   1 => 
65     array (size=3)
66       'id' => string '157980' (length=6)
67       'weight' => string '1' (length=1)
68       'attrs' => 
69         array (size=2)
70           'goodsid' => string '157980' (length=6)
71           'siteid' => string '102' (length=3)
72   2 => 
73     array (size=3)
74       'id' => string '158010' (length=6)
75       'weight' => string '1' (length=1)
76       'attrs' => 
77         array (size=2)
78           'goodsid' => string '158010' (length=6)
79           'siteid' => string '102' (length=3)
80   3 => 
81     array (size=3)
82       'id' => string '158014' (length=6)
83       'weight' => string '1' (length=1)
84       'attrs' => 
85         array (size=2)
86           'goodsid' => string '158014' (length=6)
87           'siteid' => string '102' (length=3)
View Code

 注:

两个索引被同时使用,只有先后顺序不一样时,4条记录都得到了(这样的结果是对的)

但是第3行和第47行的代码键值对表明,返回的结果集中的fields值没有什么特别的含义(至少我不知到,难道只和排在前边的索引使用的全文索引字段同步?肯定有什么意义,只是我没有总结到吧)

另外,查看结果知道,每一条匹配记录的'attrs'数组中只有siteid键值对

 

 

 

posted @ 2014-02-15 13:46  myD  阅读(5067)  评论(0编辑  收藏  举报