sphinx 同时使用多个索引进行检索探究
2014年2月15日 11:24:34
结论:
1.一次性使用多个索引进行查询的时候,返回的结果集中的fields字段没有什么清楚的意义(也没有找到文档对它的说明)
2.如果程序中一次搜索使用了多个索引,如果它们配置文件中过滤用的属性(aql_attr_uint,sql_field_string...)不全相同,那么最终返回的结果集中,只包含这几个索引中共有的属性
实验:
建立两个索引:goods_brand, goods_cate, 分别是商品信息+品牌信息,商品信息+分类信息
1 sql_query = select gid, gid as goodsid, siteid, catename from v_goods_info_cate 2 sql_attr_uint = siteid 3 sql_attr_uint = goodsid 4 sql_field_string = catename 5 6 ####################### 7 8 sql_query = select gid, gid as goodsid, siteid, brandname from v_goods_info_brand 9 sql_attr_uint = siteid 10 sql_attr_uint = goodsid 11 sql_field_string = brandname
注:
1. brandname 是商品的品牌名字, catename是商品的分类名字
2. brandname, catename 在索引时,既作为全文索引,又作为属性值返回
3. siteid在两个索引中都有,brandname和catename只在各自的索引中存在
测试程序代码
1 $sphObj->AddQuery($keyword, 'goods_brand'); 2 $sphObj->AddQuery($keyword, 'goods_cate'); 3 $sphObj->AddQuery($keyword, 'goods_cate, goods_brand'); 4 $sphObj->AddQuery($keyword, 'goods_brand,goods_cate'); 5 6 var_dump($rs[0]['fields'], $rs[0]['words'], $rs[0]['matches']);
注:
在程序中做控制:搜索"机"这个字,在goods_cate和goods_brand索引中各只有两条记录符合要求(一共有4条记录):
1.分别执行测试代码的第1行和第2行,并用第6行打印出结果:
1 //goods_brand 2 array (size=1) 3 0 => string 'brandname' (length=9) 4 5 array (size=1) 6 '机' => 7 array (size=2) 8 'docs' => string '10049' (length=5) 9 'hits' => string '10049' (length=5) 10 11 array (size=2) 12 0 => 13 array (size=3) 14 'id' => string '157978' (length=6) 15 'weight' => string '1' (length=1) 16 'attrs' => 17 array (size=3) 18 'goodsid' => string '157978' (length=6) 19 'siteid' => string '102' (length=3) 20 'brandname' => string '无锡一机' (length=12) 21 1 => 22 array (size=3) 23 'id' => string '157980' (length=6) 24 'weight' => string '1' (length=1) 25 'attrs' => 26 array (size=3) 27 'goodsid' => string '157980' (length=6) 28 'siteid' => string '102' (length=3) 29 'brandname' => string '无锡一机' (length=12) 30 31 //goods_cate 32 array (size=1) 33 0 => string 'catename' (length=8) 34 35 array (size=1) 36 '机' => 37 array (size=2) 38 'docs' => string '43986' (length=5) 39 'hits' => string '43986' (length=5) 40 41 array (size=2) 42 0 => 43 array (size=3) 44 'id' => string '158010' (length=6) 45 'weight' => string '1' (length=1) 46 'attrs' => 47 array (size=3) 48 'goodsid' => string '158010' (length=6) 49 'siteid' => string '102' (length=3) 50 'catename' => string '磨齿机' (length=9) 51 1 => 52 array (size=3) 53 'id' => string '158014' (length=6) 54 'weight' => string '1' (length=1) 55 'attrs' => 56 array (size=3) 57 'goodsid' => string '158014' (length=6) 58 'siteid' => string '102' (length=3) 59 'catename' => string '旋压机' (length=9)
注:
每个索引单独被使用时,各对应两条记录(一共有4条记录)
每条匹配记录中的'attrs'中有siteid+brandname,或者,siteid+catename
2.当用一次查询用多个索引时:分别执行第3行和第4行,并用第6行打印出结果:
1 //goods_brand在前,goods_cate在后 2 array (size=1) 3 0 => string 'brandname' (length=9) 4 5 array (size=1) 6 '机' => 7 array (size=2) 8 'docs' => string '54035' (length=5) 9 'hits' => string '54035' (length=5) 10 11 array (size=4) 12 0 => 13 array (size=3) 14 'id' => string '157978' (length=6) 15 'weight' => string '1' (length=1) 16 'attrs' => 17 array (size=2) 18 'goodsid' => string '157978' (length=6) 19 'siteid' => string '102' (length=3) 20 1 => 21 array (size=3) 22 'id' => string '157980' (length=6) 23 'weight' => string '1' (length=1) 24 'attrs' => 25 array (size=2) 26 'goodsid' => string '157980' (length=6) 27 'siteid' => string '102' (length=3) 28 2 => 29 array (size=3) 30 'id' => string '158010' (length=6) 31 'weight' => string '1' (length=1) 32 'attrs' => 33 array (size=2) 34 'goodsid' => string '158010' (length=6) 35 'siteid' => string '102' (length=3) 36 3 => 37 array (size=3) 38 'id' => string '158014' (length=6) 39 'weight' => string '1' (length=1) 40 'attrs' => 41 array (size=2) 42 'goodsid' => string '158014' (length=6) 43 'siteid' => string '102' (length=3) 44 45 //goods_cate在前,goods_brand在后 46 array (size=1) 47 0 => string 'catename' (length=8) 48 49 array (size=1) 50 '机' => 51 array (size=2) 52 'docs' => string '54035' (length=5) 53 'hits' => string '54035' (length=5) 54 55 array (size=4) 56 0 => 57 array (size=3) 58 'id' => string '157978' (length=6) 59 'weight' => string '1' (length=1) 60 'attrs' => 61 array (size=2) 62 'goodsid' => string '157978' (length=6) 63 'siteid' => string '102' (length=3) 64 1 => 65 array (size=3) 66 'id' => string '157980' (length=6) 67 'weight' => string '1' (length=1) 68 'attrs' => 69 array (size=2) 70 'goodsid' => string '157980' (length=6) 71 'siteid' => string '102' (length=3) 72 2 => 73 array (size=3) 74 'id' => string '158010' (length=6) 75 'weight' => string '1' (length=1) 76 'attrs' => 77 array (size=2) 78 'goodsid' => string '158010' (length=6) 79 'siteid' => string '102' (length=3) 80 3 => 81 array (size=3) 82 'id' => string '158014' (length=6) 83 'weight' => string '1' (length=1) 84 'attrs' => 85 array (size=2) 86 'goodsid' => string '158014' (length=6) 87 'siteid' => string '102' (length=3)
注:
两个索引被同时使用,只有先后顺序不一样时,4条记录都得到了(这样的结果是对的)
但是第3行和第47行的代码键值对表明,返回的结果集中的fields值没有什么特别的含义(至少我不知到,难道只和排在前边的索引使用的全文索引字段同步?肯定有什么意义,只是我没有总结到吧)
另外,查看结果知道,每一条匹配记录的'attrs'数组中只有siteid键值对