当A基数很小时,select B from tbl order by A limit 10查询结果每次不一样
查询常见问题 | StarRocks https://docs.starrocks.io/zh/docs/faq/Sql_faq/#%E5%BD%93a%E5%9F%BA%E6%95%B0%E5%BE%88%E5%B0%8F%E6%97%B6select-b-from-tbl-order-by-a-limit-10%E6%9F%A5%E8%AF%A2%E7%BB%93%E6%9E%9C%E6%AF%8F%E6%AC%A1%E4%B8%8D%E4%B8%80%E6%A0%B7
当A基数很小时,select B from tbl order by A limit 10查询结果每次不一样
解决方案:
使用select B from tbl order by A,B limit 10
,将B也进行排序就能保证结果一致。
问题原因
上面的SQL只能保证A是有序的,并不能保证每次查询出来的B顺序是一致的,MySQL能保证这点因为它是单机数据库,而StarRocks是分布式数据库,底层表数据存储是sharding的。A的数据分布在多台机器上,每次查询多台机器返回的顺序可能不同,就会导致每次B顺序不一致。
When cardinality of column A is small, the query results of select B from tbl order by A limit 10
vary each time
SQL can only guarantee that column A is ordered, and it cannot guarantee that the order of column B is the same for each query. MySQL can guarantee the order of column A and column B because it is a standalone database.
StarRocks is a distributed database, of which data stored in the underlying table is in a sharding pattern. The data of column A is distributed across multiple machines, so the order of column B returned by multiple machines may be different for each query, resulting in inconsistent order of B each time. To solve this problem, change select B from tbl order by A limit 10
to select B from tbl order by A,B limit 10
.