Linked Server 3：SQL Server 分布式数据库性能测试

我在三台安装SQL Server 2012的服务器上搭建分布式数据库，把产品环境中一年近1.4亿条数据大致均匀地存储在这三台服务器中，每台Server 存储4个月的数据，物理机的系统配置基本相同：内存16G，双核 CPU 3.6GHz，软件环境是Windows Server 2012 R，和SQL Server 2012。

1，创建水平分区视图

基础表是dbo.Commits，每个基础表大致存储4个月的数据，近5000万条记录：

CREATE TABLE [dbo].[Commits]
(
    [CommitID] [bigint] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
    [AuthorID] [bigint] NOT NULL,
    [CreatedDate] [datetime2](7) NOT NULL,
    [CreatedDateKey] [int] NOT NULL,
 CONSTRAINT [PK__Commits_CommitID] PRIMARY KEY CLUSTERED 
(
    [CommitID] ASC,
    [CreatedDateKey] ASC
)
)

View Code

创建分区视图，Linked Server的Alias是db2 和 db3，Catalog 是 tdw（test data warehouse）：

CREATE view [dbo].[view_commits]
as

select [CommitID]
      ,[AuthorID]
      ,[CreatedDate]
      ,[CreatedDateKey]
from dbo.commits c with(nolock)
where c.[CreatedDateKey] between 20150900 and 20160000

union ALL
select [CommitID]
      ,[AuthorID]
      ,[CreatedDate]
      ,[CreatedDateKey]
from db3.tdw.dbo.commits c with(nolock)
where c.[CreatedDateKey] between 20150000 and 20150500

union ALL
select [CommitID]
      ,[AuthorID]
      ,[CreatedDate]
      ,[CreatedDateKey]
from db2.tdw.dbo.commits c with(nolock)
where c.[CreatedDateKey] between 20150500 and 20150900
WITH check OPTION;
GO

View Code

2，查询性能测试

Test1，在基础表上测试，基础表是全部的数据，cost：79s

select count(0)
from dbo.commits_total c  with(nolock)
where day(c.[CreatedDate])=1

Test2，使用分区视图测试，cost=134s，比Test1的查询性能明显降低。

select count(0)
from dbo.view_commits c  with(nolock)
where day(c.[CreatedDate])=1

3，使用OpenQuery查询

OpenQuery把查询语句直接发送到Linked Server上执行，返回查询的结果，cost:105s，还是很高，相对提高20%的性能。

select sum(t.cnt) as cnt
from
(
    select count(0) as cnt
    from dbo.commits c  with(nolock)
    where day(c.[CreatedDate])=1

    UNION all
    select p.cnt
    from openquery(db2,
    N'select count(0) as cnt
    from dbo.commits c  with(nolock)
    where day(c.[CreatedDate])=1') as p

    UNION all
    select p.cnt
    from openquery(db3,
    N'select count(0) as cnt
    from dbo.commits c  with(nolock)
    where day(c.[CreatedDate])=1') as p
) as t

View Code

4，使用C# 多线程编程

创建三个Task同时运行在三台Server上，Cost：28s

static void Main(string[] args)
{
    List<Task> tasks = new List<Task>();
    int c1=0, c2=0, c3=0;

    Task t1 = new Task(()=> 
    {
        c1= GetCount("xxx");
    });

    Task t2 = new Task(() =>
    {
        c2=  GetCount("xxx");
    });

    Task t3 = new Task(() =>
    {
        c3= GetCount("xxx");
    });
    tasks.Add(t1);
    tasks.Add(t2);
    tasks.Add(t3);

    Stopwatch sw = new Stopwatch();
    sw.Start(); 
    t1.Start();
    t2.Start();
    t3.Start();

    Task.WaitAll(tasks.ToArray());

    int sum = c1 + c2 + c3;
    sw.Stop();

    Console.Read();
}

static int GetCount(string str)
{
    using (SqlConnection con = new SqlConnection(str))
    {
        con.Open();
        var cmd = con.CreateCommand();
        cmd.CommandText = @" select count(0) as cnt
        from dbo.commits c  with(nolock)
        where day(c.[CreatedDate]) = 1";
        int count = (int)cmd.ExecuteScalar();
        con.Close();
        return count;
    }
}

View Code

5，结论

将数据水平切分，分布式部署在不同的SQL Server上，其查询性能并不一定比单一DB性能更好。
使用OpenQuery函数将查询语句在Remote Server上执行，返回查询结果，能够优化Linked Server 的查询性能。
在使用分布式数据库查询数据时，针对特定的应用，编写特定的代码，这需要fore-end 更多的参与。

参考doc：

Top 3 Performance Killers For Linked Server Queries

[翻译]——SQL Server使用链接服务器的5个性能杀手

posted @ 2016-06-19 19:34 悦光阴阅读(2036) 评论(0) 收藏举报

刷新页面返回顶部

悦光阴

请不要假装很努力，因为结果不会陪你演戏。

Linked Server 3：SQL Server 分布式数据库性能测试

公告