disruptor调优方法

翻译自disruptor在github上的文档,https://github.com/LMAX-Exchange/disruptor/wiki/Getting-Started

 

Basic Tuning Options 基本的调优方法

Using the above approach will work functionally in the widest set of deployment scenarios. However, if you able to make certain assumptions about the hardware and software environment that the Disruptor will run in then you can take advantage of a number of tuning options to improve performance. There are 2 main options for tuning, single vs. multiple producers and alternative wait strategies. ------------------------------------------------------------------------------------------------- 使用上面方法在大多数的部署场景里都可以很好的工作。 但是,如果你能搞清楚你将在什么样的硬件和软件环境中运行disruptor的话,有一些个调优方法能够提高性能。 主要有2种方法进行调优:单生产者vs多生产者,选择合适的等待策略。

Single vs. Multiple Producers  单生产者还是多生产者?

One of the best ways to improve performance in concurrect systems is to ahere to the Single Writer Princple, this applies to the Disruptor. If you are in the situation where there will only ever be a single thread producing events into the Disruptor, then you can take advantage of this to gain additional performance. 改进并发系统性能的最佳办法之一就是遵守“单写入者”定律,Disruptor就是这样干的。 如果你只需要用一个单生产者线程生成事件投入Disruptor的话,这样的应用场景会获得额外的性能表现。

public class LongEventMain {    

  public static void main(String[] args) throws Exception     {        

  //.....        

  // Construct the Disruptor with a SingleProducerSequencer        

  Disruptor<LongEvent> disruptor = new Disruptor(factory,       

                            bufferSize,     

                            ProducerType.SINGLE, // Single producer                                                       

                            new BlockingWaitStrategy(),   

                                         executor);

        //.....     }

}

To give an indication of how much of a performance advantage can be achieved through this technique we can change the producer type in the OneToOne performance test. Tests run on i7 Sandy Bridge MacBook Air. 为了证明这样的技术到底能带来多少性能上的提升, 我们可以通过调整producerType来一场1对1的性能测试。(译者注:ProducersType.Single vs. ProducersType.Multiple ) 测试环境:i7 Sandy Bridge MacBook Air

Multiple Producer

Run 0, Disruptor=26,553,372 ops/sec

Run 1, Disruptor=28,727,377 ops/sec

Run 2, Disruptor=29,806,259 ops/sec

Run 3, Disruptor=29,717,682 ops/sec

Run 4, Disruptor=28,818,443 ops/sec

Run 5, Disruptor=29,103,608 ops/sec

Run 6, Disruptor=29,239,766 ops/sec

Single Producer

Run 0, Disruptor=89,365,504 ops/sec

Run 1, Disruptor=77,579,519 ops/sec

Run 2, Disruptor=78,678,206 ops/sec

Run 3, Disruptor=80,840,743 ops/sec

Run 4, Disruptor=81,037,277 ops/sec

Run 5, Disruptor=81,168,831 ops/sec

Run 6, Disruptor=81,699,346 ops/sec

Alternative Wait Strategies 可选择的等待策略

BlockingWaitStrategy 阻塞等待策略

The default wait strategy used by the Disruptor is the BlockingWaitStrategy. Internally the BlockingWaitStrategy uses a typical lock and condition variable to handle thread wake-up. The BlockingWaitStrategy is the slowest of the available wait strategies, but is the most conservative with the respect to CPU usage and will give the most consistent behaviour across the widest variety of deployment options. However, again knowledge of the deployed system can allow for additional performance. -------------------------------------------------------------------------------------------------------- disruptor的默认消费者等待策略是BlockingWaitStrategy。 BlockingWaitStrategy内部使用的是典型的锁和条件变量机制,来处理线程的唤醒。 这种策略是所有disruptor等待策略中最慢的一种,但是是最保守使用消耗cpu的一种用法,并且在不同的部署环境下最能保持性能一致。 但是,随着我们对部署系统的了解可以优化出额外的性能。(译者注:BlockingWaitStrategy是适用面最广的默认策略, 但是随着对具体系统部署环境的具体分析,我们可以采用其他策略替换它,来取得更好的优化性能)

SleepingWaitStrategy  休眠等待策略

Like the BlockingWaitStrategy the SleepingWaitStrategy it attempts to be conservative with CPU usage, by using a simple busy wait loop, but uses a call to  LockSupport.parkNanos(1)  in the middle of the loop. On a typical Linux system this will pause the thread for around 60μs. However it has the benefit that the producing thread does not need to take any action other increment the appropriate counter and does not require the cost of signalling a condition variable. However, the mean latency of moving the event between the producer and consumer threads will be higher. It works best in situations where low latency is not required, but a low impact on the producing thread is desired. A common use case is for asynchronous logging. --------------------------------------------------------------------------------------------------------- 像BlockingWaitStrategy一样,SleepingWaitStrategy也是属于一种保守使用cpu的策略。 它使用一个简单的loop繁忙等待循环,但是在循环体中间它调用了LockSupport.parkNanos(1)。 通常在linux系统这样会使得线程停顿大约60微秒。但是这样做的好处是,生产者线程不需要额外的动作去累加计数器,也不需要产生条件变量信号量开销。 但是这样带来的负面影响是,在生产者线程与消费者线程之间传递event的延迟变高了。所以SleepingWaitStrategy适合在不需要低延迟, 但需要很低的生产者线程影响的情形。一个典型的案例就是异步日志记录功能。

YieldingWaitStrategy  服从等待策略

The YieldingWaitStrategy is one of 2 Wait Strategies that can be use in low latency systems, where there is the option to burn CPU cycles with the goal of improving latency. The YieldingWaitStrategy will busy spin waiting for the sequence to increment to the appropriate value. Inside the body of the loop  Thread.yield()  will be called allowing other queued threads to run. This is the recommended wait strategy when need very high performance and the number of Event Handler threads is less than the total number of logical cores, e.g. you have hyper-threading enabled. ---------------------------------------------------------------------------------------------------------- YieldingWaitStrategy是2种可以用于低延迟系统的等待策略之一,充分使用压榨cpu来达到降低延迟的目标。 YieldingWaitStrategy不断的循环等待sequence去递增到合适的值。 在循环体内,调用Thread.yield()来允许其他的排队线程执行。 这是一种在需要极高性能并且event handler线程数少于cpu逻辑内核数的时候推荐使用的策略。 例如,你开启了超线程。(译者注:超线程是intel研发的一种cpu技术,可以使得一个核心提供两个逻辑线程,比如4核心超线程后有8个线程)

BusySpinWaitStrategy  繁忙旋转等待策略

The BusySpinWaitStrategy is the highest performing Wait Strategy, but puts the highest constraints on the deployment environment. This wait strategy should only be used if the number of Event Handler threads is smaller than the number of physical cores on the box.  E.g. hyper-threading should be disabled. BusySpinWaitStrategy是性能最高的等待策略,但是受部署环境的制约依赖也越强。 仅仅当event处理线程数少于物理核心数的时候才应该采用这种等待策略。 例如,超线程不可开启。

Clearing Objects From the Ring Buffer 清理RingBuffer里的对象

When passing data via the Disruptor, it is possible for objects to live longer than intended. To avoid this happening it may be necessary to clear out the event after processing it. If you have a single event handler clearing out the value within the same handler is sufficient. If you have a chain of event handlers, then you may need a specific handler placed at the end of the chain to handle clearing out the object. 当通过disruptor传数据的时候,有可能disruptor中的对象会存活的比打算的要长(已被使用过,垃圾数据)。为了避免这种情况,非常有必要在event处理之后去清理这些个对象。如果只有一个event handler的话,那么直接在这个handler中完成清理就可以了(处理完业务逻辑之后就清理掉evnet对象),而如果你有多个event handler组成了chain的话,拿需要自定义一个专门用来清理对象的handler,然后把它放在handler chain中的最后一个。

class ObjectEvent<T>
{
    T val;

    void clear()
    {
        val = null;
    }
}

public class ClearingEventHandler<T> implements EventHandler<ObjectEvent<T>>
{
    public void onEvent(ObjectEvent<T> event, long sequence, boolean endOfBatch)
    {
        // Failing to call clear here will result in the 
        // object associated with the event to live until
        // it is overwritten once the ring buffer has wrapped
        // around to the beginning.
        event.clear(); 
    }
}

public static void main(String[] args)
{
    Disruptor<ObjectEvent<String>> disruptor = new Disruptor<>(
        () -> ObjectEvent<String>(), bufferSize, DaemonThreadFactory.INSTANCE);

    disruptor
        .handleEventsWith(new ProcessingEventHandler())
        .then(new ClearingObjectHandler());
}

 

posted on 2016-01-13 13:47  肥兔子爱豆畜子  阅读(3779)  评论(0编辑  收藏  举报

导航