java8-并行计算

java8提供一个fork/join framework，fork/join框架是ExecutorService接口的一个实现，它可以帮助你充分利用你电脑中的多核处理器，它的设计理念是将一个任务分割成多个可以递归执行的小任务，这些小任务可由不同的处理器进行并行执行，显然这样设计的一个目的就是提高应用的性能。

fork/join框架的和兴是ForkJoinPool类，是AbstractExecutorService类的扩展，它继承了核心的并行执行算法，可以执行ForkJoinTask进程。基本用法可以如下伪代码：

if (my portion of the work is small enough)
  do the work directly
else
  split my work into two pieces
  invoke the two pieces and wait for the results

将上面的代码通过ForkJoinTask子类进行封装,通常继承ForkJoinTask更具体的子类来实现，如RecursiveTask或者RecursiveAction，当子类创建完毕，再创建一个表示要处理的任务的对象，将它作为一个参数传递给ForkJoinPool实例的invoke()方法。官方文档给了一个对照片进行模糊处理的示例，以这个示例为切入点了解fork/join操作，在对照片进行模糊处理时，思路是将一个图片转成一个数组，这个数组中的每个值代表一个像素，然后对这个像素进行处理，显然将图片转成的数组将会是一个很大的数组，为了加快处理，可以对数组进行切割，切割成一个个小的数组，然后对这个小的数组进行并行处理，将处理后的结果再合并成一个大的结果返给用户，代码如下：

class ForkBlur extends RecursiveAction {

    private int[] mSource;
    private int mStart;
    private int mLength;
    private int[] mDestination;
    private int mBlurWidth = 15; // Processing window size, should be odd.

    public ForkBlur(int[] src, int start, int length, int[] dst) {
        mSource = src;
        mStart = start;
        mLength = length;
        mDestination = dst;
    }

    // 计算原图片的平均像素, 将结果写道目的数组中.
    protected void computeDirectly() {
        int sidePixels = (mBlurWidth - 1) / 2;
        for (int index = mStart; index < mStart + mLength; index++) {
            // 计算平均值.
            float rt = 0, gt = 0, bt = 0;
            for (int mi = -sidePixels; mi <= sidePixels; mi++) {
                int mindex = Math.min(Math.max(mi + index, 0), mSource.length - 1);
                int pixel = mSource[mindex];
                rt += (float) ((pixel & 0x00ff0000) >> 16) / mBlurWidth;
                gt += (float) ((pixel & 0x0000ff00) >> 8) / mBlurWidth;
                bt += (float) ((pixel & 0x000000ff) >> 0) / mBlurWidth;
            }

            // 重新组合目标像素.
            int dpixel = (0xff000000)
                    | (((int) rt) << 16)
                    | (((int) gt) << 8)
                    | (((int) bt) << 0);
            mDestination[index] = dpixel;
        }
    }
    //定义计算的阈值
    protected static int sThreshold = 10000;
    
    @Override
    protected void compute() {
        //长度小于指定的阈值，则直接进行计算
        if (mLength < sThreshold) {
            computeDirectly();
            return;
        }
        //进行切分
        int split = mLength / 2;
        //并行执行所有的任务
        invokeAll(new ForkBlur(mSource, mStart, split, mDestination),
                new ForkBlur(mSource, mStart + split, mLength - split,
                        mDestination));
    }

    // Plumbing follows.
    public static void main(String[] args) throws Exception {
        String srcName = "C:\\Users\\Administrator\\Desktop\\信息文件\\研究院工作\\个人信息\\aaa.jpg";
        File srcFile = new File(srcName);
        BufferedImage image = ImageIO.read(srcFile);

        System.out.println("Source image: " + srcName);

        BufferedImage blurredImage = blur(image);

        String dstName = "C:\\Users\\Administrator\\Desktop\\信息文件\\研究院工作\\个人信息\\bbb.jpg";
        File dstFile = new File(dstName);
        ImageIO.write(blurredImage, "jpg", dstFile);

        System.out.println("Output image: " + dstName);

    }

    public static BufferedImage blur(BufferedImage srcImage) {
        int w = srcImage.getWidth();
        int h = srcImage.getHeight();

        int[] src = srcImage.getRGB(0, 0, w, h, null, 0, w);
        int[] dst = new int[src.length];

        System.out.println("Array size is " + src.length);
        System.out.println("Threshold is " + sThreshold);

        int processors = Runtime.getRuntime().availableProcessors();
        System.out.println(Integer.toString(processors) + " processor"
                + (processors != 1 ? "s are " : " is ")
                + "available");

        //创建目标任务对象
        ForkBlur fb = new ForkBlur(src, 0, src.length, dst);
        //创建任务执行对象
        ForkJoinPool pool = new ForkJoinPool();

        long startTime = System.currentTimeMillis();
        //执行模糊操作
        pool.invoke(fb);
        long endTime = System.currentTimeMillis();

        System.out.println("Image blur took " + (endTime - startTime) +
                " milliseconds.");

        BufferedImage dstImage =
                new BufferedImage(w, h, BufferedImage.TYPE_INT_ARGB);
        dstImage.setRGB(0, 0, w, h, dst, 0, w);

        return dstImage;
    }
}

View Code

上面的代码中，ForkBlur任务继承RecursiveAction抽象类，它的执行过程整体分为三步：

1、创建一个表示需要执行动作的任务：ForkBlur fb=new ForkBlur(...)

2、创建ForkJoinPool对象，以触发任务的执行：ForkJoinPool pool=new ForkJoinPool();

3、执行任务 pool.invoke(fb)

并行计算在流中也有很多引用，它的思想是：java会将流分割成多分子流，聚合操作会以并行的形式遍历处理这些子流，最后将结果汇合成一个结果，在创建平行流的时候需要调用paralleStream()方法指明创建并行流：

double average = roster

    .parallelStream()
    .filter(p -> p.getGender() == Person.Sex.MALE)
    .mapToInt(Person::getAge)
    .average()
    .getAsDouble();

java中的并发归约操作，前面讲到，如果将人按照性别的方式进行区分，则归约操作如下：

Map<Person.Sex, List<Person>> byGender =
    roster
        .stream()
        .collect(
            Collectors.groupingBy(Person::getGender));

上面的代码的结果和下面的操作方式是等价的：

ConcurrentMap<Person.Sex, List<Person>> byGender =

    roster
        .parallelStream()
        .collect(
            Collectors.groupingByConcurrent(Person::getGender));

上面的操作方式称为并发归约操作(concurrent reduction),如果以下条件均满足，则一个包含collect操作的管道就是并发归约操作：

1、流是并行的，stream is parallel

2、collect操作的参数，也就是待操作的集合需要有Collector.Characteristics.CONCURRENT特征，查看一个集合的特征，可以调用Collector.characteristics方法

3、无论流是无序的，还是集合具有Collector.Characteristics.UNORDERED的特征，为了保证流是无序的，可以调用BaseStream.unordered操作。

一个管道在处理流的顺序取决于这个流值顺序执行的还是并行执行的(in serial or in parallel)、流的源和中间操作；例如打印一个list集合中的元素，这里使用forEach操作，代码如下：

Integer[] intArray = {1, 2, 3, 4, 5, 6, 7, 8 };
List<Integer> listOfIntegers =
    new ArrayList<>(Arrays.asList(intArray));

System.out.println("listOfIntegers:");
listOfIntegers
    .stream()
    .forEach(e -> System.out.print(e + " "));
System.out.println("");

System.out.println("listOfIntegers sorted in reverse order:");
Comparator<Integer> normal = Integer::compare;
Comparator<Integer> reversed = normal.reversed(); 
Collections.sort(listOfIntegers, reversed);  
listOfIntegers
    .stream()
    .forEach(e -> System.out.print(e + " "));
System.out.println("");
     
System.out.println("Parallel stream");
listOfIntegers
    .parallelStream()
    .forEach(e -> System.out.print(e + " "));
System.out.println("");
    
System.out.println("Another parallel stream:");
listOfIntegers
    .parallelStream()
    .forEach(e -> System.out.print(e + " "));
System.out.println("");
     
System.out.println("With forEachOrdered:");
listOfIntegers
    .parallelStream()
    .forEachOrdered(e -> System.out.print(e + " "));
System.out.println("");

===========执行结果为=============

listOfIntegers:
1 2 3 4 5 6 7 8
listOfIntegers sorted in reverse order:
8 7 6 5 4 3 2 1
Parallel stream:
3 4 1 6 2 5 7 8
Another parallel stream:
6 3 1 5 7 8 4 2
With forEachOrdered:
8 7 6 5 4 3 2 1

View Code

上面的代码中，第三和第四个管道的输出结果显然是无序的，而第五个管道使用forEachOrdered方法，这时候无论使用的stream还是parallelStream，都使得执行过程按照执行顺序进行，当然，这也就是失去了并行流的优势了。

以上是java并行处理数据的小示例，判断什么时候使用并行处理也是一个问题，知识是无涯的，不要小看任何东西...

posted on 2022-02-20 16:36 Judy518 阅读(653) 评论(0) 编辑收藏举报