谈谈stream的运行原理
害,别误会,我这里说的stream不是流式编程,不是大数据处理框架。我这里说的是stream指的是jdk中的一个开发工具包stream. 该工具包在jdk8中出现,可以说已经是冷饭了,为何还要你说?只因各家一言,不算得自家理解,如若有空,何多听一版又何妨。
本篇主要从几个方面讲讲:1. 我们常见的stream都有哪些? 2. stream包有哪些好处? 3. stream包的实现原理? 相信这些多少会解开大家的一些迷惑。
1. 我们常见的stream都有哪些?
stream直接翻译为流。何谓流?我们最常见的,比如网络中的数据传输,即tcp/udp那一套东西,都是建立在二进制流的基础上的。用流来形容这些数据或文件的传输,非常形象,因为数据总是源源不断地从一端流向另一端,这是不流是什么。只是,传输到另一端之后,我们再做解析,便有了数据或文件之说。其实这说的,便是高层协议了。
另说一个stream, 那就是jdk中的各种InputStream了,它用于读取文件数据,读取byte数据,其实也是源源不断将数据从一个设备流入到另一设备。jdk中有InputStream/OutputStream, 作为根基,其上则是各种 FileInputStream, FileOutputStream, FileReader, FileWriter,... 实际上,整个io包几乎都是在围绕流这个概念来展开的。可见,io是相当的重要啊。
再说一stream, 则是对大数据的处理了,stream,即是实时数据处理的重要技术实现,因与实时二字吻合,恰好又类似于数据从一设备流入另一设备,且是实时的。所以,stream在大数据领域也是大放异彩啊!比如 spark, flink 你可知?比如 图数据库语言标准 gremlin 的算子。
还有更多的流概念,更多的流实现,不必细说,也无法细说。单只知道,流无处不在,非常重要。
还有本文要议的stream包,到底是何生物,且看后续说来。
2. stream包有何好处?
stream包,在java中是以一个工具包的形式存在,即你用则以,不用亦可。
那么,用它到底有何好处?好处主要有二: 1.可以减少冗余代码的编写;比如要写一个过滤器则只需调用一filter()传入处理逻辑即可; 2.可以很方便的利用一些隐藏的升级好处或者多核带来的好处;(当然你可能用不上这些好处)
说实话,这两个功能,看起来实际没有太多的诱惑力,但凡我们封装几个方法,供外部调用,不也可以达到同等效果?是了!如果你有这等造诣,能够抽象出足够通用的方法,供各方使用,那你不算大牛何人算?说到底,stream也就是高手封装的工具包而已。
来几个应用实例,看看stream都如何使用的:
public class StreamUtilTest { @Test public void testArrayStream() { // 1. 过滤值;改变值;排序; Integer[] intArr = {1, 2, 3, 5, 22, 8, 5}; List<Integer> iArrList = Arrays.stream(intArr) .filter(r -> r < 20) .map(r -> r + 1) .sorted().collect(Collectors.toList()); System.out.println("result:" + iArrList); String[] strArr = {"a1,a2", "q,y,h", "ddd,bb,n", null}; // 2. 过滤数组;拆分值;输出; Arrays.stream(strArr).filter(Objects::nonNull) .flatMap(r -> Arrays.stream(r.split(","))) .forEach(System.out::println); } @Test public void testListStream() { List<String> list = new ArrayList<>(); list.add("ab"); list.add("ccc"); list.add("ddd"); // 3. 求list中的最大值 Optional<String> maxStr = list.stream().max(Comparator.naturalOrder()); System.out.println(maxStr); } }
害,不必纠结里面干的事情复不复杂,有没有意义,只知道有这用法即可。 反正就当你会这么用,即能解决这般问题。这也是我们高级语言使用必备技能,学会调用api.
不过需要说明的,java中有一句老话,叫做万事万物皆对象。 但见上面的写法,自然不太像对象。是了,这是lamda语法,虽说另一主题,但何妨在此处一题。但既然说到这,不妨来想想这lamda到底是何物?从某种角度来说,它可以看作是一种内部类,不过写法不太一样。但是当我们仔细观察class文件的变化情况时,发现它与内部类又不太一致,因为java的内部类会在class中生成$xx.class的类文件,而lamda表达式却不会。但是不管怎么样,它是可以使用内部类的表达方式获得同样的效果,只需将该类代入到其中,即可达到同样的效果。
但要细说lamda表达式,则可以反编译下class文件,可以见些许端倪。
# 调用lamda表达式示例 ... 59: invokestatic #4 // Method java/util/Arrays.stream:([Ljava/lang/Object;)Ljava/util/stream/Stream; 62: invokedynamic #5, 0 // InvokeDynamic #0:test:()Ljava/util/function/Predicate; 67: invokeinterface #6, 2 // InterfaceMethod java/util/stream/Stream.filter:(Ljava/util/function/Predicate;)Ljava/ut il/stream/Stream; 72: invokedynamic #7, 0 // InvokeDynamic #1:apply:()Ljava/util/function/Function; 77: invokeinterface #8, 2 // InterfaceMethod java/util/stream/Stream.map:(Ljava/util/function/Function;)Ljava/util/s tream/Stream; ... # 常量池定义,实际是定义了lamda的实现方式为 #0 号方法 #5 = InvokeDynamic #0:#102 // #0:test:()Ljava/util/function/Predicate; # lamda表达式的具体实现1示例 BootstrapMethods: 0: #98 invokestatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lan g/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite ; Method arguments: #99 (Ljava/lang/Object;)Z // 此处为调用具体的实现方法 #100 invokestatic com/my/test/common/util/StreamUtilTest.lambda$testArrayStream$0:(Ljava/lang/Integer;)Z #101 (Ljava/lang/Integer;)Z 1: #98 invokestatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite; Method arguments: #105 (Ljava/lang/Object;)Ljava/lang/Object; #106 invokestatic com/my/test/common/util/StreamUtilTest.lambda$testArrayStream$1:(Ljava/lang/Integer;)Ljava/lang/Integer; #107 (Ljava/lang/Integer;)Ljava/lang/Integer; # lamda表达式具体实现2, 上一步的静态调用 private static boolean lambda$testArrayStream$0(java.lang.Integer); descriptor: (Ljava/lang/Integer;)Z flags: ACC_PRIVATE, ACC_STATIC, ACC_SYNTHETIC Code: stack=2, locals=1, args_size=1 0: aload_0 1: invokevirtual #48 // Method java/lang/Integer.intValue:()I 4: bipush 20 6: if_icmpge 13 9: iconst_1 10: goto 14 13: iconst_0 14: ireturn LineNumberTable: line 16: 0 LocalVariableTable: Start Length Slot Name Signature 0 15 0 r Ljava/lang/Integer; StackMapTable: number_of_entries = 2 frame_type = 13 /* same */ frame_type = 64 /* same_locals_1_stack_item */ stack = [ int ] MethodParameters: Name Flags r synthetic
害,往深了就不说了。单说这lamda表达式,并非使用内部类来实现的,而是使用内部静态函数来做的,所以也叫函数式编程呢。烦话休提。
最后,再来看看,这stream包究竟有何神圣地方?其实,就是一个以一个 Stream 接口定义为核心展开的,且看如下:
/** * A sequence of elements supporting sequential and parallel aggregate * operations. The following example illustrates an aggregate operation using * {@link Stream} and {@link IntStream}: * * <pre>{@code * int sum = widgets.stream() * .filter(w -> w.getColor() == RED) * .mapToInt(w -> w.getWeight()) * .sum(); * }</pre> * * In this example, {@code widgets} is a {@code Collection<Widget>}. We create * a stream of {@code Widget} objects via {@link Collection#stream Collection.stream()}, * filter it to produce a stream containing only the red widgets, and then * transform it into a stream of {@code int} values representing the weight of * each red widget. Then this stream is summed to produce a total weight. * * <p>In addition to {@code Stream}, which is a stream of object references, * there are primitive specializations for {@link IntStream}, {@link LongStream}, * and {@link DoubleStream}, all of which are referred to as "streams" and * conform to the characteristics and restrictions described here. * * <p>To perform a computation, stream * <a href="package-summary.html#StreamOps">operations</a> are composed into a * <em>stream pipeline</em>. A stream pipeline consists of a source (which * might be an array, a collection, a generator function, an I/O channel, * etc), zero or more <em>intermediate operations</em> (which transform a * stream into another stream, such as {@link Stream#filter(Predicate)}), and a * <em>terminal operation</em> (which produces a result or side-effect, such * as {@link Stream#count()} or {@link Stream#forEach(Consumer)}). * Streams are lazy; computation on the source data is only performed when the * terminal operation is initiated, and source elements are consumed only * as needed. * * <p>Collections and streams, while bearing some superficial similarities, * have different goals. Collections are primarily concerned with the efficient * management of, and access to, their elements. By contrast, streams do not * provide a means to directly access or manipulate their elements, and are * instead concerned with declaratively describing their source and the * computational operations which will be performed in aggregate on that source. * However, if the provided stream operations do not offer the desired * functionality, the {@link #iterator()} and {@link #spliterator()} operations * can be used to perform a controlled traversal. * * <p>A stream pipeline, like the "widgets" example above, can be viewed as * a <em>query</em> on the stream source. Unless the source was explicitly * designed for concurrent modification (such as a {@link ConcurrentHashMap}), * unpredictable or erroneous behavior may result from modifying the stream * source while it is being queried. * * <p>Most stream operations accept parameters that describe user-specified * behavior, such as the lambda expression {@code w -> w.getWeight()} passed to * {@code mapToInt} in the example above. To preserve correct behavior, * these <em>behavioral parameters</em>: * <ul> * <li>must be <a href="package-summary.html#NonInterference">non-interfering</a> * (they do not modify the stream source); and</li> * <li>in most cases must be <a href="package-summary.html#Statelessness">stateless</a> * (their result should not depend on any state that might change during execution * of the stream pipeline).</li> * </ul> * * <p>Such parameters are always instances of a * <a href="../function/package-summary.html">functional interface</a> such * as {@link java.util.function.Function}, and are often lambda expressions or * method references. Unless otherwise specified these parameters must be * <em>non-null</em>. * * <p>A stream should be operated on (invoking an intermediate or terminal stream * operation) only once. This rules out, for example, "forked" streams, where * the same source feeds two or more pipelines, or multiple traversals of the * same stream. A stream implementation may throw {@link IllegalStateException} * if it detects that the stream is being reused. However, since some stream * operations may return their receiver rather than a new stream object, it may * not be possible to detect reuse in all cases. * * <p>Streams have a {@link #close()} method and implement {@link AutoCloseable}, * but nearly all stream instances do not actually need to be closed after use. * Generally, only streams whose source is an IO channel (such as those returned * by {@link Files#lines(Path, Charset)}) will require closing. Most streams * are backed by collections, arrays, or generating functions, which require no * special resource management. (If a stream does require closing, it can be * declared as a resource in a {@code try}-with-resources statement.) * * <p>Stream pipelines may execute either sequentially or in * <a href="package-summary.html#Parallelism">parallel</a>. This * execution mode is a property of the stream. Streams are created * with an initial choice of sequential or parallel execution. (For example, * {@link Collection#stream() Collection.stream()} creates a sequential stream, * and {@link Collection#parallelStream() Collection.parallelStream()} creates * a parallel one.) This choice of execution mode may be modified by the * {@link #sequential()} or {@link #parallel()} methods, and may be queried with * the {@link #isParallel()} method. * * @param <T> the type of the stream elements * @since 1.8 * @see IntStream * @see LongStream * @see DoubleStream * @see <a href="package-summary.html">java.util.stream</a> */ public interface Stream<T> extends BaseStream<T, Stream<T>> { /** * Returns a stream consisting of the elements of this stream that match * the given predicate. * * <p>This is an <a href="package-summary.html#StreamOps">intermediate * operation</a>. * * @param predicate a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * predicate to apply to each element to determine if it * should be included * @return the new stream */ Stream<T> filter(Predicate<? super T> predicate); /** * Returns a stream consisting of the results of applying the given * function to the elements of this stream. * * <p>This is an <a href="package-summary.html#StreamOps">intermediate * operation</a>. * * @param <R> The element type of the new stream * @param mapper a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function to apply to each element * @return the new stream */ <R> Stream<R> map(Function<? super T, ? extends R> mapper); /** * Returns an {@code IntStream} consisting of the results of applying the * given function to the elements of this stream. * * <p>This is an <a href="package-summary.html#StreamOps"> * intermediate operation</a>. * * @param mapper a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function to apply to each element * @return the new stream */ IntStream mapToInt(ToIntFunction<? super T> mapper); /** * Returns a {@code LongStream} consisting of the results of applying the * given function to the elements of this stream. * * <p>This is an <a href="package-summary.html#StreamOps">intermediate * operation</a>. * * @param mapper a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function to apply to each element * @return the new stream */ LongStream mapToLong(ToLongFunction<? super T> mapper); /** * Returns a {@code DoubleStream} consisting of the results of applying the * given function to the elements of this stream. * * <p>This is an <a href="package-summary.html#StreamOps">intermediate * operation</a>. * * @param mapper a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function to apply to each element * @return the new stream */ DoubleStream mapToDouble(ToDoubleFunction<? super T> mapper); /** * Returns a stream consisting of the results of replacing each element of * this stream with the contents of a mapped stream produced by applying * the provided mapping function to each element. Each mapped stream is * {@link java.util.stream.BaseStream#close() closed} after its contents * have been placed into this stream. (If a mapped stream is {@code null} * an empty stream is used, instead.) * * <p>This is an <a href="package-summary.html#StreamOps">intermediate * operation</a>. * * @apiNote * The {@code flatMap()} operation has the effect of applying a one-to-many * transformation to the elements of the stream, and then flattening the * resulting elements into a new stream. * * <p><b>Examples.</b> * * <p>If {@code orders} is a stream of purchase orders, and each purchase * order contains a collection of line items, then the following produces a * stream containing all the line items in all the orders: * <pre>{@code * orders.flatMap(order -> order.getLineItems().stream())... * }</pre> * * <p>If {@code path} is the path to a file, then the following produces a * stream of the {@code words} contained in that file: * <pre>{@code * Stream<String> lines = Files.lines(path, StandardCharsets.UTF_8); * Stream<String> words = lines.flatMap(line -> Stream.of(line.split(" +"))); * }</pre> * The {@code mapper} function passed to {@code flatMap} splits a line, * using a simple regular expression, into an array of words, and then * creates a stream of words from that array. * * @param <R> The element type of the new stream * @param mapper a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function to apply to each element which produces a stream * of new values * @return the new stream */ <R> Stream<R> flatMap(Function<? super T, ? extends Stream<? extends R>> mapper); /** * Returns an {@code IntStream} consisting of the results of replacing each * element of this stream with the contents of a mapped stream produced by * applying the provided mapping function to each element. Each mapped * stream is {@link java.util.stream.BaseStream#close() closed} after its * contents have been placed into this stream. (If a mapped stream is * {@code null} an empty stream is used, instead.) * * <p>This is an <a href="package-summary.html#StreamOps">intermediate * operation</a>. * * @param mapper a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function to apply to each element which produces a stream * of new values * @return the new stream * @see #flatMap(Function) */ IntStream flatMapToInt(Function<? super T, ? extends IntStream> mapper); /** * Returns an {@code LongStream} consisting of the results of replacing each * element of this stream with the contents of a mapped stream produced by * applying the provided mapping function to each element. Each mapped * stream is {@link java.util.stream.BaseStream#close() closed} after its * contents have been placed into this stream. (If a mapped stream is * {@code null} an empty stream is used, instead.) * * <p>This is an <a href="package-summary.html#StreamOps">intermediate * operation</a>. * * @param mapper a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function to apply to each element which produces a stream * of new values * @return the new stream * @see #flatMap(Function) */ LongStream flatMapToLong(Function<? super T, ? extends LongStream> mapper); /** * Returns an {@code DoubleStream} consisting of the results of replacing * each element of this stream with the contents of a mapped stream produced * by applying the provided mapping function to each element. Each mapped * stream is {@link java.util.stream.BaseStream#close() closed} after its * contents have placed been into this stream. (If a mapped stream is * {@code null} an empty stream is used, instead.) * * <p>This is an <a href="package-summary.html#StreamOps">intermediate * operation</a>. * * @param mapper a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function to apply to each element which produces a stream * of new values * @return the new stream * @see #flatMap(Function) */ DoubleStream flatMapToDouble(Function<? super T, ? extends DoubleStream> mapper); /** * Returns a stream consisting of the distinct elements (according to * {@link Object#equals(Object)}) of this stream. * * <p>For ordered streams, the selection of distinct elements is stable * (for duplicated elements, the element appearing first in the encounter * order is preserved.) For unordered streams, no stability guarantees * are made. * * <p>This is a <a href="package-summary.html#StreamOps">stateful * intermediate operation</a>. * * @apiNote * Preserving stability for {@code distinct()} in parallel pipelines is * relatively expensive (requires that the operation act as a full barrier, * with substantial buffering overhead), and stability is often not needed. * Using an unordered stream source (such as {@link #generate(Supplier)}) * or removing the ordering constraint with {@link #unordered()} may result * in significantly more efficient execution for {@code distinct()} in parallel * pipelines, if the semantics of your situation permit. If consistency * with encounter order is required, and you are experiencing poor performance * or memory utilization with {@code distinct()} in parallel pipelines, * switching to sequential execution with {@link #sequential()} may improve * performance. * * @return the new stream */ Stream<T> distinct(); /** * Returns a stream consisting of the elements of this stream, sorted * according to natural order. If the elements of this stream are not * {@code Comparable}, a {@code java.lang.ClassCastException} may be thrown * when the terminal operation is executed. * * <p>For ordered streams, the sort is stable. For unordered streams, no * stability guarantees are made. * * <p>This is a <a href="package-summary.html#StreamOps">stateful * intermediate operation</a>. * * @return the new stream */ Stream<T> sorted(); /** * Returns a stream consisting of the elements of this stream, sorted * according to the provided {@code Comparator}. * * <p>For ordered streams, the sort is stable. For unordered streams, no * stability guarantees are made. * * <p>This is a <a href="package-summary.html#StreamOps">stateful * intermediate operation</a>. * * @param comparator a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * {@code Comparator} to be used to compare stream elements * @return the new stream */ Stream<T> sorted(Comparator<? super T> comparator); /** * Returns a stream consisting of the elements of this stream, additionally * performing the provided action on each element as elements are consumed * from the resulting stream. * * <p>This is an <a href="package-summary.html#StreamOps">intermediate * operation</a>. * * <p>For parallel stream pipelines, the action may be called at * whatever time and in whatever thread the element is made available by the * upstream operation. If the action modifies shared state, * it is responsible for providing the required synchronization. * * @apiNote This method exists mainly to support debugging, where you want * to see the elements as they flow past a certain point in a pipeline: * <pre>{@code * Stream.of("one", "two", "three", "four") * .filter(e -> e.length() > 3) * .peek(e -> System.out.println("Filtered value: " + e)) * .map(String::toUpperCase) * .peek(e -> System.out.println("Mapped value: " + e)) * .collect(Collectors.toList()); * }</pre> * * @param action a <a href="package-summary.html#NonInterference"> * non-interfering</a> action to perform on the elements as * they are consumed from the stream * @return the new stream */ Stream<T> peek(Consumer<? super T> action); /** * Returns a stream consisting of the elements of this stream, truncated * to be no longer than {@code maxSize} in length. * * <p>This is a <a href="package-summary.html#StreamOps">short-circuiting * stateful intermediate operation</a>. * * @apiNote * While {@code limit()} is generally a cheap operation on sequential * stream pipelines, it can be quite expensive on ordered parallel pipelines, * especially for large values of {@code maxSize}, since {@code limit(n)} * is constrained to return not just any <em>n</em> elements, but the * <em>first n</em> elements in the encounter order. Using an unordered * stream source (such as {@link #generate(Supplier)}) or removing the * ordering constraint with {@link #unordered()} may result in significant * speedups of {@code limit()} in parallel pipelines, if the semantics of * your situation permit. If consistency with encounter order is required, * and you are experiencing poor performance or memory utilization with * {@code limit()} in parallel pipelines, switching to sequential execution * with {@link #sequential()} may improve performance. * * @param maxSize the number of elements the stream should be limited to * @return the new stream * @throws IllegalArgumentException if {@code maxSize} is negative */ Stream<T> limit(long maxSize); /** * Returns a stream consisting of the remaining elements of this stream * after discarding the first {@code n} elements of the stream. * If this stream contains fewer than {@code n} elements then an * empty stream will be returned. * * <p>This is a <a href="package-summary.html#StreamOps">stateful * intermediate operation</a>. * * @apiNote * While {@code skip()} is generally a cheap operation on sequential * stream pipelines, it can be quite expensive on ordered parallel pipelines, * especially for large values of {@code n}, since {@code skip(n)} * is constrained to skip not just any <em>n</em> elements, but the * <em>first n</em> elements in the encounter order. Using an unordered * stream source (such as {@link #generate(Supplier)}) or removing the * ordering constraint with {@link #unordered()} may result in significant * speedups of {@code skip()} in parallel pipelines, if the semantics of * your situation permit. If consistency with encounter order is required, * and you are experiencing poor performance or memory utilization with * {@code skip()} in parallel pipelines, switching to sequential execution * with {@link #sequential()} may improve performance. * * @param n the number of leading elements to skip * @return the new stream * @throws IllegalArgumentException if {@code n} is negative */ Stream<T> skip(long n); /** * Performs an action for each element of this stream. * * <p>This is a <a href="package-summary.html#StreamOps">terminal * operation</a>. * * <p>The behavior of this operation is explicitly nondeterministic. * For parallel stream pipelines, this operation does <em>not</em> * guarantee to respect the encounter order of the stream, as doing so * would sacrifice the benefit of parallelism. For any given element, the * action may be performed at whatever time and in whatever thread the * library chooses. If the action accesses shared state, it is * responsible for providing the required synchronization. * * @param action a <a href="package-summary.html#NonInterference"> * non-interfering</a> action to perform on the elements */ void forEach(Consumer<? super T> action); /** * Performs an action for each element of this stream, in the encounter * order of the stream if the stream has a defined encounter order. * * <p>This is a <a href="package-summary.html#StreamOps">terminal * operation</a>. * * <p>This operation processes the elements one at a time, in encounter * order if one exists. Performing the action for one element * <a href="../concurrent/package-summary.html#MemoryVisibility"><i>happens-before</i></a> * performing the action for subsequent elements, but for any given element, * the action may be performed in whatever thread the library chooses. * * @param action a <a href="package-summary.html#NonInterference"> * non-interfering</a> action to perform on the elements * @see #forEach(Consumer) */ void forEachOrdered(Consumer<? super T> action); /** * Returns an array containing the elements of this stream. * * <p>This is a <a href="package-summary.html#StreamOps">terminal * operation</a>. * * @return an array containing the elements of this stream */ Object[] toArray(); /** * Returns an array containing the elements of this stream, using the * provided {@code generator} function to allocate the returned array, as * well as any additional arrays that might be required for a partitioned * execution or for resizing. * * <p>This is a <a href="package-summary.html#StreamOps">terminal * operation</a>. * * @apiNote * The generator function takes an integer, which is the size of the * desired array, and produces an array of the desired size. This can be * concisely expressed with an array constructor reference: * <pre>{@code * Person[] men = people.stream() * .filter(p -> p.getGender() == MALE) * .toArray(Person[]::new); * }</pre> * * @param <A> the element type of the resulting array * @param generator a function which produces a new array of the desired * type and the provided length * @return an array containing the elements in this stream * @throws ArrayStoreException if the runtime type of the array returned * from the array generator is not a supertype of the runtime type of every * element in this stream */ <A> A[] toArray(IntFunction<A[]> generator); /** * Performs a <a href="package-summary.html#Reduction">reduction</a> on the * elements of this stream, using the provided identity value and an * <a href="package-summary.html#Associativity">associative</a> * accumulation function, and returns the reduced value. This is equivalent * to: * <pre>{@code * T result = identity; * for (T element : this stream) * result = accumulator.apply(result, element) * return result; * }</pre> * * but is not constrained to execute sequentially. * * <p>The {@code identity} value must be an identity for the accumulator * function. This means that for all {@code t}, * {@code accumulator.apply(identity, t)} is equal to {@code t}. * The {@code accumulator} function must be an * <a href="package-summary.html#Associativity">associative</a> function. * * <p>This is a <a href="package-summary.html#StreamOps">terminal * operation</a>. * * @apiNote Sum, min, max, average, and string concatenation are all special * cases of reduction. Summing a stream of numbers can be expressed as: * * <pre>{@code * Integer sum = integers.reduce(0, (a, b) -> a+b); * }</pre> * * or: * * <pre>{@code * Integer sum = integers.reduce(0, Integer::sum); * }</pre> * * <p>While this may seem a more roundabout way to perform an aggregation * compared to simply mutating a running total in a loop, reduction * operations parallelize more gracefully, without needing additional * synchronization and with greatly reduced risk of data races. * * @param identity the identity value for the accumulating function * @param accumulator an <a href="package-summary.html#Associativity">associative</a>, * <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function for combining two values * @return the result of the reduction */ T reduce(T identity, BinaryOperator<T> accumulator); /** * Performs a <a href="package-summary.html#Reduction">reduction</a> on the * elements of this stream, using an * <a href="package-summary.html#Associativity">associative</a> accumulation * function, and returns an {@code Optional} describing the reduced value, * if any. This is equivalent to: * <pre>{@code * boolean foundAny = false; * T result = null; * for (T element : this stream) { * if (!foundAny) { * foundAny = true; * result = element; * } * else * result = accumulator.apply(result, element); * } * return foundAny ? Optional.of(result) : Optional.empty(); * }</pre> * * but is not constrained to execute sequentially. * * <p>The {@code accumulator} function must be an * <a href="package-summary.html#Associativity">associative</a> function. * * <p>This is a <a href="package-summary.html#StreamOps">terminal * operation</a>. * * @param accumulator an <a href="package-summary.html#Associativity">associative</a>, * <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function for combining two values * @return an {@link Optional} describing the result of the reduction * @throws NullPointerException if the result of the reduction is null * @see #reduce(Object, BinaryOperator) * @see #min(Comparator) * @see #max(Comparator) */ Optional<T> reduce(BinaryOperator<T> accumulator); /** * Performs a <a href="package-summary.html#Reduction">reduction</a> on the * elements of this stream, using the provided identity, accumulation and * combining functions. This is equivalent to: * <pre>{@code * U result = identity; * for (T element : this stream) * result = accumulator.apply(result, element) * return result; * }</pre> * * but is not constrained to execute sequentially. * * <p>The {@code identity} value must be an identity for the combiner * function. This means that for all {@code u}, {@code combiner(identity, u)} * is equal to {@code u}. Additionally, the {@code combiner} function * must be compatible with the {@code accumulator} function; for all * {@code u} and {@code t}, the following must hold: * <pre>{@code * combiner.apply(u, accumulator.apply(identity, t)) == accumulator.apply(u, t) * }</pre> * * <p>This is a <a href="package-summary.html#StreamOps">terminal * operation</a>. * * @apiNote Many reductions using this form can be represented more simply * by an explicit combination of {@code map} and {@code reduce} operations. * The {@code accumulator} function acts as a fused mapper and accumulator, * which can sometimes be more efficient than separate mapping and reduction, * such as when knowing the previously reduced value allows you to avoid * some computation. * * @param <U> The type of the result * @param identity the identity value for the combiner function * @param accumulator an <a href="package-summary.html#Associativity">associative</a>, * <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function for incorporating an additional element into a result * @param combiner an <a href="package-summary.html#Associativity">associative</a>, * <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function for combining two values, which must be * compatible with the accumulator function * @return the result of the reduction * @see #reduce(BinaryOperator) * @see #reduce(Object, BinaryOperator) */ <U> U reduce(U identity, BiFunction<U, ? super T, U> accumulator, BinaryOperator<U> combiner); /** * Performs a <a href="package-summary.html#MutableReduction">mutable * reduction</a> operation on the elements of this stream. A mutable * reduction is one in which the reduced value is a mutable result container, * such as an {@code ArrayList}, and elements are incorporated by updating * the state of the result rather than by replacing the result. This * produces a result equivalent to: * <pre>{@code * R result = supplier.get(); * for (T element : this stream) * accumulator.accept(result, element); * return result; * }</pre> * * <p>Like {@link #reduce(Object, BinaryOperator)}, {@code collect} operations * can be parallelized without requiring additional synchronization. * * <p>This is a <a href="package-summary.html#StreamOps">terminal * operation</a>. * * @apiNote There are many existing classes in the JDK whose signatures are * well-suited for use with method references as arguments to {@code collect()}. * For example, the following will accumulate strings into an {@code ArrayList}: * <pre>{@code * List<String> asList = stringStream.collect(ArrayList::new, ArrayList::add, * ArrayList::addAll); * }</pre> * * <p>The following will take a stream of strings and concatenates them into a * single string: * <pre>{@code * String concat = stringStream.collect(StringBuilder::new, StringBuilder::append, * StringBuilder::append) * .toString(); * }</pre> * * @param <R> type of the result * @param supplier a function that creates a new result container. For a * parallel execution, this function may be called * multiple times and must return a fresh value each time. * @param accumulator an <a href="package-summary.html#Associativity">associative</a>, * <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function for incorporating an additional element into a result * @param combiner an <a href="package-summary.html#Associativity">associative</a>, * <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * function for combining two values, which must be * compatible with the accumulator function * @return the result of the reduction */ <R> R collect(Supplier<R> supplier, BiConsumer<R, ? super T> accumulator, BiConsumer<R, R> combiner); /** * Performs a <a href="package-summary.html#MutableReduction">mutable * reduction</a> operation on the elements of this stream using a * {@code Collector}. A {@code Collector} * encapsulates the functions used as arguments to * {@link #collect(Supplier, BiConsumer, BiConsumer)}, allowing for reuse of * collection strategies and composition of collect operations such as * multiple-level grouping or partitioning. * * <p>If the stream is parallel, and the {@code Collector} * is {@link Collector.Characteristics#CONCURRENT concurrent}, and * either the stream is unordered or the collector is * {@link Collector.Characteristics#UNORDERED unordered}, * then a concurrent reduction will be performed (see {@link Collector} for * details on concurrent reduction.) * * <p>This is a <a href="package-summary.html#StreamOps">terminal * operation</a>. * * <p>When executed in parallel, multiple intermediate results may be * instantiated, populated, and merged so as to maintain isolation of * mutable data structures. Therefore, even when executed in parallel * with non-thread-safe data structures (such as {@code ArrayList}), no * additional synchronization is needed for a parallel reduction. * * @apiNote * The following will accumulate strings into an ArrayList: * <pre>{@code * List<String> asList = stringStream.collect(Collectors.toList()); * }</pre> * * <p>The following will classify {@code Person} objects by city: * <pre>{@code * Map<String, List<Person>> peopleByCity * = personStream.collect(Collectors.groupingBy(Person::getCity)); * }</pre> * * <p>The following will classify {@code Person} objects by state and city, * cascading two {@code Collector}s together: * <pre>{@code * Map<String, Map<String, List<Person>>> peopleByStateAndCity * = personStream.collect(Collectors.groupingBy(Person::getState, * Collectors.groupingBy(Person::getCity))); * }</pre> * * @param <R> the type of the result * @param <A> the intermediate accumulation type of the {@code Collector} * @param collector the {@code Collector} describing the reduction * @return the result of the reduction * @see #collect(Supplier, BiConsumer, BiConsumer) * @see Collectors */ <R, A> R collect(Collector<? super T, A, R> collector); /** * Returns the minimum element of this stream according to the provided * {@code Comparator}. This is a special case of a * <a href="package-summary.html#Reduction">reduction</a>. * * <p>This is a <a href="package-summary.html#StreamOps">terminal operation</a>. * * @param comparator a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * {@code Comparator} to compare elements of this stream * @return an {@code Optional} describing the minimum element of this stream, * or an empty {@code Optional} if the stream is empty * @throws NullPointerException if the minimum element is null */ Optional<T> min(Comparator<? super T> comparator); /** * Returns the maximum element of this stream according to the provided * {@code Comparator}. This is a special case of a * <a href="package-summary.html#Reduction">reduction</a>. * * <p>This is a <a href="package-summary.html#StreamOps">terminal * operation</a>. * * @param comparator a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * {@code Comparator} to compare elements of this stream * @return an {@code Optional} describing the maximum element of this stream, * or an empty {@code Optional} if the stream is empty * @throws NullPointerException if the maximum element is null */ Optional<T> max(Comparator<? super T> comparator); /** * Returns the count of elements in this stream. This is a special case of * a <a href="package-summary.html#Reduction">reduction</a> and is * equivalent to: * <pre>{@code * return mapToLong(e -> 1L).sum(); * }</pre> * * <p>This is a <a href="package-summary.html#StreamOps">terminal operation</a>. * * @return the count of elements in this stream */ long count(); /** * Returns whether any elements of this stream match the provided * predicate. May not evaluate the predicate on all elements if not * necessary for determining the result. If the stream is empty then * {@code false} is returned and the predicate is not evaluated. * * <p>This is a <a href="package-summary.html#StreamOps">short-circuiting * terminal operation</a>. * * @apiNote * This method evaluates the <em>existential quantification</em> of the * predicate over the elements of the stream (for some x P(x)). * * @param predicate a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * predicate to apply to elements of this stream * @return {@code true} if any elements of the stream match the provided * predicate, otherwise {@code false} */ boolean anyMatch(Predicate<? super T> predicate); /** * Returns whether all elements of this stream match the provided predicate. * May not evaluate the predicate on all elements if not necessary for * determining the result. If the stream is empty then {@code true} is * returned and the predicate is not evaluated. * * <p>This is a <a href="package-summary.html#StreamOps">short-circuiting * terminal operation</a>. * * @apiNote * This method evaluates the <em>universal quantification</em> of the * predicate over the elements of the stream (for all x P(x)). If the * stream is empty, the quantification is said to be <em>vacuously * satisfied</em> and is always {@code true} (regardless of P(x)). * * @param predicate a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * predicate to apply to elements of this stream * @return {@code true} if either all elements of the stream match the * provided predicate or the stream is empty, otherwise {@code false} */ boolean allMatch(Predicate<? super T> predicate); /** * Returns whether no elements of this stream match the provided predicate. * May not evaluate the predicate on all elements if not necessary for * determining the result. If the stream is empty then {@code true} is * returned and the predicate is not evaluated. * * <p>This is a <a href="package-summary.html#StreamOps">short-circuiting * terminal operation</a>. * * @apiNote * This method evaluates the <em>universal quantification</em> of the * negated predicate over the elements of the stream (for all x ~P(x)). If * the stream is empty, the quantification is said to be vacuously satisfied * and is always {@code true}, regardless of P(x). * * @param predicate a <a href="package-summary.html#NonInterference">non-interfering</a>, * <a href="package-summary.html#Statelessness">stateless</a> * predicate to apply to elements of this stream * @return {@code true} if either no elements of the stream match the * provided predicate or the stream is empty, otherwise {@code false} */ boolean noneMatch(Predicate<? super T> predicate); /** * Returns an {@link Optional} describing the first element of this stream, * or an empty {@code Optional} if the stream is empty. If the stream has * no encounter order, then any element may be returned. * * <p>This is a <a href="package-summary.html#StreamOps">short-circuiting * terminal operation</a>. * * @return an {@code Optional} describing the first element of this stream, * or an empty {@code Optional} if the stream is empty * @throws NullPointerException if the element selected is null */ Optional<T> findFirst(); /** * Returns an {@link Optional} describing some element of the stream, or an * empty {@code Optional} if the stream is empty. * * <p>This is a <a href="package-summary.html#StreamOps">short-circuiting * terminal operation</a>. * * <p>The behavior of this operation is explicitly nondeterministic; it is * free to select any element in the stream. This is to allow for maximal * performance in parallel operations; the cost is that multiple invocations * on the same source may not return the same result. (If a stable result * is desired, use {@link #findFirst()} instead.) * * @return an {@code Optional} describing some element of this stream, or an * empty {@code Optional} if the stream is empty * @throws NullPointerException if the element selected is null * @see #findFirst() */ Optional<T> findAny(); // Static factories /** * Returns a builder for a {@code Stream}. * * @param <T> type of elements * @return a stream builder */ public static<T> Builder<T> builder() { return new Streams.StreamBuilderImpl<>(); } /** * Returns an empty sequential {@code Stream}. * * @param <T> the type of stream elements * @return an empty sequential stream */ public static<T> Stream<T> empty() { return StreamSupport.stream(Spliterators.<T>emptySpliterator(), false); } /** * Returns a sequential {@code Stream} containing a single element. * * @param t the single element * @param <T> the type of stream elements * @return a singleton sequential stream */ public static<T> Stream<T> of(T t) { return StreamSupport.stream(new Streams.StreamBuilderImpl<>(t), false); } /** * Returns a sequential ordered stream whose elements are the specified values. * * @param <T> the type of stream elements * @param values the elements of the new stream * @return the new stream */ @SafeVarargs @SuppressWarnings("varargs") // Creating a stream from an array is safe public static<T> Stream<T> of(T... values) { return Arrays.stream(values); } /** * Returns an infinite sequential ordered {@code Stream} produced by iterative * application of a function {@code f} to an initial element {@code seed}, * producing a {@code Stream} consisting of {@code seed}, {@code f(seed)}, * {@code f(f(seed))}, etc. * * <p>The first element (position {@code 0}) in the {@code Stream} will be * the provided {@code seed}. For {@code n > 0}, the element at position * {@code n}, will be the result of applying the function {@code f} to the * element at position {@code n - 1}. * * @param <T> the type of stream elements * @param seed the initial element * @param f a function to be applied to to the previous element to produce * a new element * @return a new sequential {@code Stream} */ public static<T> Stream<T> iterate(final T seed, final UnaryOperator<T> f) { Objects.requireNonNull(f); final Iterator<T> iterator = new Iterator<T>() { @SuppressWarnings("unchecked") T t = (T) Streams.NONE; @Override public boolean hasNext() { return true; } @Override public T next() { return t = (t == Streams.NONE) ? seed : f.apply(t); } }; return StreamSupport.stream(Spliterators.spliteratorUnknownSize( iterator, Spliterator.ORDERED | Spliterator.IMMUTABLE), false); } /** * Returns an infinite sequential unordered stream where each element is * generated by the provided {@code Supplier}. This is suitable for * generating constant streams, streams of random elements, etc. * * @param <T> the type of stream elements * @param s the {@code Supplier} of generated elements * @return a new infinite sequential unordered {@code Stream} */ public static<T> Stream<T> generate(Supplier<T> s) { Objects.requireNonNull(s); return StreamSupport.stream( new StreamSpliterators.InfiniteSupplyingSpliterator.OfRef<>(Long.MAX_VALUE, s), false); } /** * Creates a lazily concatenated stream whose elements are all the * elements of the first stream followed by all the elements of the * second stream. The resulting stream is ordered if both * of the input streams are ordered, and parallel if either of the input * streams is parallel. When the resulting stream is closed, the close * handlers for both input streams are invoked. * * @implNote * Use caution when constructing streams from repeated concatenation. * Accessing an element of a deeply concatenated stream can result in deep * call chains, or even {@code StackOverflowException}. * * @param <T> The type of stream elements * @param a the first stream * @param b the second stream * @return the concatenation of the two input streams */ public static <T> Stream<T> concat(Stream<? extends T> a, Stream<? extends T> b) { Objects.requireNonNull(a); Objects.requireNonNull(b); @SuppressWarnings("unchecked") Spliterator<T> split = new Streams.ConcatSpliterator.OfRef<>( (Spliterator<T>) a.spliterator(), (Spliterator<T>) b.spliterator()); Stream<T> stream = StreamSupport.stream(split, a.isParallel() || b.isParallel()); return stream.onClose(Streams.composedClose(a, b)); } /** * A mutable builder for a {@code Stream}. This allows the creation of a * {@code Stream} by generating elements individually and adding them to the * {@code Builder} (without the copying overhead that comes from using * an {@code ArrayList} as a temporary buffer.) * * <p>A stream builder has a lifecycle, which starts in a building * phase, during which elements can be added, and then transitions to a built * phase, after which elements may not be added. The built phase begins * when the {@link #build()} method is called, which creates an ordered * {@code Stream} whose elements are the elements that were added to the stream * builder, in the order they were added. * * @param <T> the type of stream elements * @see Stream#builder() * @since 1.8 */ public interface Builder<T> extends Consumer<T> { /** * Adds an element to the stream being built. * * @throws IllegalStateException if the builder has already transitioned to * the built state */ @Override void accept(T t); /** * Adds an element to the stream being built. * * @implSpec * The default implementation behaves as if: * <pre>{@code * accept(t) * return this; * }</pre> * * @param t the element to add * @return {@code this} builder * @throws IllegalStateException if the builder has already transitioned to * the built state */ default Builder<T> add(T t) { accept(t); return this; } /** * Builds the stream, transitioning this builder to the built state. * An {@code IllegalStateException} is thrown if there are further attempts * to operate on the builder after it has entered the built state. * * @return the built stream * @throws IllegalStateException if the builder has already transitioned to * the built state */ Stream<T> build(); } }
只是,这接口中定义的参数,都是些经过特殊定义的接口,即函数式接口,即默认只需实现一个方法即可接口类定义。
3. stream包的具体实现?
如上一节,我们已知stream中主要依赖于许多的接口定义。既然是接口,那就必然无法直接调用,须要有与之对应的实现方可调用。所以,我们需要有特定的场景,才可以来谈stream 的实现问题。
所以,我们先以相对简单的 Integer 的流转化与处理过程,一探stream究竟。
// java.util.Arrays#stream(T[]) /** * Returns a sequential {@link Stream} with the specified array as its * source. * * @param <T> The type of the array elements * @param array The array, assumed to be unmodified during use * @return a {@code Stream} for the array * @since 1.8 */ public static <T> Stream<T> stream(T[] array) { return stream(array, 0, array.length); } // java.util.Arrays#stream(T[], int, int) /** * Returns a sequential {@link Stream} with the specified range of the * specified array as its source. * * @param <T> the type of the array elements * @param array the array, assumed to be unmodified during use * @param startInclusive the first index to cover, inclusive * @param endExclusive index immediately past the last index to cover * @return a {@code Stream} for the array range * @throws ArrayIndexOutOfBoundsException if {@code startInclusive} is * negative, {@code endExclusive} is less than * {@code startInclusive}, or {@code endExclusive} is greater than * the array size * @since 1.8 */ public static <T> Stream<T> stream(T[] array, int startInclusive, int endExclusive) { // 构造 iterator, 带入 StreamSupport 中 return StreamSupport.stream(spliterator(array, startInclusive, endExclusive), false); } /** * Returns a {@link Spliterator} covering the specified range of the * specified array. * * <p>The spliterator reports {@link Spliterator#SIZED}, * {@link Spliterator#SUBSIZED}, {@link Spliterator#ORDERED}, and * {@link Spliterator#IMMUTABLE}. * * @param <T> type of elements * @param array the array, assumed to be unmodified during use * @param startInclusive the first index to cover, inclusive * @param endExclusive index immediately past the last index to cover * @return a spliterator for the array elements * @throws ArrayIndexOutOfBoundsException if {@code startInclusive} is * negative, {@code endExclusive} is less than * {@code startInclusive}, or {@code endExclusive} is greater than * the array size * @since 1.8 */ public static <T> Spliterator<T> spliterator(T[] array, int startInclusive, int endExclusive) { return Spliterators.spliterator(array, startInclusive, endExclusive, Spliterator.ORDERED | Spliterator.IMMUTABLE); } // java.util.stream.StreamSupport#stream(java.util.Spliterator<T>, boolean) /** * Creates a new sequential or parallel {@code Stream} from a * {@code Spliterator}. * * <p>The spliterator is only traversed, split, or queried for estimated * size after the terminal operation of the stream pipeline commences. * * <p>It is strongly recommended the spliterator report a characteristic of * {@code IMMUTABLE} or {@code CONCURRENT}, or be * <a href="../Spliterator.html#binding">late-binding</a>. Otherwise, * {@link #stream(java.util.function.Supplier, int, boolean)} should be used * to reduce the scope of potential interference with the source. See * <a href="package-summary.html#NonInterference">Non-Interference</a> for * more details. * * @param <T> the type of stream elements * @param spliterator a {@code Spliterator} describing the stream elements * @param parallel if {@code true} then the returned stream is a parallel * stream; if {@code false} the returned stream is a sequential * stream. * @return a new sequential or parallel {@code Stream} */ public static <T> Stream<T> stream(Spliterator<T> spliterator, boolean parallel) { Objects.requireNonNull(spliterator); return new ReferencePipeline.Head<>(spliterator, StreamOpFlag.fromCharacteristics(spliterator), parallel); } // java.util.stream.ReferencePipeline.Head#Head(java.util.Spliterator<?>, int, boolean) /** * Constructor for the source stage of a Stream. * * @param source {@code Spliterator} describing the stream source * @param sourceFlags the source flags for the stream source, described * in {@link StreamOpFlag} */ Head(Spliterator<?> source, int sourceFlags, boolean parallel) { super(source, sourceFlags, parallel); } // java.util.stream.ReferencePipeline#ReferencePipeline(java.util.Spliterator<?>, int, boolean) /** * Constructor for the head of a stream pipeline. * * @param source {@code Spliterator} describing the stream source * @param sourceFlags The source flags for the stream source, described in * {@link StreamOpFlag} * @param parallel {@code true} if the pipeline is parallel */ ReferencePipeline(Spliterator<?> source, int sourceFlags, boolean parallel) { super(source, sourceFlags, parallel); } // java.util.stream.AbstractPipeline#AbstractPipeline(java.util.Spliterator<?>, int, boolean) /** * Constructor for the head of a stream pipeline. * * @param source {@code Spliterator} describing the stream source * @param sourceFlags the source flags for the stream source, described in * {@link StreamOpFlag} * @param parallel {@code true} if the pipeline is parallel */ AbstractPipeline(Spliterator<?> source, int sourceFlags, boolean parallel) { this.previousStage = null; this.sourceSpliterator = source; this.sourceStage = this; this.sourceOrOpFlags = sourceFlags & StreamOpFlag.STREAM_MASK; // The following is an optimization of: // StreamOpFlag.combineOpFlags(sourceOrOpFlags, StreamOpFlag.INITIAL_OPS_VALUE); this.combinedFlags = (~(sourceOrOpFlags << 1)) & StreamOpFlag.INITIAL_OPS_VALUE; this.depth = 0; this.parallel = parallel; }
如上,就返回了一 Stream 的具体实例,即是 ReferencePipeline.Head 的实例。故而,之后的每个stream操作如 filter,map,foreach方法,都尽在该 head 中进行实现了。一瞅便知。
// java.util.stream.ReferencePipeline#filter @Override public final Stream<P_OUT> filter(Predicate<? super P_OUT> predicate) { Objects.requireNonNull(predicate); // 只返回了一个 StreamlessOp实例 return new StatelessOp<P_OUT, P_OUT>(this, StreamShape.REFERENCE, StreamOpFlag.NOT_SIZED) { @Override Sink<P_OUT> opWrapSink(int flags, Sink<P_OUT> sink) { return new Sink.ChainedReference<P_OUT, P_OUT>(sink) { @Override public void begin(long size) { downstream.begin(-1); } @Override public void accept(P_OUT u) { // 在必要时候调用 test() 方法即可 // 当test返回 true 时,该元素被保留传入下一级调用中,此即filter的语义 if (predicate.test(u)) downstream.accept(u); } }; } }; } // java.util.stream.ReferencePipeline#map @Override @SuppressWarnings("unchecked") public final <R> Stream<R> map(Function<? super P_OUT, ? extends R> mapper) { Objects.requireNonNull(mapper); // 同样,仅返回一个 StatelessOp 的实例 return new StatelessOp<P_OUT, R>(this, StreamShape.REFERENCE, StreamOpFlag.NOT_SORTED | StreamOpFlag.NOT_DISTINCT) { @Override Sink<P_OUT> opWrapSink(int flags, Sink<R> sink) { return new Sink.ChainedReference<P_OUT, R>(sink) { @Override public void accept(P_OUT u) { // 同样,在必要的时候调用 apply 方法 // 即 map 的语义为 每个元素都会调用该方法 downstream.accept(mapper.apply(u)); } }; } }; } @Override public final <R> Stream<R> flatMap(Function<? super P_OUT, ? extends Stream<? extends R>> mapper) { Objects.requireNonNull(mapper); // We can do better than this, by polling cancellationRequested when stream is infinite return new StatelessOp<P_OUT, R>(this, StreamShape.REFERENCE, StreamOpFlag.NOT_SORTED | StreamOpFlag.NOT_DISTINCT | StreamOpFlag.NOT_SIZED) { @Override Sink<P_OUT> opWrapSink(int flags, Sink<R> sink) { return new Sink.ChainedReference<P_OUT, R>(sink) { @Override public void begin(long size) { downstream.begin(-1); } @Override public void accept(P_OUT u) { // flatmap 语义,所得结果,依次往下传输 try (Stream<? extends R> result = mapper.apply(u)) { // We can do better that this too; optimize for depth=0 case and just grab spliterator and forEach it if (result != null) result.sequential().forEach(downstream); } } }; } }; }
如上,几个方法调用下来,我们基本都可以看到,都是一个个的 StatelessOp 的实例的返回,但都没有触发真正的计算。那么,真正计算又要到几时呢?相信有些其他知识面的你,定然会想到,在合适的时候再来触发真正的运算操作。当数据结构不会发生本质的变化时,这种平衡就是存在的。只是在一些关键时候,才会触发运算。这为后续进行并行计算或者性能优化提供了可能。
那么,stream包中,哪些运算是作为真正的触发行为呢?至少 collect(), foreach(), reduce() 是会进行触发的。 这些优化手段,不知和其他框架实现,谁先谁后,谁主谁从。反正,总是好的想法。在其他地方,也许叫许多算子。
我们以collect()探查如何使用这stream的威力?
// java.util.stream.ReferencePipeline#collect(java.util.stream.Collector<? super P_OUT,A,R>) @Override @SuppressWarnings("unchecked") public final <R, A> R collect(Collector<? super P_OUT, A, R> collector) { A container; // 即分并行与串行 if (isParallel() && (collector.characteristics().contains(Collector.Characteristics.CONCURRENT)) && (!isOrdered() || collector.characteristics().contains(Collector.Characteristics.UNORDERED))) { container = collector.supplier().get(); BiConsumer<A, ? super P_OUT> accumulator = collector.accumulator(); forEach(u -> accumulator.accept(container, u)); } else { // 串行执行 container = evaluate(ReduceOps.makeRef(collector)); } return collector.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH) ? (R) container : collector.finisher().apply(container); } /** * Constructs a {@code TerminalOp} that implements a mutable reduce on * reference values. * * @param <T> the type of the input elements * @param <I> the type of the intermediate reduction result * @param collector a {@code Collector} defining the reduction * @return a {@code ReduceOp} implementing the reduction */ public static <T, I> TerminalOp<T, I> makeRef(Collector<? super T, I, ?> collector) { Supplier<I> supplier = Objects.requireNonNull(collector).supplier(); BiConsumer<I, ? super T> accumulator = collector.accumulator(); BinaryOperator<I> combiner = collector.combiner(); class ReducingSink extends Box<I> implements AccumulatingSink<T, I, ReducingSink> { @Override public void begin(long size) { state = supplier.get(); } @Override public void accept(T t) { accumulator.accept(state, t); } @Override public void combine(ReducingSink other) { state = combiner.apply(state, other.state); } } // 返回ReuceOp return new ReduceOp<T, I, ReducingSink>(StreamShape.REFERENCE) { @Override public ReducingSink makeSink() { return new ReducingSink(); } @Override public int getOpFlags() { return collector.characteristics().contains(Collector.Characteristics.UNORDERED) ? StreamOpFlag.NOT_ORDERED : 0; } }; } // 运算一系列任务 /** * Evaluate the pipeline with a terminal operation to produce a result. * * @param <R> the type of result * @param terminalOp the terminal operation to be applied to the pipeline. * @return the result */ final <R> R evaluate(TerminalOp<E_OUT, R> terminalOp) { assert getOutputShape() == terminalOp.inputShape(); if (linkedOrConsumed) throw new IllegalStateException(MSG_STREAM_LINKED); linkedOrConsumed = true; return isParallel() ? terminalOp.evaluateParallel(this, sourceSpliterator(terminalOp.getOpFlags())) : terminalOp.evaluateSequential(this, sourceSpliterator(terminalOp.getOpFlags())); } // java.util.stream.ReduceOps.ReduceOp#evaluateSequential @Override public <P_IN> R evaluateSequential(PipelineHelper<T> helper, Spliterator<P_IN> spliterator) { return helper.wrapAndCopyInto(makeSink(), spliterator).get(); } // java.util.stream.AbstractPipeline#wrapAndCopyInto @Override final <P_IN, S extends Sink<E_OUT>> S wrapAndCopyInto(S sink, Spliterator<P_IN> spliterator) { copyInto(wrapSink(Objects.requireNonNull(sink)), spliterator); return sink; } // java.util.stream.AbstractPipeline#wrapSink @Override @SuppressWarnings("unchecked") final <P_IN> Sink<P_IN> wrapSink(Sink<E_OUT> sink) { Objects.requireNonNull(sink); // 基本是按照倒序来排的 for ( @SuppressWarnings("rawtypes") AbstractPipeline p=AbstractPipeline.this; p.depth > 0; p=p.previousStage) { // 一层层包装算子 sink = p.opWrapSink(p.previousStage.combinedFlags, sink); } return (Sink<P_IN>) sink; } // java.util.stream.AbstractPipeline#copyInto @Override final <P_IN> void copyInto(Sink<P_IN> wrappedSink, Spliterator<P_IN> spliterator) { Objects.requireNonNull(wrappedSink); // 依次调用 begin, foreach, end 方法 if (!StreamOpFlag.SHORT_CIRCUIT.isKnown(getStreamAndOpFlags())) { wrappedSink.begin(spliterator.getExactSizeIfKnown()); // 每个元素依次迭代, 一层层退出来 spliterator.forEachRemaining(wrappedSink); wrappedSink.end(); } else { copyIntoWithCancel(wrappedSink, spliterator); } } // java.util.Spliterators.ArraySpliterator#forEachRemaining @SuppressWarnings("unchecked") @Override public void forEachRemaining(Consumer<? super T> action) { Object[] a; int i, hi; // hoist accesses and checks from loop if (action == null) throw new NullPointerException(); if ((a = array).length >= (hi = fence) && (i = index) >= 0 && i < (index = hi)) { do { action.accept((T)a[i]); } while (++i < hi); } }
可见,该stream包的实现中,大量使用了包装器模式,责任链模式,模板方法模式,以及在必要的节点再进行统一的运算触发。且在必要的时候开启并行计算,为上层应用带了各种可能。在使用起来极其简单的同时,又兼顾了性能。(我说的不是通常的性能,比如我自己写几个简单的filter岂不性能更好?)而以上,仅仅是 stream 中的一种实现,针对每个不同类型的数据,其处理方式自然不一样。比如 IntStream, DoubleStream, LongStream 虽同为Stream,但特性都都不一样,不能一概而论。当然,一般这些实现都会遵守一定的接口规范。
其中,以上这些简便的写法,得益于lamda语法的支持,以及几个简单的函数式接口定义。比如 Consumer, Function... 它们都被定义在java.util.function包下面。
@FunctionalInterface public interface Consumer<T> { /** * Performs this operation on the given argument. * * @param t the input argument */ void accept(T t); /** * Returns a composed {@code Consumer} that performs, in sequence, this * operation followed by the {@code after} operation. If performing either * operation throws an exception, it is relayed to the caller of the * composed operation. If performing this operation throws an exception, * the {@code after} operation will not be performed. * * @param after the operation to perform after this operation * @return a composed {@code Consumer} that performs in sequence this * operation followed by the {@code after} operation * @throws NullPointerException if {@code after} is null */ default Consumer<T> andThen(Consumer<? super T> after) { Objects.requireNonNull(after); return (T t) -> { accept(t); after.accept(t); }; } } @FunctionalInterface public interface Function<T, R> { /** * Applies this function to the given argument. * * @param t the function argument * @return the function result */ R apply(T t); /** * Returns a composed function that first applies the {@code before} * function to its input, and then applies this function to the result. * If evaluation of either function throws an exception, it is relayed to * the caller of the composed function. * * @param <V> the type of input to the {@code before} function, and to the * composed function * @param before the function to apply before this function is applied * @return a composed function that first applies the {@code before} * function and then applies this function * @throws NullPointerException if before is null * * @see #andThen(Function) */ default <V> Function<V, R> compose(Function<? super V, ? extends T> before) { Objects.requireNonNull(before); return (V v) -> apply(before.apply(v)); } /** * Returns a composed function that first applies this function to * its input, and then applies the {@code after} function to the result. * If evaluation of either function throws an exception, it is relayed to * the caller of the composed function. * * @param <V> the type of output of the {@code after} function, and of the * composed function * @param after the function to apply after this function is applied * @return a composed function that first applies this function and then * applies the {@code after} function * @throws NullPointerException if after is null * * @see #compose(Function) */ default <V> Function<T, V> andThen(Function<? super R, ? extends V> after) { Objects.requireNonNull(after); return (T t) -> after.apply(apply(t)); } /** * Returns a function that always returns its input argument. * * @param <T> the type of the input and output objects to the function * @return a function that always returns its input argument */ static <T> Function<T, T> identity() { return t -> t; } } @FunctionalInterface public interface Supplier<T> { /** * Gets a result. * * @return a result */ T get(); }
话说为何单叫lamda式写法又叫作函数式编程?想来原因有二,一是调用手法像是函数一般,只须传入参数即可调用,二来lamda实现方式为生出静态函数调用而成。不知是也不是。