java8新特性-引用流执行流程、filter,map,collect操作
例子:
public class User implements Comparable<User> {
private String name;
private Integer age;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public Integer getAge() {
return age;
}
public void setAge(Integer age) {
this.age = age;
}
public User(){}
public User(String name, Integer age) {
this.name = name;
this.age = age;
}
@Override
public String toString() {
return "User{" +
"name='" + name + '\'' +
", age=" + age +
'}';
}
@Override
public int compareTo(User o) {
return age.compareTo(o.getAge());
}
}
测试代码:
List<User> users = new ArrayList<>();
users.add(new User("张三",30));
users.add(new User("李四",34));
users.add(new User("王五",20));
List<String> list = users.stream().filter(user -> user.getAge() != null && user.getAge() >= 30).map(User::getName).collect(Collectors.toList());
System.out.println(list);
以上代码是求出User的age大于等于30的name并收集成List,打印。
在上面的例子,collect是个终端操作,执行后关闭流。users.stream()创建了ReferencePipeline.Head,表示流操作的头,主要是引用了Spliterator(数据)以及流标志。users.stream()后collect之前的是中间操作。每个操作都抽象成ReferencePipeline。即StatelessOp是ReferencePipeline。在StatelessOp中由Sink负责聚合操作。并且在执行终端操作时才处理数据。中间操作的之间的StatelessOp由AbstractPipeline的previousStage链接上一个流操作,nextStage字段链接下一个流操作,每个中间操作的sourceStage字段都链接到ReferencePipeline.Head。每调用一个中间操作,depth(深度)在前一个流的depth基础上加一。
源码解析:
ReferencePipeline#filter(Predicate<? super P_OUT> predicate)
@Override
public final Stream<P_OUT> filter(Predicate<? super P_OUT> predicate) {
Objects.requireNonNull(predicate);
return new StatelessOp<P_OUT, P_OUT>(this, StreamShape.REFERENCE,
StreamOpFlag.NOT_SIZED) {
@Override
Sink<P_OUT> opWrapSink(int flags, Sink<P_OUT> sink) {
return new Sink.ChainedReference<P_OUT, P_OUT>(sink) {
@Override
public void begin(long size) {
downstream.begin(-1);
}
@Override
public void accept(P_OUT u) {
if (predicate.test(u))
downstream.accept(u);
}
};
}
};
}
将filter操作抽象成StatelessOp返回,并将this(上一个Stream,在这个例子中是ReferencePipeline.Head)StatelessOp是无状态的ReferencePipeline。
abstract static class StatelessOp<E_IN, E_OUT>
extends ReferencePipeline<E_IN, E_OUT> {
StatelessOp(AbstractPipeline<?, E_IN, ?> upstream,
StreamShape inputShape,
int opFlags) {
super(upstream, opFlags);
assert upstream.getOutputShape() == inputShape;
}
@Override
final boolean opIsStateful() {
return false;
}
}
ReferencePipeline#map(Function<? super P_OUT, ? extends R> mapper)
public final <R> Stream<R> map(Function<? super P_OUT, ? extends R> mapper) {
Objects.requireNonNull(mapper);
return new StatelessOp<P_OUT, R>(this, StreamShape.REFERENCE,
StreamOpFlag.NOT_SORTED | StreamOpFlag.NOT_DISTINCT) {
@Override
Sink<P_OUT> opWrapSink(int flags, Sink<R> sink) {
return new Sink.ChainedReference<P_OUT, R>(sink) {
@Override
public void accept(P_OUT u) {
downstream.accept(mapper.apply(u));
}
};
}
};
}
将map操作抽象成StatelessOp并返回,可看到此时并没有处理数据。
ReferencePipeline#collect(Collector<? super P_OUT, A, R> collector)
public final <R, A> R collect(Collector<? super P_OUT, A, R> collector) {
A container;
if (isParallel()
&& (collector.characteristics().contains(Collector.Characteristics.CONCURRENT))
&& (!isOrdered() || collector.characteristics().contains(Collector.Characteristics.UNORDERED))) {
container = collector.supplier().get();
BiConsumer<A, ? super P_OUT> accumulator = collector.accumulator();
forEach(u -> accumulator.accept(container, u));
}
else {
container = evaluate(ReduceOps.makeRef(collector));
}
return collector.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH)
? (R) container
: collector.finisher().apply(container);
}
collect是终止操作。首先进行并行流的处理。evaluate(ReduceOps.makeRef(collector))处理顺序流。可看到将collect抽象成ReduceOp,通过ReduceOps.makeRef实现的。调用AbstractPipeline#evaluate(TerminalOp<E_OUT, R> terminalOp)执行实际处理。
ReduceOps#makeRef(Collector collector)
public static <T, I> TerminalOp<T, I>
makeRef(Collector<? super T, I, ?> collector) {
Supplier<I> supplier = Objects.requireNonNull(collector).supplier();
BiConsumer<I, ? super T> accumulator = collector.accumulator();
BinaryOperator<I> combiner = collector.combiner();
class ReducingSink extends Box<I>
implements AccumulatingSink<T, I, ReducingSink> {
@Override
public void begin(long size) {
state = supplier.get();
}
@Override
public void accept(T t) {
accumulator.accept(state, t);
}
@Override
public void combine(ReducingSink other) {
state = combiner.apply(state, other.state);
}
}
return new ReduceOp<T, I, ReducingSink>(StreamShape.REFERENCE) {
@Override
public ReducingSink makeSink() {
return new ReducingSink();
}
@Override
public int getOpFlags() {
return collector.characteristics().contains(Collector.Characteristics.UNORDERED)
? StreamOpFlag.NOT_ORDERED
: 0;
}
};
}
ReduceOps#makeRef抽象ReduceOp。
AbstractPipeline#evaluate(TerminalOp<E_OUT, R> terminalOp)
final <R> R evaluate(TerminalOp<E_OUT, R> terminalOp) {
assert getOutputShape() == terminalOp.inputShape();
if (linkedOrConsumed)
throw new IllegalStateException(MSG_STREAM_LINKED);
linkedOrConsumed = true;
return isParallel()
? terminalOp.evaluateParallel(this, sourceSpliterator(terminalOp.getOpFlags()))
: terminalOp.evaluateSequential(this, sourceSpliterator(terminalOp.getOpFlags()));
}
在这里调用terminalOp.evaluateSequential处理顺序流。通过sourceSpliterator获取数据,即流的源sourceSpliterator。Spliterator中保留了要处理的数据。terminalOp是ReduceOp。
ReduceOp#evaluateSequential(PipelineHelper
@Override
public <P_IN> R evaluateSequential(PipelineHelper<T> helper,
Spliterator<P_IN> spliterator) {
return helper.wrapAndCopyInto(makeSink(), spliterator).get();
}
makeSink()调用上面collect方法中ReduceOps.makeRef实现的makeSink()返回ReducingSink。往下看wrapAndCopyInto。
AbstractPipeline#helper.wrapAndCopyInto
final <P_IN, S extends Sink<E_OUT>> S wrapAndCopyInto(S sink, Spliterator<P_IN> spliterator) {
copyInto(wrapSink(Objects.requireNonNull(sink)), spliterator);
return sink;
}
调用wrapSink从后往前链接中间操作的Sink。copyInto执行数据处理。
AbstractPipeline#wrapSink(Sink<E_OUT> sink)
final <P_IN> Sink<P_IN> wrapSink(Sink<E_OUT> sink) {
Objects.requireNonNull(sink);
for ( @SuppressWarnings("rawtypes") AbstractPipeline p=AbstractPipeline.this; p.depth > 0; p=p.previousStage) {
sink = p.opWrapSink(p.previousStage.combinedFlags, sink);
}
return (Sink<P_IN>) sink;
}
从当前对象,用previousStage字段往前遍历,直到depth等于0,依次调用AbstractPipeline.opWrapSink封装当前Sink。在这个例子中collect的前一个操作是map,map函数返回的StatelessOp重写了opWrapSink方法。opWrapSink方法返回了Sink.ChainedReference。ChainedReference是Sink的实现类。再往前是filter操作,filter()返回的StatelessOp重写了opWrapSink方法,opWrapSink方法返回了Sink.ChainedReference。可看到wrapSink方法没有封装ReferencePipeline.Head。执行完wrapSink方法后,sink为:
通过downstream关联下一个Sink。再看AbstractPipeline#copyInto
final <P_IN> void copyInto(Sink<P_IN> wrappedSink, Spliterator<P_IN> spliterator) {
Objects.requireNonNull(wrappedSink);
if (!StreamOpFlag.SHORT_CIRCUIT.isKnown(getStreamAndOpFlags())) {
wrappedSink.begin(spliterator.getExactSizeIfKnown());
spliterator.forEachRemaining(wrappedSink);
wrappedSink.end();
}
else {
copyIntoWithCancel(wrappedSink, spliterator);
}
}
首先处理非短路操作的Stream,copyIntoWithCancel处理短路操作的Stream。此例子是非短路的Stream。wrappedSink.begin从上图中的第一个Sink开始执行,filter()抽象的StatelessOp#begin(long size):
public void begin(long size) {
downstream.begin(-1);
}
调用map抽象的begin方法。map()抽象的begin(long size):
public void begin(long size) {
downstream.begin(size);
}
继续调用collect抽象的begin:
public void begin(long size) {
state = supplier.get();
}
supplier是collect的参数Collectors.toList()提供的。从下面可知supplier.get()调用ArrayList::new实例化ArrayList。
Collectors#toList()
public static <T>
Collector<T, ?, List<T>> toList() {
return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new, List::add,
(left, right) -> { left.addAll(right); return left; },
CH_ID);
}
CollectorImpl(Supplier<A> supplier,
BiConsumer<A, T> accumulator,
BinaryOperator<A> combiner,
Set<Characteristics> characteristics) {
this(supplier, accumulator, combiner, castingIdentity(), characteristics);
}
private static <I, R> Function<I, R> castingIdentity() {
return i -> (R) i;
}
可知:supplier为ArrayList::new,accumulator为List::add,combiner为(left, right) -> { left.addAll(right); return left; },finisher为 i -> (R) i,characteristics为Collections.unmodifiableSet(EnumSet.of(Collector.Characteristics.IDENTITY_FINISH))。
再回到AbstractPipeline#copyInto中,调用spliterator.forEachRemaining(wrappedSink);处理数据。spliterator是ArrayList的内部类ArrayListSpliterator。
ArrayList#ArrayListSpliterator#forEachRemaining(Consumer<? super E> action)
public void forEachRemaining(Consumer<? super E> action) {
int i, hi, mc; // hoist accesses and checks from loop
ArrayList<E> lst; Object[] a;
if (action == null)
throw new NullPointerException();
if ((lst = list) != null && (a = lst.elementData) != null) {
if ((hi = fence) < 0) {
mc = lst.modCount;
hi = lst.size;
}
else
mc = expectedModCount;
if ((i = index) >= 0 && (index = hi) <= a.length) {
for (; i < hi; ++i) {
@SuppressWarnings("unchecked") E e = (E) a[i];
action.accept(e);
}
if (lst.modCount == mc)
return;
}
}
throw new ConcurrentModificationException();
}
遍历所有元素,依次调用action.accept,action在这里是上面的wrappedSink。此时调用Sink.ChainedReference(编号a)的accept(P_OUT u) 方法:
public void accept(P_OUT u) {
if (predicate.test(u))
downstream.accept(u);
}
predicate是user -> user.getAge() != null && user.getAge() >= 30。如果predicate.test断言成功则调用Sink.ChainedReference(编号b)的accept方法。此时过滤了数据。
Sink.ChainedReference(编号b)的accept(P_OUT u) :
public void accept(P_OUT u) {
downstream.accept(mapper.apply(u));
}
mapper是User::getName。对每个User调用getName方法后调用ReducingSink的accept方法:
public void accept(T t) {
accumulator.accept(state, t);
}
从上面可知accumulator是List::add,将元素加到刚才创建的ArrayList中。state是刚才创建的ArrayList。
当spliterator.forEachRemaining(wrappedSink);遍历完所有数据后调用 wrappedSink.end()。这里什么不做。将刚才创建的ArrayList返回。结束
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」