Stateful Future Transformation

As an async programming pattern, Future has been popular with many of our programmers across a wide range of languages. Loosely speaking, Future is a wrapper around a value which will be available at some point in the future. Strictly speaking, Future is a monad which supports the following 3 operations:

unit :: T -> Future<T>
map :: (T -> R) -> (Future<T> -> Future<R>)
flatMap :: (T -> Future<R>) -> (Future<T> -> Future<R>)

When holding a future, we know the type of the value, we can register callbacks which will be called when the future is done. But callbacks are not the recommended way to deal with futures, the point of Future pattern is to avoid callbacks and in favor of future transformation. By properly using future transformation, we can make our async code look like sequential code, the callbacks are hidden from us by the futures.

Here is an example, say there are 2 async RPCs. One takes a user ID and returns a future of a list of the user's new message header (ID and title), the other takes a message ID and returns its body.

// RPC 1: Gets a list of new message (headers) of a user.
Future<NewMessagesResponse> getNewMessages(UserId userId);

// RPC 2: Gets the full message for a message ID.
Future<Message> getMessage(MessageId messageId);

// Data structures.
class Message {
  class Header {
    MessageId id;
    String title;
  }
  class Body {
    ...
  }

  Header header;
  Body body;
}

class NewMessagesResponse {
  List<MessageHeaders> headers;
}

Your task is that, given a user ID and a keyword, get the user's new messages whose title contains the keyword. With future transformation, the code may look like:

// Gets the future of a list of messages for a user, whose titles contains a given keyword.
Future<List<Message>> getNewMessages(UserId userId, String keyword) {
  Future<NewMessagesResponse> newMessagesFuture = getNewMessages(userId);
  Future<List<MessageId>> interestingIdsFuture = filter(newMessagesFuture, keyword);
  Future<List<Message>> messagesFuture = getMessages(interestingIdsFuture);
  Return messages;
}

The structure of the code is similar to what we do with synchronous code:

List<Message> getNewMessages(UserId userId, String keyword) {
  NewMessagesResponse newMessages = getNewMessages(userId);
  List<MessageId> interestingIds = filter(newMessages, keyword);
  List<Message> messages = getMessages(interestingIds);
  Return messages;
}

The async and sync functions are isomorphic, there is a correspondence in their code structure. But their runtime behaviors are different, one happens asynchronously, one happens synchronously.

Now here comes the real challenge. What if we change the RPC a bit, say there may be too many new messages that it has to return messages page by page, each response may contain an optional next page token indicating there are more pages.

// RPC 1: Gets one page of the new message (headers) of a user. The page number is denoted by a pageToken.
Future<NewMessagesResponse> getNewMessageHeaders(UserId userId, String pageToken)

class NewMessagesResponse {
  List<MessageHeaders> messageHeaders;
  String nextPageToken; // Non-empty nextPageToken indicates there are more pages.
}

Your task remains the same, write a function which takes a user ID and a keyword, return a list of the user's new messages whose titles contain the keyword.

Future<List<Message>> getNewMessages(UserId userId, String keyword) {
  //TODO
}

The difficulty lies with that in regular future transformations we have fixed number of steps, we can simply chain them together sequentially, then we get one future of the final result; but now because of pagination, the number of steps is not nondeterministic, how can we chain them together?

For synchronous code, we may use a loop like:

List<Message> getNewMessages(UserId userId, String keyword) {
  List<MessageId> interestingMessages = new ArrayList<>();
  String pageToken = "";
  do {
    NewMessagesResponse newMessages = getNewMessages(userId, pageToken);
    List<MessageId> interestingIds = filter(newMessages, keyword);
    allNewMessages.addAll(newMessages.headers);
    pageToken = newMessages.nextPageToken;
  } while (!isEmpty(pageToken));
}

But unfortunately loop is applicable to futures. How can we get one future for all the pages? Recursion comes to rescue. This is what I call *Stateful Future Transformation*.

class State {
  UserId userId;
  String keyword;
  int pageIndex;
  String pageToken;
  List<MessageId> buffer;
}

Future<State> getInterestingMessages(Future<State> stateFuture) {
  return Future.transform(
      stateFuture, (State state) -> {
        if (state.pageIndex == 0 || !isEmpty(state.pageToken)) {
          // Final state.
          return Future.immediate(state); 
        } else {
          // Intermediate state.
          Future<NewMessagesResponse> newMessagesFuture =
              getNewMessages(state.userId, state.pageToken);
          return Future.transform(newMessagesFuture, newMessages -> {
            state.pageIndex++;
            state.pageToken = newMessages.nextPageToken;
            state.buffer.addAll(filter(newMessages, state.keyword);            
          });
        }
      });
}

Future<State> getInterestingMessages(UserId userId, String keyword) {
  State initialState = new State(userId, keyword, 0, "", new ArrayList());
  Future<State> initialStateFuture = Future.immediate(initialState);
  return getInterestingMessages(initialStateFuture);
}

The code above can be refactored into a general stateful future transformation function:

// Transforms the future of an initial state future into the future of its final state.
Future<StateT> transform(
    Future<StateT> stateFuture,
    Function<StateT, Boolean> isFinalState,
    Function<StateT, Future<StateT>> getNextState) {
  return Future.transform(
      stateFuture,
      (StateT state) -> {
        return isFinalState.apply(state)
            ? Future.immediate(state)
            : transform(getNextState.appy(state));
      }
  });
}

posted on   Todd Wei  阅读(690)  评论(0编辑  收藏  举报

编辑推荐:
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
阅读排行:
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 单线程的Redis速度为什么快?
· 展开说说关于C#中ORM框架的用法!
· SQL Server 2025 AI相关能力初探
· Pantheons:用 TypeScript 打造主流大模型对话的一站式集成库
历史上的今天:
2012-03-13 MVCC在分布式系统中的应用

统计

点击右上角即可分享
微信分享提示