Stay Hungry,Stay Foolish!

LangMem

LangMem

https://blog.langchain.dev/langmem-sdk-launch/

On memory and adaptive agents

Agents use memory to learn, but the way their memories are formed, stored, updated, and retrieved impacts types of things your agent can learn to know or do. At LangChain, we’ve found it useful to first identify the capabilities your agent needs to be able to learn, map these to specific memory types or approaches, and only then implement them in your agent. Before adding memory, we think you should consider:

  • What behavior should be learned (user-informed) vs. pre-defined?
  • What types of knowledge or facts should be tracked?
  • What conditions should trigger a memory to be recalled?

While there may be some overlap, each memory type serves distinct functions when building adaptive agents:

Memory TypePurposeAgent ExampleHuman ExampleTypical Storage Pattern
Semantic Facts & Knowledge User preferences; knowledge triplets Knowing Python is a programming language Profile or Collection
Episodic Past Experiences Few-shot examples; Summaries of past conversations Remembering your first day at work Collection
Procedural System Behavior Core personality and response patterns Knowing how to ride a bicycle Prompt rules or Collection

 

 

All of these memory types are meant to address recall beyond individual conversations. Memory within a given conversation, or thread, is already handled reasonably well using checkpointing in LangGraph (so long as it doesn’t extend beyond the model’s effective context window), which serves as the “short-term” or “working” memory system for your agent.

Note that this also differs from standard RAG in a couple ways. One is the way the information is gained: through interaction rather than offline data ingestion. The other is in the type of information that’s prioritized. Below, we will share more about the memory types in more detail.

Semantic memory: facts

Semantic memory stores key facts (and their relationships) and other information that ground an agent's responses. It lets your agent remember important details that wouldn’t be “pre-trained” into the model itself and that isn’t accessible from a web search or generic retriever.

Code

memories = [
    ExtractedMemory(
        id="27e96a9d-8e53-4031-865e-5ec50c1f7ad5",
        content=Memory(
            content="Alice manages the ML team and mentors Bob, who is also on the team."
        ),
    ),
    ExtractedMemory(
        id="e2f6b646-cdf1-4be1-bb40-0fd91d25d00f",
        content=Memory(
            content="Bob now leads the ML team and the NLP project."
        ),
    ),
]

In our experience, semantic memory is the most common form of “memory” that engineers ask for and imagine (after, perhaps, short-term “conversation history” memory) when they first seek to add a memory layer.

It also (debatably) has the most overlap with traditional RAG systems. If the knowledge is available from another store (docs site, codebase, etc.), and if that store is the source of truth (rather than the interactions themselves), then your agent may work fine simply retrieving over that knowledge corpus directly. Or you can periodically ingest that knowledge to integrate that in the semantic memory system. If the knowledge is regarding personalization (about the user) or conceptual relationships not found in the raw materials, then semantic memory is perfect for you.

Procedural memory: evolving behavior

Procedural memory represents internalized knowledge of how to perform tasks. It is distinct from episodic memory in that it focuses on generalized skills, rules, and behaviors. For AI agents, procedural memory is saved across a combination of model weights, agent code, and agent's prompt that collectively determine the agent's functionality. In LangMem, we focus on saving learned procedures as updated instructions in the agent's prompt.

Code

"""
You are a helpful assistant.. 
    If the user asks about astronomy, explain topics clearly using real-world examples and current scientific data. 
    Use visual references when helpful and adapt to the user's knowledge level.
    Balance practical observational astronomy with theoretical concepts, providing either viewing advice or technical explanations based on user needs.
"""

The optimizer is prompted with identifying patterns in successful and unsuccessful interactions, then updating the system prompt to reinforce effective behaviors. This creates a feedback loop where the agent's core instructions evolve based on observed performance.

Informed by our work on prompt optimization, LangMem provides multiple algorithms for generating prompt update proposals, including: metaprompt uses reflection & additional “thinking” time to study the conversations and then use a meta-prompt to propose the update; gradient explicitly divides the work into separate steps of critique and prompt proposals to further simplify the task at each step; and a simple prompt_memory algorithm that attempts to do the above in a single step.

Episodic memory: events and experiences

Episodic memory stores memories of past interactions. It is distinct from procedural memory in its focus on recalling specific experiences. It is distinguished from semantic memory in its focus on past events rather than general knowledge, answering “how” the agent solved a particular problem rather than just “what” the answer was. It often takes the form of few-shot examples, with each example distilled from a longer raw interaction. LangMem doesn't yet support opinionated utilities for episodic memory.

 

 

视频

https://www.bilibili.com/video/BV1yLPieVEvo/?spm_id_from=333.337.search-card.all.click&vd_source=57e261300f39bf692de396b55bf8c41b

LangChain Chain系列可以说是现在最认真且坚持在不断改进的项目了,最近一直在思考现在Agent与LLM的边界在哪里。关于最近letta和langmem的动作,我觉得是这两个团队对Agent系统的定义和认知在随着R1类自带思考的模型和Deep Research所代表的RFT类Agentic LLM训练方法的出现而改变的结果。模型正在吞掉Agent经典的几大要素:Test-time thinking在吞掉Planning,RFT在吞掉Workflow和Tool call,目前似乎也只有Memory暂时由于模型定向修改难度过大以及Online Learning的不划算而能保留给Agent做了。

 

https://langchain-ai.github.io/langmem/

LangMem helps agents learn and adapt from their interactions over time.

It provides tooling to extract important information from conversations, optimize agent behavior through prompt refinement, and maintain long-term memory.

It offers both functional primitives you can use with any storage system and native integration with LangGraph's storage layer.

This lets your agents continuously improve, personalize their responses, and maintain consistent behavior across sessions.

Key features

  • 🧩 Core memory API that works with any storage system
  • 🧠 Memory management tools that agents can use to record and search information during active conversations "in the hot path"
  • ⚙️ Background memory manager that automatically extracts, consolidates, and updates agent knowledge
  • Native integration with LangGraph's Long-term Memory Store, available by default in all LangGraph Platform deployments

 

Hot Path Quickstart Guide

https://langchain-ai.github.io/langmem/hot_path_quickstart/

Memories can be created in two ways:

  1. 👉 In the hot path (this guide): the agent consciously saves notes using tools.
  2. In the background: memories are "subconsciously" extracted automatically from conversations (see Background Quickstart).

Hot Path Quickstart Diagram

In this guide, we will create a LangGraph agent that actively manages its own long-term memory through LangMem's manage_memory tool.

 

from langgraph.checkpoint.memory import MemorySaver
from langgraph.prebuilt import create_react_agent
from langgraph.store.memory import InMemoryStore
from langgraph.utils.config import get_store 
from langmem import (
    # Lets agent create, update, and delete memories 
    create_manage_memory_tool,
)


def prompt(state):
    """Prepare the messages for the LLM."""
    # Get store from configured contextvar; 
    store = get_store() # Same as that provided to `create_react_agent`
    memories = store.search(
        # Search within the same namespace as the one
        # we've configured for the agent
        ("memories",),
        query=state["messages"][-1].content,
    )
    system_msg = f"""You are a helpful assistant.

## Memories
<memories>
{memories}
</memories>
"""
    return [{"role": "system", "content": system_msg}, *state["messages"]]


store = InMemoryStore(
    index={ # Store extracted memories 
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
) 
checkpointer = MemorySaver() # Checkpoint graph state 

agent = create_react_agent( 
    "anthropic:claude-3-5-sonnet-latest",
    prompt=prompt,
    tools=[ # Add memory tools 
        # The agent can call "manage_memory" to
        # create, update, and delete memories by ID
        # Namespaces add scope to memories. To
        # scope memories per-user, do ("memories", "{user_id}"): 
        create_manage_memory_tool(namespace=("memories",)),
    ],
    # Our memories will be stored in this provided BaseStore instance
    store=store,
    # And the graph "state" will be checkpointed after each node
    # completes executing for tracking the chat history and durable execution
    checkpointer=checkpointer, 
)

 

Background Quickstart Guide

https://langchain-ai.github.io/langmem/background_quickstart/

Memories can be created in two ways:

  1. In the hot path: the agent consciously saves notes using tools (see Hot path quickstart).
  2. 👉In the background (this guide): memories are "subconsciously" extracted automatically from conversations.

Hot Path Quickstart Diagram

This guide shows you how to extract and consolidate memories in the background using create_memory_store_manager. The agent will continue as normal while memories are processed in the background.

 

from langchain.chat_models import init_chat_model
from langgraph.func import entrypoint
from langgraph.store.memory import InMemoryStore

from langmem import ReflectionExecutor, create_memory_store_manager

store = InMemoryStore( 
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)  
llm = init_chat_model("anthropic:claude-3-5-sonnet-latest")

# Create memory manager Runnable to extract memories from conversations
memory_manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    # Store memories in the "memories" namespace (aka directory)
    namespace=("memories",),  
)

@entrypoint(store=store)  # Create a LangGraph workflow
async def chat(message: str):
    response = llm.invoke(message)

    # memory_manager extracts memories from conversation history
    # We'll provide it in OpenAI's message format
    to_process = {"messages": [{"role": "user", "content": message}] + [response]}
    await memory_manager.ainvoke(to_process)  
    return response.content
# Run conversation as normal
response = await chat.ainvoke(
    "I like dogs. My dog's name is Fido.",
)
print(response)
# Output: That's nice! Dogs make wonderful companions. Fido is a classic dog name. What kind of dog is Fido?

 

Long-term Memory in LLM Applications

https://langchain-ai.github.io/langmem/concepts/conceptual_guide/

 

 

 

 

standalone_examples

https://github.com/langchain-ai/langmem/tree/main/examples/standalone_examples

"""Example demonstrating how to use a custom store with the memory manager.

This example shows how to:
1. Create a custom InMemoryStore
2. Define a structured memory schema using Pydantic
3. Initialize a memory manager with the custom store
4. Use the memory manager to store and retrieve memories

The example demonstrates that the memory manager can work independently of LangGraph's
context store, making it usable in standalone applications.
"""

from langgraph.store.memory import InMemoryStore
from pydantic import BaseModel

from langmem import create_memory_store_manager


class PreferenceMemory(BaseModel):
    """Store preferences about the user."""
    category: str
    preference: str
    context: str


def create_store():
    """Create a custom InMemoryStore with OpenAI embeddings."""
    return InMemoryStore(
        index={
            "dims": 1536,
            "embed": "openai:text-embedding-3-small",
        }
    )


async def run_example():
    """Run the example demonstrating custom store usage."""
    # Create our custom store
    store = create_store()

    # Initialize memory manager with custom store
    manager = create_memory_store_manager(
        "openai:gpt-4o-mini",
        schemas=[PreferenceMemory],
        namespace=("project", "{langgraph_user_id}"),
        store=store  # Pass our custom store here
    )

    # Simulate a conversation
    conversation = [
        {"role": "user", "content": "I prefer dark mode in all my apps"},
        {"role": "assistant", "content": "I'll remember that preference"}
    ]

    # Process the conversation and store memories
    print("Processing conversation...")
    await manager.ainvoke(
        {"messages": conversation},
        config={"configurable": {"langgraph_user_id": "user123"}}
    )

    # Retrieve and display stored memories
    print("\nStored memories:")
    memories = store.search(("project", "user123"))
    for memory in memories:
        print(f"\nMemory {memory.key}:")
        print(f"Content: {memory.value['content']}")
        print(f"Kind: {memory.value['kind']}")


if __name__ == "__main__":
    import asyncio
    print("\nStarting custom store example...\n")
    asyncio.run(run_example())
    print("\nExample completed.\n") 

 

How to Extract Episodic Memories

https://langchain-ai.github.io/langmem/guides/extract_episodic_memories/#without-storage

from langmem import create_memory_manager
from pydantic import BaseModel, Field


class Episode(BaseModel):  
    """Write the episode from the perspective of the agent within it. Use the benefit of hindsight to record the memory, saving the agent's key internal thought process so it can learn over time."""

    observation: str = Field(..., description="The context and setup - what happened")
    thoughts: str = Field(
        ...,
        description="Internal reasoning process and observations of the agent in the episode that let it arrive"
        ' at the correct action and result. "I ..."',
    )
    action: str = Field(
        ...,
        description="What was done, how, and in what format. (Include whatever is salient to the success of the action). I ..",
    )
    result: str = Field(
        ...,
        description="Outcome and retrospective. What did you do well? What could you do better next time? I ...",
    )

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[Episode],  
    instructions="Extract examples of successful explanations, capturing the full chain of reasoning. Be concise in your explanations and precise in the logic of your reasoning.",
    enable_inserts=True,
)

 

 

from langchain.chat_models import init_chat_model
from langgraph.func import entrypoint
from langgraph.store.memory import InMemoryStore
from langmem import create_memory_store_manager

# Set up vector store for similarity search
store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

# Configure memory manager with storage
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("memories", "episodes"),
    schemas=[Episode],
    instructions="Extract exceptional examples of noteworthy problem-solving scenarios, including what made them effective.",
    enable_inserts=True,
)

llm = init_chat_model("anthropic:claude-3-5-sonnet-latest")


@entrypoint(store=store)
def app(messages: list):
    # Step 1: Find similar past episodes
    similar = store.search(
        ("memories", "episodes"),
        query=messages[-1]["content"],
        limit=1,
    )

    # Step 2: Build system message with relevant experience
    system_message = "You are a helpful assistant."
    if similar:
        system_message += "\n\n### EPISODIC MEMORY:"
        for i, item in enumerate(similar, start=1):
            episode = item.value["content"]
            system_message += f"""

Episode {i}:
When: {episode['observation']}
Thought: {episode['thoughts']}
Did: {episode['action']}
Result: {episode['result']}
        """

    # Step 3: Generate response using past experience
    response = llm.invoke([{"role": "system", "content": system_message}, *messages])

    # Step 4: Store this interaction if successful
    manager.invoke({"messages": messages})
    return response


app.invoke(
    [
        {
            "role": "user",
            "content": "What's a binary tree? I work with family trees if that helps",
        },
    ],
)
print(store.search(("memories", "episodes"), query="Trees"))

# [
#     Item(
#         namespace=["memories", "episodes"],
#         key="57f6005b-00f3-4f81-b384-961cb6e6bf97",
#         value={
#             "kind": "Episode",
#             "content": {
#                 "observation": "User asked about binary trees and mentioned familiarity with family trees. This presented an opportunity to explain a technical concept using a relatable analogy.",
#                 "thoughts": "I recognized this as an excellent opportunity to bridge understanding by connecting a computer science concept (binary trees) to something the user already knows (family trees). The key was to use their existing mental model of hierarchical relationships in families to explain binary tree structures.",
#                 "action": "Used family tree analogy to explain binary trees: Each person (node) in a binary tree can have at most two children (left and right), unlike family trees where people can have multiple children. Drew parallel between parent-child relationships in both structures while highlighting the key difference of the two-child limitation in binary trees.",
#                 "result": "Successfully translated a technical computer science concept into familiar terms. This approach demonstrated effective teaching through analogical reasoning - taking advantage of existing knowledge structures to build new understanding. For future similar scenarios, this reinforces the value of finding relatable real-world analogies when explaining technical concepts. The family tree comparison was particularly effective because it maintained the core concept of hierarchical relationships while clearly highlighting the key distinguishing feature (binary limitation).",
#             },
#         },
#         created_at="2025-02-09T03:40:11.832614+00:00",
#         updated_at="2025-02-09T03:40:11.832624+00:00",
#         score=0.30178054939692683,
#     )
# ]

 

Delayed Background Memory Processing

https://langchain-ai.github.io/langmem/guides/delayed_processing/#problem

from langchain.chat_models import init_chat_model
from langgraph.func import entrypoint
from langgraph.store.memory import InMemoryStore
from langmem import ReflectionExecutor, create_memory_store_manager

# Create memory manager to extract memories from conversations 
memory_manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("memories",),
)
# Wrap memory_manager to handle deferred background processing 
executor = ReflectionExecutor(memory_manager)
store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

@entrypoint(store=store)
def chat(message: str):
    response = llm.invoke(message)
    # Format conversation for memory processing
    # Must follow OpenAI's message format
    to_process = {"messages": [{"role": "user", "content": message}] + [response]}

    # Wait 30 minutes before processing
    # If new messages arrive before then:
    # 1. Cancel pending processing task
    # 2. Reschedule with new messages included
    delay = 0.5 # In practice would choose longer (30-60 min)
    # depending on app context.
    executor.submit(to_process, after_seconds=delay)
    return response.content

 

posted @ 2025-04-13 10:42  lightsong  阅读(63)  评论(0)    收藏  举报
千山鸟飞绝,万径人踪灭