Stay Hungry,Stay Foolish!

Implementing Memory in LLM Applications Using LangChain

Implementing Memory in LLM Applications Using LangChain

https://www.codecademy.com/article/implementing-memory-in-llm-applications-using-lang-chain

老版本

https://python.langchain.com/v0.1/docs/modules/memory/types/buffer/

 

How to migrate to LangGraph memory

https://python.langchain.com/docs/versions/migrating_memory/

 

What is Memory in LangChain?

In LangChain, memory is implemented by passing information from the chat history along with the query as part of the prompt. LangChain provides us with different modules we can use to implement memory.

Based on the implementation and functionality, we have the following memory types in LangChain.

  1. Conversation Buffer Memory: This memory stores all the messages in the conversation history.
  2. Conversation Buffer Window Memory: The conversation buffer window memory stores the k most recent interactions of the conversation history. We can specify k according to our needs.
  3. Entity: This type of memory remembers facts about entities, such as people, places, objects, and others, in the conversation. It extracts information about entities and builds its knowledge as the conversation progresses.
  4. Conversation Summary Memory: As the name suggests, conversation summary memory summarizes the conversation and stores the current summary. This memory is helpful for longer conversations and saves costs by minimizing the number of tokens used in the conversation.
  5. Conversation summary buffer memory: The conversation summary buffer memory combines the Conversation Summary Memory and Conversation Buffer Window Memory. It stores the last k messages of the conversation and a summary of the previous messages.

How to Implement Memory in LangChain?

To implement memory in LangChain, we need to store and use previous conversations while answering a new query.

For this, we will first implement a conversation buffer memory that stores the previous interactions. Next, we will create a prompt template that we can use to pass the messages stored in the memory to the LLM application, while running the LLM application for new queries.

Also, we will use an LLM chain to run the queries using the memory, prompt template, and the LLM object, as shown below:

import os 
from langchain.chains import LLMChain 
from langchain.memory import ConversationBufferMemory 
from langchain_core.prompts import HumanMessagePromptTemplate, ChatPromptTemplate, MessagesPlaceholder 
from langchain_google_genai import ChatGoogleGenerativeAI 
os.environ['GOOGLE_API_KEY'] = "YOUR_API_KEY" 
llm = ChatGoogleGenerativeAI(model="gemini-pro") 
first_prompt="Who is elon musk? Answer in 1 sentence." 
second_prompt="When was he born?" 
memory = ConversationBufferMemory(memory_key="chat_history") 
prompt = ChatPromptTemplate( 
    messages=[ 
        MessagesPlaceholder(variable_name="chat_history"), 
        HumanMessagePromptTemplate.from_template("{query}") 
    ] 
) 
conversation_chain = LLMChain( 
    llm=llm, 
    prompt=prompt_template, 
    memory=memory 
) 
first_output=conversation_chain.run({"query":first_prompt}) 
second_output=conversation_chain.run({"query":second_prompt}) 
print("The first prompt is:",first_prompt) 
print("The second prompt is:",second_prompt) 
print("The output for the first prompt is:") 
print(first_output) 
print("The output for the second prompt is:") 
print(second_output) 

 

Memory

https://langchain-ai.github.io/langgraph/concepts/memory/

 

How to add memory to chatbots

https://python.langchain.com/docs/how_to/chatbots_memory/

from langchain_core.messages import HumanMessage, RemoveMessage
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import START, MessagesState, StateGraph

workflow = StateGraph(state_schema=MessagesState)


# Define the function that calls the model
def call_model(state: MessagesState):
    system_prompt = (
        "You are a helpful assistant. "
        "Answer all questions to the best of your ability. "
        "The provided chat history includes a summary of the earlier conversation."
    )
    system_message = SystemMessage(content=system_prompt)
    message_history = state["messages"][:-1]  # exclude the most recent user input
    # Summarize the messages if the chat history reaches a certain size
    if len(message_history) >= 4:
        last_human_message = state["messages"][-1]
        # Invoke the model to generate conversation summary
        summary_prompt = (
            "Distill the above chat messages into a single summary message. "
            "Include as many specific details as you can."
        )
        summary_message = model.invoke(
            message_history + [HumanMessage(content=summary_prompt)]
        )

        # Delete messages that we no longer want to show up
        delete_messages = [RemoveMessage(id=m.id) for m in state["messages"]]
        # Re-add user message
        human_message = HumanMessage(content=last_human_message.content)
        # Call the model with summary & response
        response = model.invoke([system_message, summary_message, human_message])
        message_updates = [summary_message, human_message, response] + delete_messages
    else:
        message_updates = model.invoke([system_message] + state["messages"])

    return {"messages": message_updates}


# Define the node and edge
workflow.add_node("model", call_model)
workflow.add_edge(START, "model")

# Add simple in-memory checkpointer
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

 

LangMem

https://langchain-ai.github.io/long-term-memory/

import uuid

from langmem import AsyncClient

client = AsyncClient()
user_id = str(uuid.uuid4())
thread_id = str(uuid.uuid4())

messages = [
    {
        "role": "user",
        "content": "Hi, I love playing basketball!",
        "metadata": {"user_id": user_id},
    },
    {
        "role": "assistant",
        "content": "That's great! Basketball is a fun sport. Do you have a favorite player?",
    },
    {
        "role": "user",
        "content": "Yeah, Steph Curry is amazing!",
        "metadata": {"user_id": user_id},
    },
]

await client.add_messages(thread_id=thread_id, messages=messages)
await client.trigger_all_for_thread(thread_id=thread_id)

import anthropic
from langsmith import traceable

anthropic_client = anthropic.AsyncAnthropic()


@traceable(name="Claude", run_type="llm")
async def completion(messages: list, model: str = "claude-3-haiku-20240307"):
    system_prompt = messages[0]["content"]
    msgs = []
    for m in messages[1:]:
        msgs.append({k: v for k, v in m.items() if k != "metadata"})
    response = await anthropic_client.messages.create(
        model=model,
        system=system_prompt,
        max_tokens=1024,
        messages=msgs,
    )
    return response


async def completion_with_memory(messages, user_id):
    memories = await client.query_user_memory(
        user_id=user_id,
        text=messages[-1]["content"],
    )
    facts = "\n".join([mem["text"] for mem in memories["memories"]])

    system_prompt = {
        "role": "system",
        "content": "Here are some things you know" f" about the user:\n\n{facts}",
    }

    return await completion([system_prompt] + messages)


new_messages = [
    {
        "role": "user",
        "content": "Do you remember who my favorite basketball player is?",
        "metadata": {"user_id": user_id},
    }
]

response = await completion_with_memory(new_messages, user_id=user_id)
print(response.content[0].text)

 

posted @ 2024-12-14 23:42  lightsong  阅读(9)  评论(0编辑  收藏  举报
Life Is Short, We Need Ship To Travel