langchain教程
参考网址:
1、简介
LangChain 是一个用于开发由大型语言模型 (LLM) 驱动的应用程序的框架。
LangChain简化了LLM申请生命周期的每个阶段:
具体来说,该框架由以下开源库组成:
- langchain-core:基本抽象和 LangChain 表达式语言。
- langchain-community:第三方集成。
- 合作伙伴包(例如 langchain-openai、langchain-anthropic 等):一些集成已进一步拆分为自己的轻量级包,仅依赖于 langchain-core。
- langchain:构成应用程序认知架构的链、代理和检索策略。
- langgraph:通过将步骤建模为图中的边和节点,使用 LLM 构建健壮且有状态的多参与者应用程序。
- langserve:将 LangChain 链部署为 REST API。
更广泛的生态系统包括:
- LangSmith:一个开发者平台,可让您调试、测试、评估和监控LLM应用程序,并与LangChain无缝集成。
1、如何安装 LangChain、设置环境并开始构建。
https://python.langchain.com/docs/get_started/installation/
2、如果您想要构建特定的东西或者更多的是实践学习者,请查看我们的用例。 它们是常见端到端任务的演练和技术,例如:
3、Expression Language
LangChain表达式语言(LCEL)是LangChain许多组件的基础,是一种声明式的链组成方式。 LCEL 从第一天起就被设计为支持将原型投入生产,从最简单的“提prompt + LLM”链到最复杂的链,无需更改代码。
- Get started LCEL and its benefits
- Runnable interface: The standard interface for LCEL objects
- Primitives: More on the primitives LCEL includes
2、示例
2.1、RAG
Llm支持的最强大的应用程序之一是复杂的问答 (Q&A) 聊天机器人。 这些应用程序可以回答有关特定源信息的问题。 这些应用程序使用一种称为检索增强生成(RAG)的技术。
RAG 是一种利用额外数据增强 LLM 知识的技术。
Llm可以推理广泛的主题,但他们的知识仅限于他们接受培训的特定时间点的公共数据。 如果您想要构建能够推理私有数据或模型截止日期之后引入的数据的 AI 应用程序,您需要使用模型所需的特定信息来增强模型的知识。 引入适当的信息并将其插入模型提示的过程称为检索增强生成 (RAG)。
LangChain 有许多组件旨在帮助构建问答应用程序,以及更广泛的 RAG 应用程序。
2.1.1、典型的 RAG 应用程序有两个主要组件
-
索引:用于从源获取数据并为其建立索引的管道。 这通常发生在离线状态。
-
检索和生成:实际的 RAG 链,它在运行时接受用户查询并从索引中检索相关数据,然后将其传递给模型。
从原始数据到答案的最常见的完整序列如下所示:
2.1.2、索引
-
load:首先我们需要加载数据。 这是通过 DocumentLoaders 完成的。
-
Split:文本分割器将大文档分成更小的块。 这对于索引数据和将其传递到模型都很有用,因为大块更难搜索并且不适合模型的有限上下文窗口。
-
store:我们需要某个地方来存储和索引我们的分割,以便以后可以搜索它们。 这通常是使用 VectorStore 和 Embeddings 模型来完成的。
2.1.3、检索和生成
-
Retrieve: 给定用户输入,使用Retriever从存储中检索相关拆分。
-
Generate: LLM生成一个使用包含问题和检索到的数据的提示进行回答。
2.1.4、代码示例
import getpass
import os
os.environ["OPENAI_API_KEY"] = getpass.getpass()
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
# Load, chunk and index the contents of the blog.
loader = WebBaseLoader(
web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
bs_kwargs=dict(
parse_only=bs4.SoupStrainer(
class_=("post-content", "post-title", "post-header")
)
),
)
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
rag_chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
rag_chain.invoke("What is Task Decomposition?")
'Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It can be done through prompting techniques like Chain of Thought or Tree of Thoughts, or by using task-specific instructions or human inputs. Task decomposition helps agents plan ahead and manage complicated tasks more effectively.'
# cleanup
vectorstore.delete_collection()
2.2、Extracting structured output
信息提取的经典解决方案依赖于人员、(许多)手工制定的规则(例如正则表达式)和自定义微调的 ML 模型的组合。随着时间的推移,此类系统往往会变得复杂,维护成本也越来越高,增强起来也越来越困难。只需向法学硕士提供适当的说明和适当的参考示例,即可快速适应特定的提取任务。本指南将向您展示如何使用法学硕士进行提取应用程序!
2.2.1、方法
使用法学硕士进行信息提取有 3 种广泛的方法:
- 工具/函数调用模式:一些法学硕士支持工具或函数调用模式。 这些法学硕士可以根据给定的模式构建输出。一般来说,这种方法最容易使用,并且有望产生良好的结果。
- JSON 模式:某些法学硕士可以强制输出有效的 JSON。 这类似于工具/函数调用 方法,只不过架构是作为提示的一部分提供的。 一般来说,我们的直觉是,这比工具/函数调用方法表现更差,自己的用例进行验证
- 基于提示:可以很好地遵循说明的法学硕士可以被指示生成所需格式的文本。 生成的文本可以使用现有的输出解析器或使用自定义解析器转换为 JSON 等结构化格式。 此方法可用于不支持 JSON 模式或工具/函数调用模式的 LLM。 这种方法具有更广泛的适用性,但可能会比针对提取或函数调用进行微调的模型产生更差的结果。
2.2.2、具体示例
调用函数/工具的聊天模型来从文本中提取信息。
1、模式
首先,我们需要描述我们想要从文本中提取哪些信息。使用 Pydantic 定义一个示例架构来提取个人信息。
from typing import Optional
from langchain_core.pydantic_v1 import BaseModel, Field
class Person(BaseModel):
"""Information about a person."""
# ^ Doc-string for the entity Person.
# This doc-string is sent to the LLM as the description of the schema Person,
# and it can help to improve extraction results.
# Note that:
# 1. Each field is an `optional` -- this allows the model to decline to extract it!
# 2. Each field has a `description` -- this description is used by the LLM.
# Having a good description can help improve extraction results.
name: Optional[str] = Field(default=None, description="The name of the person")
hair_color: Optional[str] = Field(
default=None, description="The color of the peron's hair if known"
)
height_in_meters: Optional[str] = Field(
default=None, description="Height measured in meters"
)
定义架构时有两种最佳实践:
- 记录属性和模式本身:此信息被发送到法学硕士并用于提高信息提取的质量。
- 不要强迫LLM编造信息! 上面我们使用Optional作为属性,允许LLM在不知道答案时输出None。
2、提取器
让我们使用上面定义的模式创建一个信息提取器。
from typing import Optional
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_openai import ChatOpenAI
# Define a custom prompt to provide instructions and any additional context.
# 1) You can add examples into the prompt template to improve extraction quality
# 2) Introduce additional parameters to take context into account (e.g., include metadata
# about the document from which the text was extracted.)
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are an expert extraction algorithm. "
"Only extract relevant information from the text. "
"If you do not know the value of an attribute asked to extract, "
"return null for the attribute's value.",
),
# Please see the how-to about improving performance with
# reference examples.
# MessagesPlaceholder('examples'),
("human", "{text}"),
]
)
我们需要使用支持函数/工具调用的模型。
请查看结构化输出以获取可与此 API 一起使用的一些模型的列表。
from langchain_mistralai import ChatMistralAI
llm = ChatMistralAI(model="mistral-large-latest", temperature=0)
runnable = prompt | llm.with_structured_output(schema=Person)
Let's test it out
text = "Alan Smith is 6 feet tall and has blond hair."
runnable.invoke({"text": text})
输出结果为:
Person(name='Alan Smith', hair_color='blond', height_in_meters='1.8288')
2.3、ChatBot
聊天机器人是LLM最流行的用例之一。 聊天机器人的核心特征是它们可以进行长时间运行的、有状态的对话,并可以使用相关信息回答用户问题。
设计聊天机器人需要考虑具有不同优点和权衡的各种技术,具体取决于您希望它处理什么类型的问题。
聊天机器人通常对私有数据使用RAG生成,以更好地回答特定领域的问题。 您还可以选择在多个数据源之间进行路由,以确保它仅使用最热门的上下文来回答最终问题,或者选择使用更专业类型的聊天历史记录或内存,而不仅仅是来回传递消息。
2.3.1、示例
我们将通过一个示例来说明如何设计和实现由法学硕士支持的聊天机器人。 以下是我们将使用的一些高级组件:
- ChatModel: 聊天机器人界面基于消息而不是原始文本,因此最适合聊天模型而不是文本法学硕士。 有关聊天模型集成的列表,请参阅此处;有关 LangChain 中聊天模型接口的文档,请参阅此处。 您也可以将法学硕士(请参阅此处)用于聊天机器人,但聊天模型具有更具对话性的语气,并且本身支持消息界面。
- 提示模板,简化了组合默认消息、用户输入、聊天历史记录和(可选)附加检索上下文的提示的组装过程。
- 聊天历史记录,它允许聊天机器人“记住”过去的互动,并在回答后续问题时将其考虑在内。 浏览此处获取更多信息。
- 检索器(可选),如果您想要构建一个可以使用特定领域的最新知识作为上下文来增强其响应的聊天机器人,则检索器非常有用。 有关检索系统的深入文档,请参阅此处。
我们将介绍如何将上述组件组合在一起以创建强大的对话聊天机器人。
1、chatmodel
from langchain_openai import ChatOpenAI
chat = ChatOpenAI(model="gpt-3.5-turbo-1106", temperature=0.2)
2、prompt template
让我们定义一个提示模板以使格式化更容易一些。 我们可以通过将其传递到模型中来创建一条链:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant. Answer all questions to the best of your ability.",
),
MessagesPlaceholder(variable_name="messages"),
]
)
chain = prompt | chat
3、Message history
作为管理聊天历史记录的快捷方式,我们可以使用 MessageHistory 类,它负责保存和加载聊天消息。 有许多内置消息历史记录集成可将消息保存到各种数据库,但在本快速入门中,我们将使用名为 ChatMessageHistory 的内存中演示消息历史记录。
下面是直接使用它的示例:
from langchain.memory import ChatMessageHistory
demo_ephemeral_chat_history = ChatMessageHistory()
demo_ephemeral_chat_history.add_user_message("hi!")
demo_ephemeral_chat_history.add_ai_message("whats up?")
demo_ephemeral_chat_history.messages
输出为:
[HumanMessage(content='hi!'), AIMessage(content='whats up?')]
一旦我们这样做了,我们就可以将存储的消息作为参数直接传递到我们的链中:
demo_ephemeral_chat_history.add_user_message(
"Translate this sentence from English to French: I love programming."
)
response = chain.invoke({"messages": demo_ephemeral_chat_history.messages})
response
输出:
AIMessage(content='The translation of "I love programming" in French is "J\'adore la programmation."')
现在我们有了一个基本的聊天机器人!
虽然这条链本身就可以作为一个有用的聊天机器人,只需要模型的内部知识,但引入某种形式的检索增强生成(简称 RAG),通过特定领域的知识来使我们的聊天机器人更加专注,通常很有用。 我们接下来会介绍这个。
4、Retrievers检索器
我们可以设置并使用检索器来为我们的聊天机器人提取特定领域的知识。
接下来,我们将使用文档加载器从网页中提取数据:
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://docs.smith.langchain.com/overview")
data = loader.load()
接下来,我们将其分割成 LLM 上下文窗口可以处理的更小的块,并将其存储在向量数据库中:
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)
然后我们将这些块嵌入并存储在矢量数据库中:
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())
And finally, let's create a retriever from our initialized vectorstore:
# k is the number of chunks to retrieve
retriever = vectorstore.as_retriever(k=4)
docs = retriever.invoke("how can langsmith help with testing?")
docs
输出:
[Document(page_content='You can also quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs.Monitoring\u200bAfter all this, your app might finally ready to go in production. LangSmith can also be used to monitor your application in much the same way that you used for debugging. You can log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Each run can also be', metadata={'description': 'Building reliable LLM applications can be challenging. LangChain simplifies the initial setup, but there is still work needed to bring the performance of prompts, chains and agents up the level where they are reliable enough to be used in production.', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'LangSmith Overview and User Guide | 🦜️🛠️ LangSmith'}),
Document(page_content='inputs, and see what happens. At some point though, our application is performing\nwell and we want to be more rigorous about testing changes. We can use a dataset\nthat we’ve constructed along the way (see above). Alternatively, we could spend some\ntime constructing a small dataset by hand. For these situations, LangSmith simplifies', metadata={'description': 'Building reliable LLM applications can be challenging. LangChain simplifies the initial setup, but there is still work needed to bring the performance of prompts, chains and agents up the level where they are reliable enough to be used in production.', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'LangSmith Overview and User Guide | 🦜️🛠️ LangSmith'}),
Document(page_content='Skip to main content🦜️🛠️ LangSmith DocsPython DocsJS/TS DocsSearchGo to AppLangSmithOverviewTracingTesting & EvaluationOrganizationsHubLangSmith CookbookOverviewOn this pageLangSmith Overview and User GuideBuilding reliable LLM applications can be challenging. LangChain simplifies the initial setup, but there is still work needed to bring the performance of prompts, chains and agents up the level where they are reliable enough to be used in production.Over the past two months, we at LangChain', metadata={'description': 'Building reliable LLM applications can be challenging. LangChain simplifies the initial setup, but there is still work needed to bring the performance of prompts, chains and agents up the level where they are reliable enough to be used in production.', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'LangSmith Overview and User Guide | 🦜️🛠️ LangSmith'}),
Document(page_content='have been building and using LangSmith with the goal of bridging this gap. This is our tactical user guide to outline effective ways to use LangSmith and maximize its benefits.On by default\u200bAt LangChain, all of us have LangSmith’s tracing running in the background by default. On the Python side, this is achieved by setting environment variables, which we establish whenever we launch a virtual environment or open our bash shell and leave them set. The same principle applies to most JavaScript', metadata={'description': 'Building reliable LLM applications can be challenging. LangChain simplifies the initial setup, but there is still work needed to bring the performance of prompts, chains and agents up the level where they are reliable enough to be used in production.', 'language': 'en', 'source': 'https://docs.smith.langchain.com/overview', 'title': 'LangSmith Overview and User Guide | 🦜️🛠️ LangSmith'})]
我们可以看到,调用上面的检索器会产生 LangSmith 文档的某些部分,其中包含有关测试的信息,我们的聊天机器人可以在回答问题时将其用作上下文。
5、组合
让我们修改之前的prompt以接受doc作为上下文。 我们将使用 create_stuff_documents_chain 辅助函数将所有输入文档“填充”到prompt中,这也可以方便地处理格式设置。 我们使用 ChatPromptTemplate.from_messages 方法来格式化要传递给模型的消息输入,包括将直接注入聊天历史消息的 MessagesPlaceholder:
from langchain.chains.combine_documents import create_stuff_documents_chain
chat = ChatOpenAI(model="gpt-3.5-turbo-1106")
question_answering_prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"Answer the user's questions based on the below context:\n\n{context}",
),
MessagesPlaceholder(variable_name="messages"),
]
)
document_chain = create_stuff_documents_chain(chat, question_answering_prompt)
We can invoke this document_chain
with the raw documents we retrieved above:
from langchain.memory import ChatMessageHistory
demo_ephemeral_chat_history = ChatMessageHistory()
demo_ephemeral_chat_history.add_user_message("how can langsmith help with testing?")
document_chain.invoke(
{
"messages": demo_ephemeral_chat_history.messages,
"context": docs,
}
)
'LangSmith can assist with testing by providing the capability to quickly edit examples and add them to datasets. This allows for the expansion of evaluation sets or fine-tuning of a model for improved quality or reduced costs. Additionally, LangSmith simplifies the construction of small datasets by hand, providing a convenient way to rigorously test changes in the application.'
Awesome! We see an answer synthesized from information in the input documents.
3、LCEL
LangChain 表达式语言(LCEL)是一种轻松地将链组合在一起的声明性方式。
LCEL 从第一天起就被设计为支持将原型投入生产,无需更改代码,从最简单的“提示 + LLM”链到最复杂的链。 强调一下您可能想要使用
3.1、示例
3.1.1、顺序
LCEL 可以轻松地从基本组件构建复杂的链,并支持开箱即用的功能,例如流式传输、并行性和日志记录。
prompt = ChatPromptTemplate.from_template("tell me a short joke about {topic}")
output_parser = StrOutputParser()
chain = prompt | model | output_parser
chain.invoke({"topic": "ice cream"})
"Why don't ice creams ever get invited to parties?\n\nBecause they always drip when things heat up!"
-
| 符号类似于 Unix 管道运算符,它将不同的组件链接在一起,将一个组件的输出作为下一个组件的输入。
-
在此链中,用户输入传递到提示模板,然后提示模板输出传递到模型,然后模型输出传递到输出解析器。 让我们分别看一下每个组件,以真正了解发生了什么。
步骤操作:
- 将所需主题的用户输入传递为
- 提示组件接受用户输入,然后在使用主题构造提示后使用该输入构造 PromptValue。
- 模型组件采用生成的提示,并传递到 OpenAI LLM 模型进行评估。 模型生成的输出是 ChatMessage 对象。
- 最后,output_parser 组件接收 ChatMessage,并将其转换为 Python 字符串,该字符串从 invoke 方法返回。
3.1.2、RAG
vectorstore = DocArrayInMemorySearch.from_texts(
["harrison worked at kensho", "bears like to eat honey"],
embedding=OpenAIEmbeddings(),
)
retriever = vectorstore.as_retriever()
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
output_parser = StrOutputParser()
setup_and_retrieval = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | prompt | model | output_parser
chain.invoke("where did harrison work?")
In this case, the composed chain is:
chain = setup_and_retrieval | prompt | model | output_parser
retriever.invoke("where did harrison work?")
然后,我们使用 RunnableParallel 通过使用检索到的文档的条目以及原始用户问题来准备提示中的预期输入,使用检索器进行文档搜索,并使用 RunnablePassthrough 传递用户的问题:
setup_and_retrieval = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
流程为:
- 第一步创建一个包含两个条目的 RunnableParallel 对象。 第一个条目上下文将包括检索器获取的文档结果。 第二个条目问题将包含用户的原始问题。 为了传递问题,我们使用 RunnablePassthrough 来复制此条目。
- 将上述步骤中的字典提供给提示组件。 然后,它采用作为问题的用户输入以及作为上下文的检索到的文档来构造提示并输出 PromptValue。
- 模型组件采用生成的提示,并传递到 OpenAI LLM 模型进行评估。 模型生成的输出是 ChatMessage 对象。
- 最后,output_parser 组件接收 ChatMessage,并将其转换为 Python 字符串,该字符串从 invoke 方法返回。
3.2、Runnable interface
为了尽可能轻松地创建自定义链,我们实现了“Runnable”协议。 许多 LangChain 组件都实现了 Runnable 协议,包括聊天模型、LLM、输出解析器、检索器、提示模板等。 还有一些用于处理可运行对象的有用原语
这是一个标准接口,可以轻松定义自定义链并以标准方式调用它们。 标准接口包括:
输入类型和输出类型因组件而异:
Component | Input Type | Output Type |
---|---|---|
Prompt | Dictionary | PromptValue |
ChatModel | Single string, list of chat messages or a PromptValue | ChatMessage |
LLM | Single string, list of chat messages or a PromptValue | String |
OutputParser | The output of an LLM or ChatModel | Depends on the parser |
Retriever | Single string | List of Documents |
Tool | Single string or dictionary, depending on the tool | Depends on the tool |
4、模块
4.1、prompt
LLM的提示是用户提供的一组指令或输入,用于指导模型的响应,帮助模型理解上下文并生成相关且连贯的基于语言的输出,例如回答问题、完成句子或参与某项活动。 对话。
Prompt templates是用于生成语言模型提示的预定义配方。
模板可以包括说明、few-shot examples、以及适合给定任务的特定上下文和问题。
LangChain 提供了创建和使用提示模板的工具。
LangChain 致力于创建与模型无关的模板,以便能够轻松地跨不同语言模型重用现有模板。
通常,语言模型期望提示是字符串或聊天消息列表。
4.1.1、PromptTemplate
1、使用“PromptTemplate”为字符串提示创建模板。
from langchain_core.prompts import PromptTemplate
prompt_template = PromptTemplate.from_template(
"Tell me a {adjective} joke about {content}."
)
prompt_template.format(adjective="funny", content="chickens")
'Tell me a funny joke about chickens.'
2、The template supports any number of variables, including no variables:
from langchain_core.prompts import PromptTemplate
prompt_template = PromptTemplate.from_template("Tell me a joke")
prompt_template.format()
'Tell me a joke'
4.1.2、ChatPromptTemplate
每条聊天消息都与内容以及称为“角色”的附加参数相关联。 例如,在 OpenAI中,聊天消息可以与 AI 助手、人类或系统角色相关联。
创建一个这样的聊天提示模板:
from langchain_core.prompts import ChatPromptTemplate
chat_template = ChatPromptTemplate.from_messages(
[
("system", "You are a helpful AI bot. Your name is {name}."),
("human", "Hello, how are you doing?"),
("ai", "I'm doing well, thanks!"),
("human", "{user_input}"),
]
)
messages = chat_template.format_messages(name="Bob", user_input="What is your name?")
将这些格式化消息通过pip传输到 LangChain 的 ChatOpenAI 聊天模型类中,大致相当于直接使用 OpenAI 客户端:
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful AI bot. Your name is Bob."},
{"role": "user", "content": "Hello, how are you doing?"},
{"role": "assistant", "content": "I'm doing well, thanks!"},
{"role": "user", "content": "What is your name?"},
],
)
传入 MessagePromptTemplate 或 BaseMessage 的实例。
from langchain_core.messages import SystemMessage
from langchain_core.prompts import HumanMessagePromptTemplate
chat_template = ChatPromptTemplate.from_messages(
[
SystemMessage(
content=(
"You are a helpful assistant that re-writes the user's text to "
"sound more upbeat."
)
),
HumanMessagePromptTemplate.from_template("{text}"),
]
)
messages = chat_template.format_messages(text="I don't like eating tasty things")
print(messages)
[SystemMessage(content="You are a helpful assistant that re-writes the user's text to sound more upbeat."), HumanMessage(content="I don't like eating tasty things")]
这为您构建聊天提示的方式提供了很大的灵活性。LangChain提供了不同类型的MessagePromptTemplate。 最常用的是 AIMessagePromptTemplate、SystemMessagePromptTemplate 和 HumanMessagePromptTemplate,它们分别创建 AI 消息、系统消息和人工消息。
4.1.3、ChatMessagePromptTemplate
如果聊天模型支持使用任意角色获取聊天消息,您可以使用 ChatMessagePromptTemplate,它允许用户指定角色名称。
from langchain_core.prompts import ChatMessagePromptTemplate
prompt = "May the {subject} be with you"
chat_message_prompt = ChatMessagePromptTemplate.from_template(
role="Jedi", template=prompt
)
chat_message_prompt.format(subject="force")
ChatMessage(content='May the force be with you', role='Jedi')
4.1.4、MessagesPlaceholder
LangChain还提供MessagesPlaceholder,它可以让您完全控制格式化期间要呈现的消息。 当您不确定消息提示模板应使用什么角色或希望在格式化期间插入消息列表时,这会很有用。
from langchain_core.prompts import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
MessagesPlaceholder,
)
human_prompt = "Summarize our conversation so far in {word_count} words."
human_message_template = HumanMessagePromptTemplate.from_template(human_prompt)
chat_prompt = ChatPromptTemplate.from_messages(
[MessagesPlaceholder(variable_name="conversation"), human_message_template]
)
from langchain_core.messages import AIMessage, HumanMessage
human_message = HumanMessage(content="What is the best way to learn programming?")
ai_message = AIMessage(
content="""\
1. Choose a programming language: Decide on a programming language that you want to learn.
2. Start with the basics: Familiarize yourself with the basic programming concepts such as variables, data types and control structures.
3. Practice, practice, practice: The best way to learn programming is through hands-on experience\
"""
)
chat_prompt.format_prompt(
conversation=[human_message, ai_message], word_count="10"
).to_messages()
[HumanMessage(content='What is the best way to learn programming?'),
AIMessage(content='1. Choose a programming language: Decide on a programming language that you want to learn.\n\n2. Start with the basics: Familiarize yourself with the basic programming concepts such as variables, data types and control structures.\n\n3. Practice, practice, practice: The best way to learn programming is through hands-on experience'),
HumanMessage(content='Summarize our conversation so far in 10 words.')]
4.2、Chains
Chain是指调用序列——无论是LLM、TOOL还是数据预处理步骤。 主要支持的方法是使用 LCEL。LCEL 非常适合构建链条,但使用现成的链条也很好。 LangChain支持的现成链有两种:
- 使用 LCEL 构建的链条。 在这种情况下,LangChain提供了更高级的构造方法。 然而,幕后所做的一切都是用 LCEL 构建一条链。
- [旧版] 通过继承旧版 Chain 类的子类而构造的链。 这些链在底层不使用 LCEL,而是独立的类。
4.2.1、LCEL链条
下面是所有 LCEL 链构造函数的表格。
- 链构造函数:该链的构造函数。 这些都是返回 LCEL Runnables 的方法。 我们还链接到 API 文档。
- 函数调用:是否需要OpenAI函数调用。
- 其他工具:此链中使用的其他工具(如果有)。
- 何时使用:我们对何时使用该链的评论。
Chain Constructor | Function Calling | Other Tools | When to Use |
---|---|---|---|
create_stuff_documents_chain | This chain takes a list of documents and formats them all into a prompt, then passes that prompt to an LLM. It passes ALL documents, so you should make sure it fits within the context window of the LLM you are using. | ||
create_openai_fn_runnable | ✅ | If you want to use OpenAI function calling to OPTIONALLY structured an output response. You may pass in multiple functions for its call, but it does not have to call it. | |
create_structured_output_runnable | ✅ | If you want to use OpenAI function calling to FORCE the LLM to respond with a certain function. You may only pass in one function, and the chain will ALWAYS return this response. | |
load_query_constructor_runnable | Can be used to generate queries. You must specify a list of allowed operations and then return a runnable that converts a natural language query into those allowed operations. | ||
create_sql_query_chain | SQL Database | If you want to construct a query for a SQL database from natural language. | |
create_history_aware_retriever | Retriever | This chain takes in conversation history and then uses that to generate a search query which is passed to the underlying retriever. | |
create_retrieval_chain | Retriever | This chain takes in a user inquiry, which is then passed to the retriever to fetch relevant documents. Those documents (and original inputs) are then passed to an LLM to generate a response |
4.3、Agent
4.3.1、定义tools
1、我们首先需要创建我们想要使用的工具。 我们将使用两个工具
我们在LangChain中有一个内置的工具,可以方便地使用Tavily搜索引擎作为工具。 请注意,这需要一个 API 密钥 - 他们有一个免费套餐,但如果您没有或不想创建一个,则可以随时忽略此步骤。
创建 API 密钥后,您需要将其导出为:
export TAVILY_API_KEY="..."
from langchain_community.tools.tavily_search import TavilySearchResults
search = TavilySearchResults()
search.invoke("what is the weather in SF")
输出为:
[{'url': 'https://www.weatherapi.com/',
'content': "{'location': {'name': 'San Francisco', 'region': 'California', 'country': 'United States of America', 'lat': 37.78, 'lon': -122.42, 'tz_id': 'America/Los_Angeles', 'localtime_epoch': 1712847697, 'localtime': '2024-04-11 8:01'}, 'current': {'last_updated_epoch': 1712847600, 'last_updated': '2024-04-11 08:00', 'temp_c': 11.1, 'temp_f': 52.0, 'is_day': 1, 'condition': {'text': 'Partly cloudy', 'icon': '//cdn.weatherapi.com/weather/64x64/day/116.png', 'code': 1003}, 'wind_mph': 2.2, 'wind_kph': 3.6, 'wind_degree': 10, 'wind_dir': 'N', 'pressure_mb': 1015.0, 'pressure_in': 29.98, 'precip_mm': 0.0, 'precip_in': 0.0, 'humidity': 97, 'cloud': 25, 'feelslike_c': 11.5, 'feelslike_f': 52.6, 'vis_km': 14.0, 'vis_miles': 8.0, 'uv': 4.0, 'gust_mph': 2.8, 'gust_kph': 4.4}}"},
{'url': 'https://www.yahoo.com/news/april-11-2024-san-francisco-122026435.html',
'content': "2024 NBA Mock Draft 6.0: Projections for every pick following March Madness With the NCAA tournament behind us, here's an updated look at Yahoo Sports' first- and second-round projections for the ..."},
{'url': 'https://world-weather.info/forecast/usa/san_francisco/april-2024/',
'content': 'Extended weather forecast in San Francisco. Hourly Week 10 days 14 days 30 days Year. Detailed ⚡ San Francisco Weather Forecast for April 2024 - day/night 🌡️ temperatures, precipitations - World-Weather.info.'},
{'url': 'https://www.wunderground.com/hourly/us/ca/san-francisco/94144/date/date/2024-4-11',
'content': 'Personal Weather Station. Inner Richmond (KCASANFR1685) Location: San Francisco, CA. Elevation: 207ft. Nearby Weather Stations. Hourly Forecast for Today, Thursday 04/11Hourly for Today, Thu 04/11 ...'},
{'url': 'https://weatherspark.com/h/y/557/2024/Historical-Weather-during-2024-in-San-Francisco-California-United-States',
'content': 'San Francisco Temperature History 2024\nHourly Temperature in 2024 in San Francisco\nCompare San Francisco to another city:\nCloud Cover in 2024 in San Francisco\nDaily Precipitation in 2024 in San Francisco\nObserved Weather in 2024 in San Francisco\nHours of Daylight and Twilight in 2024 in San Francisco\nSunrise & Sunset with Twilight and Daylight Saving Time in 2024 in San Francisco\nSolar Elevation and Azimuth in 2024 in San Francisco\nMoon Rise, Set & Phases in 2024 in San Francisco\nHumidity Comfort Levels in 2024 in San Francisco\nWind Speed in 2024 in San Francisco\nHourly Wind Speed in 2024 in San Francisco\nHourly Wind Direction in 2024 in San Francisco\nAtmospheric Pressure in 2024 in San Francisco\nData Sources\n See all nearby weather stations\nLatest Report — 3:56 PM\nWed, Jan 24, 2024\xa0\xa0\xa0\xa013 min ago\xa0\xa0\xa0\xa0UTC 23:56\nCall Sign KSFO\nTemp.\n60.1°F\nPrecipitation\nNo Report\nWind\n6.9 mph\nCloud Cover\nMostly Cloudy\n1,800 ft\nRaw: KSFO 242356Z 18006G19KT 10SM FEW015 BKN018 BKN039 16/12 A3004 RMK AO2 SLP171 T01560122 10156 20122 55001\n While having the tremendous advantages of temporal and spatial completeness, these reconstructions: (1) are based on computer models that may have model-based errors, (2) are coarsely sampled on a 50 km grid and are therefore unable to reconstruct the local variations of many microclimates, and (3) have particular difficulty with the weather in some coastal areas, especially small islands.\n We further caution that our travel scores are only as good as the data that underpin them, that weather conditions at any given location and time are unpredictable and variable, and that the definition of the scores reflects a particular set of preferences that may not agree with those of any particular reader.\n 2024 Weather History in San Francisco California, United States\nThe data for this report comes from the San Francisco International Airport.'}]
2、我们还将根据我们自己的一些数据创建一个检索器。
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
loader = WebBaseLoader("https://docs.smith.langchain.com/overview")
docs = loader.load()
documents = RecursiveCharacterTextSplitter(
chunk_size=1000, chunk_overlap=200
).split_documents(docs)
vector = FAISS.from_documents(documents, OpenAIEmbeddings())
retriever = vector.as_retriever()
retriever.invoke("how to upload a dataset")[0]
输出为
Document(page_content='import Clientfrom langsmith.evaluation import evaluateclient = Client()# Define dataset: these are your test casesdataset_name = "Sample Dataset"dataset = client.create_dataset(dataset_name, description="A sample dataset in LangSmith.")client.create_examples( inputs=[ {"postfix": "to LangSmith"}, {"postfix": "to Evaluations in LangSmith"}, ], outputs=[ {"output": "Welcome to LangSmith"}, {"output": "Welcome to Evaluations in LangSmith"}, ], dataset_id=dataset.id,)# Define your evaluatordef exact_match(run, example): return {"score": run.outputs["output"] == example.outputs["output"]}experiment_results = evaluate( lambda input: "Welcome " + input[\'postfix\'], # Your AI system goes here data=dataset_name, # The data to predict and grade over evaluators=[exact_match], # The evaluators to score the results experiment_prefix="sample-experiment", # The name of the experiment metadata={ "version": "1.0.0", "revision_id":', metadata={'source': 'https://docs.smith.langchain.com/overview', 'title': 'Getting started with LangSmith | 🦜️🛠️ LangSmith', 'description': 'Introduction', 'language': 'en'})
现在我们已经填充了我们将进行检索的索引,我们可以轻松地将其变成一个工具(代理正确使用它所需的格式)
from langchain.tools.retriever import create_retriever_tool
retriever_tool = create_retriever_tool(
retriever,
"langsmith_search",
"Search for information about LangSmith. For any questions about LangSmith, you must use this tool!",
)
3、tools
现在我们已经创建了两者,我们可以创建将在下游使用的工具列表。
tools = [search, retriever_tool]
4、agent
接下来,我们选择要用来指导代理的提示。
如果您想查看此提示的内容并有权访问 LangSmith,您可以访问:
https://smith.langchain.com/hub/hwchase17/openai-functions-agent
from langchain import hub
# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")
prompt.messages
[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant')),
MessagesPlaceholder(variable_name='chat_history', optional=True),
HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}')),
MessagesPlaceholder(variable_name='agent_scratchpad')]
现在我们已经创建了两tools,我们可以创建将在下游使用的工具列表。
from langchain.agents import create_tool_calling_agent
agent = create_tool_calling_agent(llm, tools, prompt)
最后,我们将代理(大脑)与AgentExecutor内部的工具(它将重复调用代理并执行工具)结合起来。
from langchain.agents import AgentExecutor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
4.3.2、Run the agent
agent_executor.invoke({"input": "hi!"})
[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mHello! How can I assist you today?[0m
[1m> Finished chain.[0m
agent_executor.invoke({"input": "how can langsmith help with testing?"})
[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `langsmith_search` with `{'query': 'how can LangSmith help with testing'}`
[0m[33;1m[1;3mGetting started with LangSmith | 🦜️🛠️ LangSmith
Skip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookQuick StartOn this pageGetting started with LangSmithIntroductionLangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!Install LangSmithWe offer Python and Typescript SDKs for all your LangSmith needs.PythonTypeScriptpip install -U langsmithyarn add langchain langsmithCreate an API keyTo create an API key head to the setting pages. Then click Create API Key.Setup your environmentShellexport LANGCHAIN_TRACING_V2=trueexport LANGCHAIN_API_KEY=<your-api-key># The below examples use the OpenAI API, though it's not necessary in generalexport OPENAI_API_KEY=<your-openai-api-key>Log your first traceWe provide multiple ways to log traces
Learn about the workflows LangSmith supports at each stage of the LLM application lifecycle.Pricing: Learn about the pricing model for LangSmith.Self-Hosting: Learn about self-hosting options for LangSmith.Proxy: Learn about the proxy capabilities of LangSmith.Tracing: Learn about the tracing capabilities of LangSmith.Evaluation: Learn about the evaluation capabilities of LangSmith.Prompt Hub Learn about the Prompt Hub, a prompt management tool built into LangSmith.Additional ResourcesLangSmith Cookbook: A collection of tutorials and end-to-end walkthroughs using LangSmith.LangChain Python: Docs for the Python LangChain library.LangChain Python API Reference: documentation to review the core APIs of LangChain.LangChain JS: Docs for the TypeScript LangChain libraryDiscord: Join us on our Discord to discuss all things LangChain!FAQHow do I migrate projects between organizations?Currently we do not support project migration betwen organizations. While you can manually imitate this by
team deals with sensitive data that cannot be logged. How can I ensure that only my team can access it?If you are interested in a private deployment of LangSmith or if you need to self-host, please reach out to us at sales@langchain.dev. Self-hosting LangSmith requires an annual enterprise license that also comes with support and formalized access to the LangChain team.Was this page helpful?NextUser GuideIntroductionInstall LangSmithCreate an API keySetup your environmentLog your first traceCreate your first evaluationNext StepsAdditional ResourcesFAQHow do I migrate projects between organizations?Why aren't my runs aren't showing up in my project?My team deals with sensitive data that cannot be logged. How can I ensure that only my team can access it?CommunityDiscordTwitterGitHubDocs CodeLangSmith SDKPythonJS/TSMoreHomepageBlogLangChain Python DocsLangChain JS/TS DocsCopyright © 2024 LangChain, Inc.[0m[32;1m[1;3mLangSmith is a platform for building production-grade LLM applications that can help with testing in the following ways:
1. **Tracing**: LangSmith provides tracing capabilities that allow you to closely monitor and evaluate your application during testing. You can log traces to track the behavior of your application and identify any issues.
2. **Evaluation**: LangSmith offers evaluation capabilities that enable you to assess the performance of your application during testing. This helps you ensure that your application functions as expected and meets the required standards.
3. **Production Monitoring & Automations**: LangSmith allows you to monitor your application in production and automate certain processes, which can be beneficial for testing different scenarios and ensuring the stability of your application.
4. **Prompt Hub**: LangSmith includes a Prompt Hub, a prompt management tool that can streamline the testing process by providing a centralized location for managing prompts and inputs for your application.
Overall, LangSmith can assist with testing by providing tools for monitoring, evaluating, and automating processes to ensure the reliability and performance of your application during testing phases.[0m
[1m> Finished chain.[0m
{'input': 'how can langsmith help with testing?',
'output': 'LangSmith is a platform for building production-grade LLM applications that can help with testing in the following ways:\n\n1. **Tracing**: LangSmith provides tracing capabilities that allow you to closely monitor and evaluate your application during testing. You can log traces to track the behavior of your application and identify any issues.\n\n2. **Evaluation**: LangSmith offers evaluation capabilities that enable you to assess the performance of your application during testing. This helps you ensure that your application functions as expected and meets the required standards.\n\n3. **Production Monitoring & Automations**: LangSmith allows you to monitor your application in production and automate certain processes, which can be beneficial for testing different scenarios and ensuring the stability of your application.\n\n4. **Prompt Hub**: LangSmith includes a Prompt Hub, a prompt management tool that can streamline the testing process by providing a centralized location for managing prompts and inputs for your application.\n\nOverall, LangSmith can assist with testing by providing tools for monitoring, evaluating, and automating processes to ensure the reliability and performance of your application during testing phases.'}
询问天气
agent_executor.invoke({"input": "whats the weather in sf?"})
[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `tavily_search_results_json` with `{'query': 'weather in San Francisco'}`
[0m[36;1m[1;3m[{'url': 'https://www.weatherapi.com/', 'content': "{'location': {'name': 'San Francisco', 'region': 'California', 'country': 'United States of America', 'lat': 37.78, 'lon': -122.42, 'tz_id': 'America/Los_Angeles', 'localtime_epoch': 1712847697, 'localtime': '2024-04-11 8:01'}, 'current': {'last_updated_epoch': 1712847600, 'last_updated': '2024-04-11 08:00', 'temp_c': 11.1, 'temp_f': 52.0, 'is_day': 1, 'condition': {'text': 'Partly cloudy', 'icon': '//cdn.weatherapi.com/weather/64x64/day/116.png', 'code': 1003}, 'wind_mph': 2.2, 'wind_kph': 3.6, 'wind_degree': 10, 'wind_dir': 'N', 'pressure_mb': 1015.0, 'pressure_in': 29.98, 'precip_mm': 0.0, 'precip_in': 0.0, 'humidity': 97, 'cloud': 25, 'feelslike_c': 11.5, 'feelslike_f': 52.6, 'vis_km': 14.0, 'vis_miles': 8.0, 'uv': 4.0, 'gust_mph': 2.8, 'gust_kph': 4.4}}"}, {'url': 'https://www.yahoo.com/news/april-11-2024-san-francisco-122026435.html', 'content': "2024 NBA Mock Draft 6.0: Projections for every pick following March Madness With the NCAA tournament behind us, here's an updated look at Yahoo Sports' first- and second-round projections for the ..."}, {'url': 'https://www.weathertab.com/en/c/e/04/united-states/california/san-francisco/', 'content': 'Explore comprehensive April 2024 weather forecasts for San Francisco, including daily high and low temperatures, precipitation risks, and monthly temperature trends. Featuring detailed day-by-day forecasts, dynamic graphs of daily rain probabilities, and temperature trends to help you plan ahead. ... 11 65°F 49°F 18°C 9°C 29% 12 64°F 49°F ...'}, {'url': 'https://weatherspark.com/h/y/557/2024/Historical-Weather-during-2024-in-San-Francisco-California-United-States', 'content': 'San Francisco Temperature History 2024\nHourly Temperature in 2024 in San Francisco\nCompare San Francisco to another city:\nCloud Cover in 2024 in San Francisco\nDaily Precipitation in 2024 in San Francisco\nObserved Weather in 2024 in San Francisco\nHours of Daylight and Twilight in 2024 in San Francisco\nSunrise & Sunset with Twilight and Daylight Saving Time in 2024 in San Francisco\nSolar Elevation and Azimuth in 2024 in San Francisco\nMoon Rise, Set & Phases in 2024 in San Francisco\nHumidity Comfort Levels in 2024 in San Francisco\nWind Speed in 2024 in San Francisco\nHourly Wind Speed in 2024 in San Francisco\nHourly Wind Direction in 2024 in San Francisco\nAtmospheric Pressure in 2024 in San Francisco\nData Sources\n See all nearby weather stations\nLatest Report — 3:56 PM\nWed, Jan 24, 2024\xa0\xa0\xa0\xa013 min ago\xa0\xa0\xa0\xa0UTC 23:56\nCall Sign KSFO\nTemp.\n60.1°F\nPrecipitation\nNo Report\nWind\n6.9 mph\nCloud Cover\nMostly Cloudy\n1,800 ft\nRaw: KSFO 242356Z 18006G19KT 10SM FEW015 BKN018 BKN039 16/12 A3004 RMK AO2 SLP171 T01560122 10156 20122 55001\n While having the tremendous advantages of temporal and spatial completeness, these reconstructions: (1) are based on computer models that may have model-based errors, (2) are coarsely sampled on a 50 km grid and are therefore unable to reconstruct the local variations of many microclimates, and (3) have particular difficulty with the weather in some coastal areas, especially small islands.\n We further caution that our travel scores are only as good as the data that underpin them, that weather conditions at any given location and time are unpredictable and variable, and that the definition of the scores reflects a particular set of preferences that may not agree with those of any particular reader.\n 2024 Weather History in San Francisco California, United States\nThe data for this report comes from the San Francisco International Airport.'}, {'url': 'https://www.msn.com/en-us/weather/topstories/april-11-2024-san-francisco-bay-area-weather-forecast/vi-BB1lrXDb', 'content': 'April 11, 2024 San Francisco Bay Area weather forecast. Posted: April 11, 2024 | Last updated: April 11, 2024 ...'}][0m[32;1m[1;3mThe current weather in San Francisco is partly cloudy with a temperature of 52.0°F (11.1°C). The wind speed is 3.6 kph coming from the north, and the humidity is at 97%.[0m
[1m> Finished chain.[0m
{'input': 'whats the weather in sf?',
'output': 'The current weather in San Francisco is partly cloudy with a temperature of 52.0°F (11.1°C). The wind speed is 3.6 kph coming from the north, and the humidity is at 97%.'}