LangServe
LangServe
https://python.langchain.com/v0.1/docs/langserve/
LangServe helps developers deploy
LangChain
runnables and chains as a REST API.This library is integrated with FastAPI and uses pydantic for data validation.
In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.
Features
- Input and Output schemas automatically inferred from your LangChain object, and enforced on every API call, with rich error messages
- API docs page with JSONSchema and Swagger (insert example link)
- Efficient
/invoke
,/batch
and/stream
endpoints with support for many concurrent requests on a single server/stream_log
endpoint for streaming all (or some) intermediate steps from your chain/agent- new as of 0.0.40, supports
/stream_events
to make it easier to stream without needing to parse the output of/stream_log
.- Playground page at
/playground/
with streaming output and intermediate steps- Built-in (optional) tracing to LangSmith, just add your API key (see Instructions)
- All built with battle-tested open-source Python libraries like FastAPI, Pydantic, uvloop and asyncio.
- Use the client SDK to call a LangServe server as if it was a Runnable running locally (or call the HTTP API directly)
- LangServe Hub
https://github.com/langchain-ai/langserve?tab=License-1-ov-file
LangServe helps developers deploy
LangChain
runnables and chains as a REST API.This library is integrated with FastAPI and uses pydantic for data validation.
In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.
https://docs.langchain.com.cn/docs/
https://python.langchain.com/docs/langserve/#custom-user-types
from fastapi import FastAPI from langchain.schema.runnable import RunnableLambda from langserve import add_routes from langserve.schema import CustomUserType app = FastAPI() class Foo(CustomUserType): bar: int def func(foo: Foo) -> int: """Sample function that expects a Foo type which is a pydantic model""" assert isinstance(foo, Foo) return foo.bar # Note that the input and output type are automatically inferred! # You do not need to specify them. # runnable = RunnableLambda(func).with_types( # <-- Not needed in this case # input_type=Foo, # output_type=int, # add_routes(app, RunnableLambda(func), path="/foo")
langserve_launch_example
https://github.com/langchain-ai/langserve-launch-example/tree/main
To customise this project, edit the following files:
langserve_launch_example/chain.py
contains an example chain, which you can edit to suit your needs.langserve_launch_example/server.py
contains a FastAPI app that serves that chain usinglangserve
. You can edit this to add more endpoints or customise your server.tests/test_chain.py
contains tests for the chain. You can edit this to add more tests.pyproject.toml
contains the project metadata, including the project name, version, and dependencies. You can edit this to add more dependencies or customise your project metadata.
https://blog.langchain.dev/introducing-langserve/
How it works
First we create our chain, here using a conversational retrieval chain, but any other chain would work. This is the
my_package/chain.py
file."""A conversational retrieval chain.""" from langchain.chains import ConversationalRetrievalChain from langchain.chat_models import ChatOpenAI from langchain.embeddings import OpenAIEmbeddings from langchain.vectorstores import FAISS vectorstore = FAISS.from_texts( ["cats like fish", "dogs like sticks"], embedding=OpenAIEmbeddings() ) retriever = vectorstore.as_retriever() model = ChatOpenAI() chain = ConversationalRetrievalChain.from_llm(model, retriever)
Then, we pass that chain to
add_routes
. This is themy_package/server.py
file.#!/usr/bin/env python """A server for the chain above.""" from fastapi import FastAPI from langserve import add_routes from my_package.chain import chain app = FastAPI(title="Retrieval App") add_routes(app, chain) if __name__ == "__main__": import uvicorn uvicorn.run(app, host="localhost", port=8000)
That's it! This gets you a scalable python web server with
- Input and Output schemas automatically inferred from your chain, and enforced on every API call, with rich error messages
/docs
endpoint serves API docs with JSONSchema and Swagger (insert example link)/invoke
endpoint that accepts JSON input and returns JSON output from your chain, with support for many concurrent requests in the same server/batch
endpoint that produces output for several inputs in parallel, batching calls to LLMs where possible/stream
endpoint that sends output chunks as they become available, using SSE (same as OpenAI Streaming API)/stream_log
endpoint for streaming all (or some) intermediate steps from your chain/agent- Built-in (optional) tracing to LangSmith, just add your API key as an environment variable
- Support for hosting multiple chains in the same server under separate paths
- All built with battle-tested open-source Python libraries like FastAPI, Pydantic, uvloop and asyncio.
https://www.nowcoder.com/discuss/675470604253900800
1 概述
LangServe 提供一整套将LLM部署成产品服务的解决方案。可将LLM应用链接入常见Python Web框架(如FastAPI、Pydantic、uvloop、asyncio),进而生成一套RESTful API。LangServe减少开发人员的运维部署任务,使他们可以更专注于LLM应用开发。不仅简化从开发到生产的过渡,还确保服务的高性能和安全性。它提供了包括模型管理器、请求处理器、推理引擎、结果缓存、监控与日志记录以及API网关等各类组件。LangServe的目标是让开发者能够轻松集成、部署和管理AI模型,从零到一无缝地实现LLM应用从原型到产品的过渡。
仓库地址:https://github.com/langchain-ai/langserve
2 功能
多模型支持
LangServe支持部署多种类型的AI模型,包括文本生成、图像识别、语音处理等,开发人员能够按需切换。
高效推理缓存
为了提高响应速度和节省计算资源,LangServe包含了一个高效的结果缓存系统,可以智能地存储和管理热点数据。
安全访问控制
通过角色和策略的管理,LangServe提供了灵活的访问控制机制,确保了服务的安全性和数据的隐私性。
实时监控与日志
内置的监控系统可以实时跟踪服务的运行状态,详尽的日志记录有助于问题的调试和分析。
简洁易用的API接口
LangServe的API设计简洁直观,易于理解和使用,大大减少了开发者的学习成本。