LitServe 服务多worker启动简单说明

LitServe 是一个基于fastapi 包装的快速推理api 服务,以下只简单说明下关于server 启动部分的处理

参考使用

我们可以通过配置devices 以及每个device 对应的worker 数执行以那种模式进行server 的启动(多线程还是多进程)

  • 参考使用
if __name__ == "__main__":
    # Enable the OpenAISpec in LitServer
    api = SimpleLitAPI()
    server = ls.LitServer(api,workers_per_device=2, spec=ls.OpenAISpec())
    server.run(port=8000)

代码处理

  • server 启动 处理上实际上使用了类似uvicorn 多worker 的处理,因为默认使用了模块类,而不是字符串,多worker 是不能生效的,所以LitServe 自己使用类似uvicorn的机制包装了下 运行处理
  def run(
    self,
    port: Union[str, int] = 8000,
    num_api_servers: Optional[int] = None,
    log_level: str = "info",
    generate_client_file: bool = True,
    api_server_worker_type: Optional[str] = None,
    **kwargs,
):
    if generate_client_file:
        self.generate_client_file()

    port_msg = f"port must be a value from 1024 to 65535 but got {port}"
    try:
        port = int(port)
    except ValueError:
        raise ValueError(port_msg)

    if not (1024 <= port <= 65535):
        raise ValueError(port_msg)
    # 此处创建socket bind 信息,后续实际server会复用socket
    config = uvicorn.Config(app=self.app, host="0.0.0.0", port=port, log_level=log_level, **kwargs)
    sockets = [config.bind_socket()]

    if num_api_servers is None:
        num_api_servers = len(self.workers)

    if num_api_servers < 1:
        raise ValueError("num_api_servers must be greater than 0")

    if sys.platform == "win32":
        print("Windows does not support forking. Using threads api_server_worker_type will be set to 'thread'")
        api_server_worker_type = "thread"
    elif api_server_worker_type is None:
        api_server_worker_type = "process"
    # 基于配置的devices 以及workers_per_device 信息创建多进程
    manager, litserve_workers = self.launch_inference_worker(num_api_servers)

    try:
        # 基于uvicorn 启动多server进程
        servers = self._start_server(port, num_api_servers, log_level, sockets, api_server_worker_type, **kwargs)
        print(f"Swagger UI is available at http://0.0.0.0:{port}/docs")
        for s in servers:
            s.join()
    finally:
        print("Shutting down LitServe")
        for w in litserve_workers:
            w.terminate()
            w.join()
        manager.shutdown()

_start_server 处理

def _start_server(self, port, num_uvicorn_servers, log_level, sockets, uvicorn_worker_type, **kwargs):
    servers = []
    for response_queue_id in range(num_uvicorn_servers):
        self.app.response_queue_id = response_queue_id
        if self.lit_spec:
            self.lit_spec.response_queue_id = response_queue_id
        app = copy.copy(self.app)

        config = uvicorn.Config(app=app, host="0.0.0.0", port=port, log_level=log_level, **kwargs)
        server = uvicorn.Server(config=config)
        # 此处会基于使用线程还是进程进行server 的创建以及启动,复用了socket
        if uvicorn_worker_type == "process":
            ctx = mp.get_context("fork")
            w = ctx.Process(target=server.run, args=(sockets,))
        elif uvicorn_worker_type == "thread":
            w = threading.Thread(target=server.run, args=(sockets,))
        else:
            raise ValueError("Invalid value for api_server_worker_type. Must be 'process' or 'thread'")
        w.start()
        servers.append(w)
    return servers  

说明

以上只是LitServe关于fastapi 服务基于uvicorn server 启动部分的说明,其他部分的处理后续会介绍,实际上难度并不难,核心是了解内部机制

参考资料

https://github.com/Lightning-AI/LitServe

https://github.com/encode/uvicorn/blob/master/uvicorn/supervisors/multiprocess.py

posted on 2024-11-09 08:00  荣锋亮  阅读(19)  评论(0编辑  收藏  举报

导航