MetaGPT day04 MetaGPT ActionNode

ActionNode

说明文档导读

# 什么是ActionNode?
    1.ActionNode是Action的通用化抽象
    2.ActionNode是SOP的最小单元

# ActionNode是Action的通用化抽象:
    反推可得知Action不够通用化？也就是说ActionNode的粒度比action更细？
	Action  -粒度更细-> ActionNode
    
# ActionNode是SOP的最小单元:
    每个SOP都可以看做由ActionNode组成的动作集合,也就是说:
    SOP = {ActionNode1,ActionNode2,...}

# ActionNode构成的动作树能够更有效地模拟语言的复杂结构，包括语法和语义关系，这有助于提升系统处理自然语言的能力。

ActionNode -构成-> 动作树 -模拟-> 语言结构
语言结构 = 语法 + 语义关系 + ...

# ActionNode使得大型语言模型（LLM）能够专注于每个最小单元的填槽任务，为智能体的规划能力提供了一种高效的数据结构，以便快速遍历各种状态和决策点，寻找最优或所需的路径 
    
# 填槽任务：
在多轮对话过程中，系统为完成任务而需要获取的关键信息我们称之为“槽位”。我的理解，填槽任务就是补全prompt。

ActionNode使得LLM能够专注于每个ActionNode的prompt补全任务。ActionNode提供了一种高效的数据结构（树），以便快速遍历各种状态和决策点，寻找最优或所需的路径。

# ActionNode为后续SOP的自动生成尝试提供了可能。

就像在游戏《我的世界》中，不同的技能可以组合成新的技能一样，通过动作树节点的编排，我们可以创造出新的技能，从而更接近自然语言编程的目标。这种统一的抽象和结构化的方法，使得系统能够更好地适应和学习，为智能体的进化提供了坚实的基础

ActionNode功能

在MG中我们希望将llm完成的功能收纳到ActionNode进行实现。(重写已有的一些功能？)
更多详见源码:
https://github.com/geekan/MetaGPT/blob/dev/metagpt/actions/action_node.py

# ActionNode功能说明：
fill / review / revise / plan / cache / with_docs 

# fill: 
填槽。实现执行传入的prompt并获取结果返回，并将结果存储在对象自身中。

# review: 
审查填槽效果。评价填槽完成度与问题。

# revise: 
修正。基于review的建议和结果进行优化。

# plan: 
规划。生成新的child槽，可以只有instruction。

# cache: 
支持程序记忆。ActionNode在做具体的动作：比如fill时，需要有对应的“程序记忆”，用来缓存、重放、参考、调用。
如果遇到了历史的query，有限调用缓存来解决。
   例子1：生成产品文档
      1. 第一次fill：2048 game的产品文档 -> PRD：实际的2048产品文档
        1. 生成了实际的2048产品文档
        2. GPT-4 review分数满分
        3. 缓存：因为review满分，所以这个pair被缓存到本地
      2. 第二次fill：2048 game的产品文档 -> PRD：实际的2048产品文档
        1. 重放：使用了之前缓存的结果，直接给出了PRD
      3. 第三次fill：4096 game的产品文档 -> PRD：4096的产品文档
        1. 参考：检索了2048 game的文档，用来做参考
        2. 生成了实际的4096产品文档（实际单独生成难度比较高，这里相当于降低了难度）
   例子2：生成代码（以下简写）
      1. 第一次fill：生成2048 move函数，成功缓存
      2. 第二次fill：生成2048 move函数，复用缓存
      3. 第三次fill：生成8*8格子的move函数，参考缓存
      4. 第四次fill：需要写一个新函数，调用已有的k个函数，而这些函数存在于memory之中
        
# ith_docs:
用于实现RAG(索引增强生成)。即用外部文档做检索，检索出优质内容再传入llm增强生成效果。（这部分代码将在0.7版本后支持）

ActionNode参数结构

schema: str  # 数据的结构。比如以下三种： raw/json/markdown， 默认: ""

# 动作上下文
context: str  # 所有上下文，包括所有必要信息
llm: BaseLLM  # 具有 aask 接口的 LLM
children: dict[str, "ActionNode"]  # 子节点的字典，键为字符串，值为 "ActionNode" 类型的对象

# 动作输入
key: str  # 产品需求 / 文件列表 / 代码
expected_type: Type  # 例如 str / int / float 等
# context: str  # 历史中的所有内容
instruction: str  # 应该遵循的指导
example: Any  # 上下文学习的示例

# 动作输出
content: str
instruct_content: BaseModel

快速掌握ActionNode的用法：打印斐波那契数列

目标：打印前10个斐波那契数列的数字，并且，LLM要能以特定的可解析的格式来返回斐波那契数列，并通过格式解析实现逐个打印数字的效果。

import asyncio
import re

from metagpt.actions.action import Action
from metagpt.actions.action_node import ActionNode
from metagpt.logs import logger
from metagpt.roles import Role
from metagpt.schema import Message

# 将思考斐波那契数列的10个数字作为prompt输入，在这里我们将“思考需要生成的数字列表”作为命令（instruction）写入
# 将期望返回格式（expected_type）设置为str，无需设置例子（example）
SIMPLE_THINK_NODE = ActionNode(
    key="Simple Think Node",
    expected_type=str,
    instruction="""
            Think about what list of numbers you need to generate
            """,
    example=""
)

# 在这里通过命令（instruction）来规定需要生成的数字列表格式，提供例子（example）来帮助LLM理解
SIMPLE_CHECK_NODE = ActionNode(
    key="Simple CHECK Node",
    expected_type=str,
    instruction="""
            Please provide the number list for me, strictly following the following requirements:
            1. Answer strictly in the list format like [1,2,3,4]
            2. Do not have extra spaces or line breaks.
            Return the list here:
            """,
    example="[1,2,3,4]"
            "[4,5,6]",
)


class THINK_NODES(ActionNode):
    def __init__(self, name="Think Nodes", expected_type=str, instruction="", example=""):
        super().__init__(key=name, expected_type=str, instruction=instruction, example=example)
        self.add_children([SIMPLE_THINK_NODE, SIMPLE_CHECK_NODE])  # 初始化过程，将上面实现的两个子节点加入作为THINK_NODES类的子节点

    async def fill(self, context, llm, to="raw", mode="auto", strgy="complex"):
        self.set_llm(llm)
        self.set_context(context)
        if hasattr(self, 'to'):
            to = self.to

        if strgy == "simple":
            return await self.simple_fill(to=to, mode=mode)
        elif strgy == "complex":
            # 这里隐式假设了拥有children
            child_context = context  # 输入context作为第一个子节点的context
            for _, i in self.children.items():
                i.set_context(child_context)  # 为子节点设置context
                child = await i.simple_fill(to=to, mode=mode)
                child_context = child.content  # 将返回内容（child.content）作为下一个子节点的context

            self.content = child_context  # 最后一个子节点返回的内容设置为父节点返回内容（self.content）
            return self


class SimplePrint(Action):
    """
    Action that print the num inputted
    """
    input_num = 0

    def __init__(self, name="SimplePrint", input_num: int = 0):
        super().__init__()

        self.input_num = input_num

    async def run(self, **kwargs):
        print(str(self.input_num) + "\n")
        return 0


class ThinkAction(Action):
    """
    Action that think
    """

    def __init__(self, name="ThinkAction", context=None, llm=None):
        super().__init__()
        self.node = THINK_NODES()  # 初始化Action时，初始化一个THINK_NODE实例并赋值给self.node

    async def run(self, instruction) -> list:
        PROMPT = """
            You are now a number list generator, follow the instruction {instruction} and 
            generate a number list to be printed please.
            """

        prompt = PROMPT.format(instruction=instruction)
        rsp_node = await self.node.fill(context=prompt, llm=self.llm, to="raw",
                                        strgy="complex")  # 运行子节点，获取返回（返回格式为ActionNode）（注意设置 schema="raw" ）
        rsp = rsp_node.content  # 获取返回的文本内容

        rsp_match = self.find_in_brackets(rsp)  # 按列表格式解析返回的文本内容，定位“[”与“]”之间的内容

        try:
            rsp_list = list(map(int, rsp_match[0].split(',')))  # 按列表格式解析返回的文本内容，按“,”对内容进行分割，并形成一个python语法中的列表

            return rsp_list
        except:
            return []

    @staticmethod
    def find_in_brackets(s):
        pattern = r'\[(.*?)\]'
        match = re.findall(pattern, s)
        return match


class Printer(Role):

    def __init__(self, name="Jerry", profile="Printer", goal="Print the number", constraints=""):
        super().__init__()

        self._init_actions([ThinkAction])
        # self.num_list = list()

    async def _think(self) -> None:
        """Determine the action"""
        # logger.info(self._rc.state)

        if self._rc.todo is None:
            self._set_state(0)
            return

        if self._rc.state + 1 < len(self._states):
            self._set_state(self._rc.state + 1)
        else:
            self._rc.todo = None

    async def _prepare_print(self, num_list: list) -> Message:
        """Add actions"""
        actions = list()

        for num in num_list:
            actions.append(SimplePrint(input_num=num))

        self._init_actions(actions)
        self._rc.todo = None
        return Message(content=str(num_list))

    async def _act(self) -> Message:
        """Action"""
        todo = self._rc.todo

        if type(todo) is ThinkAction:
            msg = self._rc.memory.get(k=1)[0]
            self.goal = msg.content
            resp = await todo.run(instruction=self.goal)
            # logger.info(resp)

            return await self._prepare_print(resp)

        resp = await todo.run()
        # logger.info(resp)

        return Message(content=resp, role=self.profile)

    async def _react(self) -> Message:
        """"""
        while True:
            await self._think()

            if self._rc.todo is None:
                break
            msg = await self._act()

        return msg


async def main():
    msg = "Provide the first 10 numbers of the Fibonacci series"
    role = Printer()
    logger.info(msg)
    result = await role.run(msg)
    logger.info(result)


if __name__ == '__main__':
    asyncio.run(main())

prompt示例：

## context

            You are now a number list generator, follow the instruction Provide the first 10 numbers of the Fibonacci series and 
            generate a number list to be printed please.
            

-----

## format example
[CONTENT]
{'Simple Think Node': ''}
[/CONTENT]

## nodes: "<node>: <type>  # <comment>"
- Simple Think Node: <class 'str'>  # 
            Think about what list of numbers you need to generate
            


## constraint

- Language: Please use the same language as the user input.
- Format: output wrapped inside [CONTENT][/CONTENT] as format example, nothing else.


## action
Fill in the above nodes based on the format example.

ActionNode示例

如下，将四个子节点，通过from_children方法，添加到一个父节点中。在父节点使用compile编译，这将根据所有子节点和一个模板拼接成一个prompt。然后我们会将这个prompt使用fill方法传给llm，得到返回结构，最终存储到ActionNode对象自身。

import asyncio
from metagpt.actions.action_node import ActionNode
from metagpt.logs import logger
from metagpt.llm import LLM

LANGUAGE = ActionNode(
    key="语言",
    expected_type=str,
    instruction="提供项目中使用的语言，通常应与用户的需求语言相匹配。",
    example="en_us",
)

PROGRAMMING_LANGUAGE = ActionNode(
    key="编程语言",
    expected_type=str,
    instruction="Python/JavaScript或其他主流编程语言。",
    example="Python",
)

ORIGINAL_REQUIREMENTS = ActionNode(
    key="原始需求",
    expected_type=str,
    instruction="将原始的用户需求放在这里。",
    example="创建2048游戏",
)

PROJECT_NAME = ActionNode(
    key="项目名称",
    expected_type=str,
    instruction="根据“原始需求”的内容，使用蛇形命名风格为项目命名，例如 'game_2048' 或 'simple_crm'。",
    example="game_2048",
)

NODES = [
    LANGUAGE,
    PROGRAMMING_LANGUAGE,
    ORIGINAL_REQUIREMENTS,
    PROJECT_NAME,
]

WRITE_PRD_NODE = ActionNode.from_children("WritePRD", NODES)


async def main():
    prompt = WRITE_PRD_NODE.compile(context="你是一个产品经理，你需要为游戏幻兽帕鲁写需求文档", to='markdown', mode='auto')
    logger.info(prompt)
    respone = await LLM().aask(prompt)
    logger.info(respone)


if __name__ == '__main__':
    asyncio.run(main())

如下是prompt，我需要说明的是：

大模型的返回结构，可以通过compile的to参数（老版本0.4是schema）配置，可以返回markdown或者json。
constraint、action是模板中的固定部分。

## context
你是一个产品经理，你需要为游戏幻兽帕鲁写需求文档

-----

## format example
[CONTENT]
- 语言: en_us
- 编程语言: Python
- 原始需求: 创建2048游戏
- 项目名称: game_2048

[/CONTENT]

## nodes: "<node>: <type>  # <comment>"
- 语言: <class 'str'>  # 提供项目中使用的语言，通常应与用户的需求语言相匹配。
- 编程语言: <class 'str'>  # Python/JavaScript或其他主流编程语言。
- 原始需求: <class 'str'>  # 将原始的用户需求放在这里。
- 项目名称: <class 'str'>  # 根据“原始需求”的内容，使用蛇形命名风格为项目命名，例如 'game_2048' 或 'simple_crm'。


## constraint
- 语言：请使用与用户输入相同的语言。
- 格式：输出请使用 [CONTENT][/CONTENT] 进行包装，如格式示例所示，不要添加其他内容。


## action
Fill in the above nodes based on the format example.

从prompt的角度来看。ActionNode就是映射了一个prompt中的参数（槽位）。让我们可以一次次的询问llm，得到每一个槽位的值。

LANGUAGE = ActionNode(
    key="语言",  # 这对应了llm返回字典的键值
    expected_type=str,  # 期待llm返回的这个槽位是str类型
    instruction="提供项目中使用的语言，通常应与用户的需求语言相匹配。",  # 指导llm该如何填槽，如何返回这个参数
    example="en_us",  # 提示的例子，也就是正确返回的示例
)

ActionNode.add_children

# 做了什么事？
批量增加子节点（ActionNode），将这些子节点添加到一个children属性（字典）中。字典的键值是子节点的key属性。

举例添加两个子节点：
SIMPLE_THINK_NODE = ActionNode(
    key="Simple Think Node",
)

SIMPLE_CHECK_NODE = ActionNode(
    key="Simple CHECK Node",
)
	
就会得到类似：
	self.children = {'Simple Think Node':SIMPLE_THINK_NODE_obj,'Simple CHECK Node':SIMPLE_CHECK_NODE_obj,...}
	
class THINK_NODES(ActionNode):
    def __init__(self, name="Think Nodes", expected_type=str, instruction="", example=""):
        super().__init__(key=name, expected_type=str, instruction=instruction, example=example)
        self.add_children([SIMPLE_THINK_NODE, SIMPLE_CHECK_NODE])
        
def add_children(self, nodes: List["ActionNode"]):
    """批量增加子ActionNode"""
    for node in nodes:
        self.add_child(node)

def add_child(self, node: "ActionNode"):
	"""增加子ActionNode"""
    self.children[node.key] = node

ActionNode.set_llm

# 做了什么事？
1.给父节点设置llm属性
2.递归的给子节点设置llm属性（即子节点还有子节点，也会设置）

这里的llm是，具有预定义系统消息的大型语言模型。
def set_llm(self, llm):
    self.set_recursive("llm", llm)
    
def set_recursive(self, name, value):
    setattr(self, name, value)
    for _, i in self.children.items():
        i.set_recursive(name, value)
        
self.children = {'Simple Think Node':SIMPLE_THINK_NODE_obj,'Simple CHECK Node':SIMPLE_CHECK_NODE_obj,...}

ActionNode.set_context

# 做了什么事？
同上，递归的给所有节点设置上下文属性。
def set_context(self, context):
    self.set_recursive("context", context)

ActionNode.compile

# 做了什么事？
版本为0.5.2，只实现了children模式。

def compile(self, context, to="json", mode="children", template=SIMPLE_TEMPLATE) -> str:
    """
    mode: all/root/children
        mode="children": 编译所有子节点为一个统一模板，包括instruction与example
        mode="all": NotImplemented
        mode="root": NotImplemented
    """

    # FIXME: json instruction会带来格式问题，如："Project name": "web_2048  # 项目名称使用下划线",
    self.instruction = self.compile_instruction(to="markdown", mode=mode)
    self.example = self.compile_example(to=to, tag="CONTENT", mode=mode)
    prompt = template.format(
        context=context, example=self.example, instruction=self.instruction, constraint=CONSTRAINT
    )
    return prompt

posted @ 2024-01-26 17:44 passion2021 阅读(491) 评论(0) 收藏举报

刷新页面返回顶部

passion