导航

Auto-GPT是如何工作的?

Posted on 2023-04-25 04:53  蝈蝈俊  阅读(1016)  评论(0编辑  收藏  举报

Auto-GPT是个神奇的工具,它能够根据您用自然语言描述的目标,使用GPT做深入分析,拆分为多个有顺的子任务,并自动执行诸如访问互联网等任务操作,然后根据执行结果做反思重新优化目标,整个过程不断自主循环,直到给出满意的结果。

特点:

  • 不需要人类插手,我们使用ChatGPT,每项任务需要手动发起;而 Auto-GPT 会为自己分配新的工作目标,以实现更大的目标。
  • Auto-GPT 通过写入和读取数据库和文件来管理短期和长期记忆;

从例子看AutoGPT可以做啥?

下面是官方的例子,来源:
https://github.com/Significant-Gravitas/Auto-GPT

这个例子的目标是:整理篇AutoGPT的内容,初始行动计划是:

  1. 搜索AutoGPT;
  2. 找到项目地址看AutoGPT是做啥的?
  3. 在autogpt.txt这个文件中解释它;
  4. 结束任务
Name: AutoGPT-Demo
Role: an ai designed to teach me about auto gpt
Goals: ['search auto gpt','find the github and figure out what the project is','explain what auto gpt is in a file named autogpt.txt','terminate']

在第一轮分析后,把任务优化成了:

AUTOGPT-DEMO THOUGHTS : I think the first step should be to use the 'google' command to search for 'Auto GPT'
REASONING:  This will help us gather more information about Auto GPT and we can proceed with identifying the relevant GitHub project .
PLAN: 
- Use 'google' to search for 'Auto GPT'
- Browse relevant websites to find the GitHub project 
- Write a document explaining what Auto GPI is

CRITICISM: I need to be sure to remain focused and efficient in my use of the 'google' command to minimize the number of step needed to identify the relevant GitHub project and answer the key questions

然后它就执行第一个任务,google搜索。
根据搜索的结果,AutoGPT重新调整了认知,增加了探索Github代码库这步:

AUTOGPT-DEMO THOUGHTS : The next step would be to browse the GitHub repository of Auto GPT. This will help us to gather more information about the project and understand it better. 
REASONING:  Browsing the GitHub repository is the best way to gather more detailed information about the project and understand its functionality
PLAN: 
- Browse the Github repository of Auto GPT 
- Explore the respository to understand the project better
- Write a document explaining what Auto GPT is
CRITICISM : I need to ensure that I focus on identifying the key functionalities of Auto GPT and understand the codebase effectively so that I can accurately describe the project in the document I write 

随后就是一系列的探索AutoGPT的项目,然后写出介绍文件。

通过上面例子我们可以看到:

我们每分配一个任务,Auto-GPT都会给出一个相应的解决计划。
比如,需要浏览互联网或使用新数据,它便会调整其策略,直到任务完成。
这就像拥有一个能处理各种任务的私人助手,如市场分析、客户服务、市场营销、财务等。

AutoGPT的核心代码逻辑

我们人类的行动模式:

通过不断行动拿到反馈,同时更新想法,反复几轮后就可以完美的解决目标。

AutoGPT的项目核心代码逻辑也是这样的。

程序逻辑入口在 autogpt/cli.py 文件的 main方法中, 这个方法准备了各种配置,然后开始了一个交互循环。

start_interaction_loop() 函数实现在 autogpt/agent/agent.py 文件,其内部逻辑如下图:


我们可以看出,这跟前面人类的行动模式是一样的,只不过想法和反馈被整合到Prompt里了。这里的Prompt分为下面几个部分(每轮迭代都会更新):

  • GOALS - 任务目标,可以列出多条,更有条理
  • CONSTRAINTS - 告诉LLM一些规则
  • COMMANDS - 可以调用的函数
  • RESOURCES - 补充规则
  • PERFORMANCE EVALUATION - 补充规则
  • RESPONSE FORMAT - 返回格式,返回格式里要求LLM给出“Thoughts”,这样在执行一系列任务时有更清晰的上下文信息。

下面是Prompt的内容。


You are {ai_name}, {ai_role}
Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.
GOALS:
{Goals}
CONSTRAINTS:

1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.
2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.
3. No user assistance
4. Exclusively use the commands listed in double quotes e.g. "command name"

COMMANDS:

1. Google Search: "google", args: "input": "<search>"
5. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"
6. Start GPT Agent: "start_agent",  args: "name": "<name>", "task": "<short_task_desc>", "prompt": "<prompt>"
7. Message GPT Agent: "message_agent", args: "key": "<key>", "message": "<message>"
8. List GPT Agents: "list_agents", args: ""
9. Delete GPT Agent: "delete_agent", args: "key": "<key>"
10. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"
11. Read file: "read_file", args: "file": "<file>"
12. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"
13. Delete file: "delete_file", args: "file": "<file>"
14. Search Files: "search_files", args: "directory": "<directory>"
15. Evaluate Code: "evaluate_code", args: "code": "<full_code_string>"
16. Get Improved Code: "improve_code", args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"
17. Write Tests: "write_tests", args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"
18. Execute Python File: "execute_python_file", args: "file": "<file>"
19. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"
20. Generate Image: "generate_image", args: "prompt": "<prompt>"
21. Do Nothing: "do_nothing", args: ""

RESOURCES:

1. Internet access for searches and information gathering.
2. Long Term memory management.
3. GPT-3.5 powered Agents for delegation of simple tasks.
4. File output.

PERFORMANCE EVALUATION:

1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.
2. Constructively self-criticize your big-picture behavior constantly.
3. Reflect on past decisions and strategies to refine your approach.
4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.

You should only respond in JSON format as described below

RESPONSE FORMAT:
{
    "thoughts":
    {
        "text": "thought",
        "reasoning": "reasoning",
        "plan": "- short bulleted\n- list that conveys\n- long-term plan",
        "criticism": "constructive self-criticism",
        "speak": "thoughts summary to say to user"
    },
    "command": {
        "name": "command name",
        "args":{
            "arg name": "value"
        }
    }
}

Ensure the response can be parsed by Python json.loads

注意看其中对返回格式的要求,这可以帮助我们很方便的解析出下一步要执行的命令和参数。

整个工作机制最核心的就是流程和Prompt,通过流程实现人类的行动模式,通过Prompt实现GPT的更好利用。

总结

目前Auto-GPT的问题主要下面两个:

  • Auto-GPT目前还在不断迭代中,可能会碰到各种问题;
  • 虽然代码中限制了一定量后,会退出,减少成本消耗,但是如果我们使用理解能力更强的GPT4的API会好贵。

但是Auto-GPT 表现出了推理和通过多个自主步骤达成目标的能力,其长/短期记忆机制也让它能够不断学习更多新鲜事物。而人类的大部分智能和行为就是以这种方式实现的,可预见的未来,类似的产品替代人不是问题。

参考: