闯关地图-进阶岛
第1关
在 CompassArena 中选择双模型对话,与InternLM2.5及另外任意其他模型对话,收集 5 个 InternLM2.5 输出结果不如其他模型的对话案例,以及 InternLM2.5 的 5 个 Good Case,并写成一篇飞书文档提交到:https://aicarrier.feishu.cn/share/base/form/shrcnZ4bQ4YmhEtMtnKxZUcf1vd
作业链接:https://p157xvvpk2d.feishu.cn/docx/KDzJd4Q66o6jhLxPsqYc1G1PnIf
第2关
基础任务
- 使用 Lagent 自定义一个智能体,并使用 Lagent Web Demo 成功部署与调用,记录复现过程并截图。
-
环境配置
git clone https://github.com/InternLM/lagent.git cd lagent && git checkout 81e7ace && pip install -e .
-
使用LMDeploy部署 InernLM2.5-7B-Chat 模型:
model_dir="/home/scy/models/internlm/internlm2_5-7b-chat" # 模型本地路径 lmdeploy serve api_server $model_dir --model-name internlm2_5-7b-chat
-
基于Lagent自定义智能体,创建文件
lagent/actions/magicmake.py
,代码如下:import json import requests from lagent.actions.base_action import BaseAction, tool_api from lagent.actions.parser import BaseParser, JsonParser from lagent.schema import ActionReturn, ActionStatusCode class MagicMaker(BaseAction): styles_option = [ 'dongman', # 动漫 'guofeng', # 国风 'xieshi', # 写实 'youhua', # 油画 'manghe', # 盲盒 ] aspect_ratio_options = [ '16:9', '4:3', '3:2', '1:1', '2:3', '3:4', '9:16' ] def __init__(self, style='guofeng', aspect_ratio='4:3'): super().__init__() if style in self.styles_option: self.style = style else: raise ValueError(f'The style must be one of {self.styles_option}') if aspect_ratio in self.aspect_ratio_options: self.aspect_ratio = aspect_ratio else: raise ValueError(f'The aspect ratio must be one of {aspect_ratio}') @tool_api def generate_image(self, keywords: str) -> dict: """Run magicmaker and get the generated image according to the keywords. Args: keywords (:class:`str`): the keywords to generate image Returns: :class:`dict`: the generated image * image (str): path to the generated image """ try: response = requests.post( url='https://magicmaker.openxlab.org.cn/gw/edit-anything/api/v1/bff/sd/generate', data=json.dumps({ "official": True, "prompt": keywords, "style": self.style, "poseT": False, "aspectRatio": self.aspect_ratio }), headers={'content-type': 'application/json'} ) except Exception as exc: return ActionReturn( errmsg=f'MagicMaker exception: {exc}', state=ActionStatusCode.HTTP_ERROR) image_url = response.json()['data']['imgUrl'] return {'image': image_url}
-
修改文件
lagent/examples/internlm2_agent_web_demo.py
,添加我们自定义的工具MagicMaker
:from lagent.actions import ActionExecutor, ArxivSearch, IPythonInterpreter + from lagent.actions.magicmaker import MagicMaker from lagent.agents.internlm2_agent import INTERPRETER_CN, META_CN, PLUGIN_CN, Internlm2Agent, Internlm2Protocol ... action_list = [ ArxivSearch(), + MagicMaker(), ]
-
在另一个窗口中启动 Lagent 的Web Demo:
streamlit run examples/internlm2_agent_web_demo.py
-
效果图如下:
第3关
基础任务
- 使用结合W4A16量化与kv cache量化的
internlm2_5-1_8b-chat
模型封装本地API并与大模型进行一次对话,作业截图需包括显存占用情况与大模型回复,参考4.1 API开发(优秀学员必做),请注意2.2.3节与4.1节应使用作业版本命令。
-
使用w4a16方式进行模型量化,执行以下脚本:
lmdeploy lite auto_awq \ /home/scy/models/internlm/internlm2_5-1_8b-chat \ # 模型本地路径 --calib-dataset 'ptb' \ --calib-samples 128 \ --calib-seqlen 2048 \ --w-bits 4 \ --w-group-size 128 \ --batch-size 1 \ --search-scale False \ --work-dir /home/scy/models/internlm/internlm2_5-1_8b-chat-w4a16-4bit # 量化后的模型存储路径
-
执行如下命令,查看量化后的模型所占的磁盘空间:
du -sh ~/models/internlm/*
效果图如下:
量化前模型占空间3.6G,量化后占空间1.5G
-
对w4a16量化后的模型使用kv cache int4量化,执行以下脚本:
lmdeploy serve api_server \ /home/scy/models/internlm/internlm2_5-1_8b-chat-w4a16-4bit \ --model-format awq \ --quant-policy 4 \ --cache-max-entry-count 0.4\ --server-name 0.0.0.0 \ --server-port 23333 \ --tp 1
查看显存占用情况,如下图:
-
执行如下命令,调用部署后的接口:
lmdeploy serve api_client http://0.0.0.0:23333
对话效果图如下:
- 使用Function call功能让大模型完成一次简单的"加"与"乘"函数调用,作业截图需包括大模型回复的工具调用情况,参考4.2 Function call(选做)
-
执行以下脚本,部署internlm2.5-7b-chat模型:
model_dir="/home/scy/models/internlm/internlm2_5-7b-chat" # 模型本地存储路径 lmdeploy serve api_server $model_dir --server-port 23333 --api-keys internlm
-
新建文件
internlm2_5_func.py
,代码如下:from openai import OpenAI def add(a: int, b: int): return a + b def mul(a: int, b: int): return a * b tools = [{ 'type': 'function', 'function': { 'name': 'add', 'description': 'Compute the sum of two numbers', 'parameters': { 'type': 'object', 'properties': { 'a': { 'type': 'int', 'description': 'A number', }, 'b': { 'type': 'int', 'description': 'A number', }, }, 'required': ['a', 'b'], }, } }, { 'type': 'function', 'function': { 'name': 'mul', 'description': 'Calculate the product of two numbers', 'parameters': { 'type': 'object', 'properties': { 'a': { 'type': 'int', 'description': 'A number', }, 'b': { 'type': 'int', 'description': 'A number', }, }, 'required': ['a', 'b'], }, } }] messages = [{'role': 'user', 'content': 'Compute (3+5)*2'}] client = OpenAI( api_key='internlm', # 填写正确的api_key base_url='http://0.0.0.0:23333/v1') model_name = client.models.list().data[0].id response = client.chat.completions.create( model=model_name, messages=messages, temperature=0.8, top_p=0.8, stream=False, tools=tools) print(response) func1_name = response.choices[0].message.tool_calls[0].function.name func1_args = response.choices[0].message.tool_calls[0].function.arguments func1_out = eval(f'{func1_name}(**{func1_args})') print(func1_out) messages.append({ 'role': 'assistant', 'content': response.choices[0].message.content }) messages.append({ 'role': 'environment', 'content': f'3+5={func1_out}', 'name': 'plugin' }) response = client.chat.completions.create( model=model_name, messages=messages, temperature=0.8, top_p=0.8, stream=False, tools=tools) print(response) func2_name = response.choices[0].message.tool_calls[0].function.name func2_args = response.choices[0].message.tool_calls[0].function.arguments func2_out = eval(f'{func2_name}(**{func2_args})') print(func2_out)
效果图如下:
第4关
基础任务
- follow 教学文档和视频使用QLoRA进行微调模型,复现微调效果,并能成功讲出梗图.
-
首先需要下载
InternVL2-2B
模型,推荐从modelscope上下载 -
下载
CLoT_cn_2000
数据集(使用InternStudio服务器的可以跳过这一步),下载链接:https://github.com/chengyingshe/Tutorial/archive/refs/tags/datasets.zip -
执行如下命令,复制xtuner QLoRA微调的配置文件:
xtuner copy-cfg internvl_v2_internlm2_2b_qlora_finetune ./
-
修改配置文件,如下所示:
-
启动QLoRA微调,执行如下脚本:
# NPROC_PER_NODE:使用的GPU数量 NPROC_PER_NODE=2 xtuner train ./internvl_v2_internlm2_2b_qlora_finetune_copy.py \ --work-dir /home/scy/models/internvl_ft_run_8_filter \ --deepspeed deepspeed_zero1
-
合并pth模型,执行如下脚本:
xtuner_dir="/home/scy/llm/xtuner" # config file path pth_path="/home/scy/models/internvl_ft_run_8_filter/iter_3000.pth" # trained_model_pth model_path="/home/scy/models/OpenGVLab/InternVL2-2B-Merged" # save_path python $xtuner_dir/xtuner/configs/internvl/v1_5/convert_to_official.py \ ./internvl_v2_internlm2_2b_qlora_finetune_copy.py \ $pth_path \ $model_path
-
编写测试脚本,代码如下:
from lmdeploy import pipeline from lmdeploy.vl import load_image pipe = pipeline("/home/scy/models/OpenGVLab/InternVL2-2B-Merged") # 模型的本地存储路径 image = load_image('004atEXYgy1gpx0ifty7tj60x80x51jt02.jpg') response = pipe(('请你根据这张图片,讲一个脑洞大开的梗', image)) print(response.text)
使用的梗图如下:
微调前效果图如下:
微调后效果如下:
- 尝试使用LoRA,或调整xtuner的config,如LoRA rank,学习率。看模型Loss会如何变化,并记录调整后效果(选做,使用LoRA或调整config可以二选一)
第5关
基础任务
- 在 InternStudio 中利用 Internlm2-7b 搭建标准版茴香豆知识助手,并使用 Gradio 界面完成 2 轮问答(问题不可与教程重复,作业截图需包括 gradio 界面问题和茴香豆回答)。知识库可根据根据自己工作、学习或感兴趣的内容调整,如金融、医疗、法律、音乐、动漫等(优秀学员必做)。
如果问答效果不理想,尝试调整正反例。
效果图如下:
挑战任务
- 在 Web 版茴香豆中创建自己的问答知识助手,并完成微信群或飞书群的部署,在群内与茴香豆助手完成至少 1 轮问答(作业截图需包括问题和茴香豆回答)。
测试时发现,问答知识机器人会慢一轮问答
第6关
基础任务
- 按照教程,将 MindSearch 部署到 HuggingFace 并美化 Gradio 的界面,并提供截图和 Hugging Face 的Space的链接。
-
首先在 HuggingFace 中创建新的 codespace
-
添加硅基流动的 API Key
-
将 HuggingFace 中新建的 Space clone到本地
git clone git@hf.co:spaces/<user_name>/<space_name>
将 MindSearch 项目中的案例代码拷贝至刚克隆到本地的文件夹中
cp -r MindSearch/mindsearch/* <space_name>/
-
添加文件
<space_name>/app.py
,代码如下:import json import os import gradio as gr import requests from lagent.schema import AgentStatusCode os.system("python -m mindsearch.app --lang cn --model_format internlm_silicon &") PLANNER_HISTORY = [] SEARCHER_HISTORY = [] def rst_mem(history_planner: list, history_searcher: list): ''' Reset the chatbot memory. ''' history_planner = [] history_searcher = [] if PLANNER_HISTORY: PLANNER_HISTORY.clear() return history_planner, history_searcher def format_response(gr_history, agent_return): if agent_return['state'] in [ AgentStatusCode.STREAM_ING, AgentStatusCode.ANSWER_ING ]: gr_history[-1][1] = agent_return['response'] elif agent_return['state'] == AgentStatusCode.PLUGIN_START: thought = gr_history[-1][1].split('```')[0] if agent_return['response'].startswith('```'): gr_history[-1][1] = thought + '\n' + agent_return['response'] elif agent_return['state'] == AgentStatusCode.PLUGIN_END: thought = gr_history[-1][1].split('```')[0] if isinstance(agent_return['response'], dict): gr_history[-1][ 1] = thought + '\n' + f'```json\n{json.dumps(agent_return["response"], ensure_ascii=False, indent=4)}\n```' # noqa: E501 elif agent_return['state'] == AgentStatusCode.PLUGIN_RETURN: assert agent_return['inner_steps'][-1]['role'] == 'environment' item = agent_return['inner_steps'][-1] gr_history.append([ None, f"```json\n{json.dumps(item['content'], ensure_ascii=False, indent=4)}\n```" ]) gr_history.append([None, '']) return def predict(history_planner, history_searcher): def streaming(raw_response): for chunk in raw_response.iter_lines(chunk_size=8192, decode_unicode=False, delimiter=b'\n'): if chunk: decoded = chunk.decode('utf-8') if decoded == '\r': continue if decoded[:6] == 'data: ': decoded = decoded[6:] elif decoded.startswith(': ping - '): continue response = json.loads(decoded) yield (response['response'], response['current_node']) global PLANNER_HISTORY PLANNER_HISTORY.append(dict(role='user', content=history_planner[-1][0])) new_search_turn = True url = 'http://localhost:8002/solve' headers = {'Content-Type': 'application/json'} data = {'inputs': PLANNER_HISTORY} raw_response = requests.post(url, headers=headers, data=json.dumps(data), timeout=20, stream=True) for resp in streaming(raw_response): agent_return, node_name = resp if node_name: if node_name in ['root', 'response']: continue agent_return = agent_return['nodes'][node_name]['detail'] if new_search_turn: history_searcher.append([agent_return['content'], '']) new_search_turn = False format_response(history_searcher, agent_return) if agent_return['state'] == AgentStatusCode.END: new_search_turn = True yield history_planner, history_searcher else: new_search_turn = True format_response(history_planner, agent_return) if agent_return['state'] == AgentStatusCode.END: PLANNER_HISTORY = agent_return['inner_steps'] yield history_planner, history_searcher return history_planner, history_searcher with gr.Blocks() as demo: gr.HTML("""<h1 align="center">MindSearch Gradio Demo</h1>""") gr.HTML("""<p style="text-align: center; font-family: Arial, sans-serif;">MindSearch is an open-source AI Search Engine Framework with Perplexity.ai Pro performance. You can deploy your own Perplexity.ai-style search engine using either closed-source LLMs (GPT, Claude) or open-source LLMs (InternLM2.5-7b-chat).</p>""") gr.HTML(""" <div style="text-align: center; font-size: 16px;"> <a href="https://github.com/InternLM/MindSearch" style="margin-right: 15px; text-decoration: none; color: #4A90E2;">🔗 GitHub</a> <a href="https://arxiv.org/abs/2407.20183" style="margin-right: 15px; text-decoration: none; color: #4A90E2;">📄 Arxiv</a> <a href="https://huggingface.co/papers/2407.20183" style="margin-right: 15px; text-decoration: none; color: #4A90E2;">📚 Hugging Face Papers</a> <a href="https://huggingface.co/spaces/internlm/MindSearch" style="text-decoration: none; color: #4A90E2;">🤗 Hugging Face Demo</a> </div> """) with gr.Row(): with gr.Column(scale=10): with gr.Row(): with gr.Column(): planner = gr.Chatbot(label='planner', height=700, show_label=True, show_copy_button=True, bubble_full_width=False, render_markdown=True) with gr.Column(): searcher = gr.Chatbot(label='searcher', height=700, show_label=True, show_copy_button=True, bubble_full_width=False, render_markdown=True) with gr.Row(): user_input = gr.Textbox(show_label=False, placeholder='帮我搜索一下 InternLM 开源体系', lines=5, container=False) with gr.Row(): with gr.Column(scale=2): submitBtn = gr.Button('Submit') with gr.Column(scale=1, min_width=20): emptyBtn = gr.Button('Clear History') def user(query, history): return '', history + [[query, '']] submitBtn.click(user, [user_input, planner], [user_input, planner], queue=False).then(predict, [planner, searcher], [planner, searcher]) emptyBtn.click(rst_mem, [planner, searcher], [planner, searcher], queue=False) demo.queue() demo.launch(server_name='0.0.0.0', server_port=7860, inbrowser=True, share=True)
-
上传代码到远程 HuggingFace 仓库中(与github上传操作相同)
cd <space_name> git add . git commit -m "some comments" git branch -M main git push -u origin main
-
打开 HuggingFace 中新建的 Space,效果图如下:
HuggingFace 链接为:https://huggingface.co/spaces/MaximeSHE/mindsearch_demo