闯关地图-进阶岛

第1关

CompassArena 中选择双模型对话,与InternLM2.5及另外任意其他模型对话,收集 5 个 InternLM2.5 输出结果不如其他模型的对话案例,以及 InternLM2.5 的 5 个 Good Case,并写成一篇飞书文档提交到:https://aicarrier.feishu.cn/share/base/form/shrcnZ4bQ4YmhEtMtnKxZUcf1vd

作业链接:https://p157xvvpk2d.feishu.cn/docx/KDzJd4Q66o6jhLxPsqYc1G1PnIf

第2关

基础任务

  • 使用 Lagent 自定义一个智能体,并使用 Lagent Web Demo 成功部署与调用,记录复现过程并截图。
  1. 环境配置

    git clone https://github.com/InternLM/lagent.git
    cd lagent && git checkout 81e7ace && pip install -e .
    
  2. 使用LMDeploy部署 InernLM2.5-7B-Chat 模型:

    model_dir="/home/scy/models/internlm/internlm2_5-7b-chat"  # 模型本地路径
    lmdeploy serve api_server $model_dir --model-name internlm2_5-7b-chat
    
  3. 基于Lagent自定义智能体,创建文件lagent/actions/magicmake.py,代码如下:

    import json
    import requests
    
    from lagent.actions.base_action import BaseAction, tool_api
    from lagent.actions.parser import BaseParser, JsonParser
    from lagent.schema import ActionReturn, ActionStatusCode
    
    class MagicMaker(BaseAction):
        styles_option = [
            'dongman',  # 动漫
            'guofeng',  # 国风
            'xieshi',   # 写实
            'youhua',   # 油画
            'manghe',   # 盲盒
        ]
        aspect_ratio_options = [
            '16:9', '4:3', '3:2', '1:1',
            '2:3', '3:4', '9:16'
        ]
    
        def __init__(self,
                    style='guofeng',
                    aspect_ratio='4:3'):
            super().__init__()
            if style in self.styles_option:
                self.style = style
            else:
                raise ValueError(f'The style must be one of {self.styles_option}')
            
            if aspect_ratio in self.aspect_ratio_options:
                self.aspect_ratio = aspect_ratio
            else:
                raise ValueError(f'The aspect ratio must be one of {aspect_ratio}')
        
        @tool_api
        def generate_image(self, keywords: str) -> dict:
            """Run magicmaker and get the generated image according to the keywords.
    
            Args:
                keywords (:class:`str`): the keywords to generate image
    
            Returns:
                :class:`dict`: the generated image
                    * image (str): path to the generated image
            """
            try:
                response = requests.post(
                    url='https://magicmaker.openxlab.org.cn/gw/edit-anything/api/v1/bff/sd/generate',
                    data=json.dumps({
                        "official": True,
                        "prompt": keywords,
                        "style": self.style,
                        "poseT": False,
                        "aspectRatio": self.aspect_ratio
                    }),
                    headers={'content-type': 'application/json'}
                )
            except Exception as exc:
                return ActionReturn(
                    errmsg=f'MagicMaker exception: {exc}',
                    state=ActionStatusCode.HTTP_ERROR)
            image_url = response.json()['data']['imgUrl']
            return {'image': image_url}
    
  4. 修改文件lagent/examples/internlm2_agent_web_demo.py,添加我们自定义的工具MagicMaker

    from lagent.actions import ActionExecutor, ArxivSearch, IPythonInterpreter
    + from lagent.actions.magicmaker import MagicMaker
    from lagent.agents.internlm2_agent import INTERPRETER_CN, META_CN, PLUGIN_CN, Internlm2Agent, Internlm2Protocol
    
    ...
            action_list = [
                ArxivSearch(),
    +             MagicMaker(),
            ]
    
  5. 在另一个窗口中启动 Lagent 的Web Demo:

    streamlit run examples/internlm2_agent_web_demo.py
    
  6. 效果图如下:

    img

第3关

基础任务

  • 使用结合W4A16量化与kv cache量化的internlm2_5-1_8b-chat模型封装本地API并与大模型进行一次对话,作业截图需包括显存占用情况与大模型回复,参考4.1 API开发(优秀学员必做),请注意2.2.3节与4.1节应使用作业版本命令。
  1. 使用w4a16方式进行模型量化,执行以下脚本:

    lmdeploy lite auto_awq \
        /home/scy/models/internlm/internlm2_5-1_8b-chat \  # 模型本地路径
        --calib-dataset 'ptb' \
        --calib-samples 128 \
        --calib-seqlen 2048 \
        --w-bits 4 \
        --w-group-size 128 \
        --batch-size 1 \
        --search-scale False \
        --work-dir /home/scy/models/internlm/internlm2_5-1_8b-chat-w4a16-4bit  # 量化后的模型存储路径
    
  2. 执行如下命令,查看量化后的模型所占的磁盘空间:

    du -sh ~/models/internlm/*
    

    效果图如下:

    img

    量化前模型占空间3.6G,量化后占空间1.5G

  3. 对w4a16量化后的模型使用kv cache int4量化,执行以下脚本:

    lmdeploy serve api_server \
        /home/scy/models/internlm/internlm2_5-1_8b-chat-w4a16-4bit \
        --model-format awq \
        --quant-policy 4 \
        --cache-max-entry-count 0.4\
        --server-name 0.0.0.0 \
        --server-port 23333 \
        --tp 1
    

    查看显存占用情况,如下图:

    img

  4. 执行如下命令,调用部署后的接口:

    lmdeploy serve api_client http://0.0.0.0:23333
    

    对话效果图如下:

    img

  • 使用Function call功能让大模型完成一次简单的"加"与"乘"函数调用,作业截图需包括大模型回复的工具调用情况,参考4.2 Function call(选做)
  1. 执行以下脚本,部署internlm2.5-7b-chat模型:

    model_dir="/home/scy/models/internlm/internlm2_5-7b-chat"  # 模型本地存储路径
    lmdeploy serve api_server $model_dir --server-port 23333 --api-keys internlm
    
  2. 新建文件internlm2_5_func.py,代码如下:

    from openai import OpenAI
    
    def add(a: int, b: int):
        return a + b
    
    def mul(a: int, b: int):
        return a * b
    
    tools = [{
        'type': 'function',
        'function': {
            'name': 'add',
            'description': 'Compute the sum of two numbers',
            'parameters': {
                'type': 'object',
                'properties': {
                    'a': {
                        'type': 'int',
                        'description': 'A number',
                    },
                    'b': {
                        'type': 'int',
                        'description': 'A number',
                    },
                },
                'required': ['a', 'b'],
            },
        }
    }, {
        'type': 'function',
        'function': {
            'name': 'mul',
            'description': 'Calculate the product of two numbers',
            'parameters': {
                'type': 'object',
                'properties': {
                    'a': {
                        'type': 'int',
                        'description': 'A number',
                    },
                    'b': {
                        'type': 'int',
                        'description': 'A number',
                    },
                },
                'required': ['a', 'b'],
            },
        }
    }]
    messages = [{'role': 'user', 'content': 'Compute (3+5)*2'}]
    
    client = OpenAI(
        api_key='internlm',   # 填写正确的api_key
        base_url='http://0.0.0.0:23333/v1')
    model_name = client.models.list().data[0].id
    response = client.chat.completions.create(
        model=model_name,
        messages=messages,
        temperature=0.8,
        top_p=0.8,
        stream=False,
        tools=tools)
    print(response)
    func1_name = response.choices[0].message.tool_calls[0].function.name
    func1_args = response.choices[0].message.tool_calls[0].function.arguments
    func1_out = eval(f'{func1_name}(**{func1_args})')
    print(func1_out)
    
    messages.append({
        'role': 'assistant',
        'content': response.choices[0].message.content
    })
    messages.append({
        'role': 'environment',
        'content': f'3+5={func1_out}',
        'name': 'plugin'
    })
    response = client.chat.completions.create(
        model=model_name,
        messages=messages,
        temperature=0.8,
        top_p=0.8,
        stream=False,
        tools=tools)
    print(response)
    func2_name = response.choices[0].message.tool_calls[0].function.name
    func2_args = response.choices[0].message.tool_calls[0].function.arguments
    func2_out = eval(f'{func2_name}(**{func2_args})')
    print(func2_out)
    

效果图如下:

img

第4关

基础任务

  • follow 教学文档和视频使用QLoRA进行微调模型,复现微调效果,并能成功讲出梗图.
  1. 首先需要下载 InternVL2-2B 模型,推荐从modelscope上下载

  2. 下载 CLoT_cn_2000 数据集(使用InternStudio服务器的可以跳过这一步),下载链接:https://github.com/chengyingshe/Tutorial/archive/refs/tags/datasets.zip

  3. 执行如下命令,复制xtuner QLoRA微调的配置文件:

    xtuner copy-cfg internvl_v2_internlm2_2b_qlora_finetune ./
    
  4. 修改配置文件,如下所示:

    img

  5. 启动QLoRA微调,执行如下脚本:

    # NPROC_PER_NODE:使用的GPU数量
    NPROC_PER_NODE=2 xtuner train ./internvl_v2_internlm2_2b_qlora_finetune_copy.py  \
            --work-dir /home/scy/models/internvl_ft_run_8_filter  \
            --deepspeed deepspeed_zero1
    
  6. 合并pth模型,执行如下脚本:

    xtuner_dir="/home/scy/llm/xtuner"  # config file path
    pth_path="/home/scy/models/internvl_ft_run_8_filter/iter_3000.pth"  # trained_model_pth
    model_path="/home/scy/models/OpenGVLab/InternVL2-2B-Merged"  # save_path
    python $xtuner_dir/xtuner/configs/internvl/v1_5/convert_to_official.py \
            ./internvl_v2_internlm2_2b_qlora_finetune_copy.py \
            $pth_path \
            $model_path
    
  7. 编写测试脚本,代码如下:

    from lmdeploy import pipeline
    from lmdeploy.vl import load_image
    
    pipe = pipeline("/home/scy/models/OpenGVLab/InternVL2-2B-Merged")  # 模型的本地存储路径
    
    image = load_image('004atEXYgy1gpx0ifty7tj60x80x51jt02.jpg')
    response = pipe(('请你根据这张图片,讲一个脑洞大开的梗', image))
    print(response.text)
    

使用的梗图如下:
img

微调前效果图如下:
img

微调后效果如下:
img

  • 尝试使用LoRA,或调整xtuner的config,如LoRA rank,学习率。看模型Loss会如何变化,并记录调整后效果(选做,使用LoRA或调整config可以二选一)

第5关

基础任务

  • 在 InternStudio 中利用 Internlm2-7b 搭建标准版茴香豆知识助手,并使用 Gradio 界面完成 2 轮问答(问题不可与教程重复,作业截图需包括 gradio 界面问题和茴香豆回答)。知识库可根据根据自己工作、学习或感兴趣的内容调整,如金融、医疗、法律、音乐、动漫等(优秀学员必做)。

如果问答效果不理想,尝试调整正反例。

效果图如下:

img

img

挑战任务

  • 在 Web 版茴香豆中创建自己的问答知识助手,并完成微信群或飞书群的部署,在群内与茴香豆助手完成至少 1 轮问答(作业截图需包括问题和茴香豆回答)。

测试时发现,问答知识机器人会慢一轮问答

img

第6关

基础任务

  • 按照教程,将 MindSearch 部署到 HuggingFace 并美化 Gradio 的界面,并提供截图和 Hugging Face 的Space的链接。
  1. 首先在 HuggingFace 中创建新的 codespace

    img

    img

  2. 添加硅基流动的 API Key

    img

  3. 将 HuggingFace 中新建的 Space clone到本地

    git clone git@hf.co:spaces/<user_name>/<space_name>

    将 MindSearch 项目中的案例代码拷贝至刚克隆到本地的文件夹中

    cp -r MindSearch/mindsearch/* <space_name>/

  4. 添加文件<space_name>/app.py,代码如下:

    import json
    import os
    
    import gradio as gr
    import requests
    from lagent.schema import AgentStatusCode
    
    os.system("python -m mindsearch.app --lang cn --model_format internlm_silicon &")
    
    PLANNER_HISTORY = []
    SEARCHER_HISTORY = []
    
    
    def rst_mem(history_planner: list, history_searcher: list):
        '''
        Reset the chatbot memory.
        '''
        history_planner = []
        history_searcher = []
        if PLANNER_HISTORY:
            PLANNER_HISTORY.clear()
        return history_planner, history_searcher
    
    
    def format_response(gr_history, agent_return):
        if agent_return['state'] in [
                AgentStatusCode.STREAM_ING, AgentStatusCode.ANSWER_ING
        ]:
            gr_history[-1][1] = agent_return['response']
        elif agent_return['state'] == AgentStatusCode.PLUGIN_START:
            thought = gr_history[-1][1].split('```')[0]
            if agent_return['response'].startswith('```'):
                gr_history[-1][1] = thought + '\n' + agent_return['response']
        elif agent_return['state'] == AgentStatusCode.PLUGIN_END:
            thought = gr_history[-1][1].split('```')[0]
            if isinstance(agent_return['response'], dict):
                gr_history[-1][
                    1] = thought + '\n' + f'```json\n{json.dumps(agent_return["response"], ensure_ascii=False, indent=4)}\n```'  # noqa: E501
        elif agent_return['state'] == AgentStatusCode.PLUGIN_RETURN:
            assert agent_return['inner_steps'][-1]['role'] == 'environment'
            item = agent_return['inner_steps'][-1]
            gr_history.append([
                None,
                f"```json\n{json.dumps(item['content'], ensure_ascii=False, indent=4)}\n```"
            ])
            gr_history.append([None, ''])
        return
    
    
    def predict(history_planner, history_searcher):
    
        def streaming(raw_response):
            for chunk in raw_response.iter_lines(chunk_size=8192,
                                                decode_unicode=False,
                                                delimiter=b'\n'):
                if chunk:
                    decoded = chunk.decode('utf-8')
                    if decoded == '\r':
                        continue
                    if decoded[:6] == 'data: ':
                        decoded = decoded[6:]
                    elif decoded.startswith(': ping - '):
                        continue
                    response = json.loads(decoded)
                    yield (response['response'], response['current_node'])
    
        global PLANNER_HISTORY
        PLANNER_HISTORY.append(dict(role='user', content=history_planner[-1][0]))
        new_search_turn = True
    
        url = 'http://localhost:8002/solve'
        headers = {'Content-Type': 'application/json'}
        data = {'inputs': PLANNER_HISTORY}
        raw_response = requests.post(url,
                                    headers=headers,
                                    data=json.dumps(data),
                                    timeout=20,
                                    stream=True)
    
        for resp in streaming(raw_response):
            agent_return, node_name = resp
            if node_name:
                if node_name in ['root', 'response']:
                    continue
                agent_return = agent_return['nodes'][node_name]['detail']
                if new_search_turn:
                    history_searcher.append([agent_return['content'], ''])
                    new_search_turn = False
                format_response(history_searcher, agent_return)
                if agent_return['state'] == AgentStatusCode.END:
                    new_search_turn = True
                yield history_planner, history_searcher
            else:
                new_search_turn = True
                format_response(history_planner, agent_return)
                if agent_return['state'] == AgentStatusCode.END:
                    PLANNER_HISTORY = agent_return['inner_steps']
                yield history_planner, history_searcher
        return history_planner, history_searcher
    
    
    with gr.Blocks() as demo:
        gr.HTML("""<h1 align="center">MindSearch Gradio Demo</h1>""")
        gr.HTML("""<p style="text-align: center; font-family: Arial, sans-serif;">MindSearch is an open-source AI Search Engine Framework with Perplexity.ai Pro performance. You can deploy your own Perplexity.ai-style search engine using either closed-source LLMs (GPT, Claude) or open-source LLMs (InternLM2.5-7b-chat).</p>""")
        gr.HTML("""
        <div style="text-align: center; font-size: 16px;">
            <a href="https://github.com/InternLM/MindSearch" style="margin-right: 15px; text-decoration: none; color: #4A90E2;">🔗 GitHub</a>
            <a href="https://arxiv.org/abs/2407.20183" style="margin-right: 15px; text-decoration: none; color: #4A90E2;">📄 Arxiv</a>
            <a href="https://huggingface.co/papers/2407.20183" style="margin-right: 15px; text-decoration: none; color: #4A90E2;">📚 Hugging Face Papers</a>
            <a href="https://huggingface.co/spaces/internlm/MindSearch" style="text-decoration: none; color: #4A90E2;">🤗 Hugging Face Demo</a>
        </div>
        """)
        with gr.Row():
            with gr.Column(scale=10):
                with gr.Row():
                    with gr.Column():
                        planner = gr.Chatbot(label='planner',
                                            height=700,
                                            show_label=True,
                                            show_copy_button=True,
                                            bubble_full_width=False,
                                            render_markdown=True)
                    with gr.Column():
                        searcher = gr.Chatbot(label='searcher',
                                            height=700,
                                            show_label=True,
                                            show_copy_button=True,
                                            bubble_full_width=False,
                                            render_markdown=True)
                with gr.Row():
                    user_input = gr.Textbox(show_label=False,
                                            placeholder='帮我搜索一下 InternLM 开源体系',
                                            lines=5,
                                            container=False)
                with gr.Row():
                    with gr.Column(scale=2):
                        submitBtn = gr.Button('Submit')
                    with gr.Column(scale=1, min_width=20):
                        emptyBtn = gr.Button('Clear History')
    
        def user(query, history):
            return '', history + [[query, '']]
    
        submitBtn.click(user, [user_input, planner], [user_input, planner],
                        queue=False).then(predict, [planner, searcher],
                                        [planner, searcher])
        emptyBtn.click(rst_mem, [planner, searcher], [planner, searcher],
                    queue=False)
    
    demo.queue()
    demo.launch(server_name='0.0.0.0',
                server_port=7860,
                inbrowser=True,
                share=True)
    
    
  5. 上传代码到远程 HuggingFace 仓库中(与github上传操作相同)

    cd <space_name>
    git add .
    git commit -m "some comments"
    git branch -M main
    git push -u origin main
    
  6. 打开 HuggingFace 中新建的 Space,效果图如下:

    HuggingFace 链接为:https://huggingface.co/spaces/MaximeSHE/mindsearch_demo

    img

posted @ 2024-09-03 13:17  MaximeSHE  阅读(19)  评论(0编辑  收藏  举报